from:"daniel.santos at pobox dot com"

[Bug c++/98441] [11 Regression] member function pointer incorrectly parsed as having trailing return type

2020-12-29 Thread daniel.santos at pobox dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98441

--- Comment #6 from Daniel Santos  ---
(In reply to Jonathan Wakely from comment #5)
> That's why you're asked to provide the output of 'gcc -v' by the
> instructions at https://gcc.gnu.org/bugs/ (because we can't guess that your
> 10.2.0 is different from ours).

You're correct; my apologies.  Sorry for the extra work!

[Bug c++/98441] [11 Regression] member function pointer incorrectly parsed as having trailing return type

2020-12-29 Thread daniel.santos at pobox dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98441

--- Comment #4 from Daniel Santos  ---
Created attachment 49850
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49850=edit
Gentoo gcc 10.2.0-r2 patches

(In reply to Jonathan Wakely from comment #3)
> (In reply to Daniel Santos from comment #0)
> > However, it builds on GCC 9 and is alleged to build on MSVC.  The above
> > example is simplified from the original sources:
> 
> Are you sure this fails with 10.2.0? I only see it fail with 11.0 and not
> gcc version 10.2.1 20201125 (but I didn't try a newer build from the gcc-10
> branch).

Well, yes, but this is Gentoo gcc-10.2.0-r2, so includes a patchset (attached).
 In that, 34_all_fundecl-ICE-PR95820.patch contains the following:

It's an unofficial backport of PR95820 where gcc ICEs on
invalid syntax. As creduce frequently end up in these ICEs
as in #730406 let's backport it to gcc-10.

https://gcc.gnu.org/PR95820
https://bugs.gentoo.org/730406
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12029,14 +12029,11 @@ grokdeclarator (const cp_declarator *declarator,

/* Handle a late-specified return type.  */
tree late_return_type = declarator->u.function.late_return_type;
-   if (funcdecl_p
-   /* This is the case e.g. for
-  using T = auto () -> int.  */
-   || inner_declarator == NULL)
+   if (true)
  {
if (tree auto_node = type_uses_auto (type))
  {
-   if (!late_return_type)
+   if (!late_return_type && funcdecl_p)
  {
if (current_class_type
&& LAMBDA_TYPE_P (current_class_type))

[Bug c++/98441] member function pointer incorrectly parsed as having trailing return type

2020-12-24 Thread daniel.santos at pobox dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98441

--- Comment #1 from Daniel Santos  ---
Also, I build gcc with:

-O42 -ffast-math -ffuzzy-dice -felide-function-bodies -pipe-clogged

but that shouldn't make a difference.

[Bug c++/98441] New: member function pointer incorrectly parsed as having trailing return type

2020-12-24 Thread daniel.santos at pobox dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98441

Bug ID: 98441
   Summary: member function pointer incorrectly parsed as having
trailing return type
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

When declaring pointer to a member function pointer using atuo& as the
function's return type, we get a bad parse:

struct a {
int& mfn();
};

void fn()
{
int&  (a::*myvar1)(void) = ::mfn;
auto& (a::*myvar2)(void) = ::mfn;
auto  (a::*myvar3)(void) = ::mfn;
}

Results in:

: In function 'void fn()':
:8:5: error: 'myvar2' function with trailing return type has 'auto&' as
its type rather than plain 'auto'
8 | auto& (a::*myvar2)(void) = ::mfn;
  | ^~~~

However, it builds on GCC 9 and is alleged to build on MSVC.  The above example
is simplified from the original sources:

https://github.com/freeorion/freeorion/blob/v0.4.10.1/python/UniverseWrapper.cpp#L193

[Bug go/68931] gccgo fails to build using MUSL libc

2020-10-17 Thread daniel.santos at pobox dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68931

--- Comment #5 from Daniel Santos  ---
Created attachment 49393
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49393=edit
Patch for musl compatibility

The root problem is that musl defines off64_t and loff_t as preprocessor
macros.  These end up in gen-sysinfo.go as "// undefinedmacro" entries. 
Perhaps the real solution would be for -fdump-go-spec to attempt to determine
when a macro is for a type?

There is also a problem currently with struct sysinfo being defined multiple
times due to musl defining it in sys/sysinfo.h instead of including
linux/sysinfo.h -- I've just altered the musl header for now.

[Bug go/68931] gccgo fails to build using MUSL libc

2020-10-17 Thread daniel.santos at pobox dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68931

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #4 from Daniel Santos  ---
Hello.  This is still a problem in 10.2 and I'm definitely going to need to
solve this.  As near as I can tell, the underlying issue is libgo/config.h
incorrectly being populated with HAVE_OFF64_T.

configure:11206: checking for off64_t
configure:11206:
/home/daniel/proj/embedded/openwrt/head/build_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/gcc-10.2.0-final/./gcc/xgcc
-B/home/daniel/proj/embedded/openwrt/head/build_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/gcc-10.2.0-final/./gcc/
-B/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/bin/
-B/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/lib/
-isystem
/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/include
-isystem
/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/sys-include
   -c -Os -pipe -mno-branch-likely -mips32r2 -mtune=24kc -g3 -fno-caller-saves
-fno-plt -fhonour-copts -Wno-error=unused-but-set-variable
-Wno-error=unused-result -msoft-float -Wformat -Werror=format-security
-D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  conftest.c >&5
configure:11206: $? = 0
configure:11206:
/home/daniel/proj/embedded/openwrt/head/build_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/gcc-10.2.0-final/./gcc/xgcc
-B/home/daniel/proj/embedded/openwrt/head/build_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/gcc-10.2.0-final/./gcc/
-B/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/bin/
-B/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/lib/
-isystem
/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/include
-isystem
/home/daniel/proj/embedded/openwrt/head/staging_dir/toolchain-mipsel_24kc_gcc-10.2.0_musl/mipsel-openwrt-linux-musl/sys-include
   -c -Os -pipe -mno-branch-likely -mips32r2 -mtune=24kc -g3 -fno-caller-saves
-fno-plt -fhonour-copts -Wno-error=unused-but-set-variable
-Wno-error=unused-result -msoft-float -Wformat -Werror=format-security
-D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  conftest.c >&5
conftest.c: In function 'main':
conftest.c:146:22: error: expected expression before ')' token
  146 | if (sizeof ((off64_t)))
  |  ^
configure:11206: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "package-unused"
| #define PACKAGE_TARNAME "libgo"
| #define PACKAGE_VERSION "version-unused"
| #define PACKAGE_STRING "package-unused version-unused"
| #define PACKAGE_BUGREPORT ""
| #define PACKAGE_URL ""
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_DLFCN_H 1
| #define LT_OBJDIR ".libs/"
| #define USE_LIBFFI 1
| #define HAVE_GETIPINFO 1
| #define HAVE_SCHED_H 1
| #define HAVE_SEMAPHORE_H 1
| #define HAVE_SYS_FILE_H 1
| #define HAVE_SYS_MMAN_H 1
| #define HAVE_SYSCALL_H 1
| #define HAVE_SYS_EPOLL_H 1
| #define HAVE_SYS_INOTIFY_H 1
| #define HAVE_SYS_PTRACE_H 1
| #define HAVE_SYS_SYSCALL_H 1
| #define HAVE_SYS_USER_H 1
| #define HAVE_SYS_UTSNAME_H 1
| #define HAVE_SYS_SELECT_H 1
| #define HAVE_SYS_SOCKET_H 1
| #define HAVE_NET_IF_H 1
| #define HAVE_NET_IF_ARP_H 1
| #define HAVE_NET_ROUTE_H 1
| #define HAVE_NETPACKET_PACKET_H 1
| #define HAVE_SYS_PRCTL_H 1
| #define HAVE_SYS_MOUNT_H 1
| #define HAVE_SYS_VFS_H 1
| #define HAVE_SYS_STATFS_H 1
| #define HAVE_SYS_TIMEX_H 1
| #define HAVE_SYS_SYSINFO_H 1
| #define HAVE_UTIME_H 1
| #define HAVE_LINUX_FS_H 1
| #define HAVE_LINUX_PTRACE_H 1
| #define HAVE_LINUX_REBOOT_H 1
| #define HAVE_NETINET_IP_H 1
| #define HAVE_NETINET_IF_ETHER_H 1
| #define HAVE_NETINET_ICMP6_H 1
| #define HAVE_LINUX_FILTER_H 1
| #define HAVE_LINUX_IF_ADDR_H 1
| #define HAVE_LINUX_IF_ETHER_H 1
| #define HAVE_LINUX_IF_TUN_H 1
| #define HAVE_LINUX_NETLINK_H 1
| #define HAVE_LINUX_RTNETLINK_H 1
| #define HAVE_STRERROR_R 1
| #define HAVE_STRSIGNAL 1
| #define HAVE_WAIT4 1
| #define HAVE_MINCORE 1
| #define HAVE_SETENV 1
| #define HAVE_UNSETENV 1
| #define HAVE_DL_ITERATE_PHDR 1
| #define HAVE_MEMMEM 1
| #define HAVE_ACCEPT4 1
| #define HAVE_DUP3 1
| #define HAVE_EPOLL_CREATE1 1
| #define HAVE_FACCESSAT 1
| #define HAVE_FALLOC

[Bug lto/93772] ICE in cgraph.c with lto when symbol not defined

2020-02-16 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93772

--- Comment #2 from Daniel Santos  ---
(In reply to Andrew Pinski from comment #1)
> See https://gcc.gnu.org/wiki/A_guide_to_testcase_reduction on how to reduce
> the sources down to something which you might be able to share with us.

Hello Andrew.  I can give it a try, but up against time constraints at the
moment. :(  This looks like a nice guide btw!  I've been a bit of a stranger on
gcc lately.

Also this project has a mix of C and C++ sources, the later built as follows:

g++ -DHAVE_CONFIG_H -O2 -march=native -g3 -flto -std=gnu++11 -Wall -Wextra
-Wno-unused-parameter -Wno-ignored-qualifiers -MT my_file.o -MD -MP -MF
.deps/my_file.Tpo -c -o my_file.o my_file.cc

[Bug lto/93772] New: ICE in cgraph.c with lto when symbol not defined

2020-02-16 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93772

Bug ID: 93772
   Summary: ICE in cgraph.c with lto when symbol not defined
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

I'm getting an ICE when I try to link while a symbol is used but not defined. 
Unfortunately, the project is closed source, so this is about as much as I can
give:

from cgraph.c:

bool
cgraph_edge::possibly_call_in_translation_unit_p (void)
{
   ...

  /* Otherwise we need to lookup prevailing symbol (symbol table is not merged,
 yet) and see if it is a definition.  In fact we may also resolve aliases,
 but that is probably not too important.  */
  symtab_node *node = callee;
  for (int n = 10; node->previous_sharing_asm_name && n ; n--)
node = node->previous_sharing_asm_name;
  if (node->previous_sharing_asm_name)
node = symtab_node::get_for_asmname (DECL_ASSEMBLER_NAME (callee->decl));
  gcc_assert (TREE_PUBLIC (node->decl)); /* <-- line 3825 */
  return node->get_availability () >= AVAIL_AVAILABLE;
}


soures built with:
gcc -DHAVE_CONFIG_H  -O2 -march=native -g3 -flto -ansi -std=c89
-Wall -Wextra -Wno-unused-parameter -Wno-ignored-qualifiers  -Werror=implicit
-MT gpio/my_file.o -MD -MP -MF gpio/.deps/my_file.Tpo -c -o gpio/my_file.o
gpio/my_file.c

link command:
g++ -O2 -march=native -g3 -flto -std=gnu++11 -Wall -Wextra
-Wno-unused-parameter -Wno-ignored-qualifiers -flto -O2 -o my_file  -lrt -lpthread -Wl,-Bstatic -lboost_date_time -lboost_program_options
-lboost_regex -lboost_system -lboost_thread -Wl,-Bdynamic -lftdi -lusb
-lusb-1.0

during IPA pass: cp
lto1: internal compiler error: in possibly_call_in_translation_unit_p, at
cgraph.c:3825
0x583655 cgraph_edge::possibly_call_in_translation_unit_p()
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/cgraph.c:3825
0x583655 ipa_read_edge_info
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/ipa-prop.c:4378
0x583655 ipa_read_node_info
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/ipa-prop.c:4455
0x583655 ipa_prop_read_section
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/ipa-prop.c:4539
0x583655 ipa_prop_read_jump_functions()
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/ipa-prop.c:4565
0xda927f ipa_read_summaries_1
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/passes.c:2842
0xd79167 read_cgraph_and_symbols
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/lto/lto.c:2972
0xd79167 lto_main()
/usr/src/debug/sys-devel/gcc-9.2.0-r4/gcc-9.2.0/gcc/lto/lto.c:3387
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
lto-wrapper: fatal error: g++ returned 1 exit status
compilation terminated.
/usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/../../../../x86_64-pc-linux-gnu/bin/ld:
error: lto-wrapper failed
collect2: error: ld returned 1 exit status



$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/9.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /tmp/portage/sys-devel/gcc-9.2.0-r4/work/gcc-9.2.0/configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/9.2.0
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/9.2.0
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/9.2.0/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/9.2.0/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/include/g++-v9
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/9.2.0/python
--enable-languages=c,c++,d,go,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--enable-checking=release --with-bugurl=https://bugs.gentoo.org/
--with-pkgversion='Gentoo 9.2.0-r4 p5' --disable-esp --enable-libstdcxx-time
--with-build-config=bootstrap-lto --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --enable-multilib
--with-multilib-list=m32,m64 --disable-altivec --disable-fixed-point
--enable-targets=all --enable-libgomp --disable-libmudflap --disable-libssp
--enable-systemtap --enable-vtable-verify --enable-lto --with-isl
--disable-isl-version-check --enable-default-pie --enable-default-ssp
Thread model: posix
gcc version 9.2.0 (Gentoo 9.2.0-r4 p5)

[Bug target/88617] ICE in ix86_compute_frame_layout, at config/i386/i386.c:11238 since r248029

2019-08-27 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88617

--- Comment #3 from Daniel Santos  ---
(In reply to Martin Liška from comment #2)
> @Daniel: Can you please take a look?

My apologies for missing this one!  I'll take a look.

[Bug go/68931] gccgo fails to build using MUSL libc

2019-04-24 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68931

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #3 from Daniel Santos  ---
Confirmed.  Also present with gcc 7.3.0 and musl 1.1.19 (in addition to some
multiply defined structs)

[Bug translation/90163] untranslated placeholder in warn_once_call_ms2sysv_xlogues

2019-04-19 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90163

--- Comment #2 from Daniel Santos  ---
Yes, this is mine.  Does this only become untranslatable when feature is
"static call chains"?

iiuc, static call chains are only used with nested functions (a GNU C
extension) and closure functions -- is this correct?  So using the descriptor
"static call chains" is probably bad in the first place.  We might be able to
just change this to "nested functions".

But given that, what would be the ideal way to present this?

This is one of those things that *could* be implemented but was initially
deemed to be more work than it would be worth.  It would be better to disable
-mcall-ms2sysv-xloguesthe for the affected functions, but this is pretty much
for Wine and, to my knowledge, we haven't encountered this error in the wild.

[Bug driver/81519] Enhancement: Add --help=target-distcc or similar to dump clean, optimal CFLAGS without using -march=native

2018-12-01 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81519

--- Comment #11 from Daniel Santos  ---
(In reply to Eric Gallager from comment #9)
> (In reply to Daniel Santos from comment #7)
> > (In reply to Martin Liška from comment #4)
> > > Ok, so I've briefly investigated source code and providing such 
> > > information
> > > is definitely not a simple task :/
> > > 
> > > I would recommend to fix PR39851 and then one will just compare output of
> > > following 2 invocations:
> > > 
> > > gcc --help=target  -Q
> > > gcc --help=target -march=native -Q 
> > > 
> > > Will it work for you?
> > > 
> > > Note that fully understand which ISA extensions are enable when is also
> > > quite complex.
> > 
> > I've thought about this some more and I'm starting to think that all of this
> > can be determined with a script that iteratively calls gcc --help=target -Q
> > with various machine flags to determine which -mno-* flags are really needed
> > and which -m flags include others.  So in effect, I'm thinking that we
> > can produce optimal C(XX)FLAGS with a script and your PR39851 fix.  I'll
> > have to test this out.
> > 
> > Thanks
> 
> If you come up with a script, do you want to put it in contrib?

Interesting.  I thought I replied to this already, but maybe that was on a
different bug report?  Anyway, I have a script at
https://github.com/daniel-santos/distccflags but it has a flaw
(https://github.com/daniel-santos/distccflags/issues/2) that I need to fix. 
Other than that, it's fairly use-friendly.  It would probably be a lot cleaner
re-written in perl.

I tried to push this off onto distcc but they wanted it re-written in C.

[Bug target/71958] x86_64-w64-mingw32, ICE when '-mx32' is used

2018-11-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71958

--- Comment #8 from Daniel Santos  ---
Thank you!

[Bug target/71958] x86_64-w64-mingw32, ICE when '-mx32' is used

2018-11-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71958

--- Comment #6 from Daniel Santos  ---
(In reply to Martin Liška from comment #5)
> Dansan: Can you please update Known to work?

Hi Martin,

I don't have bugzilla admin access.  I'm actually missing my gcc git repo due
to a faulty backup when i created my new system, so I can't verify it right now
but I believe it was fixed in 8.1.

[Bug target/87928] [7/8/9 Regression] ICE in ix86_compute_frame_layout, at config/i386/i386.c:11161 since r228607

2018-11-18 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87928

--- Comment #10 from Daniel Santos  ---
(In reply to Uroš Bizjak from comment #9)
> Fixed everywhere.

Thank you Uros, great work!

It's an easy mistake to assume that you're "on one system/ABI or another" and
forget about function-level attributes.  But I've never been too crazy about
the way some globals and macros are sort-of hidden in .md or .opt files and
generated in the build (like ix86_abi).

[Bug libgcc/86290] New: Go cross build fails, "with libgcc_s.so.1 [...] not found"

2018-06-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86290

Bug ID: 86290
   Summary: Go cross build fails, "with libgcc_s.so.1 [...] not
found"
   Product: gcc
   Version: 7.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: mipsel-unknown-linux-gnu

Created attachment 44313
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44313=edit
full build log

I actually need Go on a MIPS32 machine so I'm building a toolchain with
Gentoo's crossdev (had to baby it some) and I ran into this error.  I built it
with gcc-5.4.0 because I forgot to switch out my system compiler, but we're
past that at this point anyway.


/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/build/./gcc/gccgo
-B/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/build/./gcc/
-B/usr/mipsel-unknown-linux-gnu/bin/ -B/usr/mipsel-unknown-linux-gnu/lib/
-isystem /usr/mipsel-unknown-linux-gnu/include -isystem
/usr/mipsel-unknown-linux-gnu/sys-include   -g -O2 -minterlink-mips16 
-static-libstdc++ -static-libgcc -Wl,-O1 -Wl,--as-needed -L
../mipsel-unknown-linux-gnu/libgo -L ../mipsel-unknown-linux-gnu/libgo/.libs -L
../mipsel-unknown-linux-gnu/libgcc -o cgo
/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/gcc-7.3.0/gotools/../libgo/go/cmd/cgo/ast.go
/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/gcc-7.3.0/gotools/../libgo/go/cmd/cgo/doc.go
/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/gcc-7.3.0/gotools/../libgo/go/cmd/cgo/gcc.go
/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/gcc-7.3.0/gotools/../libgo/go/cmd/cgo/godefs.go
/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/gcc-7.3.0/gotools/../libgo/go/cmd/cgo/main.go
/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/gcc-7.3.0/gotools/../libgo/go/cmd/cgo/out.go
/tmp/portage/cross-mipsel-unknown-linux-gnu/gcc-7.3.0-r3/work/gcc-7.3.0/gotools/../libgo/go/cmd/cgo/util.go
zdefaultcc.go
/usr/libexec/gcc/mipsel-unknown-linux-gnu/ld: warning: libgcc_s.so.1, needed by
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so, not found (try using -rpath
or -rpath-link)
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_RaiseException@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_GetIPInfo@GCC_4.2.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_GetTextRelBase@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_Resume_or_Rethrow@GCC_3.3'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_Resume@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_SetGR@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_SetIP@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_GetRegionStart@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_GetLanguageSpecificData@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_GetDataRelBase@GCC_3.0'
../mipsel-unknown-linux-gnu/libgo/.libs/libgo.so: undefined reference to
`_Unwind_Backtrace@GCC_3.3'
collect2: error: ld returned 1 exit status

The link succeeds if I pass -lgcc_s.  So should libgcc/config/t-slibgcc-libgcc
use be writing -lgcc_s for this arch or is something else just looking for
libgcc.so.1 instead of libgcc_s.so.1 and then lying about it?

[Bug target/85994] Comparison failure in 64-bit libgcc _{sav,res}ms64.o on Solaris/x86

2018-06-22 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85994

--- Comment #11 from Daniel Santos  ---
Thank you Rainer!

[Bug libgcc/85621] savms/resms have executable stack (lack GNU-stack marking)

2018-05-02 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85621

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #3 from Daniel Santos  ---
Hello guys.

These save/restore stubs are for 64-bit Wine and I don't know of anything else
that will use them.

I'm having a bit of trouble finding documentation on this section, but I did
find a thread suggesting that it should be documented
(https://sourceware.org/ml/gnu-gabi/2016-q1/msg1.html :)  I guess I need to
dig into binutils sources for info?  What exactly does adding this section do?

Thanks

[Bug debug/83917] [8 Regression] with -mcall-ms2sysv-xlogues, stepping into x86 tail-call restore stub gives bad backtrace

2018-02-28 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83917

--- Comment #9 from Daniel Santos  ---
You are AWESOME!! :)

[Bug debug/83917] [8 Regression] with -mcall-ms2sysv-xlogues, stepping into x86 tail-call restore stub gives bad backtrace

2018-02-25 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83917

--- Comment #5 from Daniel Santos  ---
(In reply to Jakub Jelinek from comment #4)
> Patch posted: http://gcc.gnu.org/ml/gcc-patches/2018-02/msg01294.html

My apologies on dropping the ball here and thanks for picking it up! :)

[Bug debug/83917] [8 Regression] with -mcall-ms2sysv-xlogues, stepping into x86 tail-call restore stub gives bad backtrace

2018-01-20 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83917

--- Comment #3 from Daniel Santos  ---
(In reply to Richard Biener from comment #1)
> Testcase would be nice.

*sigh* Yes, I've seen that there are tests that run gdb through expect, I
haven't learned how to use that yet. 


(In reply to Jakub Jelinek from comment #2)
> So, is this about debug info (which I believe shouldn't be needed), or
> missing unwind info?

This is only about the debug

> I presume the mingw unwind info isn't done through .cfi_* directives, so
> would need to be written by hand.

This doesn't really target Windows at all, but rather Wine.  In fact, it is
disabled on Windows because the SEH code in gcc/config/i386/winnt.c doesn't
support REG_CFA_EXPRESSION.  That said, I haven't actually *tested* this with
C++, which is bad except that I am not currently aware of any such use-case --
Wine is almost entirely written in C.  None the less, it would seem some C++
tests are in order ... that might be an understatement...

> How hard is that, and is this really a
> regression, those snippets didn't exist before?

True, these stubs did not exist before, but the user always had the ability to
step through such functions and get a valid backtrace from the debugger, so I
would think it to be a regression.

[Bug debug/83917] New: [8 Regression] with -mcall-ms2sysv-xlogues, stepping into x86 tail-call restore stub gives bad backtrace

2018-01-17 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83917

Bug ID: 83917
   Summary: [8 Regression] with -mcall-ms2sysv-xlogues, stepping
into x86 tail-call restore stub gives bad backtrace
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---
Target: x86-64-*-*

Here is an example.

0x7bc8fd8e  465 if ((status = alloc_object_attributes(
attr, ,  ))) return status;
1: x/2i $rip
=> 0x7bc8fd8e <NtCreateMutant+78>:  jmpq   0x7bcaebbc <__sse_resms64fx_16>
   0x7bc8fd93 <NtCreateMutant+83>:  nopl   0x0(%rax,%rax,1)
Wine-gdb>
__sse_resms64fx_16 () at
/home/daniel/proj/sys/gcc/git/libgcc/config/i386/resms64fx.h:39
39  mov -0x60(%rsi),%r14
1: x/2i $rip
=> 0x7bcaebbc <__sse_resms64fx_16>: mov-0x60(%rsi),%r14
   0x7bcaebc0 <__sse_resms64fx_15>: mov-0x58(%rsi),%r13
Wine-gdb> bt
#0  __sse_resms64fx_16 () at
/home/daniel/proj/sys/gcc/git/libgcc/config/i386/resms64fx.h:39
#1  0x006900750062005c in ?? ()
#2  0x006d005c0064006c in ?? ()
#3  0x003b00320033006d in ?? ()
#4  0x0077005c003a0043 in ?? ()
#5  0x003c006e0069 in ?? ()
#6  0x00034850 in ?? ()
#7  0x in ?? ()

At this point, I'm not emitting any debug information in the stubs.

[Bug tree-optimization/83784] New: Missed optimization with bitfield

2018-01-10 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83784

Bug ID: 83784
   Summary: Missed optimization with bitfield
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

Created attachment 43095
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43095=edit
test case

The layout of bitfields in memory is, of course, undefined in the C standard
and is implementation-dependent.  But when I happen to guess how gcc will lay
it out correctly, I would like for these pack and unpack functions to
compile-out.  I'm only doing this because I happen to need to be able to know
what 32-bit portion of a 64-bit value has one of the fields (for futex
operations) and bitfields are syntactically easier to work with.  But due to
this flaw, I have to go back to shifting, ANDing, ORing, etc.

The attached test case is probably not as simple as it could be as I'm testing
both 32 and 64-bit code on x86, but the below is probably a descent summary
(for 64-bits):

union u
{
unsigned long ulong_val;
struct {
unsigned long a:4;
unsigned long b:60;
};
};

union u pack(union u in)
{
union u ret;
ret.ulong_val  |= in.b;
ret.ulong_val <<= 4;
ret.ulong_val  |= in.a;
return ret;
}

The above pack function compiles into the no-op I would expect:
pack:
.LFB12:
.cfi_startproc
movq%rdi, %rax
ret
.cfi_endproc


But if I use three bitfields, my pack function is no longer a no-op:

union u
{
unsigned long ulong_val;
struct {
unsigned long a:4;
unsigned long b:30;
unsigned long c:30;
};
};

union u pack( union u in )
{
union u ret;
ret.ulong_val   = in.c;
ret.ulong_val <<= 30;
ret.ulong_val  |= in.b;
ret.ulong_val <<= 4;
ret.ulong_val  |= in.a;
return ret;
}

And here's the output (with hex immediates for ANDs)
pack:
pack:
.LFB11:
.cfi_startproc
movq%rdi, %rax
movq%rdi, %rdx
andl$0xf, %edi
shrq$34, %rax
shrq$4, %rdx
salq$30, %rax
andl$0x3fff, %edx
orq %rdx, %rax
salq$4, %rax
orq %rdi, %rax
ret
.cfi_endproc


Possibly related to bug #15596 and maybe even a duplicate of bug #35363, but
I'm uncertain.  I have only tested on gcc 5.4.0 and 8 from git so far and only
x86, but I'm going to *guess* this is a tree-optimization issue and not the x86
backend.

[Bug c/83117] [8 Regression] FAIL: gcc.target/x86_64/abi/ms-sysv/ms-sysv.c (test for excess errors)

2017-11-27 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83117

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #8 from Daniel Santos  ---
(In reply to Jakub Jelinek from comment #6)
> The warning is nothing new, GCC has been warning for that for years.  What
> my patch did is just better optimization, so the compiler can see the UB.
> 
> Try:
> extern long do_test_aligned ();
> 
> static long (*const do_test_v1) (long a, ...) = (void *) do_test_aligned;
> 
> extern void check_results (long);
> 
> int test (long a)
> {
>   long ret;
> 
>   ret = do_test_v1 (a);
>   ret += (long (*) (long a, ...)) do_test_aligned;
>   check_results (ret);
> }
> 
> We've warned about the latter, but not the former, since we weren't able to
> fold a const var to its initializer.
> 
> So, either the tests shouldn't use const on these, something like:
> -  out << "static __attribute__ ((ms_abi)) long (*const do_test_"
> +  out << "static __attribute__ ((ms_abi)) long (*do_test_"
> or they should use const volatile, or -w, or should use proper prototypes.
> 
> Daniel needs to decide what to do, it isn't obviously clear what the intent
> is.

Sorry for my slow response.  do_test and do_test_aligned are assembly hacks
that verify that the actual test functions do not alter registers that are
volatile for sysv_abi, but non-volatile for ms_abi (the ms to sysv clobbers). 
So the intention was to lie to the compiler about how to call the function, but
now your patch got smart and figured out that I was lying. :(

So in this example, the function msabi_00_v1 is the real function that is being
tested:

  init_test (msabi_00_v1, "msabi_00_v1", ALIGNMENT_NOT_TESTED,
SHRINK_WRAP_NONE, a);
  ret = do_test_v1 (a);
  check_results (ret);

More specifically, the assembly proxy stubs:
1. store rdi, rsi, xmm6-15,
2. populates them with random data
3. pops the return address and stores it
4. calls the function specified in the init_test call prior
upon return it:
5. stores the new values of rdi, rsi xmm6-15 (for later comparison)
6. restores them to what they were originally
7. jumps to the original return address

The test program is a single-threaded and there is no recursion of of these
hacked calls, so I'm just using globals.

Is there a way to disable this warning with -Wno-xxx?  Otherwise, what is the
proper way to lie to a compiler?  I want the compiler to construct the function
call since that is part of what is being tested.

[Bug target/82827] [8 regression] i386/pr82002-2a.c fail

2017-11-03 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82827

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #3 from Daniel Santos  ---
do'h! Sorry, I thought I had them all xfailing!  Will be fixed shortly...

[Bug tree-optimization/82485] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:13232

2017-10-30 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82485

--- Comment #4 from Daniel Santos  ---
Can you please mark this as a duplicate of pr82002?  I have a fix submitted. 
Thanks!

[Bug target/82712] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:11383

2017-10-30 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82712

--- Comment #1 from Daniel Santos  ---
Could you please close this as a duplicate of pr82002?  I've got a (full) fix
submitted now.  Thanks.

[Bug target/82002] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:13233

2017-10-28 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82002

--- Comment #6 from Daniel Santos  ---
Was about to submit a patch set for this that added this nifty mechanism to
track a scratch register for pro/epilogue use and automatically (re)use it when
you call choose_baseaddr.  Then I realized that I could circumvent the whole
thing by emitting the SSE saves or stub call in one of two places based upon
the offset size with much less new complexity.  Will be testing shortly.

[Bug target/82268] [8 regression] i386/pr82196-1.c fail

2017-10-20 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82268

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #2 from Daniel Santos  ---
Well crap.  I wasn't aware of --with-arch and --with-cpu configure options. 
Could you modify gcc/testsuite/gcc.target/i386/pr82196-1.c and see if this
works?

--- a/gcc/testsuite/gcc.target/i386/pr82196-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr82196-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-msse -mcall-ms2sysv-xlogues -O2" } */
+/* { dg-options "-mno-avx -msse -mcall-ms2sysv-xlogues -O2" } */
 /* { dg-final { scan-assembler "call.*__sse_savms64f?_12" } } */
 /* { dg-final { scan-assembler "jmp.*__sse_resms64f?x_12" } } */

Also, it would be helpful if you can show me what the command line for the test
was in the log file (prior to the FAILs).  Thanks

[Bug tree-optimization/82485] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:13232

2017-10-10 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82485

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #3 from Daniel Santos  ---
(In reply to Jakub Jelinek from comment #2)
> Daniel, can you please have a look at this?  Thanks.

This looks like a duplicate of pr82002, which I've been a bit slow to resolve. 
The fix to this error is a really small patch:

@@ -15682,7 +15682,7 @@ ix86_expand_epilogue (int style)
 the stack pointer, if we will restore SSE regs via sp.  */
   if (TARGET_64BIT
  && m->fs.sp_offset > 0x7fff
- && sp_valid_at (frame.stack_realign_offset)
+ && sp_valid_at (frame.stack_realign_offset + 1)
  && (frame.nsseregs + frame.nregs) != 0)
{
  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,


The problem is that if you throw in an ms to sysv call, it breaks due to the
the SP offset overflowing and needing to use a temp register.  So I've
rewritten choose_baseaddr so that when this is needed, it can track the value
of a scratch register so that we don't end up repeating the calculation for
each access beyond the 32-bit range.  Anyway, I'll try to finish that up this
week.

[Bug c/47781] warnings from custom printf format specifiers

2017-09-30 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47781

--- Comment #19 from Daniel Santos  ---
(In reply to Martin Sebor from comment #18)
> The Linux kernel also has a bunch of printf format extensions that GCC
> doesn't know anything about:
> https://www.kernel.org/doc/Documentation/printk-formats.txt.

Further, the printf format extensions in the kernel are designed so as to not
create warnings and so are often two character combinations by using a standard
format specifier followed by a modifying character.  I think that I ran a
script once to count how much extra memory the two bytes vs a single byte take
and it ended up in the 10s of kilobytes.  While this may not sound like much,
remember that the kernel data is never paged out and on some embedded systems,
it actually does make a difference.

Should GCC begin supporting custom printf format specifiers, then I would
propose we begin changing them in the kernel to take advantage of that small
savings.

[Bug target/82196] -mcall-ms2sysv-xlogues stubs sometimes use wrong MOV instruction

2017-09-17 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82196

Daniel Santos  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Daniel Santos  ---
Fixed in 8 dev branch.

[Bug other/39851] gcc -Q --help=target does not list extensions selected by -march=

2017-09-16 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39851

--- Comment #17 from Daniel Santos  ---
Thanks for all your work on this Martin.  I've put a script up on my github
account (https://github.com/daniel-santos/distccflags), updated the Gentoo
Distcc instructions and sent distcc a mail to notify them.

Thanks!

[Bug target/82196] -mcall-ms2sysv-xlogues stubs sometimes use wrong MOV instruction

2017-09-13 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82196

--- Comment #1 from Daniel Santos  ---
Created attachment 42163
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42163=edit
proposed fix minus tests

[Bug target/82196] New: -mcall-ms2sysv-xlogues stubs sometimes use wrong MOV instruction

2017-09-12 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82196

Bug ID: 82196
   Summary: -mcall-ms2sysv-xlogues stubs sometimes use wrong MOV
instruction
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---
Target: x86_64-*-*

The test for rather we use movaps or vmovaps is in
libgcc/config/i386/i386-asm.h and tests the cpp macros __SSE2__ and __AVX__,
which is an error.  This results in at least one situation where movaps is used
when vmovaps is available on the build target.  I would presume that the
alternative condition can exist which would result in a bad opcode.

[Bug target/82002] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:13233

2017-09-10 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82002

--- Comment #5 from Daniel Santos  ---
(In reply to Daniel Santos from comment #4)
> The alternative that I can see is to modify choose_baseaddr so that it can
> init and utilize an auxiliary register (like r11).

I guess this would be called a "scratch" register.

[Bug target/82002] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:13233

2017-09-10 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82002

--- Comment #4 from Daniel Santos  ---
Created attachment 42147
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42147=edit
incomplete patch set with test

(In reply to Jakub Jelinek from comment #3)
> Of course there is none.  Which is why e.g. pro_epilogue_adjust_stack has
> code to handle the case when Pmode is not SImode and offset is not
> x86_64_immediate_operand.  So whatever generated this insn also needs to
> test for sp + offset not being a valid address and load the offset into some
> hard register first and use sp + that_reg.  pro_and_epilogue pass is after
> reload, so we can't wait for RA to handle it for us.

Thanks for the help here Jakub.  Being new to gcc (and obviously the x86
backend), I'm learning about issues that weren't "on my radar", so sorry for
dragging you guys through some of this as well.  This came about because I
moved the stack realignment boundary from the start of the function's frame to
after the GP reg saves so that we could use aligned MOVs for SSE regs.  Prior
to this, we just used the frame pointer with possibly unaligned MOVs and that
offset was never very large.

The bad operand is being generated when ix86_emit_save_sse_regs_using_mov calls
choose_baseaddr.  I wouldn't at all mind using the frame pointer with possibly
unaligned MOVs for a case like this, but I'm not sure what the best solution
is.  This would mean that the realignment boundary
ix86_frame::stack_realign_offset could represent two different locations,
either after reg_save_offset or at stack_pointer_offset.  This would require
adding an alternative calculation to the if (stack_realign_fp) else block in
ix86_compute_frame_layout (basically, readd the old way it was calculated), and
-ms2sysv-xlogues might have to either be disabled or modified since it uses
choose_baseaddr to init rax/rsi prior to calling the stub.

The alternative that I can see is to modify choose_baseaddr so that it can init
and utilize an auxiliary register (like r11).  In this case, I'm thinking that
it might make sense to do something global to track what regs are available
rather than passing 'style' everywhere to know rather or not r11 is live and
also track auxiliary register(s) such as this so we can init it once and then
use it several times.

I know we're approaching the end of stage1, so don't want to shake things up
too much now.  Please let me know what you think.  I might post this to the
list for opinions too.  I'm attaching what I have of the fix for this -- it
solves the problem that was posted, but it's still broken when we call a sysv
function from an ms function.

Thanks

[Bug target/82169] New: Dynamically determine best strategy for -mcall-ms2sysv-xlogues

2017-09-10 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82169

Bug ID: 82169
   Summary: Dynamically determine best strategy for
-mcall-ms2sysv-xlogues
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---
Target: x86-64-*-*

The new -mcall-ms2sysv-xlogues is nice but it would be ideal for its use to be
determined automatically based upon optimization flags, profile-guided
optimizations, cold/hot markings, etc.  I don't know that it is currently
possible or worthwhile, but an analysis using processor_costs or some such to
see it's faster than inline saves might also improve it.

Probably 9.0 would be a sane target release.

[Bug target/82002] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:13233

2017-08-30 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82002

--- Comment #2 from Daniel Santos  ---
Another problem when we throw in an ms to sysv call:

$ cat /home/daniel/proj/sys/gcc/git/gcc/testsuite/gcc.target/i386/pr82002-2a.c
/* { dg-do compile { target lp64 } } */
/* { dg-options "-Ofast -mstackrealign -mabi=ms" } */

void __attribute__((sysv_abi)) a (char *);
void
b ()
{
  char c[100];
  c[1099511627776] = 'b';
  a (c);
  a (c);
}


spawn
/home/daniel/proj/sys/gcc/builds/pr82002-minimal-x86_64-pc-linux-gnu/gcc/xgcc
-B/home/daniel/proj/sys/gcc/builds/pr82002-minimal-x86_64-pc-linux-gnu/gcc/
/home/daniel/proj/sys/gcc/git/gcc/testsuite/gcc.target/i386/pr82002-2a.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -Ofast -mstackrealign
-mabi=ms -S -o pr82002-2a.s
/home/daniel/proj/sys/gcc/git/gcc/testsuite/gcc.target/i386/pr82002-2a.c: In
function 'b':
/home/daniel/proj/sys/gcc/git/gcc/testsuite/gcc.target/i386/pr82002-2a.c:12:1:
error: unrecognizable insn:
(insn/f 36 35 37 2 (set (mem/c:V4SF (plus:DI (reg/f:DI 7 sp)
(const_int 116 [0x2540be410])) [2  S16 A128])
(reg:V4SF 27 xmm6))
"/home/daniel/proj/sys/gcc/git/gcc/testsuite/gcc.target/i386/pr82002-2a.c":7 -1
 (expr_list:REG_DEAD (reg:V4SF 27 xmm6)
(expr_list:REG_CFA_EXPRESSION (set (mem/c:V4SF (plus:DI (reg/f:DI 7 sp)
(const_int 116 [0x2540be410])) [2  S16 A128])
(reg:V4SF 27 xmm6))
(nil
during RTL pass: cprop_hardreg
/home/daniel/proj/sys/gcc/git/gcc/testsuite/gcc.target/i386/pr82002-2a.c:12:1:
internal compiler error: in extract_insn, at recog.c:2306
0x5c1958 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/daniel/proj/sys/gcc/git/gcc/rtl-error.c:108
0x5c1974 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/daniel/proj/sys/gcc/git/gcc/rtl-error.c:116
0xba05a9 extract_insn(rtx_insn*)
/home/daniel/proj/sys/gcc/git/gcc/recog.c:2306
0xba15e8 extract_constrain_insn(rtx_insn*)
/home/daniel/proj/sys/gcc/git/gcc/recog.c:2206
0xbaaaf6 copyprop_hardreg_forward_1
/home/daniel/proj/sys/gcc/git/gcc/regcprop.c:801
0xbab8a4 execute
/home/daniel/proj/sys/gcc/git/gcc/regcprop.c:1308


I guess we don't have a 64-bit offset instruction for (v)movabs :)

[Bug driver/81519] Enhancement: Add --help=target-distcc or similar to dump clean, optimal CFLAGS without using -march=native

2017-08-29 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81519

--- Comment #7 from Daniel Santos  ---
(In reply to Martin Liška from comment #4)
> Ok, so I've briefly investigated source code and providing such information
> is definitely not a simple task :/
> 
> I would recommend to fix PR39851 and then one will just compare output of
> following 2 invocations:
> 
> gcc --help=target  -Q
> gcc --help=target -march=native -Q 
> 
> Will it work for you?
> 
> Note that fully understand which ISA extensions are enable when is also
> quite complex.

I've thought about this some more and I'm starting to think that all of this
can be determined with a script that iteratively calls gcc --help=target -Q
with various machine flags to determine which -mno-* flags are really needed
and which -m flags include others.  So in effect, I'm thinking that we can
produce optimal C(XX)FLAGS with a script and your PR39851 fix.  I'll have to
test this out.

Thanks

[Bug target/82002] [8 Regression] ICE in sp_valid_at, at config/i386/i386.c:13233

2017-08-29 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82002

--- Comment #1 from Daniel Santos  ---
(In reply to Martin Liška from comment #0)
> Starting from r251321 we ICE on:
> 
> $ cat stack-check.ii
> void a (char *);
> void
> b ()
> {
>   char c[100];
>   c[1099511627776] = 'b';
>   a (c);
>   a (c);
> }
> 
> $ g++ stack-check.ii -Ofast -mstackrealign -mabi=ms

Thanks for the report!  I added a new check to catch things that shouldn't be
and it this is good because this invokes a code path that hadn't gotten yet.

  if (TARGET_64BIT
  && m->fs.sp_offset > 0x7fff
  && sp_valid_at (frame.stack_realign_offset)
  && (frame.nsseregs + frame.nregs) != 0)
{
  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
 GEN_INT (m->fs.sp_offset
  - frame.sse_reg_save_offset),
 style,
 m->fs.cfa_reg == stack_pointer_rtx);
}

The 3rd test in that if statement used to be m->fs.sp_valid, but I changed the
way we manage that so that it's valid for some offsets but not others.  I think
that this should be sp_valid_at (frame.stack_realign_offset + 1) however --
stack-grows-down math is still new and weird to me.  I'll spend some more time
with this tomorrow, but I think that one change is correct.

[Bug driver/81519] Enhancement: Add --help=target-distcc or similar to dump clean, optimal CFLAGS without using -march=native

2017-08-25 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81519

--- Comment #6 from Daniel Santos  ---
(In reply to Martin Liška from comment #4)
> Ok, so I've briefly investigated source code and providing such information
> is definitely not a simple task :/

Sorry for my late response and thanks for looking into this.  I too was a bit
daunted when I started to investigate this just for the i386 back-end as there
were a lot of twists and turns, 64-bit bitmasks that were out of bits requiring
hack-arounds to add new processors, etc.

Maybe a nice-to-have would be a generic middle-end mechanism to manage this for
all back-ends in a universal fashion so that this type of thing would be much
easier to handle.  Of course, that would require an awfully large amount of
analysis in order to create a design that works *and* makes sense for everyone,
but such a design could also do a better job of things such as representing the
linage of a microarchitecture, accessing the processor_costs, etc.

[Bug target/81850] [mingw/cygwin] -mabi=sysv ignored

2017-08-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81850

Daniel Santos  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Daniel Santos  ---
Hmm, interestingly I cannot reproduce the error, so I must have made a mistake
somewhere:

$ cat abitest.c
void __attribute__((sysv_abi))(*volatile foo) (void);
void bar (void) {
  foo ();
}

/d/builds/x86_64-pc-cygwin/head-minimal/gcc/xgcc
-B/d/builds/x86_64-pc-cygwin/head-minimal/gcc -c -mabi=sysv -o abitest.o
abitest.c && objdump -dr abitest.o

abitest.o: file format pe-x86-64


Disassembly of section .text:

 :
   0:   55  push   %rbp
   1:   48 89 e5mov%rsp,%rbp
   4:   48 8d 05 00 00 00 00lea0x0(%rip),%rax# b 
7: R_X86_64_PC32foo-0x8
   b:   48 8b 00mov(%rax),%rax
   e:   ff d0   callq  *%rax
  10:   90  nop
  11:   5d  pop%rbp
  12:   c3  retq   
  13:   90  nop


... which is correct.

[Bug target/81850] [mingw/cygwin] -mabi=sysv ignored

2017-08-18 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81850

--- Comment #1 from Daniel Santos  ---
I have a patch that I've tested and will be submitting it shortly (I can't
change the assigned to field yet).

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-08-07 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #63 from Daniel Santos  ---
Created attachment 41943
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41943=edit
test patch for uncaught exception in generator

(In reply to Dominique d'Humieres from comment #62)
> Created attachment 41937 [details]
> Log file for ms-sysv.exp
> 
> Log file generated with
> 
> make -k check-gcc RUNTESTFLAGS="ms-sysv.exp"

Thanks Dominique.  I'm still not seeing all of what I want to see because when
you ran this test run the code generator gcc/testsuite/gcc/ms-sysv-generate.exe
already existed and so was not rebuilt.  I'm curious to know the full command
line when it was built.  Could you apply this patch and run it again?  I should
probably make sure that we rebuild the generator on each new run anyway.

As far as the generator, it's not catching an exception that it should.  I had
been catching it by value because I always knew the type that was being thrown,
but maybe there is some type of weirdness in your particular stdlibc++
implementation, so this patch also changes the catch to by-reference.

Thanks,
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-07-31 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #61 from Daniel Santos  ---
(In reply to Dominique d'Humieres from comment #60)
> At revision r250610 I still see
> 
> WARNING: Could not generate
> /opt/gcc/build_w/gcc/testsuite/gcc/ms-sysv/ms-sysv-generated.h

Thank you for the report.  Perhaps I should have had the test error here
instead of issue a warning.  Can you please post the part of the log file where
the generator fails?

The various building ms-sysv.c are expected if the generator failed.  The
contents of ms-sysv-generated.h are part of each test run.  The -DGEN_ARGS= is
actually ignored by the program -- it's just there to make each test
description unique while reflecting the arguments passed to the generator (to
create the ms-sysv-generated.h used to build that test).

Thanks!
Daniel

[Bug target/25967] Add attribute naked for x86

2017-07-31 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25967

--- Comment #19 from Daniel Santos  ---
(In reply to Uroš Bizjak from comment #18)
> Implemented for gcc 8.

Awesome!  There are actually a number of times over the years that I've wished
this were implemented, thanks! :)

[Bug target/25967] Add attribute naked for x86

2017-07-28 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25967

--- Comment #12 from Daniel Santos  ---
For those interested in a work-around, you can define an __attribute__((used))
function and then within that function use inline assembly to declare your real
function.  This can get messy depending upon how portable you need you your
code to be, here is an example:

static void __attribute__((used)) dummy ()
{
  __asm__ ("\n"
"   .globl myfunc\n"
#ifdef __ELF__
"   .type myfunc,@function\n"
#endif
"myfunc:\n"
"   \n"
"   ret\n   # you must do your own ret.\n"
  )
}

I used this in the ms to system v function call tests:
https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c#L172

[Bug driver/81519] Enhancement: Add --help=target-distcc or similar to dump clean, optimal CFLAGS without using -march=native

2017-07-25 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81519

--- Comment #3 from Daniel Santos  ---
(In reply to Martin Liška from comment #1)
> I can take a look later for GCC 8.0.

Thank you Martin!  I still don't understand enough of gcc to be able to do this
in any reasonable time frame and I've only worked with the i386 backend thus
far.

[Bug other/39851] gcc -Q --help=target does not list extensions selected by -march=

2017-07-22 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39851

Daniel Santos  changed:

   What|Removed |Added

 CC||daniel.santos at pobox dot com

--- Comment #5 from Daniel Santos  ---
Confirmed.

$ diff -u0 <(gcc -Q -march=sandybridge --help=target) <(gcc -march=native -Q
--help=target)
--- /dev/fd/63  2017-07-22 14:49:57.839642336 -0500
+++ /dev/fd/62  2017-07-22 14:49:57.839642336 -0500
@@ -16 +16 @@
-  -maes[disabled]
+  -maes[enabled]
@@ -26 +26 @@
-  -mavx[disabled]
+  -mavx[enabled]
@@ -49 +49 @@
-  -mcx16   [disabled]
+  -mcx16   [enabled]
@@ -62 +62 @@
-  -mfxsr   [disabled]
+  -mfxsr   [enabled]
@@ -79 +79 @@
-  -mmmx[disabled]
+  -mmmx[enabled]
@@ -89 +89 @@
-  -mno-sse4[enabled]
+  -mno-sse4[disabled]
@@ -95 +95 @@
-  -mpclmul [disabled]
+  -mpclmul [enabled]
@@ -97 +97 @@
-  -mpopcnt [disabled]
+  -mpopcnt [enabled]
@@ -112 +112 @@
-  -msahf   [disabled]
+  -msahf   [enabled]
@@ -116,2 +116,2 @@
-  -msse[disabled]
-  -msse2   [disabled]
+  -msse[enabled]
+  -msse2   [enabled]
@@ -119,4 +119,4 @@
-  -msse3   [disabled]
-  -msse4   [disabled]
-  -msse4.1 [disabled]
-  -msse4.2 [disabled]
+  -msse3   [enabled]
+  -msse4   [enabled]
+  -msse4.1 [enabled]
+  -msse4.2 [enabled]
@@ -126 +126 @@
-  -mssse3  [disabled]
+  -mssse3  [enabled]
@@ -135 +135 @@
-  -mtune=  
+  -mtune=  sandybridge
@@ -142 +142 @@
-  -mxsave  [disabled]
+  -mxsave  [enabled]
@@ -144 +144 @@
-  -mxsaveopt   [disabled]
+  -mxsaveopt   [enabled]

[Bug driver/81519] New: Enhancement: Add --help=target-distcc or similar to dump clean, optimal CFLAGS without using -march=native

2017-07-22 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81519

Bug ID: 81519
   Summary: Enhancement: Add --help=target-distcc or similar to
dump clean, optimal CFLAGS without using -march=native
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

To be clear, there is a working solution and that is to run a command such as
this:

gcc -v -E -x c -march=native -mtune=native - < /dev/null 2>&1 | grep cc1 | perl
-pe 's/^.* - //g;'

The problem is that it is very verbose and redundant.  You can filter out the
-mno-* flags, but I'm not certain that that is always correct and still leaves
many more -m flags.  Ideally, a command such as `gcc -Q -x c
-march=native --help=target-distcc' would emit something like

CFLAGS="-march=sandybridge --param l1-cache-size=32 --param
l1-cache-line-size=64 --param l2-cache-size=8192 -mtune=sandybridge"

In addition to the above, the output should contain any -m and
-mno- flags for each that actually differ from the arch and not
include extensions that are already included in another extension.

Calculating this from a script would actually be possible if bug #39851 were
fixed, but there is currently no (that I am aware of) way to get an output of
extensions each arch consists of.  These extensions are listed in invoke.texi
by marketing name, but this would still be a parsing challenge and is subject
to human error since it isn't parsed from the backend code.

Alternatively, I presume that cleaning up the driver code to omit redundant
options could result in the first example being sufficient, but I don't
understand the implications of that and I suspect would likely break other
things.

[Bug target/80969] [8 Regression] ICE in ix86_expand_prologue, at config/i386/i386.c:14606

2017-07-19 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80969

--- Comment #4 from Daniel Santos  ---
Created attachment 41794
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41794=edit
proposed fix (still needs cleanup and tests)

This still needs cleanup and tests as well as some explanations, but it appears
to fix the problem and passes regression tests.  I haven't tested on a machine
with avx512f, however.  So there is still work and testing to be done.

This re-works the way the realigned stack is calculated and also reduces wasted
stack space due to realignment in some cases.  One drawback is that I have some
code some-what duplicated in ix86_compute_frame_layout and I would like to
spend some more time with it to see if I can refactor it.

I will better document the changes when I have some more time.

[Bug target/80969] [8 Regression] ICE in ix86_expand_prologue, at config/i386/i386.c:14606

2017-07-02 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80969

--- Comment #3 from Daniel Santos  ---
Thank you for the report Martin.  I apologize for my slow start on this, I've
been a bit under the weather.  So when I wrote the code for using aligned SSE
saves with realigned (non-DRAP) stack pointer and the -mcall-ms2sysv-xlogues
feature, I had failed to anticipate the need for a stack alignment greater than
16 bytes for the function body.  This ICE is actually occurring from the
realigned stack changes, but this problem also exists with the
-mcall-ms2sysv-xlogues feature and both need to be fixed.

For the sake of the ABI, we only need to save the XMM portion of each SIMD
register (https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx) and we only
need a 16-byte alignment to for those, but this function body needs its locals
64-byte aligned.

On a side note, with the introduction of AVX-512, I wonder if there are some
opportunities to make more efficient (compact) use of the stack with holes
upwards of 56 bytes in the save area.  I should probably learn AVX/2/512
better.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-26 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #56 from Daniel Santos  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #55)
> > --- Comment #54 from Daniel Santos  ---
> > Created attachment 41627 [details]
> >   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41627=edit
> > darwin fixup (on top of v6) -- second attempt
> [...]
> > The macro has only two uses, so if you prefer, I can remove it and just 
> > replace
> > it with inline #if blocks, e.g.,
> >
> > #ifdef __MACH__
> > "   mov " ASMNAME(test_data) "@GOTPCREL(%%rip), %%rax\n"
> > #else
> > "   lea " ASMNAME(test_data) "(%%rip), %%rax\n"
> > #endif
> 
> I'm fine either way, with a slight preference for the macro version (the
> less code duplication, the better ;-)

That's fine with me. :)  


> I've tested this patch last night on both x86_64-apple-darwin11.4.2 and
> i386-pc-solaris2.12 and it worked just fine on both!
> 
> Thanks a lot.
> 
>   Rainer

Wonderful! I presume that we still need libgcc buy-off?  I'll put together a
ChangeLog and post it to gcc-patches tomorrow.

Thanks!
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-25 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41605|0   |1
is obsolete||

--- Comment #54 from Daniel Santos  ---
Created attachment 41627
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41627=edit
darwin fixup (on top of v6) -- second attempt

So I've learned that some_symbol@GOTPCREL(%%rip) resolves to the the address of
the GOT *entry* for that symbol, which has to be dereferenced to get the
address of the object its self.  I was able to test this on my machine by
changing #ifdef __MACH__ to #ifndef and this patch is working using the GOT.

I've re-written do_test_body and added a macro LOAD_TEST_DATA_ADDR(dest) in
hopes to make both the sources fairly readable and the resulting assembly also
readable.  To simplify the routine, I changed mem_to_regs/regs_to_mem to use
r10 instead of rax so that I don't have to save and restore it.

Of course this is sub-optimal code, but the execution of the test program is by
no means the bottleneck -- I'm trying to keep it as simple and maintainable as
possible!

The macro has only two uses, so if you prefer, I can remove it and just replace
it with inline #if blocks, e.g.,

#ifdef __MACH__
"   mov " ASMNAME(test_data) "@GOTPCREL(%%rip), %%rax\n"
#else
"   lea " ASMNAME(test_data) "(%%rip), %%rax\n"
#endif

Thanks!
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-24 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #53 from Daniel Santos  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #52)
> Unfortunately, the patch doesn't work, apart from the
> 
> +# define PCREL "@GETPCREL"
> 
> -> @GOTPCREL typo ;-)

Ah hah! That would explain why I couldn't use that addressing on gnu/linux, I
was looking for the Global Effset Table! :)

> At -O0 -g3, it SEGVs at
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x in ?? ()
> 1: x/i $pc
> => 0x0: 
> (gdb) where
> #0  0x in ?? ()
> #1  0x000100031c58 in do_test_body0 ()
> at
> /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-
> sysv.c:178
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> 
> where %rip is 0x0.  This happens because most of the addresses are off
> by 0x680 bytes.  Here's the disassembly:
> 
> (gdb) x/12i 0x000100031c58-42
>0x100031c2e : push   %rbp
>0x100031c2f :   mov%rsp,%rbp
>0x100031c32 : lea0x1b407(%rip),%rax#
> 0x10004d040
>0x100031c39 :   callq  0x10003247c 
>0x100031c3e :
> lea0x1b4db(%rip),%rax# 0x10004d120 
>0x100031c45 :  callq  0x1000324ea 
>0x100031c4a :  pop%rax
>0x100031c4b :
> mov%rax,0x1b696(%rip)# 0x10004d2e8 
>0x100031c52 :
> callq  *0x1b688(%rip)# 0x10004d2e0 
>0x100031c58 :
> mov0x1bd09(%rip),%rcx# 0x10004d968 
> 
> Here are the addresses that are supposed to be used:
> 
> %p0
> 
> (gdb) p/x _data.regdata[0]
> $11 = 0x10004d6c0
> 
> %p1
> 
> (gdb) p/x _data.regdata[1]
> $12 = 0x10004d7a0
> 
> %p4
> 
> (gdb) p/x _data.retaddr
> $13 = 0x10004d968
> 
> %p3
> 
> (gdb) p/x _data.fn
> $14 = 0x10004d960
> 
> Only the second use of %p4 is right.
> 
>   Rainer

Great! When I correct the GOTPCREL typo, I can build this on gnu/linux and I
get a variation of the same problem.  So apparently GOTPCREL allows you to
specify the address of the object, but not an address plus offset -- which is
why gcc emits that on Darwin in the first place.  All is becoming clear.

Also, I lied about needing all registers in do_test_(un)aligned; I forgot that
this is called as an ms_abi function.  I can clobber rax, r10 and r11 prior to
calling the test function and rcx, rdx, and r8-11 after the test function has
returned.  So I have plenty of registers to accommodate this.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-21 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #50 from Daniel Santos  ---
Created attachment 41605
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41605=edit
darwin fixup (on top of v6)

(In reply to r...@cebitec.uni-bielefeld.de from comment #49)
> 
> No worries at all: don't even think about this stuff before you're well
> again!

Thank you, but this is chronic and comes and goes.  The trend line seems to be
heading upward for the moment, however.

>[...]
> > In hopes of making your review easier, below is a delta between this new 
> > (v6)
> > patch set and your last posted patches.
> 
> The new patch works fine for me on both x86_64-pc-linux-gnu (as
> expected) and i386-pc-solaris2.12.
> 
> On x86_64-apple-darwin11.4.2, there are a couple of isues, some of which
> I'd already resolved before you posted the revised patch.
> 
> * Initially all tests SEGVed like this (e.g. with -p0 and compiled with -O2):
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x00010004664d in regs_to_mem ()
> 1: x/i $pc
> => 0x10004664d :   movaps %xmm6,(%rax)
> (gdb) where
> #0  0x00010004664d in regs_to_mem ()
> #1  0x0001000465df in do_test_body ()
> #2  0x00010002f227 in do_tests_ ()
> #3  0x0001000468e3 in main ()
> 
>   Here, %rax is 0x0.
> 
>   This happens because some setup happens between do_test_body0 and
>   do_test_body, and do_test_aligned jumps directly to do_test_body:
> 
> .globl _do_test_body0
> .no_dead_strip _do_test_body0
> _do_test_body0:
> movq_test_data@GOTPCREL(%rip), %rax
> 
> .globl _do_test_body
> _do_test_body:
> 
> # Save registers.
> lea (%rax), %rax
> call_regs_to_mem
> 
>   By that jump, you bypass the setup of %rax and make the test FAIL.  I
>   managed to avoid this by changing the jmp to do_test_body0 instead.
>   This gets me past this failure, and works on Linux/x86_64, too.
>   However, this makes the tests FAIL on Solaris/x86, supposedly due to
>   the -fomit-frame-pointer/-fno-omit-frame-pointer difference (though I
>   haven't looked more closely).

Thanks again for your help on this.  All of this asm is a big ABI hack and
presumes I'm working with 64-bit SystemV ABI, but apple's ABI appears to differ
somewhat (I've may have found a good description of that here:
https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/MachOTopics/1-Articles/x86_64_code.html).

My hope in declaring my own global symbol in the assembly and then explicitly
RET-ing was to bypass any ABI-specific setup and tear-down, most notably the
hard frame pointer.  In the case of Darwin, this doesn't appear to work since
the asm template instantiation of my "global + offset" seems to work quite
differently and wants to store the base address in a register that is already
being used for something else (actually, every register except XMM0-5 and
XMM16+ are volatile at this point).  I had expected it to generate something
akin to:

lea test_data + 224(%rip), %rax

It would be nice if the "naked" function attribute were available for the i386
back-end, then I wouldn't have to screw around with trying to hack-away the
ABI. (maybe a worthwhile future venture)

The attached patch (on top of v6) *might* solve the problem on Darwin, but I
don't understand exactly how GOTPCREL works, other than it's using a global
offset table for linking.  Hopefully, the linker can translate this directly
into a constant rip-rel offset.  What I'm doing here is that instead of feeding
addresses to the asm template, I'm giving in the offsets and schlepping
together an address operand from that, e.g.:

lea %p0 + test_data@GOTPCREL(%%rip), %%rax

Now if this fix *does* work, then I might need to investigate if this is a
performance problem for Darwin -- why use an extra instruction to copy the
address to a register before modifying it?  If it doesn't work then it's
probably because it really *needs* two instructions.  I'm curious what the
disassembly of the linked program looks like.

> 
> * With the do_test_body0 jump, I hit the next issue on Darwin with -O0:
>   the test SEGVs here:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000100031c4e in do_test_body0 ()
> at
> /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-
> sysv.c:163
> 163   __asm__ ("\n"
> 1: x/i $pc
> => 0x100031c4e :  mov%rax,0x2a8(%rdi)
> (gdb) where
> (gdb) p/x $rdi
> $1 = 0x5dc3340b214ef45c

Yeah, this won't work because the reality is that all GP registers are volatile
at this point, but gcc will generate code that clobbers them based upon the
alleged ABI of the function -- which is a lie. :)  If the attached patch
doesn't work, then I think it's best to just move the assembly back into
do-test.S and hard-code the offsets in a shared header file, so we can use them
as a macro from do-test.S and also check them with asserts in

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-19 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #48 from Daniel Santos  ---
Created attachment 41588
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41588=edit
proposed fix v6 2/2 (libgcc)

The only thing this changes from your patches is some macro names and testing
HAVE_GAS_HIDDEN.  From what I can tell ".hidden" is an ELF thing.  (I guess I
really need to learn the ins and outs of ELF and DWARF much better.)  There
isn't currently a HAVE_GAS_SIZE in gcc/configure.ac, but it looks like .size
syntax can vary across assemblers as well, so hopefully just the #ifdef __ELF__
is a sufficient test for that.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-19 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41533|0   |1
is obsolete||
  Attachment #41544|0   |1
is obsolete||

--- Comment #47 from Daniel Santos  ---
Created attachment 41587
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41587=edit
proposed fix v6 1/2 (testsuite)

I'm sorry for the delay again.  I've been having some health problems
infringing upon my hacking time.

I wanted to study the use of __USER_LABEL_PREFIX__ to make sure I understand
the implications.  I'm not completely clear on rather or not this is
automatically applied when a back end uses gen_rtx_SYMBOL_REF (or some such),
but guess is that it is.  It is also plausible to omit Darwin support for now,
as I've learned that 64-bit Wine isn't yet working for Darwin either.  If there
are further problems, then that might be the smartest way to go since I don't
have access to such a machine witch which I can test, experiment, debug, etc. 
But if this does the trick, then all the better.

I changed the C() macro to ASMNAME() just because I prefer helpful names and I
decided to yank out all of the FUNC_BEGIN/FUNC_END macros from ms-sysv.c and
just use #if directives directly in the string definition.  There's no sense in
maintaining a separate set of asm support macros dealing in strings when
there's only one use site.

I also noticed a possible "gotcha" with the #if __x86_64__ and __SSE2__ -- not
that I would expect necessarily expect it to happen.

In hopes of making your review easier, below is a delta between this new (v6)
patch set and your last posted patches.





diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
index 40e119b6cc3..4a4f2e42c61 100644
--- a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
+++ b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-test.S
@@ -23,40 +23,42 @@ a copy of the GCC Runtime Library Exception along with this
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */

-#ifdef __x86_64__
-
-# ifdef __ELF__
-#  define ELFFN_BEGIN(fn)   .type fn,@function
-#  define ELFFN_END(fn) .size fn,.-fn
-# else
-#  define ELFFN_BEGIN(fn)
-#  define ELFFN_END(fn)
-# endif
-
-#define C2(X, Y)  X ## Y
-#define C1(X, Y)  C2(X, Y)
+#if defined(__x86_64__) && defined(__SSE2__)
+
+/* These macros currently support GNU/Linux, Solaris and Darwin.  */
+
+#ifdef __ELF__
+# define FN_TYPE(fn) .type fn,@function
+# define FN_SIZE(fn) .size fn,.-fn
+#else
+# define FN_TYPE(fn)
+# define FN_SIZE(fn)
+#endif
+
 #ifdef __USER_LABEL_PREFIX__
-# define C(X) C1(__USER_LABEL_PREFIX__, X)
+# define ASMNAME2(prefix, name)prefix ## name
+# define ASMNAME1(prefix, name)ASMNAME2(prefix, name)
+# define ASMNAME(name) ASMNAME1(__USER_LABEL_PREFIX__, name)
 #else
-# define C(X) X
+# define ASMNAME(name) name
 #endif

-# define FUNC(fn)  \
-   .globl C(fn);   \
-   ELFFN_BEGIN(C(fn)); \
-C(fn):
+#define FUNC_BEGIN(fn) \
+   .globl ASMNAME(fn); \
+   FN_TYPE (ASMNAME(fn));  \
+ASMNAME(fn):

-#define FUNC_END(fn) ELFFN_END(fn)
+#define FUNC_END(fn) FN_SIZE(ASMNAME(fn))

-# ifdef __AVX__
-#  define MOVAPS vmovaps
-# else
-#  define MOVAPS movaps
-# endif
+#ifdef __AVX__
+# define MOVAPS vmovaps
+#else
+# define MOVAPS movaps
+#endif

.text

-FUNC(regs_to_mem)
+FUNC_BEGIN(regs_to_mem)
MOVAPS  %xmm6, (%rax)
MOVAPS  %xmm7, 0x10(%rax)
MOVAPS  %xmm8, 0x20(%rax)
@@ -78,7 +80,7 @@ FUNC(regs_to_mem)
retq
 FUNC_END(regs_to_mem)

-FUNC(mem_to_regs)
+FUNC_BEGIN(mem_to_regs)
MOVAPS  (%rax), %xmm6
MOVAPS  0x10(%rax),%xmm7
MOVAPS  0x20(%rax),%xmm8
@@ -101,8 +103,7 @@ FUNC(mem_to_regs)
 FUNC_END(mem_to_regs)

 # NOTE: Not MT safe
-FUNC(do_test_unaligned)
-   #.cfi_startproc
+FUNC_BEGIN(do_test_unaligned)
# The below alignment checks are to verify correctness of the test
# its self.

@@ -112,7 +113,7 @@ FUNC(do_test_unaligned)
jne L0
int $3  # Stack not unaligned

-FUNC(do_test_aligned)
+FUNC_BEGIN(do_test_aligned)
# Verify that incoming stack is aligned
pushf
test$0xf, %rsp
@@ -120,8 +121,7 @@ FUNC(do_test_aligned)
int $3  # Stack not aligned
 L0:
popf
-   jmp C(do_test_body)
-#.cfi_endproc
+   jmp ASMNAME(do_test_body)
 FUNC_END(do_test_aligned)
 FUNC_END(do_test_unaligned)

diff --git a/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
b/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-sysv.c
index 2880f6fc9a2..81c9c1ffdac

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-13 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41544|proposed fix v5 addendum|proposed fix v5 addendum
description|(only partially tested) |

--- Comment #40 from Daniel Santos  ---
Comment on attachment 41544
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41544
proposed fix v5 addendum

OK, my tests have completed successfully.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-12 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41543|0   |1
is obsolete||

--- Comment #38 from Daniel Santos  ---
Created attachment 41543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41543=edit
proposed fix v5 addendum (only partially tested)

I've only run check on RUNTESTFLAGS="ms-sysv.exp" so far and I have a full
regression test running right now, but I *think* this is correct.  I'm
presuming that using .hidden is a no-no as well, at least from what I can tell
it's elf-specific, but I'm not sure what else to do with it other than #ifdef
__ELF__.  (I googled 'hidden elf' and got a lot of interesting fiction...)  So
I'm sorry to just ask you to see if it blows up on Solaris & Darwin w/o gas.

I'm also unsure about my changes to libtgcc/config.host as I just don't have a
broad understanding of all of the *nix platforms out there.

Feedback greatly appreciated!

Thanks,
Daniel

--- Comment #39 from Daniel Santos  ---
Created attachment 41544
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41544=edit
proposed fix v5 addendum (only partially tested)

remove fix stray carriage return...

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-12 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #38 from Daniel Santos  ---
Created attachment 41543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41543=edit
proposed fix v5 addendum (only partially tested)

I've only run check on RUNTESTFLAGS="ms-sysv.exp" so far and I have a full
regression test running right now, but I *think* this is correct.  I'm
presuming that using .hidden is a no-no as well, at least from what I can tell
it's elf-specific, but I'm not sure what else to do with it other than #ifdef
__ELF__.  (I googled 'hidden elf' and got a lot of interesting fiction...)  So
I'm sorry to just ask you to see if it blows up on Solaris & Darwin w/o gas.

I'm also unsure about my changes to libtgcc/config.host as I just don't have a
broad understanding of all of the *nix platforms out there.

Feedback greatly appreciated!

Thanks,
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-12 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #37 from Daniel Santos  ---
(In reply to Daniel Santos from comment #36)
> tutor!  :)  This is assembly with cpp, so the gas .macro could be replaced
> with a cpp macro, but is that acceptable considering that it would result in
> multiple instructions on the same line delimited by semicolons instead of
> "\n\t"?  So should I just copy & paste the instructions and be done with it?

I forgot to note that even though the gas .macro takes an argument for base
offset, in all uses that offset is 0x60.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-12 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #36 from Daniel Santos  ---
Thank you for all of your work on this.  The .cfi directives shouldn't be *too*
critical -- I've barely scratched the surface of learning DWARF and, iirc, the
last time I stepped through these stubs in gdb it still wasn't always able to
determine the call frame correctly (although I could be thinking of stepping
through the assembly code in the test program).  I suppose this can be an issue
for somebody debugging Wine code at some future date, but I have no qualms with
removing it for now, and possibly redoing it later in more portable way (and
that actually provides the debugger with everything it needs).

Also, you *had* mentioned this linking problem in the past and I apologize for
loosing track of it.  I have not actually done a thorough study of ABIs used in
other *nix operating systems, but my guess would be that all 64-bit platforms
that GCC supports use the SystemV ABI except for Windows (Cygwin & MinGW)? 
This is a question outside of my expertise, so please let me know if the below
solution amenable.

I should also note that while this optimization isn't meant for Windows and
would likely almost never appear in code built for windows (unless somebody is
trying to link to objects/libs built on for *nix), support on Windows is
explicitly disabled due to the SEH unwind emit code not supporting
REG_CFA_EXPRESSION, which it requires.  So we don't need the stubs on Windows
anyway.

diff --git a/libgcc/config.host b/libgcc/config.host
index 7711abf2704..f0f0d6c0916 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1355,6 +1355,14 @@ esac
 case ${host} in
 i[34567]86-*-* | x86_64-*-*)
case ${host} in
+   *-*-cygwin* | *-*-mingw*)
+   ;;
+   *)
+   tmake_file="${tmake_file} i386/t-msabi"
+   ;;
+   esac
+
+   case ${host} in
*-musl*)
tmake_file="${tmake_file} i386/t-cpuinfo-static"
;;
@@ -1365,11 +1373,12 @@ i[34567]86-*-* | x86_64-*-*)
;;
 esac

+
 case ${host} in
 i[34567]86-*-linux* | x86_64-*-linux* | \
   i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
   i[34567]86-*-gnu*)
-   tmake_file="${tmake_file} t-tls i386/t-linux i386/t-msabi
t-slibgcc-libgcc"
+   tmake_file="${tmake_file} t-tls i386/t-linux t-slibgcc-libgcc"
if test "$libgcc_cv_cfi" = "yes"; then
tmake_file="${tmake_file} t-stack i386/t-stack-i386"
fi


As for the stubs, I don't think there's a real need to stay tied to gas
extensions -- truth be told, this was my first actual non-inline, x86 assembly
code I have written (last time I did assembly prior was on a Motorola
6502/6510), so I'm sorry to have forced you to become my unwitting tutor!  :) 
This is assembly with cpp, so the gas .macro could be replaced with a cpp
macro, but is that acceptable considering that it would result in multiple
instructions on the same line delimited by semicolons instead of "\n\t"?  So
should I just copy & paste the instructions and be done with it?

Thanks,
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-10 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41532|0   |1
is obsolete||

--- Comment #32 from Daniel Santos  ---
Created attachment 41533
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41533=edit
41532: proposed fix v5

Slight change, I forgot this little bit:

@@ -66,8 +66,6 @@ if { (![istarget x86_64-*-*] && ![istarget i?86-*-*])
 return
 }

-global GCC_RUNTEST_PARALLELIZE_DIR
-
 proc runtest_ms_sysv { cflags generator_args } {
 global GCC_UNDER_TEST HOSTCXX HOSTCXXFLAGS tmpdir srcdir subdir \
   TEST_ALWAYS_FLAGS runtests

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-10 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41486|0   |1
is obsolete||
  Attachment #41487|0   |1
is obsolete||
  Attachment #41488|0   |1
is obsolete||
  Attachment #41489|0   |1
is obsolete||
  Attachment #41490|0   |1
is obsolete||

--- Comment #31 from Daniel Santos  ---
Created attachment 41532
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41532=edit
proposed fix v4

OK, so here goes version 4!  Hopefully we're getting closer.

(In reply to r...@cebitec.uni-bielefeld.de from comment #30)
> Fine.  Imagine your formatting/coding style changes: when they are
> intermixed with functional changes, every reviewer has to check
> thoroughly which part is which.  If you separate them (which is easy for
> you to do) and submit them separately, the formatting stuff will be
> obvious or nearly so, and you save everyone time by having to review
> only the beef of the functional changes, in the end also giving you
> faster review.

Yes.  I had at least split out all of the formatting stuff into the first
patch, but I had attached them from git format-patch and selected the patch
checkbox in Bugzilla, so maybe it wasn't as obvious as it didn't show the
subject or text in the mbox file (it would have probably been better to copy &
paste the subject line of each patch).  I've separated out only what's relevant
to this bug report: what breaks on Solaris, the ugly hard-coded offsets and
replacing the custom parallelization code.  Note that I have included the
changes to split the test jobs into smaller pieces so that potential
RTL-checking timeout issues are resolved.

> > Please also note that I did seek guidance when putting this exp file 
> > together
> > (back in December)  I was following Mike Stump's direction, but you were
> > probably on vacation or something. :) 
> > https://gcc.gnu.org/ml/gcc/2016-12/msg00145.html
> 
> Might be.  However, I've been appointed testsuite maintainer at a time
> when nobody else was around.  Later, Mike stepped up again who's way
> more experienced here than I am.  In the case at hand, I happen to be
> both victim of your patches' fallout and testsuite maintainer ;-)

lol!!  Well I think you're a bit more knowledgeable to some parts of the test
harness as his original direction left all of my tests running one for each
-j!  Once I figured that out, I tried to pester Mike and the mailing list
for a solution, but ended up just hacking out that custom parallelization code.
 But it's all good because I'm learning, and that keeps me happy. :)

> > I've also been motivated to expand the tests by a change somebody else made 
> > to
> > my original patch that I wasn't confident the original tests would fully 
> > check
> > (been worried about it, but it all looks good).  I'll get a cleaned up patch
> > for you soon.
> 
> Excellent, thanks for your patience.  I know the first times through the
> system can be hard and tedious...

HAH! Thanks for YOUR patience! :)  This patch doesn't include those extended
tests, but at least I have run them and feel more confident.  I'll submit those
changes separately.


> Here's the complete output from and amd64-pc-solaris2.12 build with your
> patch included:
> 
> spawn /var/gcc/regression/trunk/12-gcc-64/build/gcc/xgcc
> -B/var/gcc/regression/trunk/12-gcc-64/build/gcc/
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-
> sysv.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2
> -DGEN_ARGS=-p1 -t64
> -I/var/gcc/regression/trunk/12-gcc-64/build/gcc/testsuite/gcc6/ms-sysv
> -I/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv
> -Wall -Wall
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/do-
> test.S -lm -o ./ms-sysv.exe^M
> In file included from
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-
> sysv.c:158:0:^M
> /var/gcc/regression/trunk/12-gcc-64/build/gcc/testsuite/gcc6/ms-sysv/ms-sysv-
> generated.h: In function 'msabi_02_1':^M
> /var/gcc/regression/trunk/12-gcc-64/build/gcc/testsuite/gcc6/ms-sysv/ms-sysv-
> generated.h:42:1: error: bp cannot be used in asm here^M
> compiler exited with status 1
> 
> I suspect the problem is not explicitly passing -fno-omit-frame-pointer,
> though.  gcc/config/i386/sol2.h has
> 
> #define USE_IX86_FRAME_POINTER 1
> #define USE_X86_64_FRAME_POINTER 1
> 
> which does this implicitly...
> 
>   Rainer

Awesome!  This is what I've done in this patch

 # Detect when hard frame pointers are enabled (or required) so we know not

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-09 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #29 from Daniel Santos  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #28)
> As I've said before, the parallelization of ms-sysv.exp runs may be a
> bonus, but is certainly separate from this PR and thus should be split
> out:

Yes, you are right of course.  I was trying to kill too many birds with one
stone and I somehow omitted a bit of your patch for the function size thing,
sorry about that.  Some of this gets complicated though, if you want me to use
dg-runtest then a few other changes must be made as well, but obviously not as
many as I had included.  I'll get this sorted out.

Please also note that I did seek guidance when putting this exp file together
(back in December)  I was following Mike Stump's direction, but you were
probably on vacation or something. :) 
https://gcc.gnu.org/ml/gcc/2016-12/msg00145.html

I've also been motivated to expand the tests by a change somebody else made to
my original patch that I wasn't confident the original tests would fully check
(been worried about it, but it all looks good).  I'll get a cleaned up patch
for you soon.


(In reply to r...@cebitec.uni-bielefeld.de from comment #27)

> * Also as I'd reported before, with the fix above, I still get a couple
>   of FAILures:
> 
>
> FAIL: gcc.target/x86_64/abi/ms-sysv/ms-sysv.c  -O2 "-DGEN_ARGS=-p0\ -t64"
> (test for excess errors)
> Excess errors:
> /var/gcc/regression/trunk/12-gcc/build/gcc/testsuite/gcc/ms-sysv/ms-sysv-
> generated.h:30:1: error: bp cannot be used in asm here
> 
>   Full compiler output is
> 
> In file included from
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/ms-
> sysv.c:158:0:
> /var/gcc/regression/trunk/12-gcc/build/gcc/testsuite/gcc/ms-sysv/ms-sysv-
> generated.h: In function 'msabi_02_0':
> /var/gcc/regression/trunk/12-gcc/build/gcc/testsuite/gcc/ms-sysv/ms-sysv-
> generated.h:30:1: error: bp cannot be used in asm here
> 
>   At least some of the tests PASS now :-)

Well this is a problem and is unexpected.  Can you please post the relevant
portion of the log file?  What I really need to see is the command line to
build ms-sysv.c.  I'm going to *guess* that the problem is that
TEST_ALWAYS_FLAGS contains something that enables hard frame pointers and that
I need this little change:

 # Detect when hard frame pointers are enabled (or required) so we know not
 # to generate bp clobbers.
-if [regexp "^(.+ +| *)-(O0|fno-omit-frame-pointer|p|pg)( +.*)?$" \
-  $cflags match] then {
+if [regexp "^( *|.* )-(O0|fno-omit-frame-pointer|p|pg)( *| +.*)$" \
+  "$TEST_ALWAYS_FLAGS $cflags" match] then {
set generator_args "$generator_args --omit-rbp-clobbers"
 }

We could also just pass --omit-rbp-clobbers to the generator in all cases,
although it would weaken the tests.

Thanks,
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-07 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #26 from Daniel Santos  ---
Created attachment 41490
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41490=edit
proposed fix v3 part 5

I'm currently running a few jobs to try to measure the difference in load
average and running time of tests.  I can think of a whole lot of better
solutions to this issue, but I suppose I will need to throw a few numbers out
there if I'm going to assert it's a problem.  With or without this last patch,
a lot of other problems should be solved.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-07 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #25 from Daniel Santos  ---
Created attachment 41489
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41489=edit
proposed fix v3 part 4

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-07 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #24 from Daniel Santos  ---
Created attachment 41488
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41488=edit
proposed fix v3 part 3

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-07 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #23 from Daniel Santos  ---
Created attachment 41487
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41487=edit
proposed fix v3 part 2

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-07 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41397|0   |1
is obsolete||
  Attachment #41398|0   |1
is obsolete||

--- Comment #22 from Daniel Santos  ---
Created attachment 41486
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41486=edit
proposed fix v3 part 1

Rainer,

I thought I would post this here before posting to the list since I still don't
have a useable i686 build to test with.  Either way, I *think* all of the
Solaris problems should be fixed.  This patch set addresses a number of other
issues as well and ends with a proposed approach to tune parallelization.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-06-01 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #21 from Daniel Santos  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #20)
> > failures, but if you call dg-runtest, you are using gcc's hack-daptation of
> > parallelization.  However, your patch doesn't remove *my* hack-daptation of
> > parallelization, so we end up with two different parallelization schemes 
> > that
> > step on each other's toes.
> >
> > Another problem with the already present parallelization is that it bunches
> > tests into groups of 10 per job which will perform very poorly for these 
> > tests.
> >
> > (https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/lib/gcc-defs.exp#L170).
> >  I presume this is to reduce disk I/O and it makes sense from that 
> > standpoint
> > (I don't want to know what it would take to get a ramdisk/tmpfs in a
> > platform-neutral fashion.)
> 
> My basic point still stands: running your ms-sysv tests sequentially
> takes just a few minutes even on an old and (by today's standards) slow
> CPU, so there's absolutely no point investing lots of effort and
> complexity to parallelize what already runs adequately fast sequentially!

There is plenty of point!  It may be fast without --enable-checking=rtl, but
it's very slow with it.  A very large portion of my development lifecycle was
spent waiting for tests to run.  Using --enable-checking=rtl caught SO many
errors that didn't (or might not have) cause an ICE without it.

Now at the time, all I had was a phenom and when I had more than 64 tests run
in a single function it was very slow (I presume due to thrashing the data
cache) which is the reason the generator splits the tests out into multiple
functions that run 64 tests each.  I have a nice quad i7 now, so it's going
faster.  But one thing I hadn't gotten back to yet was adding more extensive
tests using features and optimizations that effect the stack (-fsplit-stack,
-pg, etc.).


> > However, I'm learning a little more about how the test harness works, and it
> > MAY be possible to call gcc_parallel_test_enable 0 at the start of 
> > ms-sysv.exp
> > and be able to use all of the built-in dg-runtest, et. al. functions!  If I 
> > can
> > get this to work (and not break something else in the process), then we may 
> > be
> > on to a pathway to clean up ms-sysv.exp a little bit -- that is except for a
> > few outstanding (possibly surmountable) issues:
> >
> > 1.) Can the default time-out of 5 minutes be changed?  I need 20 minutes for
> > the slowest processors and a whole HOUR when full tests are enabled.
> 
> Sure it can: for one there are dg-timeout (and preferably dg-timeout
> factor) per testcase.  I still wonder why you'd need that, though: if
> all your tests together take no more than a few minutes, why would you
> need to increase the timeout at all?  Which processor would this be that
> takes 20 minutes or even an hour to run the tests *and complete all
> other tests well within the five minute timeout*?

Thanks for that!  I *may* have a better solution (described below).  But the 5
minute timeout even happens on my new i7 when --enable-checking=rtl is on.

> In fact, every test that takes more than about a minute on a resonably
> current CPU is frowned upon because under parallel testing/load such
> tests tend to run into the timeout.

The time is taken during compilation and also during linking when -flto is used
(again, with rtl checking).

> If your tests regularly exceed the timeout, there's something wrong with
> them: you need to split the so individual tests complete within the
> minute just mentioned.

Hah! That is actually the direction I had decided to go and have been testing
it the last few days. :)  However, they can still run longer than 1 minute
(with rtl checking).  I can split them apart even further with a few changes to
the code generator.

> If really necessary in a setup, it is possible to set
> board_info(unix,gcc,timeout) in a global site.exp file, e.g. to deal
> with really slow/ancient systems.  This would be necessary without your
> test anyway.
> 
> > 2.) The test description should include the generator flags and not just the
> > CFLAGS.  Is that possible from dg-runtest, et. al.?  I suppose it's always
> > possible to add them to CFLAGS with -DGEN_FLAGS="-p0-12" as a hack.
> 
> That's a requirement actually: the summary lines for different runs of a
> test must differ so you can tell them apart if one of them fails.  How
> this is done in the end is primarily a cosmetic issue, though.

I seem to have worked this out, although I have to do a regsub to replace
spaces in the generator_args with an escaped space and it prints a little ugly,
but at least it works.

set escaped_generator_args [regsub -all " " $generator_args "\\ "]
set cflags "$cflags\"-DGEN_ARGS=$escaped_generator_args\""

> > I guess you can see why I said that I was "semi-content" to leave it like it
> > is. :)  But I'm also glad to better understand how the

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-28 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #18 from Daniel Santos  ---
I intended to respond to your comments from 6 days ago sooner, but better late
than never!  Again, sorry for the delay

(In reply to r...@cebitec.uni-bielefeld.de from comment #14)
> You need to make certain to have the necessary 32-bit libraries and
> headers.  Apart from that, configure --target=i686-pc-linux-gnu
> --enable-targets=all should be enough, together with CC='gcc -m32'
> CXX='g++ -m32'.  I don't pass --enable-multilib, this happens by default.

I think I'll need to figure this part out soon, as I posted earlier about
strange problems when calling exec.  As it turns out, there are a lot of
variables in the generated script that aren't being populated,
ORIGINAL_AS_FOR_TARGET being one of them.  The "exec: --: invalid option"
problem is occurs because ORIGINAL_AS_FOR_TARGET expands to an empty string and
the first thing after "exec" is the argument "--32" that is intended for the AS
program.

If it turns out that this is because I AM missing some 32-bit library, then we
still have a bug somewhere as some configure script should fail (but it's
failing to fail :).  I build my x86_64 Gentoo systems with ABI_X86="64 32", so
I should have all 32 bit libraries for pretty much everything.

> >> However, I still don't understand why you are jumping through all these
> >> hoops in ms-sysv.exp doing the compilations etc. manually rather than
> >> just relying on dg-runtest or similar.  This would avoid all this
> >> multilib trouble nicely, and massivly reduce ms-sysv.exp.
> >
> > Well quite frankly because dg-runtest, et. al. don't offer support for tests
> > that use code generators.  The generated headers using the default options 
> > are
> 
> Why would they need to?  You just generate the headers in advance and
> than invoke dg-runtest to compile and run the ms-sysv.c test proper.

The side-effect of this is that we build the code generator in every -j
and run it prior to every call to dg-runtest on each job.  The building takes a
few seconds and generating the header is almost instant, but it's still a lot
of I/O.  Some of this will likely get mitigated by the I/O scheduler, when a
previously written header hasn't been committed to disk yet and a second write
removes the previous one from the I/O queue, but it's still a lot of waste
IHMO.  And as I mentioned before, if you build with -j48, you'll still end up
wasting a lot of disk space, I think around half a gig of space.

The second problem is that using dg-runtest assigns tests to jobs in bunches of
10.  At current, there are 4 tests and they would all be assigned to a single
CPU, even though each CPU would have to build and run the code generator.

Now I have been digging deeper into gcc-defs.exp, and I may have a clean
solution to this problem and possibly a way to make the entire test harness
perform better job distribution, but I think I should fix the real problems
here first and do this other work as subsequent patches.  If my guess is
correct, with a few changes to gcc-defs.exp, I can use dg-runtest (a la your
patch) AND eliminate all of the parallelization code I've written, while still
eliminating all unnecessary builds and execution of the code generator and
distributing the tests evenly across CPUs.


> > between 4.4 and 6 MiB in size and there are more things that need to be 
> > tested
> > (-fsplit-stack to name one) that isn't tested now.  I would also like to 
> > add a
> > feature where defining an environment variable generates more comprehensive
> > tests that I wouldn't want to run for every test (as it could take hours 
> > with
> > --enable-checking=all,rtl).
> 
> I'd strongly suggest only invoking the basic tests during a regular
> testsuite run and control additional test modes with an environment
> variable as you suggest.

Yes, I agree.  Also, I meant "--enable-checking=yes,rtl" above, not "all"
(yikes!).  I've been building with --enable-checking=yes,rtl because I choose
an ambitious goal for my fist gcc project and I am very paranoid about breaking
something.  I want to do everything I can to detect any flaws in my code.  I am
adding more tests that triggered by GCC_TEST_RUN_EXPENSIVE for now.


> > The most behaviorally similar test currently in the tree is
> > gcc/testsuite/gcc.dg/compat/struct-layout-1.exp, which builds a generator
> > (using remote_exec), runs the generator (remote_exec again) to generate 
> > sources
> > for all tests and then builds and runs each test using (using 
> > compat-execute). 
> 
> Right, but the test executions proper are done with
> ${tool}_target_compile, which also underly dg-test/dg-runtest.

True, but this test will only use 4-5 CPUs, no matter how many -j you
throw at it.  I hope to fix this problem with a later patch (as mentioned
above).


 > I can't help but get the the feeling that your doing very much premature
> optimization here.

lol! Well, I admit that's one of my hang-ups

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-26 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #17 from Daniel Santos  ---
(In reply to Rainer Orth from comment #15)
> Created attachment 41404 [details]
> Switch ms-sysv to more regular dg functions

You may be surprised to learn how many faulty assumptions you may have about
how the gcc test harness works and manages parallelization.  I did try your
patch and it didn't work.  I didn't go too deeply into trying to analyze the
failures, but if you call dg-runtest, you are using gcc's hack-daptation of
parallelization.  However, your patch doesn't remove *my* hack-daptation of
parallelization, so we end up with two different parallelization schemes that
step on each other's toes.

Another problem with the already present parallelization is that it bunches
tests into groups of 10 per job which will perform very poorly for these tests.

(https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/lib/gcc-defs.exp#L170).
 I presume this is to reduce disk I/O and it makes sense from that standpoint
(I don't want to know what it would take to get a ramdisk/tmpfs in a
platform-neutral fashion.)

However, I'm learning a little more about how the test harness works, and it
MAY be possible to call gcc_parallel_test_enable 0 at the start of ms-sysv.exp
and be able to use all of the built-in dg-runtest, et. al. functions!  If I can
get this to work (and not break something else in the process), then we may be
on to a pathway to clean up ms-sysv.exp a little bit -- that is except for a
few outstanding (possibly surmountable) issues:

1.) Can the default time-out of 5 minutes be changed?  I need 20 minutes for
the slowest processors and a whole HOUR when full tests are enabled.

2.) The test description should include the generator flags and not just the
CFLAGS.  Is that possible from dg-runtest, et. al.?  I suppose it's always
possible to add them to CFLAGS with -DGEN_FLAGS="-p0-12" as a hack.

I guess you can see why I said that I was "semi-content" to leave it like it
is. :)  But I'm also glad to better understand how the test harness
parallelization works.  Maybe it's possible to make a small modification to
dg-defs.exp to get it to divvy out a single test per job instead of 10.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-25 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #16 from Daniel Santos  ---
Sorry for my delayed response.  I'm working on adding extended tests triggered
by an environment variable (because I needed to better validate somebody else's
changes to my -mcall-ms2sysv-xlogues feature and I encountered a flaw in my
test program when the header is generated with a -p range of 7 or higher.

(In reply to r...@cebitec.uni-bielefeld.de from comment #14)
> You need to make certain to have the necessary 32-bit libraries and
> headers.  Apart from that, configure --target=i686-pc-linux-gnu
> --enable-targets=all should be enough, together with CC='gcc -m32'
> CXX='g++ -m32'.  I don't pass --enable-multilib, this happens by default.

I seem to have encountered some type of bug in something here (autoconf
maybe?).  When the build gets to running configure on libgcc, it fails with:

configure: error: cannot compute suffix of object files: cannot compile
See `config.log' for more details.
make[1]: *** [Makefile:13549: configure-target-libgcc] Error 1
make[1]: Leaving directory
'/home/daniel/proj/sys/gcc/builds/testsuite-fixes-i686-pc-linux-gnu'
make: *** [Makefile:898: all] Error 2


The failure is actually when it's calling exec with -- and the option isn't
recognized:


configure:3469:
/home/daniel/proj/sys/gcc/builds/testsuite-fixes-i686-pc-linux-gnu/./gcc/xgcc
-B/home/daniel/proj/sys/gcc/builds/testsuite-fixes-i686-pc-linux-gnu/./gc
c/ -B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux-gnu/lib/
-isystem /usr/local/i686-pc-linux-gnu/include -isystem
/usr/local/i686-pc-linux-gnu/sys-incl
ude-o conftest -g -O2   conftest.c  >&5
/home/daniel/proj/sys/gcc/builds/testsuite-fixes-i686-pc-linux-gnu/./gcc/as:
line 106: exec: --: invalid option
exec: usage: exec [-cl] [-a name] [command [arguments ...]] [redirection ...]
configure:3472: $? = 1
configure:3660: checking for suffix of object files
configure:3682:
/home/daniel/proj/sys/gcc/builds/testsuite-fixes-i686-pc-linux-gnu/./gcc/xgcc
-B/home/daniel/proj/sys/gcc/builds/testsuite-fixes-i686-pc-linux-gnu/./gc
c/ -B/usr/local/i686-pc-linux-gnu/bin/ -B/usr/local/i686-pc-linux-gnu/lib/
-isystem /usr/local/i686-pc-linux-gnu/include -isystem
/usr/local/i686-pc-linux-gnu/sys-incl
ude-c -g -O2  conftest.c >&5
/home/daniel/proj/sys/gcc/builds/testsuite-fixes-i686-pc-linux-gnu/./gcc/as:
line 106: exec: --: invalid option
exec: usage: exec [-cl] [-a name] [command [arguments ...]] [redirection ...]
configure:3686: $? = 1


I don't fancy myself an autotools expert, so I'm not yet certain that it's a
bug.  I'm using gentoo with 

app-shells/bash-4.3_p48-r1
sys-devel/autoconf-2.64 (among others)
sys-devel/binutils-2.26.1

My spidy-sense tells me to try out binutils 2.27 or some such.

I'll get back on the rest of this soon.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-21 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #12 from Daniel Santos  ---
Created attachment 41398
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41398=edit
proposed fix v2 part 2

Formatting, comments and other aesthetic changes.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-21 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41396|0   |1
is obsolete||

--- Comment #11 from Daniel Santos  ---
Created attachment 41397
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41397=edit
proposed fix v2 part 1

Oops, I forgot to re-add my parallelize.exp script to the commit.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-21 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

Daniel Santos  changed:

   What|Removed |Added

  Attachment #41386|0   |1
is obsolete||

--- Comment #10 from Daniel Santos  ---
Created attachment 41396
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41396=edit
proposed fix v2 part 1

So I've moved the body of do_test_(un)aligned into inline gcc where I can use
the template to pass the offsets within test_data.  This splits up the assembly
and makes it a bit harder to decipher, but it also cleans up access to
test_data struct members.

I've hard-coded the "-m64" into the CFLAGS for now and this should be fine
since the test only runs on 64-bit x86 and only when the remote is native.  If
I ever figure out where this is usually fed from, I'll swap that out. :)

Anyway, if you can test it again for me and let me know what you think I would
appreciate it.  I've got some other code formatting changes I want to send with
it, but I separated it out from this patch to simplify reading.  I'll post the
second patch anyway though.

Thanks,
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-20 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #9 from Daniel Santos  ---
Thank you again for the assistance.

(In reply to r...@cebitec.uni-bielefeld.de from comment #8)
> Daniel,
>
> > Would you be so kind as to test this on Solaris for me please?  I don't have
> > access to a Solaris machine and I've never set it up before, so I wouldn't 
> > even
> > know where to start to try to build an OpensSolaris VM.
>
> sure, though there's no need at all (except for the .struct part) to do
> the testing on Solaris.  I believe there are ready-made Solaris/x86
> VirtualBox images, though.

I've found a few, so going to try them out when I get some time.  Oracle even
has something on their downloads.  I haven't used Solaris since the early
aughts.

> For the multilib problem, you can easily
> configure gcc for i686-pc-linux-gnu with --enable-targets=all on a
> Linux/x86_64 box (with a few necessary 32-bit development packages
> added), so the default multilib is non-x86_64, while the x86_64 multilib
> is only used with -m64.

Hmm, I seem to be having problems getting this to work.  Would I configure with
--target=i686-pc-linux-gnu --enable-targets=all --enable-multilib?

> However, I still don't understand why you are jumping through all these
> hoops in ms-sysv.exp doing the compilations etc. manually rather than
> just relying on dg-runtest or similar.  This would avoid all this
> multilib trouble nicely, and massivly reduce ms-sysv.exp.

Well quite frankly because dg-runtest, et. al. don't offer support for tests
that use code generators.  The generated headers using the default options are
between 4.4 and 6 MiB in size and there are more things that need to be tested
(-fsplit-stack to name one) that isn't tested now.  I would also like to add a
feature where defining an environment variable generates more comprehensive
tests that I wouldn't want to run for every test (as it could take hours with
--enable-checking=all,rtl).

The most behaviorally similar test currently in the tree is
gcc/testsuite/gcc.dg/compat/struct-layout-1.exp, which builds a generator
(using remote_exec), runs the generator (remote_exec again) to generate sources
for all tests and then builds and runs each test using (using compat-execute). 
Calls to remote_exec are not automatically parallelized.  I don't fully
understand how the gcc/testsuite/lib/compat.exp library works, but I'm guessing
that calls to compat-execute are parallelized by dejagnu.

The scheme that struct-layout-1 uses builds the generator and creates sources
for all of the tests in job directory (i.e.,
gcc/testsuite/gcc{,1,2,3,4,5,6,etc.}/gcc.dg-struct-layout-1).  They take up
1.21 MiB per job, so -j48 results in 58 MiB of space usage.  My generator and
generated sources are larger, and currently take about 11.65 MiB per job, so
-j48 would eat 559 MiB of disk space, even though there are only 6 tests at the
moment.  This could be mitigated if there was a way to build and run the
generator only once and have the output go to a directory shared across jobs,
but I'm not yet aware of any such existing mechanism.

This doesn't mean that my approach is the only solution.  In fact, I built this
with Mike Stump's counsel and later discovered that when I ran multiple jobs,
each test was run once per job, so -j8 would run all of the tests 8 times,
rather than split them apart!  That's when I added the parallelization scheme.

So if you have some better ideas on how to accomplish this then please do
present them.  Or maybe I'm misunderstanding something about the way
dg-runtest, gcc_target_compile, etc. work in relation to parallelism?  My
understanding is that if I use them in succession for a single test run (i.e.,
build the generator, run the generator, build & run the test) that they could
end up being run on different jobs and then fail.

> One or two nits about PR management, btw.: it is good practice to take
> the PR if you're working on it.  And just add the URL to the patch
> submissing into the URL field.
>
> Thanks.
>
>   Rainer

I very much appreciate hints and guidance about proper PR management, coding
standards, etiquette, procedures, norms, etc.! I'm still pretty new to this
project but I find it really enjoyable.  However, I don't seem to have the
privileges to change those fields.  Do I need to seek advanced privileges from
somebody?

[Bug target/78962] i386: Missed optimization: unaligned SSE movs with force_align_arg_pointer

2017-05-19 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78962

Daniel Santos  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Daniel Santos  ---
Fixed

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e67d3d383449dc4680615e44098577f45944e4dc

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-18 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #7 from Daniel Santos  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #3)
> > Well, this was my introduction to DejaGnu and the test harness.  I found 
> > that
> > none of these support doing a build when there is more than one object file 
> > --
> > in my case, I'm linking the output of both ms-sysv.c and do_test.S.  
> > However,
> 
> Huh?  What about dg-additional-sources, which seems to be exactly what
> you need and is even documented in sourcebuild.texi ;-)

I think my limitation is the generated headers.  I was examining how
struct-layout-1.exp works (as it is the most similar test to mine) and it
generates all of it's sources for each job, but they are pretty small.  I don't
suppose you know of a way to generates my sources (headers) in one place and
then access them when ms-sysv.exp runs do you?  I'm semi-content with leaving
it as it is (presuming all of the flaws can be fixed), but it would be nice to
be able to use the standard functions.

On the flip side, each of these tests are fairly slow (especially with rtl
checking) and I understand that the way dg splits up tests might not
parallelize well with fewer, slower tests.

Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-18 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #6 from Daniel Santos  ---
Created attachment 41386
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41386=edit
proposed fix

Rainer,

Would you be so kind as to test this on Solaris for me please?  I don't have
access to a Solaris machine and I've never set it up before, so I wouldn't even
know where to start to try to build an OpensSolaris VM.

Also, I don't know if Mike will accept adding the new
gcc/testsuite/lib/parallelize.exp, if not I'll just reintegrate it into the
ms-sysv.exp file.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-18 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #5 from Daniel Santos  ---
OK, I think I've got these fixed but I need to rerun my tests now.  Somebody
else discovered another flaw that caused the test to break with -j1 (when
parallelization wasn't being used).  I hate that I've had to go so far off of
the beaten path with this test, but I don't see another way to do it right now
unless I pre-generated all of the headers (for each parallelization directory),
but they are about 9 MB each so I think that would be a very bad idea -- if you
ran -j48, you could get 1.7 GiB of stupid headers.

(In reply to r...@cebitec.uni-bielefeld.de from comment #4)
> Exactly: I'd avoid that if you can.  It will only complicate things.

The only advantage is that I can get the offsets of the test_data struct
members cleanly using inline gcc, but for now I've just replaced them with hard
offsets and put asserts in the main() of the C file to validate that those
offsets haven't changed.

Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-15 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #2 from Daniel Santos  ---
Actually, I just realized that it won't help to move do_test.S into ms-sysv.c
as inline asm because each test still needs a unique ms-sysv-generated.h header
that's generated by the output of gen.cc.  Although I suppose it's possible to
generate all of the headers in advance (with unique names) and then set the
header name in the options with -DGEN_HEADER_NAME=name.h -- that will have its
own set of issues.

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

2017-05-15 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759

--- Comment #1 from Daniel Santos  ---
(In reply to Rainer Orth from comment #0)
> It seems to me that ms-sysv.exp is seriously misguided in trying to do all
> its compilations manually instead of using
> dg-test/dg-runtest/gcc_target_compile
> which whould nicely avoid all those issues.

Well, this was my introduction to DejaGnu and the test harness.  I found that
none of these support doing a build when there is more than one object file --
in my case, I'm linking the output of both ms-sysv.c and do_test.S.  However,
this test started out with multiple .c files and I was able to reduce it down
to one.  I'm going to see if there's a way to cleanly do my assembly inline and
reduce it down to a single translation unit which gc-runtest, et. al. will work
with.  Otherwise, I'll have to fix the .struct, CFLAGS and multiple warnings.

If I'm wrong about the single object file thing, please point me in the right
direction.

> The new gcc.target/x86_64/abi/ms-sysv tests FAIL in various e.g. on
> i386-pc-solaris2.*
> and i686-pc-linux-gnu:
> 
> * In those 32-bit-default configurations, the 32-bit multilib is skipped as
>   unsupported as expected (although the UNSUPPORTED entry in gcc.sum occurs
>   e.g. 45 times for -j48 testing instead of only once),

Sadly, I discovered that by not using the standard test functions that I had to
cook up my own parallelism scheme, otherwise all of my tests would run once for
each -j!  I think that this issue is fixable though.

[Bug middle-end/80735] New: IPA: SRA inhibits constant propagation of structs across multiple function calls

2017-05-13 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80735

Bug ID: 80735
   Summary: IPA: SRA inhibits constant propagation of structs
across multiple function calls
   Product: gcc
   Version: 7.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
CC: mjambor at suse dot cz
  Target Milestone: ---

Created attachment 41350
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41350=edit
test_case.c

I've finally managed a simple test case for this long-standing missed
optimization.  Given the test code built with -O2 -fno-inline:

static const struct foo {
  long a;
  long b;
} f = {8, 8};

static long a (const struct foo *foo) {return foo->b;}
static long b (const struct foo *foo) {return a (foo);}
long c (void) {return b ();}

Result:
 :
   0:   48 89 f8mov%rdi,%rax
   3:   c3  retq

0010 :
  10:   bf 08 00 00 00  mov$0x8,%edi
  15:   eb e9   jmp0 

0020 :
  20:   eb ee   jmp10 

Although we got isra for foo::b, I had expected a() to consist of only mov
$0x8, %eax; retq, and b() just be a jump to a().  But when we disable ipa-sra
we get the expected result (-O2 -fno-inline -fno-ipa-sra):

 :
   0:   b8 08 00 00 00  mov$0x8,%eax
   5:   c3  retq

0010 :
  10:   eb ee   jmp0 

0020 :
  20:   eb ee   jmp10 

If we replace the struct with an array or a pointer to a long then SRA does not
interfere with the constant propagation (-O2 -fno-inline):

static const long f[2] = {8, 8};
static long a (const long foo[]) {return foo[1];}
static long b (const long foo[]) {return a (foo);}
long c (void){return b (f);}

Result
 :
   0:   b8 08 00 00 00  mov$0x8,%eax
   5:   c3  retq

0010 :
  10:   eb ee   jmp0 

0020 :
  20:   eb ee   jmp10 

I'm still very new to this part of GCC, but I'm guessing that when we do the
SRA, we toss out the original aggregate.  If so, then we aren't reserving the
possibility that all of the function's callers could get cloned with a constant
for the aggregate, which would (probably always?) be better than just plucking
the scaler out of the aggregate.  I'll be digesting tree.sra.c and the cgraph
to try and figure this one out, but if anybody understands this better then I
would appreciate some hints.  :)

[Bug testsuite/79867] [cygwin] LD_LIBRARY_PATH ignored, contaminating (nearly?) all test results

2017-03-07 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79867

--- Comment #1 from Daniel Santos  ---
Minor correction: LD_LIBRARY_PATH is used to resolve lib names when dlopen() is
called, but not for load-time linking.

There are also a few other complications on Cygwin.  DLLs (including libgcc)
are stored in /usr/bin and not /usr/lib.  Windows searches the PATH to find
dlls, and you can't remove /usr/bin from the PATH and have anything work.  So
any type of real solution will likely require Cygwin to change the install
location of gcc's libs, otherwise there is no assurance that we've loaded the
correct library (should something be amiss in the build tree).

Finally, there is no ldconfig on Cygwin at this time.

[Bug testsuite/79867] New: [cygwin] LD_LIBRARY_PATH ignored, contaminating (nearly?) all test results

2017-03-04 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79867

Bug ID: 79867
   Summary: [cygwin] LD_LIBRARY_PATH ignored, contaminating
(nearly?) all test results
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

Running the testsuite resulted in 15,308 instances of the error message
"cyggfortran-4.dll: cannot open shared object file: No such file or directory"
(similar for libatomic).  I could see in the log file that LD_LIBRARY_PATH was
set correctly and that the file did exist in that path.  After much
experimentation, wailing, gnashing of teeth and a few emails to the Cygwin list
(https://cygwin.com/ml/cygwin/2017-03/msg00059.html) I learned that
LD_LIBRARY_PATH is *not* used in Cygwin.  (I presume because there's no
practical way to get the Windows kernel to use it when loading libraries.)  I
couldn't even debug it with gdb because the link failure happened before the
image's entry point was even called.

This means that when we run the testsuite on Cygwin and we're expecting to test
using what we have built, but are actually linking in libgcc, libgfortran,
libstdc++, etc. from the environment and NOT from the build tree, so that any
Cygwin-specific regression in those libraries that arise between a.) the
release that is installed and b.) the version being tested will not be exposed
by any normal execution of the testsuite.

I'm still fairly green with DejaGnu and Tcl/expect, but perhaps the solution is
to

1. Add a platform-abstracted mechanism to DejaGnu for altering the executable
and library search paths, and
2. Deprecate and replace all usage (reading or modifying) of the
LD_LIBRARY_PATH, SHLIB_PATH and PATH environment variables in tests.

Then on Cygwin, DejaGnu can internally track the executable and library search
paths separately as they are modified and set the PATH environment variable
accordingly.

It would also be nice to have an interim hack for running tests on Cygwin.

[Bug bootstrap/79771] [7 Regression] in-tree zlib breaks build

2017-03-03 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79771

--- Comment #3 from Daniel Santos  ---
I'm guessing that either they didn't test on Cygwin or they tested on a
pre-release version or I have some local/environmental issue, although my
environment was just recently generated.

Upstream is at 1.2.11 and the latest zlib available in Cygwin is 1.2.8-3, which
does not have this patch, but I am not an expert in Cygwin.  _wopen is an
ms-proprietary function and I presumed that it not being exported in Cygwin was
intentional, although I could be wrong.  It's probably a good idea to send it
upstream.

[Bug regression/79771] New: in-tree zlib breaks build

2017-02-28 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79771

Bug ID: 79771
   Summary: in-tree zlib breaks build
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: regression
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

Built from the head on Windows 7 using cywgin64.

Configured with:
/home/daniel/proj/sys/gcc/work0/configure --host=x86_64-pc-cygwin
--build=x86_64-pc-cygwin --target=x86_64-pc-cygwin
--prefix=/home/daniel/local/gcc-head-test-unpatched-cygwin64
--enable-stage1-checking=yes,rtl --enable-libssp --enable-libada --enable-lto
--enable-gold=yes --enable-ld=yes --enable-bootstrap



make[3]: Entering directory
'/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/gcc'
/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/./prev-gcc/xg++
-B/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/./prev-gcc/
-B/home/daniel/local/gcc-head-test-unpatched-cygwin64/x86_64-pc-cygwin/bin/
-nostdinc++
-B/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/prev-x86_64-pc-cygwin/libstdc++-v3/src/.libs
-B/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/prev-x86_64-pc-cygwin/libstdc++-v3/libsupc++/.libs

-I/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/prev-x86_64-pc-cygwin/libstdc++-v3/include/x86_64-pc-cygwin

-I/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/prev-x86_64-pc-cygwin/libstdc++-v3/include
 -I/home/daniel/proj/sys/gcc/work0/libstdc++-v3/libsupc++
-L/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/prev-x86_64-pc-cygwin/libstdc++-v3/src/.libs
-L/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/prev-x86_64-pc-cygwin/libstdc++-v3/libsupc++/.libs
-no-pie   -g -O2 -gtoggle -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common
 -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc -Wl,--stack,12582912 -o
f951.exe \
fortran/arith.o fortran/array.o fortran/bbt.o fortran/check.o
fortran/class.o fortran/constructor.o fortran/cpp.o fortran/data.o
fortran/decl.o fortran/dump-parse-tree.o fortran/error.o fortran/expr.o
fortran/interface.o fortran/intrinsic.o fortran/io.o fortran/iresolve.o
fortran/match.o fortran/matchexp.o fortran/misc.o fortran/module.o
fortran/openmp.o fortran/options.o fortran/parse.o fortran/primary.o
fortran/resolve.o fortran/scanner.o fortran/simplify.o fortran/st.o
fortran/symbol.o fortran/target-memory.o  fortran/convert.o
fortran/dependency.o fortran/f95-lang.o fortran/trans.o fortran/trans-array.o
fortran/trans-common.o fortran/trans-const.o fortran/trans-decl.o
fortran/trans-expr.o fortran/trans-intrinsic.o fortran/trans-io.o
fortran/trans-openmp.o fortran/trans-stmt.o fortran/trans-types.o
fortran/frontend-passes.o libbackend.a main.o libcommon-target.a libcommon.a
../libcpp/libcpp.a ../libdecnumber/libdecnumber.a -L./../zlib -lz libcommon.a
../libcpp/libcpp.a -lintl  ../libbacktrace/.libs/libbacktrace.a
../libiberty/libiberty.a ../libdecnumber/libdecnumber.a  attribs.o \
 -lmpc -lmpfr -lgmp   -L./../zlib -lz
./../zlib/libz.a(libz_a-gzlib.o):gzlib.c:(.text+0x646): undefined reference to
`_wopen'
./../zlib/libz.a(libz_a-gzlib.o):gzlib.c:(.text+0x646): relocation truncated to
fit: R_X86_64_PC32 against undefined symbol `_wopen'
collect2: error: ld returned 1 exit status
make[3]: *** [/home/daniel/proj/sys/gcc/work0/gcc/fortran/Make-lang.in:97:
f951.exe] Error 1
make[3]: Leaving directory
'/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64/gcc'
make[2]: *** [Makefile:4585: all-stage2-gcc] Error 2
make[2]: Leaving directory
'/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64'
make[1]: *** [Makefile:21590: stage2-bubble] Error 2
make[1]: Leaving directory
'/home/daniel/proj/sys/gcc/work0/build/head-test-unpatched-cygwin64'
make: *** [Makefile:21802: bootstrap] Error 2

This appears to be due to 1e5dce21ca754450ca2bb1481aed0f5ee62560b8 (in the git
mirror) committed on 2017-01-13.  I think the problem caused by this chunk:

@@ -239,7 +239,7 @@ local gzFile gz_open(path, fd, mode)

 /* open the file with the appropriate flags (or just use fd) */
 state->fd = fd > -1 ? fd : (
-#ifdef _WIN32
+#ifdef WIDECHAR
 fd == -2 ? _wopen(path, oflag, 0666) :
 #endif
 open((const char *)path, oflag, 0666));

WIDECHAR defined in gzguts.h as follows:

@@ -35,6 +39,10 @@
 #  include 
 #endif

+#if defined(_WIN32) || defined(__CYGWIN__)
+#  define WIDECHAR
+#endif
+
 #ifdef WINAPI_FAMILY
 #  define open _open
 #  define read _read


I don't know Cygwin's architecture well, but I'm guessing that we don'

[Bug rtl-optimization/78962] New: i386: Missed optimization: unaligned SSE movs with force_align_arg_pointer

2016-12-31 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78962

Bug ID: 78962
   Summary: i386: Missed optimization: unaligned SSE movs with
force_align_arg_pointer
   Product: gcc
   Version: 5.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

void b (void);  /* Normal System V function.  */
__attribute__((ms_abi, force_align_arg_pointer)) void a (void)
{
b ();
}

When using __attribute__((force_align_arg_pointer)) on 64-bit code, the stack
pointer is re-aligned, but SSE movs in pro/epilogues to save/restore clobbers
caused by the Microsoft ABI function calling a System V function are emitted
unaligned.  I posted a patchset for this problem here
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01924.html.  Uros advised to
resubmit when the next stage 1 rolls around, so filing a bug for now.

[Bug c/61939] warn when attribute((aligned(x))) is ignored

2016-10-19 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61939

--- Comment #2 from Daniel Santos  ---
(In reply to Vedran Miletic from comment #1)
> #include 
> #include 
> float f(std::vector& A, std::vector& B)
> {
>   __builtin_assume_aligned(A.data(), 64);
>   __builtin_assume_aligned(B.data(), 64);
>   return std::inner_product(A.begin(), A.end(), B.begin(), 0.f);
> }

You are doing it wrong. __builtin_assume_aligned() returns void* and you must
use it's return value for it to be effective. So your code should be something
like this:

float f(std::vector& A, std::vector& B)
{
  float *a_data = __builtin_assume_aligned(A.data(), 64);
  float *b_data = __builtin_assume_aligned(B.data(), 64);
  return std::inner_product(a_data, b_data, B.begin(), 0.f);
}

Of course, this assumes that the buffer that your vector<> implementation
supplies is 64 byte aligned.

[Bug target/54829] bad optimization: sub followed by cmp w/ zero (x86 & ARM)

2016-08-11 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54829

Daniel Santos  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #10 from Daniel Santos  ---
(In reply to Richard Earnshaw from comment #8)
> Unfortunately, computers don't to infinite precision arithmetic by default. 
> That would perform a different comparison in that it checks that r0 > r1,
> not whether r0 - r1 > 0.  The difference, for signed comparisons, is when
> overflow occurs.

Looks like I've forgotten to close this bug after retesting. Thank you again
Richard for clarifying this. When modifying the compare function as described
above, optimization is perfect on all targets I've tested (x86, ARM and MIPS).

[Bug c/68507] New: attribute ms_abi (on Linux) bloats by pushing/popping xmm6-15 needlessly

2015-11-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68507

Bug ID: 68507
   Summary: attribute ms_abi (on Linux) bloats by pushing/popping
xmm6-15 needlessly
   Product: gcc
   Version: 4.9.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: daniel.santos at pobox dot com
  Target Milestone: ---

Created attachment 36814
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36814=edit
simple test case

When a function declared ms_abi calls a function not (and thus, implicitly
sysv_abi on Linux) then the ms_abi function saves & restores xmm6-16 needlessly
causing quite a lot of bloat in wine.

I'm still on 4.9.3, so building (gentoo) 5.2.0 now to see if the bug is still
there. When debian released the wine 1.8-rc1 today, all of the code was bloated
like this too.

[Bug target/68507] attribute ms_abi (on Linux) bloats by pushing/popping xmm6-15 needlessly

2015-11-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68507

--- Comment #3 from Daniel Santos  ---
(In reply to Andrew Pinski from comment #2)
> I think there is an ABI difference with respect of xmm6-16 between sysv ABI
> and windows ABIs.  Can you provide why you think this is not a bug?

Ehem, uh no. I was recently informed that sysv_abi treats xmm registers as
temp, but I need to read up on this topic. If this is really true then this bug
report would be invalid (but still very sad).

[Bug c/68507] attribute ms_abi (on Linux) bloats by pushing/popping xmm6-15 needlessly

2015-11-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68507

--- Comment #1 from Daniel Santos  ---
Correction: xmm6-15, I can't type today. And here is the output on gcc 4.9.3:

$ objdump -dSr test_case.o

test_case.o: file format elf64-x86-64


Disassembly of section .text:

 :
   0:   55  push   %rbp
   1:   48 89 e5mov%rsp,%rbp
   4:   57  push   %rdi
   5:   56  push   %rsi
   6:   48 81 ec a0 00 00 00sub$0xa0,%rsp
   d:   0f 29 34 24 movaps %xmm6,(%rsp)
  11:   0f 29 7c 24 10  movaps %xmm7,0x10(%rsp)
  16:   44 0f 29 44 24 20   movaps %xmm8,0x20(%rsp)
  1c:   44 0f 29 4d 80  movaps %xmm9,-0x80(%rbp)
  21:   44 0f 29 55 90  movaps %xmm10,-0x70(%rbp)
  26:   44 0f 29 5d a0  movaps %xmm11,-0x60(%rbp)
  2b:   44 0f 29 65 b0  movaps %xmm12,-0x50(%rbp)
  30:   44 0f 29 6d c0  movaps %xmm13,-0x40(%rbp)
  35:   44 0f 29 75 d0  movaps %xmm14,-0x30(%rbp)
  3a:   44 0f 29 7d e0  movaps %xmm15,-0x20(%rbp)
  3f:   e8 00 00 00 00  callq  44 
40: R_X86_64_PC32   wool_sweaters-0x4
  44:   0f 28 34 24 movaps (%rsp),%xmm6
  48:   0f 28 7c 24 10  movaps 0x10(%rsp),%xmm7
  4d:   44 0f 28 44 24 20   movaps 0x20(%rsp),%xmm8
  53:   44 0f 28 4d 80  movaps -0x80(%rbp),%xmm9
  58:   44 0f 28 55 90  movaps -0x70(%rbp),%xmm10
  5d:   44 0f 28 5d a0  movaps -0x60(%rbp),%xmm11
  62:   44 0f 28 65 b0  movaps -0x50(%rbp),%xmm12
  67:   44 0f 28 6d c0  movaps -0x40(%rbp),%xmm13
  6c:   44 0f 28 75 d0  movaps -0x30(%rbp),%xmm14
  71:   44 0f 28 7d e0  movaps -0x20(%rbp),%xmm15
  76:   48 81 c4 a0 00 00 00add$0xa0,%rsp
  7d:   5e  pop%rsi
  7e:   5f  pop%rdi
  7f:   5d  pop%rbp
  80:   c3  retq

[Bug target/68507] attribute ms_abi (on Linux) bloats by pushing/popping xmm6-15 needlessly

2015-11-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68507

--- Comment #4 from Daniel Santos  ---
According to § 3.2.1 "Registers and the Stack Frame" of the System V
Application Binary Interface for AMD64

Registers %rbp, %rbx and %r12 through %r15 “belong” to the calling function and
the called function is required to preserve their values. In other words, a
called function must preserve these registers’ values for its caller. Remaining
registers “belong” to the called function.5 If a calling function wants to
preserve such a register value across a function call, it must save the value
in its local stack frame.

And for microsoft's "x64" calling convention
(https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx), the xmm registers are
considered non-volatile, so it would appear that this is the correct behaviour,
barring some extensive whole-program analysis that can guarantee that the xmm
registers are not destroyed.

[Bug target/68507] attribute ms_abi (on Linux) bloats by pushing/popping xmm6-15 needlessly

2015-11-23 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68507

Daniel Santos  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #5 from Daniel Santos  ---
closing

[Bug target/54829] bad optimization: sub followed by cmp w/ zero (x86 ARM)

2015-02-14 Thread daniel.santos at pobox dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54829

--- Comment #9 from Daniel Santos daniel.santos at pobox dot com ---
I appologize for my late response.

(In reply to Richard Earnshaw from comment #8)
 Unfortunately, computers don't to infinite precision arithmetic by default. 
 That would perform a different comparison in that it checks that r0  r1,
 not whether r0 - r1  0.  The difference, for signed comparisons, is when
 overflow occurs.
 
 Consider the case where (in your original code) a has the value INT_MIN (ie
 -2147483648) and b has the value 1.
 
 Now clearly a  b and by the normal rules of arithmetic (infinite precision)
 we would expect a - b to be less than zero.
 
 However, INT_MIN - 1 cannot be represented in a 32-bit long value and
 becomes INT_MAX due to overflow; the result is that for these values a - b 
 0!
 
 On ARM and x86, the flag setting that results from a subtract operation is,
 in effect a comparison of the original operands, rather than a comparison of
 the result; that is on ARM
 
subs rd, rn, rm
 
 is equivalent to 
 
cmp rn, rm
 
 except that the register rd is not written by the comparison.
 
 Power PC is different: it's subtract and compare instruction really does use
 the result of the subtraction to form the comparison.

Thank you very much for your work on this. In re-examining, I'm suspecting that
this may be an invalid bug. :( I have modified the test program slightly:

extern print_gt(void);
extern print_lt(void);
extern print_eq(void);

void cmp_and_branch(long a, long b)
{
long diff = a  b ? 1 : (a  b ? -1 : 0);

if (diff  0) {
print_gt();
} else if (diff  0) {
print_lt();
} else {
print_eq();
}
}

I thought that I had originally tried this and gotten worse results (although
the diff was being done via a complicated -findirect-inline situation), but
this version of the program leaves a finite number of options. When compiled on
x86_64 and ARM, both are flawless:

ARM
cmp_and_branch:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r0, r1
bgt .L2
blt .L5
b   print_eq
.L2:
b   print_gt
.L5:
b   print_lt
.size   cmp_and_branch, .-cmp_and_branch
.ident  GCC: (Gentoo 4.8.3 p1.1, pie-0.5.9) 4.8.3
.section.note.GNU-stack,,%progbits


x86_64
cmp_and_branch:
.LFB0:
.cfi_startproc
cmpq%rsi, %rdi
jg  .L2
jl  .L5
jmp print_eq
.p2align 4,,10
.p2align 3
.L2:
jmp print_gt
.p2align 4,,10
.p2align 3
.L5:
jmp print_lt
.cfi_endproc

I don't want to close this bug just yet, I want to reset in my other code. This
will certainly help clean up some of my code!!

1 2 >

1 - 100 of 121 matches

Mail list logo