[PATCH, i386]: Remove mode of address_operand predicate from prefetch patterns
Hello! The mode of address_operand predicate is ignored in ix86_legitimate_address_p. 2012-08-13 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (prefetch): Do not assert mode of operand 0. (*prefetch_sse_mode): Do not set mode of address_operand predicate. Rename to ... (*prefetch_sse): ... this. (*prefetch_3dnow_mode): Do not set mode of address_operand predicate. Rename to ... (*prefetch_3dnow): ... this. Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: i386.md === --- i386.md (revision 191240) +++ i386.md (working copy) @@ -17800,12 +17800,10 @@ int locality = INTVAL (operands[2]); gcc_assert (rw == 0 || rw == 1); - gcc_assert (locality = 0 locality = 3); - gcc_assert (GET_MODE (operands[0]) == Pmode - || GET_MODE (operands[0]) == VOIDmode); + gcc_assert (IN_RANGE (locality, 0, 3)); + if (TARGET_PRFCHW rw) operands[2] = GEN_INT (3); - /* Use 3dNOW prefetch in case we are asking for write prefetch not supported by SSE counterpart or the SSE prefetch is not available (K6 machines). Otherwise use SSE prefetch as it allows specifying @@ -17816,8 +17814,8 @@ operands[1] = const0_rtx; }) -(define_insn *prefetch_sse_mode - [(prefetch (match_operand:P 0 address_operand p) +(define_insn *prefetch_sse + [(prefetch (match_operand 0 address_operand p) (const_int 0) (match_operand:SI 1 const_int_operand))] TARGET_PREFETCH_SSE @@ -17827,7 +17825,7 @@ }; int locality = INTVAL (operands[1]); - gcc_assert (locality = 0 locality = 3); + gcc_assert (IN_RANGE (locality, 0, 3)); return patterns[locality]; } @@ -17837,8 +17835,8 @@ (symbol_ref memory_address_length (operands[0]))) (set_attr memory none)]) -(define_insn *prefetch_3dnow_mode - [(prefetch (match_operand:P 0 address_operand p) +(define_insn *prefetch_3dnow + [(prefetch (match_operand 0 address_operand p) (match_operand:SI 1 const_int_operand n) (const_int 3))] TARGET_3DNOW || TARGET_PRFCHW
Re: [PATCH] Fix up _mm_f{,n}m{add,sub}_s{s,d} (PR target/54564)
On Thu, Sep 13, 2012 at 5:52 PM, Jakub Jelinek ja...@redhat.com wrote: The fma-*.c testcase show that these intrinsics probably mean to preserve the high elements (other than the lowest) of the first argument of the fmaintrin.h *_s{s,d} intrinsics in the destination (the HW insn preserve there the destination register, but that varies - for 132 and 213 it is the first one (but the negation performed for _mm_fnm*_s[sd] breaks it anyway), for 231 it is the last one). What the expander did was to put there an uninitialized pseudo, so we ended up with pretty random content, before H.J's http://gcc.gnu.org/viewcvs?root=gccview=revrev=190492 it happened to work by accident, but when things changed slightly and reload chose different alternative, this broke. The following patch fixes it, by tweaking the header so that the first argument is not negated (we negate the second one instead), as we don't want to negate the high elements if e.g. for whatever reason combiner doesn't match it. It fixes the expander to use a dup of the X operand as the high element provider for the pattern, removes the 231 alternatives (because those provide different destination high elements) and removes commutative marker (again, that would mean different high elements). Can we introduce additional *fmai_fmadd_mode_1 pattern (and others) that would cover missing 231 alternative? 2012-09-13 Jakub Jelinek ja...@redhat.com PR target/54564 * config/i386/sse.md (fmai_vmfmadd_mode): Use (match_dup 1) instead of (match_dup 0) as second argument to vec_merge. (*fmai_fmadd_mode, *fmai_fmsub_mode): Likewise. Remove third alternative. (*fmai_fnmadd_mode, *fmai_fnmsub_mode): Likewise. Negate operand 2 instead of operand 1, but put it as first argument of fma. * config/i386/fmaintrin.h (_mm_fnmadd_sd, _mm_fnmadd_ss, _mm_fnmsub_sd, _mm_fnmsub_ss): Negate the second argument instead of the first. OK, but header change should be also reviewed by H.J. Thanks, Uros.
Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup
On Wed, Jul 6, 2011 at 7:34 PM, Ian Lance Taylor i...@google.com wrote: This seems like a reasonable patch to me, but technically speaking it is incomplete. Go should have IEEE floating point behaviour by default. I believe Java is the same. Ideally there would be a target-independent way for a frontend to request this mode by default. It's a little bit odd because as far as I know every other backend does default to proper IEEE arithmetic, and only deviates when using -ffast-math or equivalent. sh*-*-* also needs -mieee to handle NaN Inf, spu-*-* simply doesn't support them. Uros.
Re: Remove unused t-* fragments
On Wed, Jul 6, 2011 at 10:14 PM, Joseph S. Myers jos...@codesourcery.com wrote: This patch removes three unused t-* makefile fragments. (t-pa is unused because no target uses it explicitly and all PA targets define nonempty tmake_file; t-$cpu_type is is only used implicitly if tmake_file is empty after config.gcc.) Bootstrapped with no regressions on x86_64-unknown-linux-gnu. OK to commit? 2011-07-06 Joseph Myers jos...@codesourcery.com * config/i386/t-crtpic, config/i386/t-svr3dbx, config/pa/t-pa: Remove. OK for x86. Thanks, Uros.
[go]: Port to ALPHA arch - epoll problems
On Tue, Jul 5, 2011 at 10:12 PM, Ian Lance Taylor i...@google.com wrote: What remains is a couple of unrelated failures in the testsuite: Epoll unexpected fd=0 pollServer: unexpected wakeup for fd=0 mode=w panic: test timed out ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 7123 Aborted ./a.out -test.short -test.timeout=$timeout $@ FAIL: http gmake[2]: *** [http/check] Error 1 2011/07/05 18:43:28 Test RPC server listening on 127.0.0.1:50334 2011/07/05 18:43:28 Test HTTP RPC server listening on 127.0.0.1:49010 2011/07/05 18:43:28 rpc.Serve: accept:accept tcp 127.0.0.1:50334: Resource temporarily unavailable FAIL: rpc gmake[2]: *** [rpc/check] Error 1 2011/07/05 18:44:22 Test WebSocket server listening on 127.0.0.1:40893 Epoll unexpected fd=0 pollServer: unexpected wakeup for fd=0 mode=w panic: test timed out ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 12993 Aborted ./a.out -test.short -test.timeout=$timeout $@ FAIL: websocket gmake[2]: *** [websocket/check] Error 1 ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 13945 Segmentation fault ./a.out -test.short -test.timeout=$timeout $@ FAIL: compress/flate gmake[2]: *** [compress/flate/check] Error 1 Any ideas how to attack these? None of these look familiar to me. An Epoll unexpected fd error means that epoll returned information about a file descriptor which the program didn't ask about. Not sure why that would happen. Particularly for fd 0, since epoll is only used for network connections, which fd 0 presumably is not. The way to look into these is to cd to TARGET/libgo and run make GOTESTFLAGS=--keep http/check (or whatever/check). That will leave a directory gotest in your libgo directory. The executable a.out in that directory is the test case. You can debug the test case using gdb in more or less the usual way. It's a bit painful to set breakpoints by function name, but setting breakpoints by file:line works fine. Printing variables works as well as it ever does, but the variables are printed in C form rather than Go form. It turned out that the EpollEvent definition in libgo/syscalls/epoll/socket_epoll.go is non-portable (if not outright dangerous...). The definition does have a FIXME comment, but does not take into account the effects of __attribute__((__packed__)) from system headers. Contrary to alpha header, x86 has __attribute__((__packed__)) added to struct epoll_event definition in sys/epoll.h header. To illustrate the problem, please run following test: --cut here-- #include stdint.h #include stdio.h typedef union epoll_data { void *ptr; int fd; uint32_t u32; uint64_t u64; } epoll_data_t; struct epoll_event { uint32_t events; epoll_data_t data; }; struct packed_epoll_event { uint32_t events; epoll_data_t data; } __attribute__ ((__packed__)); struct fake_epoll_event { uint32_t events; int32_t fd; int32_t pad; }; int main () { struct epoll_event *ep; struct packed_epoll_event *pep; struct fake_epoll_event fep; fep.events = 0xfe; fep.fd = 9; fep.pad = 0; ep = (struct epoll_event *) fep; pep = (struct packed_epoll_event *) fep; printf (%#x %i\n, ep-events, ep-data.fd); printf (%#x %i\n, pep-events, pep-data.fd); return 0; } --cut here-- ./a.out 0xfe 0 0xfe 9 So, the first line simulates the alpha, the second simulates x86_64. 32bit targets are OK in both cases: ./a.out 0xfe 9 0xfe 9 By changing the definition of EpollEvent to the form that suits alpha: type EpollEvent struct { Events uint32; Pad int32; Fd int32; }; both timeouts got fixed and correct FD was passed to and from the syscall. Uros.
[go]: Many valgrind errors (use of uninit value, jump depends on uninit value) in the testsuite
On Tue, Jul 5, 2011 at 10:12 PM, Ian Lance Taylor i...@google.com wrote: What remains is a couple of unrelated failures in the testsuite: ../../../gcc-svn/trunk/libgo/testsuite/gotest: line 388: 13945 Segmentation fault ./a.out -test.short -test.timeout=$timeout $@ FAIL: compress/flate gmake[2]: *** [compress/flate/check] Error 1 Any ideas how to attack these? None of these look familiar to me. compress/flate test sometimes passes and sometimes don't. I have run the resulting executable through the valgrind, and there are many (i.e. hundreds) of warnings of uses and calls that depend on uninitialized variables, also on x86_64. ATM, I would like to just report problems with valgrind, and due to the number of them, it looks to me that something is wrong with the library. Uros.
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Hello! diff --git a/libmudflap/testsuite/libmudflap.c/pass47-frag.c b/libmudflap/testsuite/libmudflap.c/pass47-frag.c --- a/libmudflap/testsuite/libmudflap.c/pass47-frag.c +++ b/libmudflap/testsuite/libmudflap.c/pass47-frag.c @@ -8,3 +8,5 @@ int main () tolower (buf[4]) == 'o' tolower ('X') == 'x' isdigit (buf[3])) == 0 isalnum ('4')); } + +/* { dg-warning cannot track unknown size extern .__ctype. Solaris __ctype declared without size { target *-*-solaris2.* } 0 } */ This is handled differently throughout the mudflap testsuite: /* Ignore a warning that is irrelevant to the purpose of this test. */ /* { dg-prune-output .*mudflap cannot track unknown size extern.* } */ Uros.
Re: [PATCH] Fix UNRESOLVED gcc.dg/graphite/pr37485.c
Hello! Committed. Richard. 2011-07-07 Richard Guenther rguent...@suse.de * gcc.dg/graphite/pr37485.c: Add -floop-block. Heh, you were faster by a minute! Uros.
Re: PATCH [1/n] X32: Add initial -x32 support
On Thu, Jul 7, 2011 at 2:59 PM, H.J. Lu hjl.to...@gmail.com wrote: Hi Paolo, DJ, Nathanael, Alexandre, Ralf, Is the change . * configure.ac: Support --enable-x32. * configure: Regenerated. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..bddabeb 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -611,6 +611,11 @@ AC_ARG_ENABLE(multilib, [], [enable_multilib=yes]) AC_SUBST(enable_multilib) +# With x32 support +AC_ARG_ENABLE(x32, +[ --enable-x32 enable x32 library support for multiple ABIs], Looks like a very very generic switch for a global configury ... we already have --with-multilib-list (SH only), why not extend that to also work for x86_64? Richard. +[], [enable_x32=no]) + # Enable __cxa_atexit for C++. AC_ARG_ENABLE(__cxa_atexit, [AS_HELP_STRING([--enable-__cxa_atexit], [enable __cxa_atexit for C++])], OK? Thanks. Here is the updated patch to use --with-multilib-list=x32. Paolo, DJ, Nathanael, Alexandre, Ralf, Is the configure.ac change --- * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * doc/install.texi: Document --with-multilib-list=x32. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..a73f758 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -795,7 +795,7 @@ esac], [enable_languages=c]) AC_ARG_WITH(multilib-list, -[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH only)])], +[AS_HELP_STRING([--with-multilib-list], [select multilibs (SH and x86-64 only)])], :, with_multilib_list=default) diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 49aac95..a5d266c 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1049,8 +1049,10 @@ sysv, aix. @item --with-multilib-list=@var{list} @itemx --without-multilib-list Specify what multilibs to build. -Currently only implemented for sh*-*-*. +Currently only implemented for sh*-*-* and x86-64-*-linux*. +@table @code +@item sh*-*-* @var{list} is a comma separated list of CPU names. These must be of the form @code{sh*} or @code{m*} (in which case they match the compiler option for that processor). The list should not contain any endian options - @@ -1082,6 +1084,12 @@ only little endian SH4AL: --with-multilib-list=sh4al,!mb/m4al @end smallexample +@item x86-64-*-linux* +If @var{list} is @code{x32}, x32 run-time library will be enabled. By +default, x32 run-time library is disabled. + +@end table + @item --with-endian=@var{endians} Specify what endians to use. Currently only implemented for sh*-*-*. --- OK? Thanks. -- H.J. --- 2011-07-06 H.J. Lu hongjiu...@intel.com * config.gcc: Support --with-multilib-list=x32 for x86 Linux targets. * configure.ac: Mention x86-64 for --with-multilib-list. * configure: Regenerated. * config/i386/gnu-user64.h (SPEC_64): Support x32. (SPEC_32): Likewise. (ASM_SPEC): Likewise. (LINK_SPEC): Likewise. (TARGET_THREAD_SSP_OFFSET): Likewise. (TARGET_THREAD_SPLIT_STACK_OFFSET): Likewise. (SPEC_X32): New. * config/i386/i386.h (TARGET_X32): New. (TARGET_LP64): New. (LONG_TYPE_SIZE): Likewise. (POINTER_SIZE): Likewise. (POINTERS_EXTEND_UNSIGNED): Likewise. (OPT_ARCH64): Support x32. (OPT_ARCH32): Likewise. * config/i386/i386.opt (mx32): New. * config/i386/kfreebsd-gnu64.h (GNU_USER_LINK_EMULATIONX32): New. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/linux64.h (GNU_USER_LINK_EMULATIONX32): Likewise. (GLIBC_DYNAMIC_LINKERX32): Likewise. * config/i386/t-linux-x32: New. * config/linux.h (UCLIBC_DYNAMIC_LINKERX32): New. (BIONIC_DYNAMIC_LINKERX32): Likewise. (GNU_USER_DYNAMIC_LINKERX32): Likewise. * doc/install.texi: Document --with-multilib-list=x32. * doc/invoke.texi: Document -mx32. Hi Uros, This new version only adds a comment to configure.ac. OK to install? OK. Thanks, Uros.
Re: PATCH: Support -mx32 in GCC tests
On Fri, Jul 8, 2011 at 1:03 AM, H.J. Lu hjl.to...@gmail.com wrote: Here is the updated patch. I will wait for Uros's comments. I attached the wrong file. Here is the updated patch. --- a/gcc/testsuite/g++.dg/abi/bitfield3.C +++ b/gcc/testsuite/g++.dg/abi/bitfield3.C @@ -4,7 +4,7 @@ // Cygwin and mingw32 default to MASK_ALIGN_DOUBLE. Override to ensure // 4-byte alignment. // { dg-options -mno-align-double { target i?86-*-cygwin* i?86-*-mingw* } } -// { dg-require-effective-target ilp32 } +// { dg-require-effective-target ia32 } Please rather change dg-do run command to: +// { dg-do ... { target { { i?86-*-* x86_64-*-* } ia32 } } } and remove dg-require-effective-target entirely. This will ease grepping for certain target considerably. +++ b/gcc/testsuite/g++.dg/ext/attrib8.C +++ b/gcc/testsuite/g++.dg/ext/tmplattr1.C +++ b/gcc/testsuite/g++.dg/inherit/override-attribs.C +++ b/gcc/testsuite/g++.dg/opt/life1.C +++ b/gcc/testsuite/g++.dg/opt/nrv12.C +++ b/gcc/testsuite/g++.old-deja/g++.ext/attrib1.C +++ b/gcc/testsuite/g++.old-deja/g++.ext/attrib2.C +++ b/gcc/testsuite/g++.old-deja/g++.ext/attrib3.C +++ b/gcc/testsuite/g++.old-deja/g++.pt/asm2.C +++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-28.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-3.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-4.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-5.c ... and many more. Same here. --- a/gcc/testsuite/gcc.dg/20020103-1.c +++ b/gcc/testsuite/gcc.dg/20020103-1.c @@ -1,6 +1,6 @@ /* Verify that constant equivalences get reloaded properly, either by being spilled to the stack, or regenerated, but not dropped to memory. */ -/* { dg-do compile { target { { i?86-*-* rs6000-*-* alpha*-*-* x86_64-*-* } || { powerpc*-*-* ilp32 } } } } */ +/* { dg-do compile { target { { i?86-*-* rs6000-*-* alpha*-*-* x86_64-*-* } || { powerpc*-*-* ia32 } } } } */ Wrong change. --- a/gcc/testsuite/gcc.dg/pr25023.c +++ b/gcc/testsuite/gcc.dg/pr25023.c @@ -1,7 +1,7 @@ /* PR debug/25023 */ /* { dg-do compile } */ /* { dg-options -O2 } */ -/* { dg-options -O2 -mtune=i686 { target { { i?86-*-* || x86_64-*-* } ilp32 } } } */ +/* { dg-options -O2 -mtune=i686 { target { { i?86-*-* || x86_64-*-* } ia32 } } } */ Please also remove || in the target string. --- a/gcc/testsuite/gcc.dg/lower-subreg-1.c +++ b/gcc/testsuite/gcc.dg/lower-subreg-1.c @@ -1,4 +1,4 @@ -/* { dg-do compile { target { { { ! mips64 } { ! ia64-*-* } } { ! spu-*-* } } } } */ +/* { dg-do compile { target { { { { ! mips64 } { ! ia64-*-* } } { ! spu-*-* } } { ! { { i?86-*-* x86_64-*-* } x32 } } } } } */ /* { dg-options -O -fdump-rtl-subreg1 } */ /* { dg-require-effective-target ilp32 } */ This change is still present in updated patch, please change according to Mike's comments. I'd prefer skip-if there, BTW. BTW: What about using ... { ! ia32 } instead of ... { x32 || lp64 } in +/* { dg-do compile { target { { i?86-*-* x86_64-*-* } { x32 || lp64 } } } } */ This will IMO future-proof the testcases. Otherwise, the patch looks OK to me. Uros.
[PATCH, fortran]: Fix PR 48926, gfortran.dg/coarray/image_index_1.f90 -fcoarray=single -O2 (test for excess errors)
Hello! gfc_get_corank returns integer value, not bool. This problem was triggered by --enable-build-with-cxx configured build. 2011-07-09 Uros Bizjak ubiz...@gmail.com PR fortran/48926 * expr.c (gfc_get_corank): Change return value to int. * gfortran.h (gfc_get_corank): Update function prototype. Patch was regression tested on x86_64-pc-linux-gnu {,-m32} with --enable-build-with-cxx. Approved by Tobias Burnus in the PR. Patch was committed to mainline, will be committed to 4.6 branch soon. Uros. Index: expr.c === --- expr.c (revision 176083) +++ expr.c (working copy) @@ -4143,7 +4143,7 @@ } -bool +int gfc_get_corank (gfc_expr *e) { int corank; Index: gfortran.h === --- gfortran.h (revision 176083) +++ gfortran.h (working copy) @@ -2734,7 +2734,7 @@ bool gfc_is_proc_ptr_comp (gfc_expr *, gfc_component **); bool gfc_is_coindexed (gfc_expr *); -bool gfc_get_corank (gfc_expr *); +int gfc_get_corank (gfc_expr *); bool gfc_has_ultimate_allocatable (gfc_expr *); bool gfc_has_ultimate_pointer (gfc_expr *);
Re: [rfc, i386] Convert output_mi_thunk to rtl
On Sun, Jul 10, 2011 at 3:34 AM, Richard Henderson r...@redhat.com wrote: I developed this patch while working on the dwarf2 pass series. This was before I bypassed the entire problem by removing the !deep branch prediction paths. Ideally, we'd do this generically from gimple. Less ideally, but still better, is to always emit rtl, and support that in the middle end without so many hacks in the back end. Looks good to me! + reload_completed = 1; + epilogue_completed = 1; Do we really need these? Perhaps a comment should be added here, it is not obvious at the first sight... + tmp_regno = CX_REG; if ((ccvt (IX86_CALLCVT_FASTCALL | IX86_CALLCVT_THISCALL)) != 0) tmp_regno = AX_REG; if (...) tmp_regno = AX_REG; else tmp_regno = CX_REG; Uros.
Re: PATCH [2/n] X32: Turn on 64bit and check models for x32
On Sat, Jul 9, 2011 at 11:22 PM, H.J. Lu hongjiu...@intel.com wrote: This patch turns on 64bit and check models for x32. OK for trunk? Thanks. H.J. --- 2011-07-09 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_option_override_internal): Turn on OPTION_MASK_ISA_64BIT for TARGET_X32. Only allow small and small PIC models for TARGET_X32. OK. Thanks, Uros.
Re: PATCH [3/n] X32: Promote pointers to Pmode
On Sat, Jul 9, 2011 at 11:28 PM, H.J. Lu hongjiu...@intel.com wrote: X32 psABI requires promoting pointers to Pmode when passing/returning in registers. OK for trunk? Thanks. H.J. -- 2011-07-09 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_promote_function_mode): New. (TARGET_PROMOTE_FUNCTION_MODE): Likewise. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 04cb07d..c852719 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -7052,6 +7061,23 @@ ix86_function_value (const_tree valtype, const_tree fntype_or_decl, return ix86_function_value_1 (valtype, fntype_or_decl, orig_mode, mode); } +/* Pointer function arguments and return values are promoted to + Pmode. */ + +static enum machine_mode +ix86_promote_function_mode (const_tree type, enum machine_mode mode, + int *punsignedp, const_tree fntype, + int for_return) +{ + if (for_return != 1 type != NULL_TREE POINTER_TYPE_P (type)) + { + *punsignedp = POINTERS_EXTEND_UNSIGNED; + return Pmode; + } + return default_promote_function_mode (type, mode, punsignedp, fntype, + for_return); +} Please rewrite the condition to: if (for_return == 1) /* Do not promote function return values. */ ; else if (type != NULL_TREE ...) Also, please add some comments. Your comment also says that pointer return arguments are promoted to Pmode. The documentation says that: FOR_RETURN allows to distinguish the promotion of arguments and return values. If it is `1', a return value is being promoted and `TARGET_FUNCTION_VALUE' must perform the same promotions done here. If it is `2', the returned mode should be that of the register in which an incoming parameter is copied, or the outgoing result is computed; then the hook should return the same mode as `promote_mode', though the signedness may be different. You bypass promotions when FOR_RETURN is 1. Uros.
[PATCH, i386]: ix86_trampoline_init: use offset everywhere
Hello! A small cleanup, no functional change. This allows us to assert that generated code length is less than TRAMPOLINE_SIZE also for 32bit targets. 2011-07-11 Uros Bizjak ubiz...@gmail.com * config/i386/i386.c (ix86_trampoline_init): Switch arms of if expr. Use offset everywhere. Always assert that offset = TRAMPOLINE_SIZE. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline. Uros. Index: i386.c === --- i386.c (revision 176159) +++ i386.c (working copy) @@ -22683,54 +22683,14 @@ static void ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value) { rtx mem, fnaddr; + int opcode; + int offset = 0; fnaddr = XEXP (DECL_RTL (fndecl), 0); - if (!TARGET_64BIT) -{ - rtx disp, chain; - int opcode; - - /* Depending on the static chain location, either load a register -with a constant, or push the constant to the stack. All of the -instructions are the same size. */ - chain = ix86_static_chain (fndecl, true); - if (REG_P (chain)) - { - if (REGNO (chain) == CX_REG) - opcode = 0xb9; - else if (REGNO (chain) == AX_REG) - opcode = 0xb8; - else - gcc_unreachable (); - } - else - opcode = 0x68; - - mem = adjust_address (m_tramp, QImode, 0); - emit_move_insn (mem, gen_int_mode (opcode, QImode)); - - mem = adjust_address (m_tramp, SImode, 1); - emit_move_insn (mem, chain_value); - - /* Compute offset from the end of the jmp to the target function. -In the case in which the trampoline stores the static chain on -the stack, we need to skip the first insn which pushes the -(call-saved) register static chain; this push is 1 byte. */ - disp = expand_binop (SImode, sub_optab, fnaddr, - plus_constant (XEXP (m_tramp, 0), - MEM_P (chain) ? 9 : 10), - NULL_RTX, 1, OPTAB_DIRECT); - - mem = adjust_address (m_tramp, QImode, 5); - emit_move_insn (mem, gen_int_mode (0xe9, QImode)); - - mem = adjust_address (m_tramp, SImode, 6); - emit_move_insn (mem, disp); -} - else + if (TARGET_64BIT) { - int offset = 0, size; + int size; /* Load the function address to r11. Try to load address using the shorter movl instead of movabs. We may want to support @@ -22757,20 +22717,22 @@ ix86_trampoline_init (rtx m_tramp, tree offset += 10; } - /* Load static chain using movabs to r10. */ - mem = adjust_address (m_tramp, HImode, offset); - /* Use the shorter movl instead of movabs for x32. */ + /* Load static chain using movabs to r10. Use the +shorter movl instead of movabs for x32. */ if (TARGET_X32) { + opcode = 0xba41; size = 6; - emit_move_insn (mem, gen_int_mode (0xba41, HImode)); } else { + opcode = 0xba49; size = 10; - emit_move_insn (mem, gen_int_mode (0xba49, HImode)); } + mem = adjust_address (m_tramp, HImode, offset); + emit_move_insn (mem, gen_int_mode (opcode, HImode)); + mem = adjust_address (m_tramp, ptr_mode, offset + 2); emit_move_insn (mem, chain_value); offset += size; @@ -22780,10 +22742,56 @@ ix86_trampoline_init (rtx m_tramp, tree mem = adjust_address (m_tramp, SImode, offset); emit_move_insn (mem, gen_int_mode (0x90e3ff49, SImode)); offset += 4; +} + else +{ + rtx disp, chain; - gcc_assert (offset = TRAMPOLINE_SIZE); + /* Depending on the static chain location, either load a register +with a constant, or push the constant to the stack. All of the +instructions are the same size. */ + chain = ix86_static_chain (fndecl, true); + if (REG_P (chain)) + { + switch (REGNO (chain)) + { + case AX_REG: + opcode = 0xb8; break; + case CX_REG: + opcode = 0xb9; break; + default: + gcc_unreachable (); + } + } + else + opcode = 0x68; + + mem = adjust_address (m_tramp, QImode, offset); + emit_move_insn (mem, gen_int_mode (opcode, QImode)); + + mem = adjust_address (m_tramp, SImode, offset + 1); + emit_move_insn (mem, chain_value); + offset += 5; + + mem = adjust_address (m_tramp, QImode, offset); + emit_move_insn (mem, gen_int_mode (0xe9, QImode)); + + mem = adjust_address (m_tramp, SImode, offset + 1); + + /* Compute offset from the end of the jmp to the target function. +In the case in which the trampoline stores the static chain on +the stack, we need to skip the first insn which pushes the +(call-saved) register static chain; this push is 1 byte
Re: AMD bdver2 enablement.
Hello! 2011-07-11 Harsha Jagasia harsha.jaga...@amd.com AMD bdver2 Enablement * config.gcc (i[34567]86-*-linux* | ...): Add bdver2. (case ${target}): Add bdver2. * config/i386/driver-i386.c (host_detect_local_cpu): Let -march=native recognize bdver2 processors. * config/i386/i386-c.c (ix86_target_macros_internal): Add bdver2 def_and_undef * config/i386/i386.c (struct processor_costs bdver2_cost): New bdver2 cost table. (m_BDVER2): New definition. (m_AMD_MULTIPLE): Includes m_BDVER2. (initial_ix86_tune_features): Add bdver2 tuning. (processor_target_table): Add bdver2 entry. (static const char *const cpu_names): Add bdver2 entry. (ix86_option_override_internal): Add bdver2 instruction sets. (ix86_issue_rate): Add bdver2. (ix86_adjust_cost): Add bdver2. (has_dispatch): Add bdver2. * config/i386/i386.h (TARGET_BDVER2): New definition. (enum target_cpu_default): Add TARGET_CPU_DEFAULT_bdver2. (enum processor_type): Add PROCESSOR_BDVER2. * config/i386/i386.md (define_attr cpu): Add bdver2. * config/i386/i386.opt ( mdispatch-scheduler): Add bdver2 to description. OK, with a small change - see below. @@ -1813,8 +1900,10 @@ const struct processor_costs *ix86_cost #define m_ATHLON_K8 (m_K8 | m_ATHLON) #define m_AMDFAM10 (1PROCESSOR_AMDFAM10) #define m_BDVER1 (1PROCESSOR_BDVER1) +#define m_BDVER2 (1PROCESSOR_BDVER2) #define m_BTVER1 (1PROCESSOR_BTVER1) -#define m_AMD_MULTIPLE (m_K8 | m_ATHLON | m_AMDFAM10 | m_BDVER1 | m_BTVER1) +#define m_BDVER (m_BDVER1 | m_BDVER2) +#define m_AMD_MULTIPLE (m_ATHLON_K8 | m_AMDFAM10 | m_BDVER | m_BTVER1) #define m_GENERIC32 (1PROCESSOR_GENERIC32) #define m_GENERIC64 (1PROCESSOR_GENERIC64) @@ -1856,8 +1945,8 @@ static unsigned int initial_ix86_tune_fe ~m_386, /* X86_TUNE_USE_SAHF */ - m_ATOM | m_PPRO | m_K6_GEODE | m_K8 | m_AMDFAM10 | m_BDVER1 | m_BTVER1 - | m_PENT4 | m_NOCONA | m_CORE2I7 | m_GENERIC, + m_ATOM | m_PPRO | m_K6_GEODE | m_K8 | m_AMDFAM10 | m_BDVER1 | m_BDVER2 + | m_BTVER1 | m_PENT4 | m_NOCONA | m_CORE2I7 | m_GENERIC, Please use newly introduced m_BDVER in tune flags instead of m_BDVER1 | m_BDVER2. Thanks, Uros.
Re: Use of vector instructions in memmov/memset expanding
Hello! Please don't use -m32/-m64 in testcases directly. You should use /* { dg-do compile { target { ! ia32 } } } */ for 32bit insns and /* { dg-do compile { target { ia32 } } } */ for 64bit insns. Also, there is no need to add -mtune if -march is already specified. -mtune will follow -march. To scan for the %xmm register, you don't have to add -dp to compile flags. -dp will also dump pattern name to file, so unless you are looking for specific pattern name, you should omit -dp. Uros.
Re: PATCH [3/n] X32: Promote pointers to Pmode
On Wed, Jul 13, 2011 at 3:17 PM, H.J. Lu hjl.to...@gmail.com wrote: PING. 2011-07-10 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_promote_function_mode): New. (TARGET_PROMOTE_FUNCTION_MODE): Likewise. You have discussed this with rth, the final approval should be from him. Uros.
[PATCH, testsuite]: Use istarget everywhere
Hello! Attached patch converts several places where string match or regexp on $target_triplet is used with istarget. The patch also removes quotes around target string. 2011-07-13 Uros Bizjak ubiz...@gmail.com * lib/g++.exp (g++_init): Use istarget. Remove target_triplet global. * lib/obj-c++.exp (obj-c++_init): Ditto. * lib/file-format.exp (gcc_target_object_format): Ditto. * lib/target-supports-dg.exp (dg-require-dll): Ditto. * lib/target-supports-dg-exp (check_weak_available): Ditto. (check_visibility_available): Ditto. (check_effective_target_tls_native): Ditto. (check_effective_target_tls_emulated): Ditto. (check_effective_target_function_sections): Ditto. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: lib/g++.exp === --- lib/g++.exp (revision 176236) +++ lib/g++.exp (working copy) @@ -188,7 +188,6 @@ global TOOL_EXECUTABLE TOOL_OPTIONS global GXX_UNDER_TEST global TESTING_IN_BUILD_TREE -global target_triplet global gcc_warning_prefix global gcc_error_prefix @@ -263,7 +262,7 @@ set gcc_warning_prefix warning: set gcc_error_prefix error: -if { [string match *-*-darwin* $target_triplet] } { +if { [istarget *-*-darwin*] } { lappend ALWAYS_CXXFLAGS ldflags=-multiply_defined suppress } Index: lib/obj-c++.exp === --- lib/obj-c++.exp (revision 176236) +++ lib/obj-c++.exp (working copy) @@ -210,7 +210,6 @@ global TOOL_EXECUTABLE TOOL_OPTIONS global OBJCXX_UNDER_TEST global TESTING_IN_BUILD_TREE -global target_triplet global gcc_warning_prefix global gcc_error_prefix @@ -270,7 +269,7 @@ set gcc_warning_prefix warning: set gcc_error_prefix error: -if { [string match *-*-darwin* $target_triplet] } { +if { [istarget *-*-darwin*] } { lappend ALWAYS_OBJCXXFLAGS ldflags=-multiply_defined suppress } @@ -299,7 +298,7 @@ # we need to add the include path for the gnu runtime if that is in # use. # First, set the default... -if { [istarget *-*-darwin*] } { +if { [istarget *-*-darwin*] } { set nextruntime 1 } else { set nextruntime 0 Index: lib/scanasm.exp === --- lib/scanasm.exp (revision 176236) +++ lib/scanasm.exp (working copy) @@ -461,10 +461,10 @@ } } -if { [istarget hppa*-*-*] } { +if { [istarget hppa*-*-*] } { set pattern [format {\t;[^:]+:%d\n(\t[^\t]+\n)+%s:\n\t.PROC} \ $line $symbol] -} elseif { [istarget mips-sgi-irix*] } { +} elseif { [istarget mips-sgi-irix*] } { set pattern [format {\t\.loc [0-9]+ %d 0( [^\n]*)?\n\t\.set\t(no)?mips16\n\t\.ent\t%s\n\t\.type\t%s, @function\n%s:\n} \ $line $symbol $symbol $symbol] } else { Index: lib/file-format.exp === --- lib/file-format.exp (revision 176236) +++ lib/file-format.exp (working copy) @@ -24,17 +24,16 @@ proc gcc_target_object_format { } { global gcc_target_object_format_saved -global target_triplet global tool if [info exists gcc_target_object_format_saved] { verbose gcc_target_object_format returning saved $gcc_target_object_format_saved 2 -} elseif { [string match *-*-darwin* $target_triplet] } { +} elseif { [istarget *-*-darwin*] } { # Darwin doesn't necessarily have objdump, so hand-code it. set gcc_target_object_format_saved mach-o -} elseif { [string match hppa*-*-hpux* $target_triplet] } { +} elseif { [istarget hppa*-*-hpux*] } { # HP-UX doesn't necessarily have objdump, so hand-code it. - if { [string match hppa*64*-*-hpux* $target_triplet] } { + if { [istarget hppa*64*-*-hpux*] } { set gcc_target_object_format_saved elf } else { set gcc_target_object_format_saved som Index: lib/target-libpath.exp === --- lib/target-libpath.exp (revision 176236) +++ lib/target-libpath.exp (working copy) @@ -272,11 +272,11 @@ proc get_shlib_extension { } { global shlib_ext -if { [ istarget *-*-darwin* ] } { +if { [istarget *-*-darwin*] } { set shlib_ext dylib -} elseif { [ istarget *-*-cygwin* ] || [ istarget *-*-mingw* ] } { +} elseif { [istarget *-*-cygwin*] || [istarget *-*-mingw*] } { set shlib_ext dll -} elseif { [ istarget hppa*-*-hpux* ] } { +} elseif { [istarget hppa*-*-hpux*] } { set shlib_ext sl } else { set shlib_ext so Index: lib/go-torture.exp === --- lib/go-torture.exp (revision 176236
Re: [build] Move crtfastmath to toplevel libgcc
On Thu, Jul 14, 2011 at 12:09 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Andreas Schwab sch...@redhat.com writes: Same on ia64: Configuration mismatch! Extra parts from gcc directory: crtbegin.o crtbeginS.o crtend.o crtendS.o Extra parts from libgcc: crtbegin.o crtend.o crtbeginS.o crtendS.o crtfastmath.o Alpha needs the same fix. I need following patch to bootstrap the compiler: --cut here-- Index: gcc/config.gcc === --- gcc/config.gcc (revision 176282) +++ gcc/config.gcc (working copy) @@ -757,6 +757,7 @@ extra_options=${extra_options} alpha/elf.opt target_cpu_default=MASK_GAS tmake_file=${tmake_file} alpha/t-alpha alpha/t-ieee alpha/t-linux + extra_parts=$extra_parts crtfastmath.o ;; alpha*-*-freebsd*) tm_file=${tm_file} ${fbsd_tm_file} alpha/elf.h alpha/freebsd.h --cut here-- Uros.
Re: PATCH [5/n] X32: Supprot 32bit address
On Sun, Jul 10, 2011 at 12:20 AM, H.J. Lu hongjiu...@intel.com wrote: TARGET_MEM_REF only works on ptr_mode. That means base and index parts of x86 address operand in x32 mode may be in ptr_mode. This patch supports 32bit base and index parts in x32 mode. OK for trunk? Thanks. H.J. --- 2011-07-09 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_simplify_base_index_disp): New. (ix86_decompose_address): Support 32bit address in x32 mode. (ix86_legitimate_address_p): Likewise. (ix86_fixup_binary_operands): Likewise. Why don't you handle translations in TARGET_LEGITIMIZE_ADDRESS (or maybe also LEGITIMIZE_RELOAD_ADDRESS) ? Uros.
Re: PATCH [5/n] X32: Supprot 32bit address
On Fri, Jul 15, 2011 at 3:03 PM, H.J. Lu hjl.to...@gmail.com wrote: On Fri, Jul 15, 2011 at 5:49 AM, Uros Bizjak ubiz...@gmail.com wrote: On Sun, Jul 10, 2011 at 12:20 AM, H.J. Lu hongjiu...@intel.com wrote: TARGET_MEM_REF only works on ptr_mode. That means base and index parts of x86 address operand in x32 mode may be in ptr_mode. This patch supports 32bit base and index parts in x32 mode. OK for trunk? Thanks. H.J. --- 2011-07-09 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_simplify_base_index_disp): New. (ix86_decompose_address): Support 32bit address in x32 mode. (ix86_legitimate_address_p): Likewise. (ix86_fixup_binary_operands): Likewise. Why don't you handle translations in TARGET_LEGITIMIZE_ADDRESS (or maybe also LEGITIMIZE_RELOAD_ADDRESS) ? It is because ix86_decompose_address is also called from: predicates.md: ok = ix86_decompose_address (op, parts); predicates.md: ok = ix86_decompose_address (op, parts); predicates.md: ok = ix86_decompose_address (XEXP (op, 0), parts); predicates.md: ok = ix86_decompose_address (XEXP (op, 0), parts); predicates.md: ok = ix86_decompose_address (XEXP (op, 0), parts); Yes, but you should legitimize the address created by reload before it enters into predicates. So, the questions are: + (set (reg:SI 40 r11) +(plus:SI (plus:SI (mult:SI (reg:SI 1 dx) + (const_int 8)) + (subreg:SI (plus:DI (reg/f:DI 7 sp) + (const_int CONST1)) 0)) +(const_int CONST2))) + + We translate it into + + (set (reg:SI 40 r11) +(plus:SI (plus:SI (mult:SI (reg:SI 1 dx) + (const_int 8)) + (reg/f:SI 7 sp)) +(const_int [CONST1 + CONST2]))) If the first form of the address is not OK (it does not represent the hardware operation), then it should not enter into the insn stream. This means, that it should be fixed (legitimized) to second form by appropriate function (it looks that LEGITIMIZE_RELOAD_ADDRESS should fix it, since the incorrect address is generated by IRA/reload). After this operation, various predicates, based on ix86_decompose_address will start to work, since they will decompose valid memory addresses. Uros.
Re: PATCH [5/n] X32: Supprot 32bit address
On Fri, Jul 15, 2011 at 5:44 PM, H.J. Lu hjl.to...@gmail.com wrote: TARGET_MEM_REF only works on ptr_mode. That means base and index parts of x86 address operand in x32 mode may be in ptr_mode. This patch supports 32bit base and index parts in x32 mode. OK for trunk? Thanks. H.J. --- 2011-07-09 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_simplify_base_index_disp): New. (ix86_decompose_address): Support 32bit address in x32 mode. (ix86_legitimate_address_p): Likewise. (ix86_fixup_binary_operands): Likewise. Why don't you handle translations in TARGET_LEGITIMIZE_ADDRESS (or maybe also LEGITIMIZE_RELOAD_ADDRESS) ? It is because ix86_decompose_address is also called from: predicates.md: ok = ix86_decompose_address (op, parts); predicates.md: ok = ix86_decompose_address (op, parts); predicates.md: ok = ix86_decompose_address (XEXP (op, 0), parts); predicates.md: ok = ix86_decompose_address (XEXP (op, 0), parts); predicates.md: ok = ix86_decompose_address (XEXP (op, 0), parts); Yes, but you should legitimize the address created by reload before it enters into predicates. So, the questions are: + (set (reg:SI 40 r11) + (plus:SI (plus:SI (mult:SI (reg:SI 1 dx) + (const_int 8)) + (subreg:SI (plus:DI (reg/f:DI 7 sp) + (const_int CONST1)) 0)) + (const_int CONST2))) + + We translate it into + + (set (reg:SI 40 r11) + (plus:SI (plus:SI (mult:SI (reg:SI 1 dx) + (const_int 8)) + (reg/f:SI 7 sp)) + (const_int [CONST1 + CONST2]))) If the first form of the address is not OK (it does not represent the hardware operation), then it should not enter into the insn stream. This means, that it should be fixed (legitimized) to second form by appropriate function (it looks that LEGITIMIZE_RELOAD_ADDRESS should fix it, since the incorrect address is generated by IRA/reload). After this operation, various predicates, based on ix86_decompose_address will start to work, since they will decompose valid memory addresses. IRA/.RELOAD isn't prepared to deal with it and it just ICEs. I opened a few GCC bugs on this. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47744 is one of them. That is why I went this route. Hm, but it crashed in postreload pass since the address was not in the legitimate form. This is exactly what LEGITIMIZE_RELOAD_ADDRESS fixes. Did you try to go this route? Uros.
Re: PATCH [5/n] X32: Supprot 32bit address
On Fri, Jul 15, 2011 at 6:07 PM, H.J. Lu hjl.to...@gmail.com wrote: If the first form of the address is not OK (it does not represent the hardware operation), then it should not enter into the insn stream. This means, that it should be fixed (legitimized) to second form by appropriate function (it looks that LEGITIMIZE_RELOAD_ADDRESS should fix it, since the incorrect address is generated by IRA/reload). After this operation, various predicates, based on ix86_decompose_address will start to work, since they will decompose valid memory addresses. IRA/.RELOAD isn't prepared to deal with it and it just ICEs. I opened a few GCC bugs on this. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47744 is one of them. That is why I went this route. Hm, but it crashed in postreload pass since the address was not in the legitimate form. This is exactly what LEGITIMIZE_RELOAD_ADDRESS fixes. Did you try to go this route? It ran into various ICEs like: /export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/ -S -o m.s -mx32 -std=gnu99 -O2 -fPIC m.i m.i: In function \u2018__kernel_rem_pio2\u2019: m.i:18:1: error: insn does not satisfy its constraints: (insn 108 106 186 3 (set (reg:SI 40 r11 [207]) (plus:SI (plus:SI (mult:SI (reg:SI 1 dx [205]) (const_int 8 [0x8])) (subreg:SI (plus:DI (reg/f:DI 7 sp) (const_int 208 [0xd0])) 0)) (const_int -160 [0xff60]))) m.i:3 251 {*lea_1_x32} (nil)) m.i:18:1: internal compiler error: in reload_cse_simplify_operands, at postreload.c:403 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. make: *** [m.s] Error 1 Yes, this is an example from PR I am referring to. Did you try to define LEGITIMIZE_RELOAD_ADDRESS? It is supposed to fix this. Uros.
Re: PATCH [5/n] X32: Supprot 32bit address
On Sat, Jul 16, 2011 at 6:47 PM, H.J. Lu hjl.to...@gmail.com wrote: Yes, this is an example from PR I am referring to. Did you try to define LEGITIMIZE_RELOAD_ADDRESS? It is supposed to fix this. They make things even more complex. ix86_simplify_base_index_disp is called after reload is done since we can do this translation safely only on hard registers, not on pseudo registers. Hi Uros, The current implementation has been tested extensively. I'd like to keep it ASIS so that we can have a working x32 support. We will revisit it later: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49765 after we have a working x32 GCC. This can not be only my decision, I have CCd other x86 maintainers and RMs for their opinion on this question. Uros.
[PATCH, i386]: FixPR47744; [x32] ICE: in reload_cse_simplify_operands, at postreload.c:403 [was: Re: PATCH [5/n] X32: Supprot 32bit address]
Hello! This alternative patch fixes the problem in ix86_decompose_address, uncovered by x32 branch. Since x32 branch generates lots of SImode subreg of DImode values to handle Pmode vs. ptr_mode restrictions, a latent bug in x86_decompose_address allowed addresses in the (invalid, see SImode subreg of DImode operation) form of: (insn 108 106 186 3 (set (reg:SI 40 r11 [207]) (plus:SI (plus:SI (mult:SI (reg:SI 1 dx [205]) (const_int 8 [0x8])) (subreg:SI (plus:DI (reg/f:DI 7 sp) (const_int 208 [0xd0])) 0)) (const_int -160 [0xff60]))) m.i:3 251 {*lea_1_x32} (nil)) this form later choked reload to ICE with error: insn does not satisfy its constraints: in reload_cse_simplify_operands, at postreload.c:403. Invalid RTX in this example was created by reload trying to eliminate frame pointer register to RSP+offset. The solution is to prevent subregs of DImode operations in PLUS address sequences. We can still allow hard registers, since we are sure that combine won't touch them and reload won't try to eliminate them to some reg+offset. Effectively, instead of above RTX, gcc generates more correct sequence that correctly handles SI and DImodes: (insn 185 87 89 3 (set (reg:DI 0 ax) (plus:DI (reg/f:DI 7 sp) (const_int 200 [0xc8]))) pr47744.c:5 248 {*lea_1} (nil)) (insn 89 185 90 3 (set (reg:SI 40 r11 [177]) (plus:SI (plus:SI (mult:SI (reg:SI 40 r11 [175]) (const_int 8 [0x8])) (reg:SI 0 ax)) (const_int -160 [0xff60]))) pr47744.c:5 286 {*lea_general_3} (nil)) So, there is no need for some special lea_* patterns. In addition, this simple patch removes huge amount of problematic kludges from current x32 branch. Also, the patch prevents invalid address RTXes for current x86 targets (32 and 64 bit). There is in fact no protection for i.e. SImode subreg of HImode operation to combine into invalid address RTX. On a related note, SImode subreg of a DImode hard register is OK also for 32bit targets, reload will choose the lower SImode register of a DImode pair. Oh, and BTW: patched gcc bootstrapped faster for me on x86_64 SNB for default configure and make -j 8: (unpached) real28m40.314s user154m2.612s sys8m16.934s vs: (patched) real27m8.057s user142m42.522s sys7m41.875s (see PR for details). 2011-07-18 Uros Bizjak ubiz...@gmail.com PR target/47744 * config/i386/i386.c (ix86_decompose_address): Allow only subregs of DImode hard registers in PLUS address chains. Patch was bootstrapped on x86_64-pc-linux-gnu {,-m32}. H.J. tested it on x32 target, where the patch fixed all reported failures. Patch was committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 176386) +++ config/i386/i386.c (working copy) @@ -11149,8 +11149,13 @@ ix86_decompose_address (rtx addr, struct return 0; break; - case REG: case SUBREG: + /* Allow only subregs of DImode hard regs in PLUS chains. */ + if (!register_no_elim_operand (SUBREG_REG (op), DImode)) + return 0; + /* FALLTHRU */ + + case REG: if (!base) base = op; else if (!index)
Re: PATCH [6/n] X32: Supprot 32bit address
On Mon, Jul 18, 2011 at 8:39 PM, H.J. Lu hongjiu...@intel.com wrote: TARGET_MEM_REF only works on ptr_mode. This patch allows 32bit address in x32 mode. OK for trunk? Do you perhaps have a testcase to help in analyzing the problem? Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Mon, Jul 18, 2011 at 8:48 PM, H.J. Lu hjl.to...@gmail.com wrote: TARGET_MEM_REF only works on ptr_mode. This patch allows 32bit address in x32 mode. OK for trunk? Do you perhaps have a testcase to help in analyzing the problem? See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49780 I don't think that tree-ssa-address/addr_for_mem_ref is correct when REALLY_EXPAND is false. It constructs RTX template in pointer_mode, which is not necessary valid and is rejected from ix86_validate_address_p. When really expanding the expression, we have a conversion at the end: gen_addr_rtx (pointer_mode, sym, bse, idx, st, off, address, NULL, NULL); if (pointer_mode != address_mode) address = convert_memory_address (address_mode, address); return address; This is in fact your r175912 change in the fix for PR47383 - you need to do something with template as well... Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Mon, Jul 18, 2011 at 10:25 PM, H.J. Lu hjl.to...@gmail.com wrote: TARGET_MEM_REF only works on ptr_mode. This patch allows 32bit address in x32 mode. OK for trunk? Do you perhaps have a testcase to help in analyzing the problem? See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49780 I don't think that tree-ssa-address/addr_for_mem_ref is correct when REALLY_EXPAND is false. It constructs RTX template in pointer_mode, which is not necessary valid and is rejected from ix86_validate_address_p. When really expanding the expression, we have a conversion at the end: gen_addr_rtx (pointer_mode, sym, bse, idx, st, off, address, NULL, NULL); if (pointer_mode != address_mode) address = convert_memory_address (address_mode, address); return address; This is in fact your r175912 change in the fix for PR47383 - you need to do something with template as well... Since TARGET_MEM_REF only works on ptr_mode, I don't think we can change template. We just need to accept TARGET_MEM_REF in ptr_mode and fix it up later. No, a template is used to get some insight into the supported address structure. If there is a mismatch, this approach fails, we can as well give the compiler whatever fake template we want. Uros.
Re: PATCH [5/n] X32: Supprot 32bit address
On Tue, Jul 19, 2011 at 1:25 PM, Richard Sandiford richard.sandif...@linaro.org wrote: On Sat, Jul 16, 2011 at 6:47 PM, H.J. Lu hjl.to...@gmail.com wrote: Yes, this is an example from PR I am referring to. Did you try to define LEGITIMIZE_RELOAD_ADDRESS? It is supposed to fix this. They make things even more complex. ix86_simplify_base_index_disp is called after reload is done since we can do this translation safely only on hard registers, not on pseudo registers. Hi Uros, The current implementation has been tested extensively. I'd like to keep it ASIS so that we can have a working x32 support. We will revisit it later: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49765 after we have a working x32 GCC. This can not be only my decision, I have CCd other x86 maintainers and RMs for their opinion on this question. FWIW, I agree with you that things like: (set (reg:SI 40 r11) (plus:SI (plus:SI (mult:SI (reg:SI 1 dx) (const_int 8)) (subreg:SI (plus:DI (reg/f:DI 7 sp) (const_int CONST1)) 0)) (const_int CONST2))) do not look like things that should ever enter the insn stream. They're liable to confuse other code besides the x86 predicates. The target of the conversion: (set (reg:SI 40 r11) (plus:SI (plus:SI (mult:SI (reg:SI 1 dx) (const_int 8)) (reg/f:SI 7 sp)) (const_int [CONST1 + CONST2]))) looks like the generally preferred form. It isn't an x32-ism. LEGITIMIZE_RELOAD_ADDRESS is supposed to be for optimisation only, not correctness. Why doesn't reload have enough information to generate the correct form itself? Please see the solution at [1]. The problem was that x86 target allowed SImode subregs of DImode operations (i.e. PLUS). When these are rejected, everything works as expected. IMO, LEGITIMIZE_RELOAD_ADDRESS can not optimize resulting RTX, as shown in [1]. [1] http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01427.html Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Tue, Jul 19, 2011 at 3:47 PM, H.J. Lu hjl.to...@gmail.com wrote: Attached patch simply removes these two checks, as it seems they are not needed. This also follows how other Pmode != ptr_mode targets. 2011-07-19 Uros Bizjak ubiz...@gmail.com PR target/49780 * config/i386/i386.c (ix86_legitimate_address_p): Remove checks that base and index registers are in Pmode. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Can you please re-test it on x32? Comparing with my patch, which only allows DImode and SImode, it caused the following regressions: FAIL: libgomp.fortran/omp_atomic1.f90 -O1 execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O2 execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -fomit-frame-pointer execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -fomit-frame-pointer -funroll-loops execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -g execution test FAIL: libgomp.fortran/omp_atomic1.f90 -Os execution test BTW: I still think that template should return the same address structure as expansion, but this won't crash the compiler anymore. There is no non-DImode addresses in insn stream, so I doubt the bug is due to my change. Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Tue, Jul 19, 2011 at 4:42 PM, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Jul 19, 2011 at 7:04 AM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Jul 19, 2011 at 3:47 PM, H.J. Lu hjl.to...@gmail.com wrote: Attached patch simply removes these two checks, as it seems they are not needed. This also follows how other Pmode != ptr_mode targets. 2011-07-19 Uros Bizjak ubiz...@gmail.com PR target/49780 * config/i386/i386.c (ix86_legitimate_address_p): Remove checks that base and index registers are in Pmode. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Can you please re-test it on x32? Comparing with my patch, which only allows DImode and SImode, it caused the following regressions: FAIL: libgomp.fortran/omp_atomic1.f90 -O1 execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O2 execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -fomit-frame-pointer execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -fomit-frame-pointer -funroll-loops execution test FAIL: libgomp.fortran/omp_atomic1.f90 -O3 -g execution test FAIL: libgomp.fortran/omp_atomic1.f90 -Os execution test BTW: I still think that template should return the same address structure as expansion, but this won't crash the compiler anymore. There is no non-DImode addresses in insn stream, so I doubt the bug is due to my change. I saw the same failures on x86-64: http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg02224.html Can you take a look? Sometimes, the compiler is really creative in inventing instructions: (insn 47 46 49 7 (set (reg:SI 68 [ D.1686 ]) (subreg:SI (plus:SF (reg:SF 159 [ D.1685 ]) (reg:SF 159 [ D.1685 ])) 0)) omp_atomic1.f90:17 247 {*lea_2} (expr_list:REG_DEAD (reg:SF 159 [ D.1685 ]) (nil))) Really funny. Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Tue, Jul 19, 2011 at 6:30 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Jul 19, 2011 at 06:26:33PM +0200, Uros Bizjak wrote: Sometimes, the compiler is really creative in inventing instructions: (insn 47 46 49 7 (set (reg:SI 68 [ D.1686 ]) (subreg:SI (plus:SF (reg:SF 159 [ D.1685 ]) (reg:SF 159 [ D.1685 ])) 0)) omp_atomic1.f90:17 247 {*lea_2} (expr_list:REG_DEAD (reg:SF 159 [ D.1685 ]) (nil))) Really funny. That's the job of combiner to try all kinds of stuff and it is the responsibility of the backend to reject those. I think it would be better to get back to testing Pmode in the legitimate address hook, perhaps allowing ptr_mode too in addition to Pmode (which for -m32/-m64 won't mean any change, just for -mx32). Actually, there is a bypass in ix86_decompose_address, and this RTX squeezed through. IMO constructs like this should be rejected in i_d_a, which effectively only moves Pmode/ptr_mode check here. I'm looking into it. Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Tue, Jul 19, 2011 at 6:37 PM, Uros Bizjak ubiz...@gmail.com wrote: Sometimes, the compiler is really creative in inventing instructions: (insn 47 46 49 7 (set (reg:SI 68 [ D.1686 ]) (subreg:SI (plus:SF (reg:SF 159 [ D.1685 ]) (reg:SF 159 [ D.1685 ])) 0)) omp_atomic1.f90:17 247 {*lea_2} (expr_list:REG_DEAD (reg:SF 159 [ D.1685 ]) (nil))) Really funny. That's the job of combiner to try all kinds of stuff and it is the responsibility of the backend to reject those. I think it would be better to get back to testing Pmode in the legitimate address hook, perhaps allowing ptr_mode too in addition to Pmode (which for -m32/-m64 won't mean any change, just for -mx32). Actually, there is a bypass in ix86_decompose_address, and this RTX squeezed through. IMO constructs like this should be rejected in i_d_a, which effectively only moves Pmode/ptr_mode check here. I'm looking into it. The problem was in fact the declaration of no_seg_address_operand predicate that was defined as special predicate and this way ignoring the mode of the operand. Attached patch also includes check for DImode SUBREGS for base register, to eventually save x32 some trouble in future. I'm currently regression testing the patch added to the patch that removed Pmode checks. H.J., can you please test it on x32? Uros. Index: predicates.md === --- predicates.md (revision 176462) +++ predicates.md (working copy) @@ -796,7 +796,7 @@ ;; Return true if op if a valid address, and does not contain ;; a segment override. -(define_special_predicate no_seg_address_operand +(define_predicate no_seg_address_operand (match_operand 0 address_operand) { struct ix86_address parts; Index: i386.c === --- i386.c (revision 176462) +++ i386.c (working copy) @@ -11085,8 +11085,16 @@ ix86_decompose_address (rtx addr, struct int retval = 1; enum ix86_address_seg seg = SEG_DEFAULT; - if (REG_P (addr) || GET_CODE (addr) == SUBREG) + if (REG_P (addr)) base = addr; + else if (GET_CODE (addr) == SUBREG) +{ + /* Allow only subregs of DImode hard regs. */ + if (register_no_elim_operand (SUBREG_REG (addr), DImode)) + base = addr; + else + return 0; +} else if (GET_CODE (addr) == PLUS) { rtx addends[4], op;
Re: PATCH [6/n] X32: Supprot 32bit address
On Tue, Jul 19, 2011 at 7:33 PM, Uros Bizjak ubiz...@gmail.com wrote: Sometimes, the compiler is really creative in inventing instructions: (insn 47 46 49 7 (set (reg:SI 68 [ D.1686 ]) (subreg:SI (plus:SF (reg:SF 159 [ D.1685 ]) (reg:SF 159 [ D.1685 ])) 0)) omp_atomic1.f90:17 247 {*lea_2} (expr_list:REG_DEAD (reg:SF 159 [ D.1685 ]) (nil))) Really funny. That's the job of combiner to try all kinds of stuff and it is the responsibility of the backend to reject those. I think it would be better to get back to testing Pmode in the legitimate address hook, perhaps allowing ptr_mode too in addition to Pmode (which for -m32/-m64 won't mean any change, just for -mx32). Actually, there is a bypass in ix86_decompose_address, and this RTX squeezed through. IMO constructs like this should be rejected in i_d_a, which effectively only moves Pmode/ptr_mode check here. I'm looking into it. The problem was in fact the declaration of no_seg_address_operand predicate that was defined as special predicate and this way ignoring the mode of the operand. This change should be backported to 4.6 and 4.5. Uros.
Re: PATCH [8/n] X32: Convert to Pmode if needed
On Tue, Jul 19, 2011 at 6:47 AM, H.J. Lu hongjiu...@intel.com wrote: This patch adds the missing Pmode check and conversion. OK for trunk? 2011-07-18 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_legitimize_address): Convert to Pmode if needed. (ix86_expand_move): Likewise. (ix86_expand_call): Likewise. (ix86_expand_special_args_builtin): Likewise. (ix86_expand_builtin): Likewise. copy_addr_to_reg ? Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Tue, Jul 19, 2011 at 6:30 PM, Jakub Jelinek ja...@redhat.com wrote: Sometimes, the compiler is really creative in inventing instructions: (insn 47 46 49 7 (set (reg:SI 68 [ D.1686 ]) (subreg:SI (plus:SF (reg:SF 159 [ D.1685 ]) (reg:SF 159 [ D.1685 ])) 0)) omp_atomic1.f90:17 247 {*lea_2} (expr_list:REG_DEAD (reg:SF 159 [ D.1685 ]) (nil))) Really funny. That's the job of combiner to try all kinds of stuff and it is the responsibility of the backend to reject those. I think it would be better to get back to testing Pmode in the legitimate address hook, perhaps allowing ptr_mode too in addition to Pmode (which for -m32/-m64 won't mean any change, just for -mx32). I agree that we still need to check naked registers. However, for 64bit targets it is OK to pass both, SImode and DImode registers. We are sure that SImode values in DImode regs have top 32bits equal to 0 in address calculations. This is not true for QImode regs (assignment to lowpart only). We also have to prevent non-integer registers. Attached is my final version of the patch. Uros. Index: predicates.md === --- predicates.md (revision 176462) +++ predicates.md (working copy) @@ -796,7 +796,7 @@ ;; Return true if op if a valid address, and does not contain ;; a segment override. -(define_special_predicate no_seg_address_operand +(define_predicate no_seg_address_operand (match_operand 0 address_operand) { struct ix86_address parts; Index: i386.c === --- i386.c (revision 176462) +++ i386.c (working copy) @@ -11085,8 +11085,16 @@ ix86_decompose_address (rtx addr, struct int retval = 1; enum ix86_address_seg seg = SEG_DEFAULT; - if (REG_P (addr) || GET_CODE (addr) == SUBREG) + if (REG_P (addr)) base = addr; + else if (GET_CODE (addr) == SUBREG) +{ + /* Allow only subregs of DImode hard regs. */ + if (register_no_elim_operand (SUBREG_REG (addr), DImode)) + base = addr; + else + return 0; +} else if (GET_CODE (addr) == PLUS) { rtx addends[4], op; @@ -11643,8 +11651,7 @@ ix86_legitimate_address_p (enum machine_ /* Base is not a register. */ return false; - if (GET_MODE (base) != Pmode) - /* Base is not in Pmode. */ + if (GET_MODE (base) != SImode GET_MODE (base) != DImode) return false; if ((strict ! REG_OK_FOR_BASE_STRICT_P (reg)) @@ -11672,8 +11679,7 @@ ix86_legitimate_address_p (enum machine_ /* Index is not a register. */ return false; - if (GET_MODE (index) != Pmode) - /* Index is not in Pmode. */ + if (GET_MODE (index) != SImode GET_MODE (index) != DImode) return false; if ((strict ! REG_OK_FOR_INDEX_STRICT_P (reg))
Re: [PATCH, testsuite] Fix for PR47440 - Use LCM for vzeroupper insertion
Hello! ? ? ? ?* a/gcc/gcse.c (alloc_gcse_mem): Added code to run in PRE2. And this is necessary because...??? Why not just make it a separate pass in ix86-reorg that uses LCM? Look at mode switching for an example. I was also expecting that vzeroupper would be inserted in the same way as I387 mode switching instructions are inserted. To expand on Steven's suggestion, please see i386.h for OPTIMIZE_MODE_SWITCHING and following macros. At the moment, there are 4 separate entities that handle (four independent) insertions for mode switching for x87 for each mode of fistp or frndint instruction. Mode insertions will actually insert calculations of x87 control word (CW) at optimal points and push this new CW (together with old CW) to known stack slot to be consumed by fistp/frndint insn. You can add a new entitiy to enum ix86_entity (say, AVX_VZEROUPPER) and update OPTIMIZE_MODE_SWITCHING to perform mode insertion for AVX_VZEROUPPER entitiy when needed. Various modes for AVX_VZEROUPPER are defined in NUM_MODES_FOR_MODE_SWITCHING, mode transition in MODE_NEEDED and insn insertions in EMIT_MODE_SET. Please note that LCM handles all entities in parallel, so there is no need for extra passes. The real worker for mode switching is ix86_mode_needed, but don't forget that you can disable mode switching pass per-function when not needed through OPTIMIZE_MODE_SWITCHING macro. FYI: Existing x87 CW initialization insertion works this way: - fistp/frndint is inserted into insn stream and corresponding OPTIMIZE_MODE_SWITCHING flag is set. - inserted insn has i386_cw attribute that defines requested mode in which the insn operate. Based on this attribute, MODE_NEEDED handles mode transitions (please note that there are four independent entities) for each entitiy. - EMIT_MODE_SET emits CW initializations. These are further optimized by follow-up optimization passes, so two consecutive initializations at the same place are CSEd, etc. Uros.
Re: PATCH [7/n] X32: Handle address output and calls patterns
On Wed, Jul 20, 2011 at 4:51 AM, H.J. Lu hjl.to...@gmail.com wrote: I had it in my x32 tree. But I reverted: http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00954.html since Pmode is used in non-PIC tablejump, we have to put 64bit value for labels with 0 upper 32bits in tablejump for x32. The mode is completely controled by CASE_VECTOR_MODE. Here is the updated patch. OK for trunk? A small change. It always use 64bit register for indirect branch. - ix86_print_operand (file, x, 0); + /* Always use 64bit register for indirect branch. */ + ix86_print_operand (file, x, + REG_P (x) TARGET_64BIT ? 'q' : 0); return; /* Always use 64bit register for indirect branch. */ if (REG_P (x) TARGET_64BIT) print_reg (x, 'q', file); else ix86_print_operand (file, x, 0); (define_insn *indirect_jump - [(set (pc) (match_operand:P 0 nonimmediate_operand rm))] + [(set (pc) (match_operand:P 0 x32_indirect_branch_operand rm))] Just name it indirect_branch_operand. (define_insn_and_split *call_vzeroupper - [(call (mem:QI (match_operand:P 0 call_insn_operand czm)) + [(call (mem:QI (match_operand:P 0 x32_call_insn_operand czm)) Don't introduce new predicate, change call_insn_operand instead to conditionally disable memory_operand on x32. You will need to change czm register constraint to cz on x32, otherwise you will get ICEs. And i386.c also calls call_insn_operand in one place. Uros.
Re: PATCH [7/n] X32: Handle address output and calls patterns
On Wed, Jul 20, 2011 at 9:53 AM, Uros Bizjak ubiz...@gmail.com wrote: since Pmode is used in non-PIC tablejump, we have to put 64bit value for labels with 0 upper 32bits in tablejump for x32. The mode is completely controled by CASE_VECTOR_MODE. Here is the updated patch. OK for trunk? A small change. It always use 64bit register for indirect branch. - ix86_print_operand (file, x, 0); + /* Always use 64bit register for indirect branch. */ + ix86_print_operand (file, x, + REG_P (x) TARGET_64BIT ? 'q' : 0); return; /* Always use 64bit register for indirect branch. */ if (REG_P (x) TARGET_64BIT) print_reg (x, 'q', file); else ix86_print_operand (file, x, 0); (define_insn *indirect_jump - [(set (pc) (match_operand:P 0 nonimmediate_operand rm))] + [(set (pc) (match_operand:P 0 x32_indirect_branch_operand rm))] Just name it indirect_branch_operand. (define_insn_and_split *call_vzeroupper - [(call (mem:QI (match_operand:P 0 call_insn_operand czm)) + [(call (mem:QI (match_operand:P 0 x32_call_insn_operand czm)) Don't introduce new predicate, change call_insn_operand instead to conditionally disable memory_operand on x32. You will need to change czm register constraint to cz on x32, otherwise you will get ICEs. Use new constraint here, something like (untested): Index: constraints.md === --- constraints.md (revision 176494) +++ constraints.md (working copy) @@ -127,6 +127,11 @@ @internal Constant call address operand. (match_operand 0 constant_call_address_operand)) +(define_constraint w + @internal Call memory operand. + (and (match_test !TARGET_X32) + (match_operand 0 memory_operand)) + ;; Integer constant constraints. (define_constraint I Integer constant in the range 0 @dots{} 31, for 32-bit shifts. Uros.
Re: PATCH [6/n] X32: Supprot 32bit address
On Wed, Jul 20, 2011 at 2:54 PM, H.J. Lu hjl.to...@gmail.com wrote: Sometimes, the compiler is really creative in inventing instructions: (insn 47 46 49 7 (set (reg:SI 68 [ D.1686 ]) (subreg:SI (plus:SF (reg:SF 159 [ D.1685 ]) (reg:SF 159 [ D.1685 ])) 0)) omp_atomic1.f90:17 247 {*lea_2} (expr_list:REG_DEAD (reg:SF 159 [ D.1685 ]) (nil))) Really funny. That's the job of combiner to try all kinds of stuff and it is the responsibility of the backend to reject those. I think it would be better to get back to testing Pmode in the legitimate address hook, perhaps allowing ptr_mode too in addition to Pmode (which for -m32/-m64 won't mean any change, just for -mx32). I agree that we still need to check naked registers. However, for 64bit targets it is OK to pass both, SImode and DImode registers. We are sure that SImode values in DImode regs have top 32bits equal to 0 in address calculations. This is not true for QImode regs (assignment to lowpart only). We also have to prevent non-integer registers. Attached is my final version of the patch. It works fine. Can you check it in? Tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN with following ChangeLog: 2011-07-20 Uros Bizjak ubiz...@gmail.com PR target/49780 * config/i386/predicates.md (no_seg_addres_operand): No more special. * config/i386/i386.c (ix86_decompose_address): Allow only subregs of DImode hard registers in base. (ix86_legitimate_address_p): Allow SImode and DImode base and index registers. Uros.
Re: PATCH [7/n] X32: Handle address output and calls patterns
On Wed, Jul 20, 2011 at 3:18 PM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Jul 20, 2011 at 1:19 AM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Jul 20, 2011 at 9:53 AM, Uros Bizjak ubiz...@gmail.com wrote: since Pmode is used in non-PIC tablejump, we have to put 64bit value for labels with 0 upper 32bits in tablejump for x32. The mode is completely controled by CASE_VECTOR_MODE. Here is the updated patch. OK for trunk? A small change. It always use 64bit register for indirect branch. - ix86_print_operand (file, x, 0); + /* Always use 64bit register for indirect branch. */ + ix86_print_operand (file, x, + REG_P (x) TARGET_64BIT ? 'q' : 0); return; /* Always use 64bit register for indirect branch. */ if (REG_P (x) TARGET_64BIT) print_reg (x, 'q', file); else ix86_print_operand (file, x, 0); (define_insn *indirect_jump - [(set (pc) (match_operand:P 0 nonimmediate_operand rm))] + [(set (pc) (match_operand:P 0 x32_indirect_branch_operand rm))] Just name it indirect_branch_operand. (define_insn_and_split *call_vzeroupper - [(call (mem:QI (match_operand:P 0 call_insn_operand czm)) + [(call (mem:QI (match_operand:P 0 x32_call_insn_operand czm)) Don't introduce new predicate, change call_insn_operand instead to conditionally disable memory_operand on x32. You will need to change czm register constraint to cz on x32, otherwise you will get ICEs. Use new constraint here, something like (untested): Index: constraints.md === --- constraints.md (revision 176494) +++ constraints.md (working copy) @@ -127,6 +127,11 @@ @internal Constant call address operand. (match_operand 0 constant_call_address_operand)) +(define_constraint w + @internal Call memory operand. + (and (match_test !TARGET_X32) + (match_operand 0 memory_operand)) + ;; Integer constant constraints. (define_constraint I Integer constant in the range 0 @dots{} 31, for 32-bit shifts. Uros. Here is the updated patch. OK for trunk? Thanks. -- H.J. - 2011-07-20 H.J. Lu hongjiu...@intel.com Uros Bizjak ubiz...@gmail.com * config/i386/constraints.md (w): New. * config/i386/i386.c (ix86_print_operand): Always use 64bit register for indirect branch. (ix86_output_addr_vec_elt): Check TARGET_LP64 instead of TARGET_64BIT for ASM_QUAD. * config/i386/i386.h (CASE_VECTOR_MODE): Check TARGET_LP64 instead of TARGET_64BIT. * config/i386/i386.md (*indirect_jump): Replace nonimmediate_operand with indirect_branch_operand. (*tablejump_1): Likewise. (*call_vzeroupper): Replace constraint m with w. (*call): Likewise. (*call_rex64_ms_sysv_vzeroupper): Likewise. (*call_rex64_ms_sysv): Likewise. (*call_value_vzeroupper): Likewise. (*call_value): Likewise. (*call_value_rex64_ms_sysv_vzeroupper): Likewise. (*call_value_rex64_ms_sysv): Likewise. (*tablejump_1_x32): New. (set_got_offset_rex64): Check TARGET_LP64 instead of TARGET_64BIT. * config/i386/predicates.md (indirect_branch_operand): New. (call_insn_operand): Support x32. + +(define_insn *tablejump_1_x32 + [(set (pc) (match_operand:SI 0 register_operand r)) + (use (label_ref (match_operand 1 )))] + TARGET_X32 + jmp\t%A0 + [(set_attr type ibr) + (set_attr length_immediate 0)]) This pattern should include zero_extend from operand 0. Please fix the tablejump expander to generate correct pattern. Also, indirect jump needs to generate zero_extend from SImode register for x32. Other than that, the patch looks OK to me. Please also wait for rth's approval. Thanks, Uros.
Re: PATCH [7/n] X32: Handle address output and calls patterns
On Wed, Jul 20, 2011 at 4:09 PM, H.J. Lu hjl.to...@gmail.com wrote: Hello! +(define_insn *tablejump_1_x32 + [(set (pc) (match_operand:SI 0 register_operand r)) + (use (label_ref (match_operand 1 )))] + TARGET_X32 + jmp\t%A0 + [(set_attr type ibr) + (set_attr length_immediate 0)]) This pattern should include zero_extend from operand 0. Please fix the tablejump expander to generate correct pattern. Also, indirect jump needs to generate zero_extend from SImode register for x32. I am testing this patch on top of the last one. We don't need to zero-extend indirect jump since it takes operand 0 in Pmode, which is DImode. Looks good to me, but please wait for rth's approval. Thanks, Uros.
[PATCH, i386]: Allow subregs of multi-word values in addresses
On Wed, Jul 20, 2011 at 9:46 PM, Uros Bizjak ubiz...@gmail.com wrote: Note that SUBREG_PROMOTED_UNSIGNED_P wasn't designed for paradoxical subregs, but for regular subregs (typically of word-sized objects). You should check that the ones created for x32 (because of POINTERS_EXTEND_UNSIGNED I guess) are legitimate. I have left out paradoxical subreg stuff ATM and committed following patch that allows subregs of multi-word values in addresses. 2011-07-20 Uros Bizjak ubiz...@gmail.com * config/i386/i386.c (ix86_decompose_address): Allow only subregs of DImode hard registers in index. (ix86_legitimate_address_p): Allow subregs of base and index to span more than a word. Assert that subregs of base and index satisfy register_no_elim_operand predicates. Reject addresses where base and index have different modes. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. (I will prepare a followup [RFC] patch that also allows paradoxical (?) subregs for experimenting and testing on x32 target). Uros. Index: i386.c === --- i386.c (revision 176533) +++ i386.c (working copy) @@ -11197,6 +11197,16 @@ ix86_decompose_address (rtx addr, struct else disp = addr; /* displacement */ + if (index) +{ + if (REG_P (index)) + ; + /* Allow only subregs of DImode hard regs. */ + else if (GET_CODE (index) == SUBREG + !register_no_elim_operand (SUBREG_REG (index), DImode)) + return 0; +} + /* Extract the integral value of scale. */ if (scale_rtx) { @@ -11630,23 +11640,18 @@ ix86_legitimate_address_p (enum machine_ disp = parts.disp; scale = parts.scale; - /* Validate base register. - - Don't allow SUBREG's that span more than a word here. It can lead to spill - failures when the base is one word out of a two word structure, which is - represented internally as a DImode int. */ - + /* Validate base register. */ if (base) { rtx reg; if (REG_P (base)) reg = base; - else if (GET_CODE (base) == SUBREG - REG_P (SUBREG_REG (base)) - GET_MODE_SIZE (GET_MODE (SUBREG_REG (base))) - = UNITS_PER_WORD) - reg = SUBREG_REG (base); + else if (GET_CODE (base) == SUBREG REG_P (SUBREG_REG (base))) + { + reg = SUBREG_REG (base); + gcc_assert (register_no_elim_operand (reg, DImode)); + } else /* Base is not a register. */ return false; @@ -11660,21 +11665,18 @@ ix86_legitimate_address_p (enum machine_ return false; } - /* Validate index register. - - Don't allow SUBREG's that span more than a word here -- same as above. */ - + /* Validate index register. */ if (index) { rtx reg; if (REG_P (index)) reg = index; - else if (GET_CODE (index) == SUBREG - REG_P (SUBREG_REG (index)) - GET_MODE_SIZE (GET_MODE (SUBREG_REG (index))) - = UNITS_PER_WORD) - reg = SUBREG_REG (index); + else if (GET_CODE (index) == SUBREG REG_P (SUBREG_REG (index))) + { + reg = SUBREG_REG (index); + gcc_assert (register_no_elim_operand (reg, DImode)); + } else /* Index is not a register. */ return false; @@ -11688,6 +11690,11 @@ ix86_legitimate_address_p (enum machine_ return false; } + /* Index and base should have the same mode. */ + if (base index + GET_MODE (base) != GET_MODE (index)) +return false; + /* Validate scale factor. */ if (scale != 1) {
Re: PATCH [8/n] X32: Convert to Pmode if needed
On Tue, Jul 19, 2011 at 6:47 AM, H.J. Lu hongjiu...@intel.com wrote: So, since copy_to_reg co. expects x in Pmode or VOIDmode constant (due to force_reg that won't do mode conversion), we have to implement them with a mode conversion... This patch adds the missing Pmode check and conversion. OK for trunk? 2011-07-18 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_legitimize_address): Convert to Pmode if needed. (ix86_expand_move): Likewise. (ix86_expand_call): Likewise. (ix86_expand_special_args_builtin): Likewise. (ix86_expand_builtin): Likewise. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c268899..1ed451b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12618,7 +12667,11 @@ ix86_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, rtx temp = gen_reg_rtx (Pmode); rtx val = force_operand (XEXP (x, 1), temp); if (val != temp) - emit_move_insn (temp, val); + { + if (GET_MODE (val) != Pmode) + val = convert_to_mode (Pmode, val, 1); + emit_move_insn (temp, val); + } XEXP (x, 1) = temp; return x; OK. @@ -12629,7 +12682,11 @@ ix86_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED, rtx temp = gen_reg_rtx (Pmode); rtx val = force_operand (XEXP (x, 0), temp); if (val != temp) - emit_move_insn (temp, val); + { + if (GET_MODE (val) != Pmode) + val = convert_to_mode (Pmode, val, 1); + emit_move_insn (temp, val); + } OK. @@ -14956,6 +15023,8 @@ ix86_expand_move (enum machine_mode mode, rtx operands[]) if (model) { op1 = legitimize_tls_address (op1, model, true); + if (GET_MODE (op1) != mode) + op1 = convert_to_mode (mode, op1, 1); op1 = force_operand (op1, op0); if (op1 == op0) return; Please write this part in the same form as above two changes. This way, force_operand will emit instructions in narrower mode (i.e. SImode, not in DImode). @@ -21475,7 +21554,10 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, ? !sibcall_insn_operand (XEXP (fnaddr, 0), Pmode) : !call_insn_operand (XEXP (fnaddr, 0), Pmode)) { - fnaddr = copy_to_mode_reg (Pmode, XEXP (fnaddr, 0)); + fnaddr = XEXP (fnaddr, 0); + if (GET_MODE (fnaddr) != Pmode) + fnaddr = convert_to_mode (Pmode, fnaddr, 1); + fnaddr = copy_to_mode_reg (Pmode, fnaddr); fnaddr = gen_rtx_MEM (QImode, fnaddr); } Use force_reg (Pmode, ...) instead of copy_to_mode_reg (Pmode, ...). We know we have Pmode here. No need to copy a register if convert_to_mode returned a register. Better yet: fnaddr = gen_rtx_MEM (QImode, force_reg (Pmode, fnaddr)); @@ -26700,7 +26782,11 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, op = expand_normal (arg); gcc_assert (target == 0); if (memory) - target = gen_rtx_MEM (tmode, copy_to_mode_reg (Pmode, op)); + { + if (GET_MODE (op) != Pmode) + op = convert_to_mode (Pmode, op, 1); + target = gen_rtx_MEM (tmode, copy_to_mode_reg (Pmode, op)); + } else target = force_reg (tmode, op); arg_adjust = 1; Use force_reg. @@ -26743,6 +26829,8 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, if (i == memory) { /* This must be the memory operand. */ + if (GET_MODE (op) != Pmode) + op = convert_to_mode (Pmode, op, 1); op = gen_rtx_MEM (mode, copy_to_mode_reg (Pmode, op)); gcc_assert (GET_MODE (op) == mode || GET_MODE (op) == VOIDmode); Same here. @@ -26969,6 +27057,8 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, mode1 = insn_data[icode].operand[1].mode; mode2 = insn_data[icode].operand[2].mode; + if (GET_MODE (op0) != Pmode) + op0 = convert_to_mode (Pmode, op0, 1); op0 = force_reg (Pmode, op0); op0 = gen_rtx_MEM (mode1, op0); op0 = gen_rtx_MEM (mode1, force_reg (Pmode, op0)); @@ -27001,7 +27091,11 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, op0 = expand_normal (arg0); icode = CODE_FOR_sse2_clflush; if (!insn_data[icode].operand[0].predicate (op0, Pmode)) + { + if (GET_MODE (op0) != Pmode) + op0 = convert_to_mode (Pmode, op0, 1); op0 = copy_to_mode_reg (Pmode, op0); + } emit_insn (gen_sse2_clflush (op0)); return 0; Use force_reg. @@ -27014,7 +27108,11 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, op1 = expand_normal (arg1);
[PATCH, i386]: Reject wrong RTXes from index early
Hello! Just a small optimization, we can reject non-register RTXes and wrong subregs from index early. No functional change - these RTXes were rejected in ix86_legitimate_address_p anyway. 2011-07-21 Uros Bizjak ubiz...@gmail.com * config/i386/i386.c (ix86_decompose_address): Reject all but register operands and DImode hard registers in index. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: i386.c === --- i386.c (revision 176550) +++ i386.c (working copy) @@ -11203,7 +11203,9 @@ ; /* Allow only subregs of DImode hard regs. */ else if (GET_CODE (index) == SUBREG - !register_no_elim_operand (SUBREG_REG (index), DImode)) + register_no_elim_operand (SUBREG_REG (index), DImode)) + ; + else return 0; }
[PATCH, testsuite]: Introduce check_avx_os_support_available
Hello! This is the same functionality as recently added to glibc [1]. 2011-07-21 Uros Bizjak ubiz...@gmail.com * lib/target-supports.exp (check_avx_os_support_available): New. (check_effective_target_avx_runtime): Use it. Tested on x86_64-pc-linux-gnu {,-m32} AVX and non-AVX targets, ommitted to mainline SVN. The patch will be backported to release branches. [1] http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=5644ef5461b5d3ff266206d8ee70d4b575ea6658 Uros. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 176571) +++ lib/target-supports.exp (working copy) @@ -1070,8 +1070,8 @@ check_runtime_nocache sse_os_support_available { int main () { - __asm__ volatile (movaps %xmm0,%xmm0); - return 0; + asm volatile (movaps %xmm0,%xmm0); + return 0; } } -msse } else { @@ -1080,6 +1080,29 @@ }] } +# Return 1 if the target OS supports running AVX executables, 0 +# otherwise. Cache the result. + +proc check_avx_os_support_available { } { +return [check_cached_effective_target avx_os_support_available { + # If this is not the right target then we can skip the test. + if { !([istarget x86_64-*-*] || [istarget i?86-*-*]) } { + expr 0 + } else { + # Check that OS has AVX and SSE saving enabled. + check_runtime_nocache avx_os_support_available { + int main () + { + unsigned int eax, edx; + + asm (xgetbv : =a (eax), =d (edx) : c (0)); + return (eax 6) != 6; + } + } + } +}] +} + # Return 1 if the target supports executing SSE instructions, 0 # otherwise. Cache the result. @@ -1176,7 +1199,8 @@ proc check_effective_target_avx_runtime { } { if { [check_effective_target_avx] - [check_avx_hw_available] } { + [check_avx_hw_available] + [check_avx_os_support_available] } { return 1 } return 0
Re: PATCH [9/n] X32: PR target/49798: Zero-extend symbol address to 64bit if needed
On Thu, Jul 21, 2011 at 6:28 PM, H.J. Lu hjl.to...@gmail.com wrote: .quad symbol isn't really valid for 32bit. Why not? We certainly know what value to put there. x32 doesn't support 64bit relocation, like R_X86_64_64. In many causes, generate .long symbol .long 0 for .quad symbol is wrong. Please see: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47446 for some examples. Please also see: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49798#c12 on why I think this is middle-end/tree-optimization issue. Uros.
Re: PATCH [9/n] X32: PR target/49798: Zero-extend symbol address to 64bit if needed
On Thu, Jul 21, 2011 at 7:24 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Jul 21, 2011 at 10:04 AM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Jul 21, 2011 at 6:28 PM, H.J. Lu hjl.to...@gmail.com wrote: .quad symbol isn't really valid for 32bit. Why not? We certainly know what value to put there. x32 doesn't support 64bit relocation, like R_X86_64_64. In many causes, generate .long symbol .long 0 for .quad symbol is wrong. Please see: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47446 for some examples. Please also see: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49798#c12 on why I think this is middle-end/tree-optimization issue. I still think it is a backend issue. /* Represents viewing something of one type as being of a second type. This corresponds to an Unchecked Conversion in Ada and roughly to the idiom *(type2 *)X in C. The only operand is the value to be viewed as being of another type. **It is undefined if the type of the input and of the expression have different sizes.** ... DEFTREECODE (VIEW_CONVERT_EXPR, view_convert_expr, tcc_reference, 1) We have: bb 2: D.2709_8 = VIEW_CONVERT_EXPRdouble(); D.2702_1 = u.d; D.2704_3 = D.2702_1 == D.2709_8; D.2701_4 = (int) D.2704_3; return D.2701_4; Where sizeof (double) = 64 sizeof (ptr_type) = 32. Uros.
Re: PATCH [9/n] X32: PR target/49798: Zero-extend symbol address to 64bit if needed
On Thu, Jul 21, 2011 at 6:42 PM, Richard Henderson r...@redhat.com wrote: On 07/21/2011 09:28 AM, H.J. Lu wrote: On Thu, Jul 21, 2011 at 9:23 AM, Richard Henderson r...@redhat.com wrote: On 07/21/2011 09:20 AM, H.J. Lu wrote: .quad symbol isn't really valid for 32bit. Why not? We certainly know what value to put there. x32 doesn't support 64bit relocation, like R_X86_64_64. This being a self-fulfilling assertion, because you decided to disable that relocation. It *could* be supported. Easily. IMO, it is OK to disable 64bit relocations, and that compiler is at fault here. Consider that something gets written to the d field (see example of PR49798). Reading a pointer from *m fileld in DImode, we will get non-zero bits in high 32bits of a pointer. We have to access the pointer in SImode. Uros.
Re: PATCH [9/n] X32: PR target/49798: Zero-extend symbol address to 64bit if needed
On Thu, Jul 21, 2011 at 10:00 PM, H.J. Lu hjl.to...@gmail.com wrote: /* Represents viewing something of one type as being of a second type. This corresponds to an Unchecked Conversion in Ada and roughly to the idiom *(type2 *)X in C. The only operand is the value to be viewed as being of another type. **It is undefined if the type of the input and of the expression have different sizes.** ... DEFTREECODE (VIEW_CONVERT_EXPR, view_convert_expr, tcc_reference, 1) We have: bb 2: D.2709_8 = VIEW_CONVERT_EXPRdouble(); D.2702_1 = u.d; D.2704_3 = D.2702_1 == D.2709_8; D.2701_4 = (int) D.2704_3; return D.2701_4; Where sizeof (double) = 64 sizeof (ptr_type) = 32. Are you sure that you used -mx32? I couldn't reproduce it. It looks like an x86 backend bug to me. Hm, can't reproduce it anymore... x32 -O2 looks OK: bb 2: v = {}; v.m = ; D.2702_1 = u.d; D.2703_2 = v.d; D.2704_3 = D.2702_1 == D.2703_2; D.2701_4 = (int) D.2704_3; return D.2701_4; } Expand generates: (insn 8 6 9 (set (reg:SI 68) (symbol_ref:SI () [flags 0x40] var_decl 0x7fccc360b140 )) p r49798.c:12 -1 (nil)) (insn 9 8 10 (set (reg:DI 67) (zero_extend:DI (reg:SI 68))) pr49798.c:12 -1 (nil)) I don't know if this is OK to be transformed to DImode load. Uros.
Re: PATCH [9/n] X32: PR target/49798: Zero-extend symbol address to 64bit if needed
On Thu, Jul 21, 2011 at 10:22 PM, H.J. Lu hjl.to...@gmail.com wrote: Expand generates: (insn 8 6 9 (set (reg:SI 68) (symbol_ref:SI () [flags 0x40] var_decl 0x7fccc360b140 )) p r49798.c:12 -1 (nil)) (insn 9 8 10 (set (reg:DI 67) (zero_extend:DI (reg:SI 68))) pr49798.c:12 -1 (nil)) I don't know if this is OK to be transformed to DImode load. I believe it is valid. How is this situation handled in other targets? I don't see that any of other ptr_mode != Pmode targets define TARGET_ASM_INTEGER in the way you propose. Uros.
[PATCH, testsuite]: Fix detection of ifunc support
Hello! Revision 164725 [1] broke detection of ifunc support in the testsuite [2] due to extra #endif without if in the test function. Attached patch fixes this up. 2011-07-21 Uros Bizjak ubiz...@gmail.com * lib/target-supports.exp (check_ifunc_available): Fix test function. The patch is tested on x86_64-pc-linux-gnu, but my toolchain does not support ifunc attribute. Can somebody please test it with ifunc support? OTOH, the patch is kind of obvious, so OK for mainline? [1] http://gcc.gnu.org/viewcvs?view=revisionrevision=164725 [2] http://gcc.gnu.org/viewcvs/trunk/gcc/testsuite/lib/target-supports.exp?r1=164725r2=164724pathrev=164725 Uros. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 176584) +++ lib/target-supports.exp (working copy) @@ -381,10 +381,8 @@ set obj ifunc[pid].o verbose check_ifunc_available compiling testfile $src 2 set f [open $src w] - puts $f #endif puts $f #ifdef __cplusplus\nextern \C\\n#endif - puts $f void g() {} - puts $f void f() __attribute__((ifunc(\g\))); + puts $f void g() {} f() __attribute__((ifunc(\g\))); close $f set lines [${tool}_target_compile $src $obj object ] file delete $src
Re: [PATCH, testsuite]: Fix detection of ifunc support
On Thu, Jul 21, 2011 at 11:56 PM, Uros Bizjak ubiz...@gmail.com wrote: Revision 164725 [1] broke detection of ifunc support in the testsuite [2] due to extra #endif without if in the test function. Attached patch fixes this up. Actually, we can use existing testsuite infrastructure to simplify the function substantially. 2011-07-21 Uros Bizjak ubiz...@gmail.com * lib/target-supports.exp (check_ifunc_available): Rewrite. The patch is tested on x86_64-pc-linux-gnu, but my toolchain does not support ifunc attribute. Can somebody please test it with ifunc support? OK for mainline and 4.6 ? Uros. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 176584) +++ lib/target-supports.exp (working copy) @@ -361,45 +361,16 @@ return $alias_available_saved } -### -# proc check_ifunc_available { } -### +# Returns 1 if the target supports ifunc, 0 otherwise. -# Determine if the target toolchain supports the ifunc attribute. - -# Returns 1 if the target supports ifunc. Returns 0 if the target -# does not support ifunc. - proc check_ifunc_available { } { -global ifunc_available_saved -global tool - -if [info exists ifunc_available_saved] { -verbose check_ifunc_available returning saved $ifunc_available_saved 2 -} else { - set src ifunc[pid].c - set obj ifunc[pid].o -verbose check_ifunc_available compiling testfile $src 2 - set f [open $src w] - puts $f #endif - puts $f #ifdef __cplusplus\nextern \C\\n#endif - puts $f void g() {} - puts $f void f() __attribute__((ifunc(\g\))); - close $f - set lines [${tool}_target_compile $src $obj object ] - file delete $src - remote_file build delete $obj - - if [string match $lines] then { - set ifunc_available_saved 1 - } else { - set ifunc_available_saved 0 - } - - verbose check_ifunc_available returning $ifunc_available_saved 2 -} - -return $ifunc_available_saved +return [check_no_compiler_messages ifunc_available object { + #ifdef __cplusplus + extern C + #endif + void g() {} + f() __attribute__((ifunc(g))); +}] } # Returns true if --gc-sections is supported on the target.
[PATCH, build]: Enable default_gnu_indirect_function on x86_64-*-linux*
Hello! Fixing ifunc test function in the testsuite uncovered a nasty screwup in config.gcc that prohibited usage of GNU indirect functions on x86_64-*-linux*. Fixed by mirroring i[34567]86-*-linux* setting. 2011-07-22 Uros Bizjak ubiz...@gmail.com * config.gcc (i[34567]86-*-linux*): Set default_gnu_indirect_function to yes. Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Committed to mainline, will commit to 4.6 after regression tests finish there. Uros. Index: config.gcc === --- config.gcc (revision 176624) +++ config.gcc (working copy) @@ -1327,8 +1327,10 @@ i386/x86-64.h i386/gnu-user64.h case ${target} in x86_64-*-linux*) - tm_file=${tm_file} linux.h i386/linux64.h - default_gnu_indirect_function=glibc-2011 ;; + tm_file=${tm_file} linux.h i386/linux64.h + # Assume modern glibc + default_gnu_indirect_function=yes + ;; x86_64-*-kfreebsd*-gnu) tm_file=${tm_file} kfreebsd-gnu.h i386/kfreebsd-gnu64.h ;; x86_64-*-knetbsd*-gnu) tm_file=${tm_file} knetbsd-gnu.h ;; esac
Re: [PATCH, build]: Enable default_gnu_indirect_function on x86_64-*-linux*
On Fri, Jul 22, 2011 at 5:27 PM, Uros Bizjak ubiz...@gmail.com wrote: Fixing ifunc test function in the testsuite uncovered a nasty screwup in config.gcc that prohibited usage of GNU indirect functions on x86_64-*-linux*. Fixed by mirroring i[34567]86-*-linux* setting. 2011-07-22 Uros Bizjak ubiz...@gmail.com * config.gcc (i[34567]86-*-linux*): Set default_gnu_indirect_function to yes. (x86_64-*-linux*) in fact. Fixed typo in ChangeLog. Uros.
Re: [PATCH, build]: Enable default_gnu_indirect_function on x86_64-*-linux*
On Fri, Jul 22, 2011 at 5:38 PM, H.J. Lu hjl.to...@gmail.com wrote: Fixing ifunc test function in the testsuite uncovered a nasty screwup in config.gcc that prohibited usage of GNU indirect functions on x86_64-*-linux*. Fixed by mirroring i[34567]86-*-linux* setting. 2011-07-22 Uros Bizjak ubiz...@gmail.com * config.gcc (i[34567]86-*-linux*): Set default_gnu_indirect_function to yes. (x86_64-*-linux*) in fact. Fixed typo in ChangeLog. Can we also enable it for Linux/iX86? It is already enabled for this target. Uros.
[PATCH, libstdc++]: Backport PR libstdc++/49293 fix to 4.6 branch
Hello! This patch backports the fix to the testcase for newer glibcs to 4.6 branch. 2011-07-22 Uros Bizjak ubiz...@gmail.com Backport from mainline 2011-06-07 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/49293 * testsuite/22_locale/time_get/get_weekday/char/38081-1.cc: Tweak for glibc 2.14. * testsuite/22_locale/time_get/get_weekday/char/38081-2.cc: Likewise. Tested on x86_64-pc-linux-gnu on Fedora 15. OK for 4.6 branch? Uros. Index: testsuite/22_locale/time_get/get_weekday/char/38081-1.cc === --- testsuite/22_locale/time_get/get_weekday/char/38081-1.cc(revision 176630) +++ testsuite/22_locale/time_get/get_weekday/char/38081-1.cc(working copy) @@ -1,6 +1,6 @@ // { dg-require-namedlocale ru_RU.ISO-8859-5 } -// Copyright (C) 2010 Free Software Foundation +// Copyright (C) 2010, 2011 Free Software Foundation // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -49,7 +49,11 @@ // get_weekday(iter_type, iter_type, ios_base, // ios_base::iostate, tm*) const +#if __GLIBC__ 2 || (__GLIBC__ == 2 __GLIBC_MINOR__ = 14) + iss.str(\xbf\xdd\x2e); +#else iss.str(\xbf\xdd\xd4); +#endif iterator_type is_it01(iss); tm time01; memset(time01, -1, sizeof(tm)); @@ -67,7 +71,11 @@ VERIFY( time02.tm_wday == 1 ); VERIFY( errorstate == ios_base::eofbit ); +#if __GLIBC__ 2 || (__GLIBC__ == 2 __GLIBC_MINOR__ = 14) + iss.str(\xbf\xdd\x2e\xd5\xd4\xd5\xdb\xec\xdd\xd8\xda); +#else iss.str(\xbf\xdd\xd4\xd5\xd4\xd5\xdb\xec\xdd\xd8\xda); +#endif iterator_type is_it03(iss); tm time03; memset(time03, -1, sizeof(tm)); Index: testsuite/22_locale/time_get/get_weekday/char/38081-2.cc === --- testsuite/22_locale/time_get/get_weekday/char/38081-2.cc(revision 176630) +++ testsuite/22_locale/time_get/get_weekday/char/38081-2.cc(working copy) @@ -2,7 +2,7 @@ // 2010-01-05 Paolo Carlini paolo.carl...@oracle.com -// Copyright (C) 2010 Free Software Foundation +// Copyright (C) 2010, 2011 Free Software Foundation // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the @@ -50,6 +50,15 @@ // get_weekday(iter_type, iter_type, ios_base, // ios_base::iostate, tm*) const +#if __GLIBC__ 2 || (__GLIBC__ == 2 __GLIBC_MINOR__ = 14) + const char* awdays[7] = { \u0412\u0441\u002E, + \u041F\u043D\u002E, + \u0412\u0442\u002E, + \u0421\u0440\u002E, + \u0427\u0442\u002E, + \u041F\u0442\u002E, + \u0421\u0431\u002E }; +#else const char* awdays[7] = { \u0412\u0441\u043A, \u041F\u043D\u0434, \u0412\u0442\u0440, @@ -57,6 +66,7 @@ \u0427\u0442\u0432, \u041F\u0442\u043D, \u0421\u0431\u0442 }; +#endif for (int i = 0; i 7; ++i) {
[PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
On Sat, Jul 23, 2011 at 3:57 PM, H.J. Lu hjl.to...@gmail.com wrote: This patch adds x32 LEA insn support. The main issue is gen_lowpart (Pmode, operands[1]); doesn't work on symbol. This patch avoids it. Also we shouldn't generate 32bit store with x32 PIC source. Any comments? You are not fixing the core of the problem... this is why you need so much hacks and kludges at various places (some w.r.t. -fPIC already existed, see the patch). Above, you correctly identified the problem, so let's avoid gen_lowpart on SImode operands by not calling it anymore. Attached patch effectively rewrites LEA handling. The trick is, that instead of using Pmode operations in addresses, we use either SImode or DImode operations to calculate the address on 64bit targets. Up to now, address calculations strictly used Pmode, so SImode on 32bit targets and DImode on 64bit targets. Recent patches to ix86_decompose_address and ix86_legitimate_address_p relaxed this requirement. Attached patch changes LEA patterns and LEA splitters to accept addresses, calculated with either SImode or DImode operations.This means, that on x64 targets, we don't use gen_lowpart on SImode operands anymore. Since symbol references on x32 are in SImode, this solves the problem. The patch also avoids generating SImode subregs of DImode addresses and DImode zero_extends of SImode addresses, since LEA insn does this for us automatically. Please also note the change to ix86_print_operand_address. To avoid addr32 prefixes, we can force registers in DImode on 64bit targets without any problems. On x32, we can investigate, if this change avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8). Also, we can investigate the effect of addr32 on benchmarks. Patched gcc also fixes all testcases from PR 47381. 2011-07-24 Uros Bizjak ubiz...@gmail.com PR target/47381 * config/i386/i386.md (*lea_1): Use SWI48 mode iterator. (*lea_1_zext): New insn pattern. (add-lea splitter): Check operand modes in insn constraint. Extend operands less than SImode wide to SImode. (add-lea zext splitter): Do not extend operands to DImode. (*lea_general_1): Handle only QImode and HImode operands. (*lea_general_2): Ditto. (*lea_general_3): Ditto. (*lea_general_1_zext): Remove. (*lea_general_2_zext): Ditto. (*lea_general_3_zext): Ditto. (*lea_general_4): Check operand modes in insn constraint. Extend operands less than SImode wide to SImode. (ashift-lea splitter): Ditto. * config/i386/i386.md (ix86_print_operand_address): Print address registers with 'q' modifier on 64bit targets. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} with no regressions. H.J., can you please test it on x32? BTW: -fPIC is not yet implemented on trunk and still fails there with an (unrelated) error, I didn't check x32 branch. Uros. Index: i386.md === --- i386.md (revision 176713) +++ i386.md (working copy) @@ -5425,13 +5425,22 @@ (set_attr mode QI)]) (define_insn *lea_1 - [(set (match_operand:P 0 register_operand =r) - (match_operand:P 1 no_seg_address_operand p))] + [(set (match_operand:SWI48 0 register_operand =r) + (match_operand:SWI48 1 no_seg_address_operand p))] lea{imodesuffix}\t{%a1, %0|%0, %a1} [(set_attr type lea) (set_attr mode MODE)]) +(define_insn *lea_1_zext + [(set (match_operand:DI 0 register_operand =r) + (zero_extend:DI + (match_operand:SI 1 no_seg_address_operand p)))] + TARGET_64BIT + lea{l}\t{%a1, %k0|%k0, %a1} + [(set_attr type lea) + (set_attr mode SI)]) + (define_insn *lea_2 [(set (match_operand:SI 0 register_operand =r) (subreg:SI (match_operand:DI 1 no_seg_address_operand p) 0))] @@ -5794,39 +5803,36 @@ (const_string none))) (set_attr mode QI)]) -;; Convert lea to the lea pattern to avoid flags dependency. +;; Convert add to the lea pattern to avoid flags dependency. (define_split [(set (match_operand 0 register_operand ) (plus (match_operand 1 register_operand ) (match_operand 2 nonmemory_operand ))) (clobber (reg:CC FLAGS_REG))] - reload_completed ix86_lea_for_add_ok (insn, operands) + GET_MODE (operands[0]) == GET_MODE (operands[1]) +(GET_MODE (operands[0]) == GET_MODE (operands[2]) + || GET_MODE (operands[2]) == VOIDmode) +reload_completed ix86_lea_for_add_ok (insn, operands) [(const_int 0)] { - rtx pat; enum machine_mode mode = GET_MODE (operands[0]); - - /* In -fPIC mode the constructs like (const (unspec [symbol_ref])) - may confuse gen_lowpart. */ - if (mode != Pmode) -{ - operands[1] = gen_lowpart (Pmode, operands[1]); - operands[2] = gen_lowpart (Pmode, operands[2]); -} - - pat = gen_rtx_PLUS (Pmode, operands[1], operands[2]); + rtx pat
Re: [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
On Mon, Jul 25, 2011 at 3:58 AM, H.J. Lu hjl.to...@gmail.com wrote: You are not fixing the core of the problem... this is why you need so much hacks and kludges at various places (some w.r.t. -fPIC already existed, see the patch). Above, you correctly identified the problem, so let's avoid gen_lowpart on SImode operands by not calling it anymore. Attached patch effectively rewrites LEA handling. The trick is, that instead of using Pmode operations in addresses, we use either SImode or DImode operations to calculate the address on 64bit targets. Up to now, address calculations strictly used Pmode, so SImode on 32bit targets and DImode on 64bit targets. Recent patches to ix86_decompose_address and ix86_legitimate_address_p relaxed this requirement. Attached patch changes LEA patterns and LEA splitters to accept addresses, calculated with either SImode or DImode operations.This means, that on x64 targets, we don't use gen_lowpart on SImode operands anymore. Since symbol references on x32 are in SImode, this solves the problem. The patch also avoids generating SImode subregs of DImode addresses and DImode zero_extends of SImode addresses, since LEA insn does this for us automatically. Please also note the change to ix86_print_operand_address. To avoid addr32 prefixes, we can force registers in DImode on 64bit targets without any problems. On x32, we can investigate, if this change avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8). The testcase won't compile since PIC doesn't work: Well, I did say that -fPIC did not work. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} with no regressions. H.J., can you please test it on x32? On x32, it failed: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49832 BTW: -fPIC is not yet implemented on trunk and still fails there with an (unrelated) error, I didn't check x32 branch. This could be: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49833 Attached patch implements -fpic handling for x32. In x32 mode, we now use x86_64_general_operand and corresponding e constraints for adds in SImode, since it looks that invalid addresses can only be generated through adds. This avoids a whole bunch of new predicates and constraints. 2011-07-25 Uros Bizjak ubiz...@gmail.com PR target/47381 PR target/49832 PR target/49833 * config/i386/i386.md (add_operand): New mode attribute. (*movdi_internal_rex64): Remove mode from pic_32bit_operand check. (*movsi_internal): Ditto. Use e constraint in alternative 2. (*lea_1): Use SWI48 mode iterator. (*lea_1_zext): New insn pattern. (addmode3): Use add_operand predicate for operand 2. (*addmode_1): Use add_operand predicate for operand 2. Use le constraint for alternative 2. (addsi_1_zext): Use addsi_operand predicate for operand 2. Use le constraint for alternative 2. (add-lea splitter): Check operand modes in insn constraint. Extend operands less than SImode wide to SImode. (add-lea zext splitter): Do not extend operands to DImode. (*lea_general_1): Handle only QImode and HImode operands. (*lea_general_2): Ditto. (*lea_general_3): Ditto. (*lea_general_1_zext): Remove. (*lea_general_2_zext): Ditto. (*lea_general_3_zext): Ditto. (*lea_general_4): Check operand modes in insn constraint. Extend operands less than SImode wide to SImode. (ashift-lea splitter): Ditto. * config/i386/i386.c (ix86_print_operand_address): Print address registers with 'q' modifier on 64bit targets. * config/i386/predicates.md (pic_32bit_opreand): Define as special predicate. Reject non-SI and non-DI modes. (addsi_operand): New predicate. Uros. Index: i386.md === --- i386.md (revision 176733) +++ i386.md (working copy) @@ -901,6 +901,14 @@ (SI nonmemory_operand) (DI x86_64_nonmemory_operand)]) +;; Operand predicate for adds. +(define_mode_attr add_operand + [(QI general_operand) +(HI general_operand) +(SI addsi_operand) +(DI x86_64_general_operand) +(TI x86_64_general_operand)]) + ;; Operand predicate for shifts. (define_mode_attr shift_operand [(QI nonimmediate_operand) @@ -2039,7 +2047,7 @@ (const_string ssemov) (eq_attr alternative 16,17) (const_string ssecvt) - (match_operand:DI 1 pic_32bit_operand ) + (match_operand 1 pic_32bit_operand ) (const_string lea) ] (const_string imov))) @@ -2184,7 +2192,7 @@ [(set (match_operand:SI 0 nonimmediate_operand =r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x) (match_operand:SI 1 general_operand - g ,ri,C ,*y,*y ,rm ,C ,*x
Re: [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
On Mon, Jul 25, 2011 at 3:30 PM, H.J. Lu hjl.to...@gmail.com wrote: Attached patch implements -fpic handling for x32. In x32 mode, we now use x86_64_general_operand and corresponding e constraints for adds in SImode, since it looks that invalid addresses can only be generated through adds. This avoids a whole bunch of new predicates and constraints. X32 glibc is miscompiled: CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32 -E -x c-header' /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2 --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y ../scripts -h rpcsvc/yppasswd.x -o /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp] Segmentation fault (core dumped) Some LEA patterns are wrong for x32. I will investigate. What about x32 GCC testsuite? Uros.
Re: [PATCH, i386, take 2]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
On Mon, Jul 25, 2011 at 11:05 PM, H.J. Lu hjl.to...@gmail.com wrote: Attached patch implements -fpic handling for x32. In x32 mode, we now use x86_64_general_operand and corresponding e constraints for adds in SImode, since it looks that invalid addresses can only be generated through adds. This avoids a whole bunch of new predicates and constraints. X32 glibc is miscompiled: CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32 -E -x c-header' /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2 --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y ../scripts -h rpcsvc/yppasswd.x -o /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp] Segmentation fault (core dumped) Some LEA patterns are wrong for x32. I will investigate. We have to prevent symbols from entering general_operand predicated SImode operands. Fortunatelly, x86_64_general_operand works OK for x32, while both for i686 and x86_64 are unaffected due to early bypass (i686) and due to the fact that all symbols are DImode (x86_64). 2011-07-25 Uros Bizjak ubiz...@gmail.com H.J. Lu hongjiu...@intel.com PR target/47381 PR target/49832 PR target/49833 * config/i386/i386.md (i): Change SImode attribute to e. (g): Change SImode attribute to rme. (di): Change SImode attribute to nF. (general_operand): Change SImode attribute to x86_64_general_operand. (general_szext_operand): Change SImode attribute to x86_64_szext_general_operand. (immediate_operand): Change SImode attribute to x86_64_immediate_operand- (*movdi_internal_rex64): Remove mode from pic_32bit_operand check. (*movsi_internal): Ditto. Use e constraint in alternative 2. (*lea_1): Use SWI48 mode iterator. (*lea_1_zext): New insn pattern. (*addmode1): Use x86_64_general_operand predicate for operand 2. Update operand constraints. (addsi_1_zext): Ditto. (*addmode2): Ditto. (*addsi_3_zext): Ditto. (*subsi_1_zext): Ditto. (*subsi_2_zext): Ditto. (*subsi_3_zext): Ditto. (*addsi3_carry_zext): Ditto. (*plusminus_insnsi3_zext_cc_overflow): Ditto. (*mulsi3_1_zext): Ditto. (*andsi_1): Ditto. (*andsi_1_zext): Ditto. (*andsi_2_zext): Ditto. (*any_or:codesi_1_zext): Ditto. (*any_or:codesi_2_zext): Ditto. (*testmode_1): Use general_operand predicate for operand 1. (*andmode_2): Ditto. (add-lea splitter): Check operand modes in insn constraint. Extend operands less than SImode wide to SImode. (add-lea zext splitter): Do not extend input operands to DImode. (*lea_general_1): Handle only QImode and HImode operands. (*lea_general_2): Ditto. (*lea_general_3): Ditto. (*lea_general_1_zext): Remove. (*lea_general_2_zext): Ditto. (*lea_general_3_zext): Ditto. (*lea_general_4): Check operand modes in insn constraint. Extend operands less than SImode wide to SImode. (ashift-lea splitter): Ditto. * config/i386/i386.c (ix86_print_operand_address): Print address registers with 'q' modifier on 64bit targets. * config/i386/predicates.md (pic_32bit_opreand): Define as special predicate. Reject non-SI and non-DI modes. Bootstrapped and regression ested on x86_64-pc-linux-gnu {,-m32}. GCC and glibc testsuites are clean on x32. Can you check it in? I will do this tomorrow, if anybody has some comment on the patch. Thanks, Uros.
Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol
On Tue, Jul 26, 2011 at 4:59 PM, H.J. Lu hongjiu...@intel.com wrote: This patch fixes PIC with external symbol and updates x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand for x32. 2011-07-26 H.J. Lu hongjiu...@intel.com PR target/49853 * config/i386/i386.c (ix86_expand_move): Call convert_to_mode on legitimize_tls_address return if needed. Allow ptr_mode for symbolic operand with PIC. Eh... half of your patch is just an unnecessary rename of a temporary variable. See attached patch for a cleaned-up version. Also, please use explicit DImode and SImode checks to match what ix86_legitimate_address_p does. * config/i386/predicates.md (x86_64_immediate_operand): Always allow the offsetted memory references for TARGET_X32. (x86_64_zext_immediate_operand): Likewise. (x86_64_movabs_operand): Don't allow nonmemory_operand for TARGET_X32. Why? It is certainly not needed for -fPIC. Please provide a separate patch and testcase for predicates.md change. Uros. Index: i386.c === --- i386.c (revision 176794) +++ i386.c (working copy) @@ -15028,11 +15028,14 @@ ix86_expand_move (enum machine_mode mode op0, 1, OPTAB_DIRECT); if (tmp == op0) return; + if (GET_MODE (tmp) != mode) + op1 = convert_to_mode (mode, tmp, 1); } } if ((flag_pic || MACHOPIC_INDIRECT) -mode == Pmode symbolic_operand (op1, Pmode)) + (mode == SImode || mode == DImode) + symbolic_operand (op1, mode)) { if (TARGET_MACHO !TARGET_64BIT) { @@ -15073,13 +15076,15 @@ ix86_expand_move (enum machine_mode mode else { if (MEM_P (op0)) - op1 = force_reg (Pmode, op1); - else if (!TARGET_64BIT || !x86_64_movabs_operand (op1, Pmode)) + op1 = force_reg (mode, op1); + else if (!TARGET_64BIT || !x86_64_movabs_operand (op1, mode)) { rtx reg = can_create_pseudo_p () ? NULL_RTX : op0; op1 = legitimize_pic_address (op1, reg); if (op0 == op1) return; + if (GET_MODE (op1) != mode) + op1 = convert_to_mode (mode, op1, 1); } } }
Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol
On Tue, Jul 26, 2011 at 7:31 PM, H.J. Lu hjl.to...@gmail.com wrote: This patch fixes PIC with external symbol and updates x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand for x32. 2011-07-26 H.J. Lu hongjiu...@intel.com PR target/49853 * config/i386/i386.c (ix86_expand_move): Call convert_to_mode on legitimize_tls_address return if needed. Allow ptr_mode for symbolic operand with PIC. Eh... half of your patch is just an unnecessary rename of a temporary variable. See attached patch for a cleaned-up version. It looks good to me. Can you check it in? Please, can you test it on x32 first? I will commit it after bootstrap/regtest finish. Thanks, Uros.
Re: PATCH: PR target/49853: [x32] PIC doesn't work with external symbol
On Tue, Jul 26, 2011 at 7:50 PM, H.J. Lu hjl.to...@gmail.com wrote: This patch fixes PIC with external symbol and updates x86_64_immediate_operand/x86_64_zext_immediate_operand/x86_64_movabs_operand for x32. 2011-07-26 H.J. Lu hongjiu...@intel.com PR target/49853 * config/i386/i386.c (ix86_expand_move): Call convert_to_mode on legitimize_tls_address return if needed. Allow ptr_mode for symbolic operand with PIC. Eh... half of your patch is just an unnecessary rename of a temporary variable. See attached patch for a cleaned-up version. It looks good to me. Can you check it in? Please, can you test it on x32 first? I will commit it after bootstrap/regtest finish. It may need other changes for TLS support. I can update it after your change is checked in. Committed with following ChangeLog: 2011-07-26 Uros Bizjak ubiz...@gmail.com H.J. Lu hongjiu...@intel.com PR target/47369 PR target/49853 * config/i386/i386.c (ix86_expand_move): Call convert_to_mode if legitimize_tls_address returned operand in wrong mode. Allow SImode and DImode symbolic operand for PIC. Call convert_to_mode if legitimize_pic_address returned operand in wrong mode. Tested on x86_64-pc-linux-gnu {,-m32}. Uros.
Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222
On Tue, Jul 26, 2011 at 10:12 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Jul 26, 2011 at 10:05:06PM +0200, Uros Bizjak wrote: 2011-07-26 H.J. Lu hongjiu...@intel.com PR target/47372 * config/i386/i386.c (ix86_delegitimize_address): Call simplify_gen_subreg for PIC with ptr_mode only if modes of x and orig_x are different. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 429cd62..9c52aa3 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12967,9 +12982,10 @@ ix86_delegitimize_address (rtx x) || !MEM_P (orig_x)) return ix86_delegitimize_tls_address (orig_x); x = XVECEXP (XEXP (x, 0), 0, 0); When x is no longer known to be Pmode - if (GET_MODE (orig_x) != Pmode) + if (GET_MODE (orig_x) != GET_MODE (x) + GET_MODE (orig_x) != ptr_mode) why not simply just if (GET_MODE (orig_x) != GET_MODE (x)) { - x = simplify_gen_subreg (GET_MODE (orig_x), x, Pmode, 0); + x = simplify_gen_subreg (GET_MODE (orig_x), x, ptr_mode, 0); and using GET_MODE (x) instead of Pmode/ptr_mode here? I mean, x is certainly not VOIDmode here, should be either SImode or DImode and thus simplify_gen_subreg ought to work for it. This also works, we look at orig_x that looks like: (mem/u/c:SI (const:DI (unspec:DI [ (symbol_ref:SI (__sflush) [flags 0x41] function_decl 0x7f6f2eaad000 __sflush) ] UNSPEC_GOTPCREL)) [2 S4 A8]) So, we look at SImode load, and compare it with SImode (actually ptr_mode) symbol. Will your suggestion work with this RTX? Thanks, Uros.
Re: PATCH: PR target/47372: [x32] internal compiler error: in simplify_subreg, at simplify-rtx.c:5222
On Tue, Jul 26, 2011 at 10:33 PM, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Jul 26, 2011 at 1:29 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Jul 26, 2011 at 10:21:11PM +0200, Uros Bizjak wrote: This also works, we look at orig_x that looks like: (mem/u/c:SI (const:DI (unspec:DI [ (symbol_ref:SI (__sflush) [flags 0x41] function_decl 0x7f6f2eaad000 __sflush) ] UNSPEC_GOTPCREL)) [2 S4 A8]) So, we look at SImode load, and compare it with SImode (actually ptr_mode) symbol. Will your suggestion work with this RTX? Then if (GET_MODE (orig_x) != GET_MODE (x)) { x = simplify_gen_subreg (GET_MODE (orig_x), x, GET_MODE (x), 0); if (x == NULL_RTX) return orig_x; } will work, orig_x is the above SImode MEM, x is (symbol_ref:SI (__sflush) [flags 0x41] function_decl 0x7f6f2eaad000 __sflush) thus the modes are the same and no simplify_gen_subreg needs to be done, the mode is already right. This works for my testcase. I will do a full test. Also OK for mainline, wih suitable ChangeLog and bootstrap/regression test. BTW: I'm thinking of removing this check from ix86_expand_move: @@ -15034,7 +15034,6 @@ ix86_expand_move (enum machine_mode mode } if ((flag_pic || MACHOPIC_INDIRECT) - (mode == SImode || mode == DImode) symbolic_operand (op1, mode)) { if (TARGET_MACHO !TARGET_64BIT) There is no way symbolic_operand would be in different mode than SImode/DImode. Uros.
Re: [Patch, i386, testsuite] Fix for PR49547, new tescases for lzcnt instruction
On Wed, Jul 27, 2011 at 9:05 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Thanks for inputs! I'll do it today. Just ine point. How AVX is connected to LZCNT features? AVX requires OS support since it has wider registers etc. LZCNT need no support from OS side, so from my point of view it is redundant to check in lzcnt-check.h presence of AVX support from OS side. Or I get you wrong? Ah, I see. I got distracted by the wrong comment in your patch: +# Return 1 if the target supports running AVX executables, 0 otherwise. + +proc check_effective_target_lzcnt_runtime { } { +if { [check_effective_target_lzcnt] + [check_lzcnt_hw_available] } { + return 1 +} +return 0 +} (I will add avx-os-support.h myself later today). Uros.
Re: PATCH: PR target/49860: [x32] Error: cannot represent relocation type BFD_RELOC_64 in x32 mode
On Wed, Jul 27, 2011 at 6:31 AM, H.J. Lu hongjiu...@intel.com wrote: The offsetted memory references always work for x32. OK for trunk? No, this is the same issue as in [1]. Please fix the assembler to zero-extend this relocation. [1] http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01825.html Uros,
[PATCH, i386]: Do not explicitly check symbol_operands in ix86_expand_move
Hello! There is no way symbol_operand uses non-DI or non-SI modes on x86. 2011-07-27 Uros Bizjak ubiz...@gmail.com * config/i386/i386.c (ix86_expand_move): Do not explicitly check the mode of symbolic_opreand RTXes. Tested on x86_64-pc-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 176833) +++ config/i386/i386.c (working copy) @@ -15032,7 +15032,6 @@ } if ((flag_pic || MACHOPIC_INDIRECT) - (mode == SImode || mode == DImode) symbolic_operand (op1, mode)) { if (TARGET_MACHO !TARGET_64BIT)
Re: [PATCH, i386, testsuite] New BMI testcases
On Wed, Jul 27, 2011 at 11:29 PM, Jakub Jelinek ja...@redhat.com wrote: Guys, with write approval, could you please commit that? I checked it in for you. Unfortunately many of the new tests fail with old assembler, because the builtin in check_effective_target_bmi is optimized away (ignored, as well as using constant arguments, two reasons to get rid of it). Fixed thusly, tested on i686-linux and x86_64-linux, both with old and new binutils. Ok for trunk? 2011-07-27 Jakub Jelinek ja...@redhat.com * gcc.target/i386/i386.exp (check_effective_target_bmi): Make sure the builtin isn't optimized away. OK. Thanks, Uros.
Re: PATCH: PR target/47364: [x32] internal compiler error: in emit_move_insn, at expr.c:3355
On Thu, Jul 28, 2011 at 5:48 AM, H.J. Lu hongjiu...@intel.com wrote: We should only expand strlen to Pmode. Otherwise, we got [hjl@gnu-6 ilp32-38]$ cat x.i char one[50] = ijk; int main (void) { return __builtin_strlen (one) != 3; } [hjl@gnu-6 ilp32-38]$ /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -S -o x.s -mx32 -O2 x.i x.i: In function ‘main’: x.i:5:27: internal compiler error: in emit_move_insn, at expr.c: Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. OK for trunk? 2011-07-27 H.J. Lu hongjiu...@intel.com PR target/47364 * config/i386/i386.md (strlenmode): Replace SWI48x with P. OK. Thanks, Uros.
Re: PATCH: PR target/47715: [x32] TLS doesn't work
On Thu, Jul 28, 2011 at 4:55 AM, H.J. Lu hongjiu...@intel.com wrote: TLS on X32 is almost identical to TLS on x86-64. The only difference is x32 address space is 32bit. That means TLS symbols can be in either SImode or DImode with upper 32bit zero. This patch updates tls_global_dynamic_64 to support x32. OK for trunk? 2011-07-27 H.J. Lu hongjiu...@intel.com PR target/47715 * config/i386/i386.md (PTR64): New. (*tls_global_dynamic_64): Rename to ... (*tls_global_dynamic_64_mode): This. Put PTR64 on operand 1. (tls_global_dynamic_64): Rename to ... (tls_global_dynamic_64_mode): This. Put PTR64 on operand 1. * config/i386/i386.c (legitimize_tls_address): Updated. Just remove mode check, so: (unspec:DI [(match_operand 1 tls_symbolic_operand )] at both sites. - fputs (ASM_BYTE 0x66\n, asm_out_file); + if (!TARGET_X32) +fputs (ASM_BYTE 0x66\n, asm_out_file); Are you sure? There are some scary comments in binutils that these sequences have to be written _exactly_ as shown to enable certain linker relaxations w.r.t. TLS relocs. Uros.
[PATCH, i386]: Fix i386.md:5807: warning: source missing a mode?
Hello! 2011-07-28 Uros Bizjak ubiz...@gmail.com * config/i386/i386.c (add-lea splitter): Add SWI mode to PLUS RTX. Tested on x86_64-pc-linux-gnu, committed to mainline. Uros. Index: i386.md === --- i386.md (revision 176858) +++ i386.md (working copy) @@ -5806,8 +5806,8 @@ ;; Convert add to the lea pattern to avoid flags dependency. (define_split [(set (match_operand:SWI 0 register_operand ) - (plus (match_operand:SWI 1 register_operand ) - (match_operand:SWI 2 nonmemory_operand ))) + (plus:SWI (match_operand:SWI 1 register_operand ) + (match_operand:SWI 2 nonmemory_operand ))) (clobber (reg:CC FLAGS_REG))] reload_completed ix86_lea_for_add_ok (insn, operands) [(const_int 0)]
Re: PATCH: PR target/47715: [x32] TLS doesn't work
On Thu, Jul 28, 2011 at 8:52 AM, Uros Bizjak ubiz...@gmail.com wrote: TLS on X32 is almost identical to TLS on x86-64. The only difference is x32 address space is 32bit. That means TLS symbols can be in either SImode or DImode with upper 32bit zero. This patch updates tls_global_dynamic_64 to support x32. OK for trunk? Please also change 64bit GNU2_TLS patterns, so -mtls-dialect=gnu2 will also work. Please see attached patch. Uros. Index: i386.md === --- i386.md (revision 176860) +++ i386.md (working copy) @@ -12327,7 +12327,7 @@ (call:DI (mem:QI (match_operand:DI 2 constant_call_address_operand z)) (match_operand:DI 3 ))) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLS_GD)] TARGET_64BIT { @@ -12349,7 +12349,7 @@ (call:DI (mem:QI (match_operand:DI 2 constant_call_address_operand )) (const_int 0))) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLS_GD)])]) (define_insn *tls_local_dynamic_base_32_gnu @@ -12553,7 +12553,7 @@ (define_expand tls_dynamic_gnu2_64 [(set (match_dup 2) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLSDESC)) (parallel [(set (match_operand:DI 0 register_operand ) @@ -12568,7 +12568,7 @@ (define_insn *tls_dynamic_lea_64 [(set (match_operand:DI 0 register_operand =r) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLSDESC))] TARGET_64BIT TARGET_GNU2_TLS lea{q}\t{%a1@TLSDESC(%%rip), %0|%0, %a1@TLSDESC[rip]} @@ -12579,7 +12579,7 @@ (define_insn *tls_dynamic_call_64 [(set (match_operand:DI 0 register_operand =a) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand ) + (unspec:DI [(match_operand 1 tls_symbolic_operand ) (match_operand:DI 2 register_operand 0) (reg:DI SP_REG)] UNSPEC_TLSDESC)) @@ -12598,7 +12598,7 @@ (reg:DI SP_REG)] UNSPEC_TLSDESC) (const:DI (unspec:DI - [(match_operand:DI 1 tls_symbolic_operand )] + [(match_operand 1 tls_symbolic_operand )] UNSPEC_DTPOFF (clobber (reg:CC FLAGS_REG))] TARGET_64BIT TARGET_GNU2_TLS
Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer
On Thu, Jul 28, 2011 at 5:11 AM, H.J. Lu hongjiu...@intel.com wrote: In x32, thread pointer is 32bit and choice of segment register for the thread base ptr load should be based on TARGET_64BIT. This patch implements it. OK for trunk? -ENOTESTCASE. Uros.
Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant
Hello! convert_memory_address_addr_space has a special PLUS/MULT case for POINTERS_EXTEND_UNSIGNED 0. ?It turns out that it is also needed for all Pmode != ptr_mode cases. ?OK for trunk? 2011-06-11 ?H.J. Lu ?hongjiu...@intel.com ? ? ? ?PR middle-end/47727 ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the ? ? ? ?conversion and addition if one operand is a constant. Do we still need this patch? With recent target changes the testcase from PR can be compiled without problems with a gcc from an unpatched trunk. Uros.
Re: PATCH: PR target/47364: [x32] internal compiler error: in emit_move_insn, at expr.c:3355
On Thu, Jul 28, 2011 at 8:30 AM, Uros Bizjak ubiz...@gmail.com wrote: We should only expand strlen to Pmode. Otherwise, we got [hjl@gnu-6 ilp32-38]$ cat x.i char one[50] = ijk; int main (void) { return __builtin_strlen (one) != 3; } [hjl@gnu-6 ilp32-38]$ /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -S -o x.s -mx32 -O2 x.i x.i: In function ‘main’: x.i:5:27: internal compiler error: in emit_move_insn, at expr.c: Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. OK for trunk? 2011-07-27 H.J. Lu hongjiu...@intel.com PR target/47364 * config/i386/i386.md (strlenmode): Replace SWI48x with P. OK. Please also backport this fix to release branches. Thanks, Uros.
Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant
On Thu, Jul 28, 2011 at 7:59 PM, H.J. Lu hjl.to...@gmail.com wrote: convert_memory_address_addr_space has a special PLUS/MULT case for POINTERS_EXTEND_UNSIGNED 0. ?It turns out that it is also needed for all Pmode != ptr_mode cases. ?OK for trunk? 2011-06-11 ?H.J. Lu ?hongjiu...@intel.com ? ? ? ?PR middle-end/47727 ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the ? ? ? ?conversion and addition if one operand is a constant. Do we still need this patch? With recent target changes the testcase from PR can be compiled without problems with a gcc from an unpatched trunk. Given the communication difficulties, I hope not... Paolo Here is the updated patch. OK for trunk? Did you see the question two levels up the thread you are replying to? Uros.
Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer
On Thu, Jul 28, 2011 at 8:03 PM, H.J. Lu hjl.to...@gmail.com wrote: So, instead of huge complications with new mode iterator, just introduce two new patterns that will shadow existing ones for TARGET_X32. Like in attached (untested) patch. I tried the following patch with typos fixed. It almost worked, except for this failure in glibc testsuite: gen-locale.sh: line 27: 14755 Aborted (core dumped) I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet -c -f $charmap -i $input ${common_objpfx}localedata/$out Charmap: ISO-8859-1 Inputfile: nb_NO Outputdir: nb_NO.ISO-8859-1 failed make[4]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE] Error 1 I will add: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 8723dc5..d32d64d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg) { rtx tp, reg, insn; - tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP); + tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP); + if (ptr_mode != Pmode) + tp = convert_to_mode (Pmode, tp, 1); if (!to_reg) return tp; since TP must be 32bit. No, this won't have the desired effect. It will change the UNSPEC, so it won't match patterns in i386.md. Can you debug the failure a bit more? With my patterns, add{l} and mov{l} should clear top 32bits. TP is 32bit in x32 For load_tp_x32, we load SImode value and zero-extend to DImode. For add_tp_x32, we are adding SImode value. We can't pretend TP is 64bit. load_tp_x32 and add_tp_x32 must take SImode TP. I will see what I can do. Here is the updated patch to use 32bit TP for 32. Why?? This part makes no sense: - tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP); + tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP); + if (ptr_mode != Pmode) +tp = convert_to_mode (Pmode, tp, 1); You will create zero_extend (unspec ...), that won't be matched by any pattern. Can you please explain, how is this pattern different than DImode pattern, proposed in my patch? +(define_insn *load_tp_x32 + [(set (match_operand:SI 0 register_operand =r) + (unspec:SI [(const_int 0)] UNSPEC_TP))] + TARGET_X32 + mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) vs: +(define_insn *load_tp_x32 + [(set (match_operand:DI 0 register_operand =r) + (unspec:DI [(const_int 0)] UNSPEC_TP))] + TARGET_X32 + mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) Uros.
Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns
On Thu, Jul 28, 2011 at 8:09 PM, H.J. Lu hjl.to...@gmail.com wrote: convert_memory_address_addr_space has a special PLUS/MULT case for POINTERS_EXTEND_UNSIGNED 0. ?It turns out that it is also needed for all Pmode != ptr_mode cases. ?OK for trunk? 2011-06-11 ?H.J. Lu ?hongjiu...@intel.com ? ? ? ?PR middle-end/47727 ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the ? ? ? ?conversion and addition if one operand is a constant. Do we still need this patch? With recent target changes the testcase from PR can be compiled without problems with a gcc from an unpatched trunk. Given the communication difficulties, I hope not... Paolo Here is the updated patch. OK for trunk? Did you see the question two levels up the thread you are replying to? The patch is for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721 I changed the thread subject. Please add testcase to see the patch in action. Uros.
Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer
On Thu, Jul 28, 2011 at 8:30 PM, H.J. Lu hjl.to...@gmail.com wrote: TP is 32bit in x32 For load_tp_x32, we load SImode value and zero-extend to DImode. For add_tp_x32, we are adding SImode value. We can't pretend TP is 64bit. load_tp_x32 and add_tp_x32 must take SImode TP. I will see what I can do. Here is the updated patch to use 32bit TP for 32. Why?? This part makes no sense: - tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP); + tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP); + if (ptr_mode != Pmode) + tp = convert_to_mode (Pmode, tp, 1); You will create zero_extend (unspec ...), that won't be matched by any pattern. No. I created zero_exten from (reg:SI) to (reg: DI). Can you please explain, how is this pattern different than DImode pattern, proposed in my patch? +(define_insn *load_tp_x32 + [(set (match_operand:SI 0 register_operand =r) + (unspec:SI [(const_int 0)] UNSPEC_TP))] + TARGET_X32 + mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) vs: +(define_insn *load_tp_x32 + [(set (match_operand:DI 0 register_operand =r) + (unspec:DI [(const_int 0)] UNSPEC_TP))] That is wrong since source (TP) is 32bit. This pattern tells compiler source is 64bit. Where? Uros.
Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns
On Thu, Jul 28, 2011 at 8:32 PM, H.J. Lu hjl.to...@gmail.com wrote: convert_memory_address_addr_space has a special PLUS/MULT case for POINTERS_EXTEND_UNSIGNED 0. ?It turns out that it is also needed for all Pmode != ptr_mode cases. ?OK for trunk? 2011-06-11 ?H.J. Lu ?hongjiu...@intel.com ? ? ? ?PR middle-end/47727 ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the ? ? ? ?conversion and addition if one operand is a constant. Do we still need this patch? With recent target changes the testcase from PR can be compiled without problems with a gcc from an unpatched trunk. Given the communication difficulties, I hope not... Paolo Here is the updated patch. OK for trunk? Did you see the question two levels up the thread you are replying to? The patch is for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721 I changed the thread subject. Please add testcase to see the patch in action. I haven't found a testcase yet. The problem was discovered in this thread: http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01065.html This was before x32 could handle SImode addresses. With recent x86 target work, this is no more true, and SImode and DImode addresses are first-class citizens as far as x32 backend is concerned. Please note that original testcase (that this whole patch is all about) now compiles without problems. Also, middle end is shared with at least two ptr_mode != Pmode targets, and they all work well. So, to see what makes x32 special, we need a testcase that breaks _WITHOUT_ your proposed patch. Without testcase, nobody can analyze your approach and tell if the approach is the right one, if this is in fact target problem, or indeed a middle-end problem. And there is no point to flood the mainling-list with patches. Uros.
Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work
On Thu, Jul 28, 2011 at 8:13 PM, H.J. Lu hongjiu...@intel.com wrote: This patch adds x32 support to UNSPEC_SP_XXX patterns. OK for trunk? http://gcc.gnu.org/contribute.html#patches Uros.
Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work
On Thu, Jul 28, 2011 at 9:03 PM, H.J. Lu hjl.to...@gmail.com wrote: This patch adds x32 support to UNSPEC_SP_XXX patterns. OK for trunk? http://gcc.gnu.org/contribute.html#patches Sorry. I should have mentioned testcase in: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47766 Actually, they are in gcc testsuite. I noticed them when I run gcc testsuite on x32. This looks like a middle-end problem to me. According to the documentation: --quote-- `stack_protect_set' This pattern, if defined, moves a `Pmode' value from the memory in operand 1 to the memory in operand 0 without leaving the value in a register afterward. This is to avoid leaking the value some place that an attacker might use to rewrite the stack guard slot after having clobbered it. If this pattern is not defined, then a plain move pattern is generated. `stack_protect_test' This pattern, if defined, compares a `Pmode' value from the memory in operand 1 with the memory in operand 0 without leaving the value in a register afterward and branches to operand 2 if the values weren't equal. If this pattern is not defined, then a plain compare pattern and conditional branch pattern is used. --quote-- According to the documentation, x86 patterns are correct. However, middle-end fails to extend ptr_mode value to Pmode, and in function.c, stack_protect_prologue/stack_protect_epilogue, we already have ptr_mode (SImode) operand: (mem/v/f/c/i:SI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -4 [0xfffc])) [2 D.2704+0 S4 A32]) (mem/v/f/c/i:SI (symbol_ref:DI (__stack_chk_guard) [flags 0x40] var_decl 0x7ffc35aa0be0 __stack_chk_guard) [2 __stack_chk_guard+0 S4 A32]) An opinion of a RTL maintainer (CC'd) is needed here. Target definition is OK in its current form. Uros.
Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer
On Thu, Jul 28, 2011 at 10:15 PM, H.J. Lu hjl.to...@gmail.com wrote: TP is 32bit in x32 For load_tp_x32, we load SImode value and zero-extend to DImode. For add_tp_x32, we are adding SImode value. We can't pretend TP is 64bit. load_tp_x32 and add_tp_x32 must take SImode TP. Here is the revised patch. The difference is I changed *add_tp_x32 to SImode. For --- extern __thread int __libc_errno __attribute__ ((tls_model (initial-exec))); int * __errno_location (void) { return __libc_errno; } --- compiled with -mx32 -O2 -fPIC DImode *add_tp_x32 generates: movq __libc_errno@gottpoff(%rip), %rax addl %fs:0, %eax mov %eax, %eax ret SImode *add_tp_x32 generates: movl %fs:0, %eax addl __libc_errno@gottpoff(%rip), %eax ret This happens because combine can't combine DImode load and SImode plus RTXes. These RTXes have to be in Pmode, see the intention in legitimize_tls_address, also for TARGET_GNU2_TLS. Can you please debug what goes wrong with tp_add_x32 in DImode? Uros.
Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer
On Fri, Jul 29, 2011 at 12:28 AM, H.J. Lu hjl.to...@gmail.com wrote: TP is 32bit in x32 For load_tp_x32, we load SImode value and zero-extend to DImode. For add_tp_x32, we are adding SImode value. We can't pretend TP is 64bit. load_tp_x32 and add_tp_x32 must take SImode TP. Here is the revised patch. The difference is I changed *add_tp_x32 to SImode. For --- extern __thread int __libc_errno __attribute__ ((tls_model (initial-exec))); int * __errno_location (void) { return __libc_errno; } --- compiled with -mx32 -O2 -fPIC DImode *add_tp_x32 generates: movq __libc_errno@gottpoff(%rip), %rax addl %fs:0, %eax mov %eax, %eax ret SImode *add_tp_x32 generates: movl %fs:0, %eax addl __libc_errno@gottpoff(%rip), %eax ret This happens because combine can't combine DImode load and SImode plus RTXes. These RTXes have to be in Pmode, see the intention in legitimize_tls_address, also for TARGET_GNU2_TLS. Can you please debug what goes wrong with tp_add_x32 in DImode? We start with Uh, we didn't understand each other... can you please debug what goes wrong with glibc runtime test? Thanks, Uros.
Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer
On Thu, Jul 28, 2011 at 3:24 PM, H.J. Lu hjl.to...@gmail.com wrote: In x32, thread pointer is 32bit and choice of segment register for the thread base ptr load should be based on TARGET_64BIT. This patch implements it. OK for trunk? -ENOTESTCASE. There is no standalone testcase. The symptom is in glibc build, I got CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32 -E -x c-header' /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2 --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y ../scripts -h rpcsvc/yppasswd.x -o /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp] Segmentation fault make[5]: *** Waiting for unfinished jobs make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp] Segmentation fault make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp] Segmentation fault since thread pointer is 32bit in x32. If we load thread pointer (fs segment register) in x32 with 64bit load, the upper 32bits are garbage. We must load 32bit So, instead of huge complications with new mode iterator, just introduce two new patterns that will shadow existing ones for TARGET_X32. Like in attached (untested) patch. Uros. Index: i386.md === --- i386.md (revision 176860) +++ i386.md (working copy) @@ -12442,6 +12442,17 @@ (define_mode_attr tp_seg [(SI gs) (DI fs)]) ;; Load and add the thread base pointer from %tp_seg:0. +(define_insn *load_tp_x32 + [(set (match_operand:DI 0 register_operand =r) + (unspec:DI [(const_int 0)] UNSPEC_TP))] + TARGET_X32 + mov{l}\t{%%tp_seg:0, %k0|%k0, DWORD PTR tp_seg:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + (define_insn *load_tp_mode [(set (match_operand:P 0 register_operand =r) (unspec:P [(const_int 0)] UNSPEC_TP))] @@ -12453,6 +12464,19 @@ (set_attr memory load) (set_attr imm_disp false)]) +(define_insn *add_tp_x32 + [(set (match_operand:DI 0 register_operand =r) + (plus:DI (unspec:DI [(const_int 0)] UNSPEC_TP) +(match_operand:DI 1 register_operand 0))) + (clobber (reg:CC FLAGS_REG))] + TARGET_X32 + add{l}\t{%%tp_seg:0, %k0|%k0, DWORD PTR tp_seg:0} + [(set_attr type alu) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + (define_insn *add_tp_mode [(set (match_operand:P 0 register_operand =r) (plus:P (unspec:P [(const_int 0)] UNSPEC_TP)
PATCH, v2: PR target/47715: [x32] Use SImode for thread pointer
Hello! ABI specifies that TP is loaded in ptr_mode. Attached patch implements this requirement. 2011-07-29 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*load_tp_x32): New. (*load_tp_x32_zext): Ditto. (*add_tp_x32): Ditto. (*add_tp_x32_zext): Ditto. (*load_tp_mode): Disable for !TARGET_X32 targets. (*add_tp_mode): Ditto. * config/i386/i386.c (get_thread_pointer): Load thread pointer in ptr_mode and convert to Pmode if needed. Testing on x86_64-pc-linux-gnu in progress. H.J., please test this version on x32. Uros. Index: i386.md === --- i386.md (revision 176915) +++ i386.md (working copy) @@ -12444,10 +12444,32 @@ (define_mode_attr tp_seg [(SI gs) (DI fs)]) ;; Load and add the thread base pointer from %tp_seg:0. +(define_insn *load_tp_x32 + [(set (match_operand:SI 0 register_operand =r) + (unspec:SI [(const_int 0)] UNSPEC_TP))] + TARGET_X32 + mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + +(define_insn *load_tp_x32_zext + [(set (match_operand:DI 0 register_operand =r) + (zero_extend:DI (unspec:SI [(const_int 0)] UNSPEC_TP)))] + TARGET_X32 + mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + (define_insn *load_tp_mode [(set (match_operand:P 0 register_operand =r) (unspec:P [(const_int 0)] UNSPEC_TP))] - + !TARGET_X32 mov{imodesuffix}\t{%%tp_seg:0, %0|%0, iptrsize PTR tp_seg:0} [(set_attr type imov) (set_attr modrm 0) @@ -12455,12 +12477,39 @@ (set_attr memory load) (set_attr imm_disp false)]) +(define_insn *add_tp_x32 + [(set (match_operand:SI 0 register_operand =r) + (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP) +(match_operand:SI 1 register_operand 0))) + (clobber (reg:CC FLAGS_REG))] + TARGET_X32 + add{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0} + [(set_attr type alu) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + +(define_insn *add_tp_x32_zext + [(set (match_operand:DI 0 register_operand =r) + (zero_extend:DI + (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP) + (match_operand:SI 1 register_operand 0 + (clobber (reg:CC FLAGS_REG))] + TARGET_X32 + add{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0} + [(set_attr type alu) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + (define_insn *add_tp_mode [(set (match_operand:P 0 register_operand =r) (plus:P (unspec:P [(const_int 0)] UNSPEC_TP) (match_operand:P 1 register_operand 0))) (clobber (reg:CC FLAGS_REG))] - + !TARGET_X32 add{imodesuffix}\t{%%tp_seg:0, %0|%0, iptrsize PTR tp_seg:0} [(set_attr type alu) (set_attr modrm 0) Index: i386.c === --- i386.c (revision 176915) +++ i386.c (working copy) @@ -12118,17 +12118,15 @@ legitimize_pic_address (rtx orig, rtx re static rtx get_thread_pointer (bool to_reg) { - rtx tp, reg, insn; + rtx tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP); - tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP); - if (!to_reg) -return tp; + if (GET_MODE (tp) != Pmode) +tp = convert_to_mode (Pmode, tp, 1); - reg = gen_reg_rtx (Pmode); - insn = gen_rtx_SET (VOIDmode, reg, tp); - insn = emit_insn (insn); + if (to_reg) +tp = copy_addr_to_reg (tp); - return reg; + return tp; } /* Construct the SYMBOL_REF for the tls_get_addr function. */
PATCH, v2: PR target/47715: [x32] Use SImode for thread pointer
[ For some reason this post didn't reach gcc-patches@ ML archives... ] Hello! ABI specifies that TP is loaded in ptr_mode. Attached patch implements this requirement. 2011-07-29 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*load_tp_x32): New. (*load_tp_x32_zext): Ditto. (*add_tp_x32): Ditto. (*add_tp_x32_zext): Ditto. (*load_tp_mode): Disable for !TARGET_X32 targets. (*add_tp_mode): Ditto. * config/i386/i386.c (get_thread_pointer): Load thread pointer in ptr_mode and convert to Pmode if needed. Testing on x86_64-pc-linux-gnu in progress. H.J., please test this version on x32. Uros. Index: i386.md === --- i386.md (revision 176915) +++ i386.md (working copy) @@ -12444,10 +12444,32 @@ (define_mode_attr tp_seg [(SI gs) (DI fs)]) ;; Load and add the thread base pointer from %tp_seg:0. +(define_insn *load_tp_x32 + [(set (match_operand:SI 0 register_operand =r) + (unspec:SI [(const_int 0)] UNSPEC_TP))] + TARGET_X32 + mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + +(define_insn *load_tp_x32_zext + [(set (match_operand:DI 0 register_operand =r) + (zero_extend:DI (unspec:SI [(const_int 0)] UNSPEC_TP)))] + TARGET_X32 + mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0} + [(set_attr type imov) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + (define_insn *load_tp_mode [(set (match_operand:P 0 register_operand =r) (unspec:P [(const_int 0)] UNSPEC_TP))] - + !TARGET_X32 mov{imodesuffix}\t{%%tp_seg:0, %0|%0, iptrsize PTR tp_seg:0} [(set_attr type imov) (set_attr modrm 0) @@ -12455,12 +12477,39 @@ (set_attr memory load) (set_attr imm_disp false)]) +(define_insn *add_tp_x32 + [(set (match_operand:SI 0 register_operand =r) + (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP) +(match_operand:SI 1 register_operand 0))) + (clobber (reg:CC FLAGS_REG))] + TARGET_X32 + add{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0} + [(set_attr type alu) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + +(define_insn *add_tp_x32_zext + [(set (match_operand:DI 0 register_operand =r) + (zero_extend:DI + (plus:SI (unspec:SI [(const_int 0)] UNSPEC_TP) + (match_operand:SI 1 register_operand 0 + (clobber (reg:CC FLAGS_REG))] + TARGET_X32 + add{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0} + [(set_attr type alu) + (set_attr modrm 0) + (set_attr length 7) + (set_attr memory load) + (set_attr imm_disp false)]) + (define_insn *add_tp_mode [(set (match_operand:P 0 register_operand =r) (plus:P (unspec:P [(const_int 0)] UNSPEC_TP) (match_operand:P 1 register_operand 0))) (clobber (reg:CC FLAGS_REG))] - + !TARGET_X32 add{imodesuffix}\t{%%tp_seg:0, %0|%0, iptrsize PTR tp_seg:0} [(set_attr type alu) (set_attr modrm 0) Index: i386.c === --- i386.c (revision 176915) +++ i386.c (working copy) @@ -12118,17 +12118,15 @@ legitimize_pic_address (rtx orig, rtx re static rtx get_thread_pointer (bool to_reg) { - rtx tp, reg, insn; + rtx tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP); - tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP); - if (!to_reg) -return tp; + if (GET_MODE (tp) != Pmode) +tp = convert_to_mode (Pmode, tp, 1); - reg = gen_reg_rtx (Pmode); - insn = gen_rtx_SET (VOIDmode, reg, tp); - insn = emit_insn (insn); + if (to_reg) +tp = copy_addr_to_reg (tp); - return reg; + return tp; } /* Construct the SYMBOL_REF for the tls_get_addr function. */
Re: PATCH: PR target/47715: [x32] TLS doesn't work
On Thu, Jul 28, 2011 at 3:47 PM, H.J. Lu hjl.to...@gmail.com wrote: TLS on X32 is almost identical to TLS on x86-64. The only difference is x32 address space is 32bit. That means TLS symbols can be in either SImode or DImode with upper 32bit zero. This patch updates tls_global_dynamic_64 to support x32. OK for trunk? Please also change 64bit GNU2_TLS patterns, so -mtls-dialect=gnu2 will also work. Please see attached patch. Yes, it works. Can you apply it? This is what I have committed: 2011-07-28 Uros Bizjak ubiz...@gmail.com PR target/47715 * config/i386/i386.md (*tls_global_dynamic_64): Remove mode from tls_symbolic_operand check. Update code sequence for TARGET_X32. (tls_global_dynamic_64): Remove mode from tls_symbolic_operand check. (tls_dynamic_gnu2_64): Ditto. (*tls_dynamic_gnu2_lea_64): Ditto. (*tls_dynamic_gnu2_call_64): Ditto. (*tls_dynamic_gnu2_combine_64): Ditto. Uros. Index: i386.md === --- i386.md (revision 176870) +++ i386.md (working copy) @@ -12327,11 +12327,12 @@ (call:DI (mem:QI (match_operand:DI 2 constant_call_address_operand z)) (match_operand:DI 3 ))) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLS_GD)] TARGET_64BIT { - fputs (ASM_BYTE 0x66\n, asm_out_file); + if (!TARGET_X32) +fputs (ASM_BYTE 0x66\n, asm_out_file); output_asm_insn (lea{q}\t{%a1@tlsgd(%%rip), %%rdi|rdi, %a1@tlsgd[rip]}, operands); fputs (ASM_SHORT 0x\n, asm_out_file); @@ -12349,7 +12350,7 @@ (call:DI (mem:QI (match_operand:DI 2 constant_call_address_operand )) (const_int 0))) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLS_GD)])]) (define_insn *tls_local_dynamic_base_32_gnu @@ -12553,7 +12554,7 @@ (define_expand tls_dynamic_gnu2_64 [(set (match_dup 2) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLSDESC)) (parallel [(set (match_operand:DI 0 register_operand ) @@ -12568,7 +12569,7 @@ (define_insn *tls_dynamic_lea_64 [(set (match_operand:DI 0 register_operand =r) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand )] + (unspec:DI [(match_operand 1 tls_symbolic_operand )] UNSPEC_TLSDESC))] TARGET_64BIT TARGET_GNU2_TLS lea{q}\t{%a1@TLSDESC(%%rip), %0|%0, %a1@TLSDESC[rip]} @@ -12579,7 +12580,7 @@ (define_insn *tls_dynamic_call_64 [(set (match_operand:DI 0 register_operand =a) - (unspec:DI [(match_operand:DI 1 tls_symbolic_operand ) + (unspec:DI [(match_operand 1 tls_symbolic_operand ) (match_operand:DI 2 register_operand 0) (reg:DI SP_REG)] UNSPEC_TLSDESC)) @@ -12598,7 +12599,7 @@ (reg:DI SP_REG)] UNSPEC_TLSDESC) (const:DI (unspec:DI - [(match_operand:DI 1 tls_symbolic_operand )] + [(match_operand 1 tls_symbolic_operand )] UNSPEC_DTPOFF (clobber (reg:CC FLAGS_REG))] TARGET_64BIT TARGET_GNU2_TLS
[PATCH, i386]: Re-define pic_32bit_operand back to define_predicate
Hello! With recent developments, there is no need for pic_32bit_operand to be defined as special predicate with explicit mode checks anymore. Implicit mode checks (including VIODmode bypass) of normal predicates work OK now. 2011-07-28 Uros Bizjak ubiz...@gmail.com * config/i386/predicates.md (pic_32bit_opreand): Do not define as special predicate. Remove explicit mode checks. Tested on x86_64-pc-linux-gnu {,-m32}. There is remote chance this patch breaks x32, so let's alert H.J. Committed to mainline SVN. Uros. Index: predicates.md === --- predicates.md (revision 176870) +++ predicates.md (working copy) @@ -366,15 +366,12 @@ ;; Return true when operand is PIC expression that can be computed by lea ;; operation. -(define_special_predicate pic_32bit_operand +(define_predicate pic_32bit_operand (match_code const,symbol_ref,label_ref) { - if (GET_MODE (op) != SImode - GET_MODE (op) != DImode) -return false; - if (!flag_pic) return false; + /* Rule out relocations that translate into 64bit constants. */ if (TARGET_64BIT GET_CODE (op) == CONST) { @@ -386,6 +383,7 @@ || XINT (op, 1) == UNSPEC_GOT)) return false; } + return symbolic_operand (op, mode); })
[PATCH, i386]: Remove tp_or_register_operand predicate
Hello! tp_or_register_operand predicate is not used. 2011-07-29 Uros Bizjak ubiz...@gmail.com * config/i386/predicates.md (tp_or_register_operand): Remove predicate. Tested on x86_64-pc-linux-gnu, committed to mainline SVN. Uros. Index: predicates.md === --- predicates.md (revision 176924) +++ predicates.md (working copy) @@ -490,11 +490,6 @@ (and (match_code symbol_ref) (match_test op == ix86_tls_module_base ( -(define_predicate tp_or_register_operand - (ior (match_operand 0 register_operand) - (and (match_code unspec) - (match_test XINT (op, 1) == UNSPEC_TP - ;; Test for a pc-relative call operand (define_predicate constant_call_address_operand (match_code symbol_ref)
Re: PATCH: [x32]: Check TARGET_LP64 for SIZE_TYPE/PTRDIFF_TYPE
On Sat, Jul 30, 2011 at 12:41 AM, H.J. Lu hongjiu...@intel.com wrote: X32 is 32bit. This patch checks TARGET_LP64 for SIZE_TYPE/PTRDIFF_TYPE. OK for trunk? OK, if tested on x32. You didn't say how the patch was tested. Thanks, Uros.
[PATCH, testsuite]: Remove .exe.???t.* and .exe.ltrans0.???t.* files from testsuite dir
Hello! 2011-07-31 Uros Bizjak ubiz...@gmail.com * lib/gcc-dg.exp (cleanup-dump): Also remove .exe. and .exe.ltrans0. dump files. Tested on x64-pc-linux-gnu. OK for mainline? Uros. Index: lib/gcc-dg.exp === --- lib/gcc-dg.exp (revision 176960) +++ lib/gcc-dg.exp (working copy) @@ -487,6 +487,8 @@ # The name might include a list of options; extract the file name. set src [file tail [lindex $testcase 0]] remove-build-file [file tail $src].$suffix +remove-build-file [file rootname [file tail $src]].exe.$suffix +remove-build-file [file rootname [file tail $src]].exe.ltrans0.$suffix # -fcompare-debug dumps remove-build-file [file tail $src].gk.$suffix @@ -494,6 +496,8 @@ if [info exists additional_sources] { foreach srcfile $additional_sources { remove-build-file [file tail $srcfile].$suffix + remove-build-file [file rootname [file tail $srcfile]].exe.$suffix + remove-build-file [file rootname [file tail $srcfile]].exe.ltrans0.$suffix # -fcompare-debug dumps remove-build-file [file tail $srcfile].gk.$suffix }
Re: [PATCH, testsuite]: Remove .exe.???t.* and .exe.ltrans0.???t.* files from testsuite dir
On Sun, Jul 31, 2011 at 11:39 AM, Richard Guenther richard.guent...@gmail.com wrote: 2011-07-31 Uros Bizjak ubiz...@gmail.com * lib/gcc-dg.exp (cleanup-dump): Also remove .exe. and .exe.ltrans0. dump files. Tested on x64-pc-linux-gnu. OK for mainline? I think you need to remove all .exe.ltrans[0-9]*. files instead. Thanks, attached is what I have committed. 2011-07-31 Uros Bizjak ubiz...@gmail.com * lib/gcc-dg.exp (cleanup-dump): Also remove .exe. and .exe.ltrans[0-9]*. dump files. Uros. Index: lib/gcc-dg.exp === --- lib/gcc-dg.exp (revision 176960) +++ lib/gcc-dg.exp (working copy) @@ -487,6 +487,8 @@ # The name might include a list of options; extract the file name. set src [file tail [lindex $testcase 0]] remove-build-file [file tail $src].$suffix +remove-build-file [file rootname [file tail $src]].exe.$suffix +remove-build-file [file rootname [file tail $src]].exe.ltrans\[0-9\]*.$suffix # -fcompare-debug dumps remove-build-file [file tail $src].gk.$suffix @@ -494,6 +496,8 @@ if [info exists additional_sources] { foreach srcfile $additional_sources { remove-build-file [file tail $srcfile].$suffix + remove-build-file [file rootname [file tail $srcfile]].exe.$suffix + remove-build-file [file rootname [file tail $srcfile]].exe.ltrans\[0-9\]*.$suffix # -fcompare-debug dumps remove-build-file [file tail $srcfile].gk.$suffix }
[PATCH, testsuite]: Prevent stale dump files in testsuite directory
Hello! 2011-07-31 Uros Bizjak ubiz...@gmail.com * gcc.dg/tree-ssa/20050314-1.c: Dump and cleanup lim1 pass only. * gcc.dg/tree-ssa/pr23109.c: Ditto. * gcc.dg/tree-ssa/loop-7.c: Ditto. * gcc.dg/tree-ssa/loop-32.c: Ditto. * gcc.dg/tree-ssa/loop-33.c: Ditto. * gcc.dg/tree-ssa/loop-34.c: Ditto. * gcc.dg/tree-ssa/loop-35.c: Ditto. * gcc.dg/tree-ssa/restrict-3.c: Ditto. * gcc.dg/tree-ssa/ssa-lim-2.c: Ditto. * gcc.dg/tree-ssa/ssa-lim-1.c: Ditto. * gcc.dg/tree-ssa/ssa-lim-3.c: Ditto. * gcc.dg/tree-ssa/ssa-lim-6.c: Ditto. * gcc.dg/tree-ssa/structopt-1.c: Ditto. * g++.dg/tree-ssa/pr33615.C: Ditto. * g++.dg/tree-ssa/restrict1.C: Ditto. * c-c++-common/restrict-2.c: Ditto. * gfortran.dg/pr32921.f: Ditto. * gcc.dg/tree-ssa/ssa-dse-10.c: Dump and cleanup dse1 pass only. * gcc.dg/fold-compare-2.c: Dump and cleanup vrp1 pass only. * gcc.dg/tree-ssa/vrp47.c: Ditto. * gcc.dg/tree-ssa/pr25501.c: Dump and cleanup mergephi1 pass only. * gcc.dg/tree-ssa/pr15349.c: Dump and cleanup mergephi2 pass only. * gcc.dg/tree-ssa/tailrecursion-1.c: Dump and cleanup tailr1 pass only. * gcc.dg/tree-ssa/tailrecursion-2.c: Ditto. * gcc.dg/tree-ssa/tailrecursion-3.c: Ditto. * gcc.dg/tree-ssa/tailrecursion-4.c: Ditto. * gcc.dg/tree-ssa/tailrecursion-6.c: Ditto. Tested on x86_64-pc-linux-gnu, committed to mainline SVN. Uros. Index: gfortran.dg/pr32921.f === --- gfortran.dg/pr32921.f (revision 176960) +++ gfortran.dg/pr32921.f (working copy) @@ -1,5 +1,5 @@ ! { dg-do compile } -! { dg-options -O2 -fdump-tree-lim } +! { dg-options -O2 -fdump-tree-lim1 } ! gfortran -c -m32 -O2 -S junk.f ! MODULE LES3D_DATA @@ -46,5 +46,5 @@ RETURN END ! { dg-final { scan-tree-dump-times stride 4 lim1 } } -! { dg-final { cleanup-tree-dump lim\[1-2\] } } +! { dg-final { cleanup-tree-dump lim1 } } ! { dg-final { cleanup-modules LES3D_DATA } } Index: gcc.dg/fold-compare-2.c === --- gcc.dg/fold-compare-2.c (revision 176960) +++ gcc.dg/fold-compare-2.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-vrp } */ +/* { dg-options -O2 -fdump-tree-vrp1 } */ extern void abort (void); @@ -16,5 +16,5 @@ } /* { dg-final { scan-tree-dump-times Removing basic block 2 vrp1 } } */ -/* { dg-final { cleanup-tree-dump vrp\[1-2\] } } */ +/* { dg-final { cleanup-tree-dump vrp1 } } */ Index: gcc.dg/tree-ssa/vrp47.c === --- gcc.dg/tree-ssa/vrp47.c (revision 176960) +++ gcc.dg/tree-ssa/vrp47.c (working copy) @@ -4,8 +4,8 @@ jumps when evaluating an condition. VRP is not able to optimize this. */ /* { dg-do compile { target { ! mips*-*-* s390*-*-* avr-*-* mn10300-*-* } } } */ -/* { dg-options -O2 -fdump-tree-vrp -fdump-tree-dom } */ -/* { dg-options -O2 -fdump-tree-vrp -fdump-tree-dom -march=i586 { target { i?86-*-* ilp32 } } } */ +/* { dg-options -O2 -fdump-tree-vrp1 -fdump-tree-dom1 } */ +/* { dg-options -O2 -fdump-tree-vrp1 -fdump-tree-dom1 -march=i586 { target { i?86-*-* ilp32 } } } */ int h(int x, int y) { @@ -44,5 +44,5 @@ /* { dg-final { scan-tree-dump-times x\[^ \]* \[|\] y 1 vrp1 } } */ /* { dg-final { scan-tree-dump-times x\[^ \]* \\^ 1 1 vrp1 } } */ -/* { dg-final { cleanup-tree-dump vrp\[0-9\] } } */ -/* { dg-final { cleanup-tree-dump dom\[0-9\] } } */ +/* { dg-final { cleanup-tree-dump vrp1 } } */ +/* { dg-final { cleanup-tree-dump dom1 } } */ Index: gcc.dg/tree-ssa/pr15349.c === --- gcc.dg/tree-ssa/pr15349.c (revision 176960) +++ gcc.dg/tree-ssa/pr15349.c (working copy) @@ -1,6 +1,6 @@ /* PR 15349. Merge two PHI nodes. */ /* { dg-do compile } */ -/* { dg-options -O1 -fdump-tree-mergephi } */ +/* { dg-options -O1 -fdump-tree-mergephi2 } */ int foo (int a, int b) @@ -23,4 +23,4 @@ } /* { dg-final { scan-tree-dump-times PHI 1 mergephi2} } */ -/* { dg-final { cleanup-tree-dump mergephi\[1-2\] } } */ +/* { dg-final { cleanup-tree-dump mergephi2 } } */ Index: gcc.dg/tree-ssa/loop-32.c === --- gcc.dg/tree-ssa/loop-32.c (revision 176960) +++ gcc.dg/tree-ssa/loop-32.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-lim-details } */ +/* { dg-options -O2 -fdump-tree-lim1-details } */ int x; int a[100]; @@ -43,4 +43,4 @@ } /* { dg-final { scan-tree-dump-times Executing store motion of 3 lim1 } } */ -/* { dg-final { cleanup-tree-dump lim\[1-2\] } } */ +/* { dg-final { cleanup-tree-dump lim1 } } */ Index: gcc.dg/tree-ssa/ssa-lim-1.c
[PATCH, i386]: Fix PR49920, unable to find a register to spill in class ‘DIREG’
Hello! The problem is similar to PR11001, where we should not expand to special x86 stringop insn when one of necessary registers is marked fixed. In this particular PR, the problem was, that combine synthesized an instruction that exactly matched stringop insn. However, special registers were also marked fixed, so reload (obviously) didn't manage to get one. Attached patch disables stringop patterns when one of needed registers is marked fixed and this way prevents combine to synthesize stringop insn. Since nothing prevents combine to synthesize other stringop patterns, the patch conditionally disables these as well. 2011-07-31 Uros Bizjak ubiz...@gmail.com PR target/49920 * config/i386/i386.md (strset): Do not expand strset_singleop when %eax or $edi are fixed. (*strsetdi_rex_1): Disable when %eax or %edi are fixed. (*strsetsi_1): Ditto. (*strsethi_1): Ditto. (*strsetqi_1): Ditto. (*rep_stosdi_rex64): Disable when %eax, %ecx or %edi are fixed. (*rep_stossi): Ditto. (*rep_stosqi): Ditto. (cmpstrnsi): Also fail when %ecx is fixed. (*cmpstrnqi_nz_1): Disable when %ecx, %esi or %edi are fixed. (*cmpstrnqi_1): Ditto. (*strlenqi_1): Ditto. (*strmovdi_rex_1): Disable when %esi or %edi are fixed. (*strmovsi_1): Ditto. (*strmovhi_1): Ditto. (*strmovqi_1): Ditto. (*rep_movdi_rex64): Disable when %ecx, %esi or %edi are fixed. (*rep_movsi): Ditto. (*rep_movqi): Ditto. testsuite/ChangeLog: 2011-07-31 Uros Bizjak ubiz...@gmail.com PR target/49920 * gcc.target/i386/pr49920.c: New test. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Patch was committed to mainline SVN and will be backported to release branches. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 176960) +++ config/i386/i386.md (working copy) @@ -15421,7 +15421,8 @@ (set (match_operand:DI 1 register_operand =S) (plus:DI (match_dup 3) (const_int 8)))] - TARGET_64BIT + TARGET_64BIT +!(fixed_regs[SI_REG] || fixed_regs[DI_REG]) movsq [(set_attr type str) (set_attr memory both) @@ -15436,7 +15437,7 @@ (set (match_operand:P 1 register_operand =S) (plus:P (match_dup 3) (const_int 4)))] - + !(fixed_regs[SI_REG] || fixed_regs[DI_REG]) movs{l|d} [(set_attr type str) (set_attr memory both) @@ -15451,7 +15452,7 @@ (set (match_operand:P 1 register_operand =S) (plus:P (match_dup 3) (const_int 2)))] - + !(fixed_regs[SI_REG] || fixed_regs[DI_REG]) movsw [(set_attr type str) (set_attr memory both) @@ -15466,7 +15467,7 @@ (set (match_operand:P 1 register_operand =S) (plus:P (match_dup 3) (const_int 1)))] - + !(fixed_regs[SI_REG] || fixed_regs[DI_REG]) movsb [(set_attr type str) (set_attr memory both) @@ -15501,7 +15502,8 @@ (set (mem:BLK (match_dup 3)) (mem:BLK (match_dup 4))) (use (match_dup 5))] - TARGET_64BIT + TARGET_64BIT +!(fixed_regs[CX_REG] || fixed_regs[SI_REG] || fixed_regs[DI_REG]) rep{%;} movsq [(set_attr type str) (set_attr prefix_rep 1) @@ -15520,7 +15522,7 @@ (set (mem:BLK (match_dup 3)) (mem:BLK (match_dup 4))) (use (match_dup 5))] - + !(fixed_regs[CX_REG] || fixed_regs[SI_REG] || fixed_regs[DI_REG]) rep{%;} movs{l|d} [(set_attr type str) (set_attr prefix_rep 1) @@ -15537,7 +15539,7 @@ (set (mem:BLK (match_dup 3)) (mem:BLK (match_dup 4))) (use (match_dup 5))] - + !(fixed_regs[CX_REG] || fixed_regs[SI_REG] || fixed_regs[DI_REG]) rep{%;} movsb [(set_attr type str) (set_attr prefix_rep 1) @@ -15580,7 +15582,9 @@ operands[3] = gen_rtx_PLUS (Pmode, operands[0], GEN_INT (GET_MODE_SIZE (GET_MODE (operands[2]; - if (TARGET_SINGLE_STRINGOP || optimize_insn_for_size_p ()) + /* Can't use this if the user has appropriated eax or edi. */ + if ((TARGET_SINGLE_STRINGOP || optimize_insn_for_size_p ()) + !(fixed_regs[AX_REG] || fixed_regs[DI_REG])) { emit_insn (gen_strset_singleop (operands[0], operands[1], operands[2], operands[3])); @@ -15602,7 +15606,8 @@ (set (match_operand:DI 0 register_operand =D) (plus:DI (match_dup 1) (const_int 8)))] - TARGET_64BIT + TARGET_64BIT +!(fixed_regs[AX_REG] || fixed_regs[DI_REG]) stosq [(set_attr type str) (set_attr memory store) @@ -15614,7 +15619,7 @@ (set (match_operand:P 0 register_operand =D) (plus:P (match_dup 1) (const_int 4)))] - + !(fixed_regs[AX_REG] || fixed_regs[DI_REG]) stos{l|d} [(set_attr type str) (set_attr memory store
Re: [Patch, i386, testsuite] Fix for PR49547, new tescases for lzcnt instruction
On Mon, Aug 1, 2011 at 10:21 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Okay, then here is an updated patch updated ChangeLog entry: 2011-07-26 Kirill Yukhin kirill.yuk...@intel.com PR target/49547 * config.gcc (i[34567]86-*-*): Replace abmintrin.h with lzcntintrin.h. (x86_64-*-*): Likewise. * config/i386/i386.opt (mlzcnt): New. * config/i386/abmintrin.h: File removed. (__lzcnt_u16, __lzcnt, __lzcnt_u64): Moved to ... * config/i386/lzcntintrin.h: ... here. New file. (__lzcnt): Rename to ... (__lzcnt32): ... this. * config/i386/bmiintrin.h (head): Update copyright year. (__lzcnt_u16): Removed. (__lzcnt_u32): Likewise. (__lzcnt_u64): Likewise. * config/i386/x86intrin.h: Include lzcntintrin.h when __LZCNT__ is defined, remove abmintrin.h. * config/i386/cpuid.h: New define. * config/i386/driver-i386.c (host_detect_local_cpu): Detect LZCNT feature. * config/i386/i386-c.c (ix86_target_macros_internal): Define __LZCNT__ if needed. * config/i386/i386.c (ix86_target_string): New option -mlzcnt. (ix86_option_override_internal): Handle LZCNT option. (ix86_valid_target_attribute_inner_p): Likewise. (struct builtin_description bdesc_args) IX86_BUILTIN_CLZS: Update. * config/i386/i386.h (TARGET_LZCNT): New. (CLZ_DEFINED_VALUE_AT_ZERO): Update. * config/i386/i386.md (clzmode2): Update insn constraint. (clzmode2_lzcnt): Likewise. * doc/invoke.texi: Mention -mlzcnt option. * doc/extend.texi: Likewise. Bootstrapped successfully. OK for mainline. Uros.
[PATCH, i386]: Fix PR49927, ice in spill_failure, at reload1.c:2120
Hello! On a register starved i686, the relaxation that we allow DImode values in addresses can lead to register shortages and spill failures. Attached patch puts back the requirement that we allow subregs up to and including WORD_MODE width, nicely packed in a new function. 2011-08-01 Uros Bizjak ubiz...@gmail.com PR target/49927 * config/i386/i386.c (ix86_address_subreg_operand): New. (ix86_decompose_address): Use ix86_address_subreg_operand. (ix86_legitimate_address_p): Do not assert that subregs satisfy register_no_elim_operand in DImode. testsuite/ChangeLog: 2011-08-01 Uros Bizjak ubiz...@gmail.com PR target/49927 * gcc.target/i386/pr49927.c: New test. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 177036) +++ config/i386/i386.c (working copy) @@ -11096,6 +11096,30 @@ ix86_live_on_entry (bitmap regs) } } +/* Determine if op is suitable SUBREG RTX for address. */ + +static bool +ix86_address_subreg_operand (rtx op) +{ + enum machine_mode mode; + + if (!REG_P (op)) +return false; + + mode = GET_MODE (op); + + if (GET_MODE_CLASS (mode) != MODE_INT) +return false; + + /* Don't allow SUBREGs that span more than a word. It can lead to spill + failures when the register is one word out of a two word structure. */ + if (GET_MODE_SIZE (mode) UNITS_PER_WORD) +return false; + + /* Allow only SUBREGs of non-eliminable hard registers. */ + return register_no_elim_operand (op, mode); +} + /* Extract the parts of an RTL expression that is a valid memory address for an instruction. Return 0 if the structure of the address is grossly off. Return -1 if the address contains ASHIFT, so it is not @@ -6,8 +11140,7 @@ ix86_decompose_address (rtx addr, struct base = addr; else if (GET_CODE (addr) == SUBREG) { - /* Allow only subregs of DImode hard regs. */ - if (register_no_elim_operand (SUBREG_REG (addr), DImode)) + if (ix86_address_subreg_operand (SUBREG_REG (addr))) base = addr; else return 0; @@ -11175,8 +11198,7 @@ ix86_decompose_address (rtx addr, struct break; case SUBREG: - /* Allow only subregs of DImode hard regs in PLUS chains. */ - if (!register_no_elim_operand (SUBREG_REG (op), DImode)) + if (!ix86_address_subreg_operand (SUBREG_REG (op))) return 0; /* FALLTHRU */ @@ -11228,9 +11250,8 @@ ix86_decompose_address (rtx addr, struct { if (REG_P (index)) ; - /* Allow only subregs of DImode hard regs. */ else if (GET_CODE (index) == SUBREG - register_no_elim_operand (SUBREG_REG (index), DImode)) + ix86_address_subreg_operand (SUBREG_REG (index))) ; else return 0; @@ -11677,10 +11698,7 @@ ix86_legitimate_address_p (enum machine_ if (REG_P (base)) reg = base; else if (GET_CODE (base) == SUBREG REG_P (SUBREG_REG (base))) - { - reg = SUBREG_REG (base); - gcc_assert (register_no_elim_operand (reg, DImode)); - } + reg = SUBREG_REG (base); else /* Base is not a register. */ return false; @@ -11702,10 +11720,7 @@ ix86_legitimate_address_p (enum machine_ if (REG_P (index)) reg = index; else if (GET_CODE (index) == SUBREG REG_P (SUBREG_REG (index))) - { - reg = SUBREG_REG (index); - gcc_assert (register_no_elim_operand (reg, DImode)); - } + reg = SUBREG_REG (index); else /* Index is not a register. */ return false; Index: testsuite/gcc.target/i386/pr49927.c === --- testsuite/gcc.target/i386/pr49927.c (revision 0) +++ testsuite/gcc.target/i386/pr49927.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O0 } */ + +char a[1][1]; +long long b; + +void +foo (void) +{ + --a[b][b]; +}