uot;, CC1_ENDIAN_BIG_SPEC }, \
- { "cc1_endian_little", CC1_ENDIAN_LITTLE_SPEC }, \
- { "cc1_endian_default", CC1_ENDIAN_DEFAULT_SPEC }, \
{ "cc1_secure_plt_default", CC1_SECURE_PLT_DEFAULT_SPEC }, \
{ "cpp_os_ads", CPP_OS_ADS_SPEC }, \
{ "cpp_os_yellowknife", CPP_OS_YELLOWKNIFE_SPEC }, \
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
Jakub Jelinek wrote:
> On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
> > Actually, now I think the problem originally described there is still
> > valid: on s390 the CFA is *not* equal to the value at function entry,
> > but biased by 96/160 bytes. So sett
SP to the CFA is wrong ...
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
on s390 otherwise:
http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00904.html
And the background of the bug here:
http://gcc.gnu.org/ml/gcc/2003-05/msg00536.html
Actually, now I think the problem originally described there is still
valid: on s390 the CFA is *not* equal to the value at function entry
David Edelsohn wrote:
> On Mon, Nov 18, 2013 at 3:07 PM, Ulrich Weigand wrote:
> > Also note that this patch does not change how TDmode values are loaded
> > into GPRs: on little-endian, this means we do get the usual LE subreg
> > order there (least significant word in low
longlong.c (working copy)
@@ -11,7 +11,11 @@
int i[2];
} ud;
ud.ll = in;
+#ifdef __LITTLE_ENDIAN__
+ return ud.i[1];
+#else
return ud.i[0];
+#endif
}
int main()
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
if (sp->slot[4].l != MAKE_SLOT (1, 2)
+ || sp->slot[6].l != MAKE_SLOT (5, 6))
abort();
}
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
IG_ENDIAN
&& VECTOR_MEM_VSX_P (mode)
&& mode != TImode
+ && !gpr_or_gpr_p (operands[0], operands[1])
&& (memory_operand (operands[0], mode)
^ memory_operand (operands[1], mode)))
{
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
r57363.c: New test.
> +/* Check if adding a sNAN and a normal long double does not generate a
> + inexact exception. */
qNaN again :-)
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
turn;
+}
+
if (REG_P (src) && REG_P (dst) && (REGNO (src) < REGNO (dst)))
{
/* Move register range backwards, if we might have destructive
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
);
emit_insn (gen_movsd_hardfloat (operands[0], mem));
}
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
ave a test case, I think it would be good to add it
to the GCC test suite ...
Otherwise, this looks reasonable to me (but I cannot approve the patch):
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
David Edelsohn wrote:
> On Thu, Nov 14, 2013 at 5:07 PM, Ulrich Weigand wrote:
>
> > Here's a patch to add documentation along the lines of what we have
> > for the longdouble switches.
> >
> > Doc build tested on powerpc64-linux.
> >
> > David,
Joseph Myers wrote:
> On Tue, 12 Nov 2013, Ulrich Weigand wrote:
> > > > Therefore, it is introduces via a new pair of options
> > > >-mabi=elfv1 / -mabi=elfv2
> > > > where -mabi=elfv1 select the current Linux ABI, and -mabi=elfv2
> > > >
n the same test fail with the 4.6.0
baseline compiler installed on the system. My assumption
would be that this is simply a pre-existing bug (either
the alignment is computed incorrectly, or it is not being
respected properly throughout the toolchain), and you were
seeing successful runs in the
Jeff Law wrote:
> On 11/11/13 07:32, Ulrich Weigand wrote:
> > 2013-11-11 Ulrich Weigand
> >
> > * calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
> > arguments.
> OK, so after a lot of worrying, I think this is OK. I kept thinking
>
): Similarly.
> (insert_trap_and_remove_trailing_statements): Remove statements
> in reverse order.
This does indeed fix the Python build problem for me.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
Vladimir Makarov wrote:
> On 11/13/2013, 8:34 PM, Ulrich Weigand wrote:
> >> Unfortunately, this patch causes cc1 for powerpc64-linux to crash for me
> >> even when compiling "int main () { return 0; }" with -O due to a memory
> >> corruption somewhere:
&
y 0x1033713B: compile() (cgraphunit.c:2195)
==15063==
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
/home/uweigand/src/gcc/gcc/df-scan.c:265
0x1037480f df_scan_alloc(bitmap_head_def*)
/home/uweigand/src/gcc/gcc/df-scan.c:324
0x106189c7 do_reload
/home/uweigand/src/gcc/gcc/ira.c:5470
0x106189c7 rest_of_handle_reload
/home/uweigand/src/gcc/gcc/ira.c:5536
0x106189c7 execute
.5.
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59119
for a reduced test case ...
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
Jakub Jelinek wrote:
> On Mon, Nov 11, 2013 at 03:40:48PM +0100, Ulrich Weigand wrote:
> > @@ -355,7 +364,11 @@ extern int dot_symbols;
> > #define LINK_OS_DEFAULT_SPEC "%(link_os_linux)"
> >
> > #define GLIBC_DYNAMIC_LINKER32 "/lib/ld.so.1"
>
Joseph Myers wrote:
> On Mon, 11 Nov 2013, Ulrich Weigand wrote:
> > The ELFv2 ABI is the intended ABI for the new powerpc64le-linux port.
> > However, it is not inherently tied to the byte order; it it possible
> > in principle to use the ELFv2 ABI in big-endian mode too.
&g
Jeff Law wrote:
> On 11/11/13 07:32, Ulrich Weigand wrote:
> > However, looking more closely, it seems
> > store_unaligned_arguments_into_pseudos
> > is not really useful for PARALLEL arguments in the first place. What this
> > routine does is load arguments into args[
Hello,
this patch finally throws the switch and enables the ELFv2 ABI
by default on powerpc64le-linux.
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000/sysv4le.h (LINUX64_DEFAULT_ABI_ELFv2): Define.
Index: gcc/gcc/config/rs6000/sysv4le.h
ich
ChangeLog:
2013-11-11 Ulrich Weigand
Alan Modra
* config/rs6000/rs6000-protos.h (rs6000_reg_parm_stack_space):
Add prototype.
* config/rs6000/rs6000.h (RS6000_REG_SAVE): Remove.
(REG_PARM_STACK_SPACE): Call rs6000_reg_parm_stack_space.
value is returned formatted exactly as if
it were being passed as first argument. In order to get this correct
in the big-endian ELFv2 case, we need to provide a non-default
implementation of TARGET_RETURN_IN_MSB.
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
Michael Gschwind
ded in a number of
places ...
The patch also updates a number of test cases that hardcoded the
stack layout.
Bye,
Ulrich
gcc/ChangeLog:
2013-11-11 Ulrich Weigand
Alan Modra
* config/rs6000/rs6000.h (RS6000_SAVE_AREA): Handle ABI_ELFv2.
an now get called for BLKmode arguments,
and has to handle them using a PARALLEL.
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
Michael Gschwind
* config/rs6000/rs6000.h (AGGR_ARG_NUM_REG): Define.
* config/rs6000/rs6000.c (rs6000_aggregate_candidate): Ne
ch
gcc/ChangeLog:
2013-11-11 Ulrich Weigand
* config.gcc [powerpc*-*-* | rs6000-*-*]: Support --with-abi=elfv1
and --with-abi=elfv2.
* config/rs6000/option-defaults.h (OPTION_DEFAULT_SPECS): Add "abi".
* config/rs6000/rs6000.opt (mabi=elfv1): New option.
eds to go into the Go repository first ...)
Bye,
Ulrich
gcc/ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000/rs6000.c (machine_function): New member
r2_setup_needed.
(rs6000_emit_prologue): Set r2_setup_needed if necessary.
(rs6000_output_mi_thunk): Set
uw_install_context to install values for multiple fields.
(Note that there is already precedent for unwinder routines being
treated specially in the rs6000.c prologue/epilogue code ...)
Bye,
Ulrich
gcc/ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000/rs6000.c (struct rs6000_stack
logic by making
explicit the fact that rs6000_arg_partial_bytes does not actually need to
handle the cases described above.
No change in generated code expected.
Tested on powerpc64-linux and powerpc64le-linux.
OK for mainline?
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
generated code intented.
Tested on powerpc64-linux and powerpc64le-linux.
OK for mainline?
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000/rs6000.c (USE_FP_FOR_ARG_P): Remove TYPE argument.
(USE_ALTIVEC_FOR_ARG_P): Likewise
.
OK for mainline?
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000/rs6000.c (rs6000_option_override_internal): Replace
"DEFAULT_ABI != ABI_AIX" test by testing for ABI_V4 or ABI_DARWIN.
(rs6000_savres_strategy): Likewise.
(rs6000_r
them for both floating-point and vector
arguments. For the current ABI, the result should be exactly
the same.
No change in generated code intented.
Tested on powerpc64-linux and powerpc64le-linux.
OK for mainline?
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000
ye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000/rs6000.c (rs6000_call_indirect_aix): Rename to ...
(rs6000_call_aix): ... this. Handle both direct and indirect calls.
Create call insn directly instead of via various gen_... routines.
Mention special regi
powerpc64-linux and powerpc64le-linux.
OK for mainline?
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
* config/rs6000/rs6000.c (rs6000_emit_prologue): Do not place a
RTX_FRAME_RELATED_P marker on the UNSPEC_MOVESI_FROM_CR insn.
Instead, add USEs of all modified call
field
is different in little-endian.
Tested on powerpc64-linux and powerpc64le-linux.
OK for mainline?
Bye,
Ulrich
libgcc/ChangeLog:
2013-11-11 Ulrich Weigand
Alan Modra
* config/rs6000/linux-unwind.h (ppc_fallback_frame_state): Correct
location of CR save area
2013-11-11 Ulrich Weigand
Alan Modra
ChangeLog:
* function.c (assign_parms): Use all.reg_parm_stack_space instead
of re-evaluating REG_PARM_STACK_SPACE target macro.
(locate_and_pad_parm): New parameter REG_PARM_STACK_SPACE. Use it
instead of
ALLEL arguments.
Tested on powerpc64-linux and powerpc64le-linux.
OK for mainline?
Bye,
Ulrich
ChangeLog:
2013-11-11 Ulrich Weigand
* calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
arguments.
Index:
Joern Rennecke wrote:
> 2013-05-02 Joern Rennecke
>
> * reload.c (find_valid_class): Allow classes that do not include
> FIRST_PSEUDO_REGISTER - 1.
This is OK.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU/Linux compilers and toolchain
ulrich.weig...@de.ibm.com
de : SImode);
@@ -11311,7 +11315,8 @@
(define_insn "*tls_got_tprel_low"
[(set (match_operand:TLSmode 0 "gpc_reg_operand" "=r")
(lo_sum:TLSmode (match_operand:TLSmode 1 "gpc_reg_operand" "b")
-(unspec:TLSmode [(match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")]
+(unspec:TLSmode [(match_operand:TLSmode 3 "gpc_reg_operand" "b")
+ (match_operand:TLSmode 2 "rs6000_tls_symbol_ref" "")]
UNSPEC_TLSGOTTPREL)))]
"HAVE_AS_TLS && TARGET_CMODEL != CMODEL_SMALL"
"l %0,%2@got@tprel@l(%1)"
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
ent for every use, too.)
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
d (new_insn);
+ INSN_LOCATION (new_insn) = UNKNOWN_LOCATION;
+ return;
+ }
+
p = get_pipe (insn);
if ((CALL_P (insn) || JUMP_P (insn)) && SCHED_ON_EVEN_P (insn))
new_insn = emit_insn_after (gen_lnop (), insn);
--
Dr. Ulrich Weigand
GNU Toolchain for
Matthew Gretton-Dann wrote:
> 2013-02-02 Matthew Gretton-Dann
>
> * gcc/reload.c (subst_reloads): Fix DEBUG_RELOAD build issue.
This is OK.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
e
> code at all.
>
> Thus, ok from a RM perspective if a reload-affine person approves it.
The patch was originally by Bernd, but FWIW it looks good to me as well.
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
for "X"
are those that imply side-effects like pre-increment. ]
> Fixes the ICE gcc.dg/torture/asm-subreg-1.c on aarch64.
Clearly the compiler shouldn't crash either, but I guess it
really ought to be possible to fix this problem elsewhere.
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00984.html
> Ping.
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
David Edelsohn wrote:
> On Mon, Nov 26, 2012 at 2:10 PM, Ulrich Weigand wrote:
>
> > So I'm wondering where to go from here. I guess we could:
> >
> > 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
> >by removing the #undef DBX
eh_frame and .dwarf_frame. The only
other such platforms I'm aware of are Darwin on 32-bit i386, and
some other operating systems on ppc (AIX, Darwin, BSD).
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
Mark Kettenis wrote:
> > Date: Mon, 26 Nov 2012 20:10:06 +0100 (CET)
> > From: "Ulrich Weigand"
> >
> > Hello,
> >
> > I noticed what appears to be a long-standing bug in generating .dwarf_frame
> > sections with GCC on Linux on PowerPC.
>
t with GCC 4.0 and 4.1 unless we
want to add a special hack for that.
3. Like 2., but remove the condition code hack: simply use identical
numbers in .eh_frame and .dwarf_frame. This would make PowerPC
like other Linux platforms in that respect.
Thoughts?
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00984.html
Ping.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
Matthias Klose wrote:
> 2012-11-14 Matthias Klose
>
> * config/s390/t-linux64: Add multiarch names in MULTILIB_OSDIRNAMES.
This is OK.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
ld public.
> > Rename field vec_ to vec_PRIVATE_.
> > Update all users.
> > (va_heap::release): Do nothing if V is NULL.
> > (va_stack::release): Likewise.
>
> Committed as rev 193667.
This fixed the spu-elf build failure. Thanks!
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
enabled).
"affine_fn" is defined in tree-data-ref.h as:
typedef vec affine_fn;
which apparently is no longer a POD type?
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
this patch is ready to merge
as well at this point. ]
Tested on arm-linux-gnueabi.
OK for mainline?
Bye,
Ulrich
2012-11-13 Andrew Stubbs
Ulrich Weigand
gcc/
* config/arm/arm.md (zero_extenddi2): Add extra alternatives
for NEON registers.
Add a
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01521.html
Ping.
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
Tejas Belagod wrote:
> > Ulrich Weigand wrote:
> >> The following patch implements this idea; it passes a basic regression
> >> test on arm-linux-gnueabi. (Obviously this would need a lot more
> >> testing on various platforms before getting into mainline .
Richard Earnshaw wrote:
> On 20/09/12 16:49, Ulrich Weigand wrote:
> > Richard Earnshaw wrote:
> >
> >> Hmm, this is going to cause bottlenecks on Cortex-A15: writing a Neon
> >> single-precision register and then reading it back as a double-precision
> >
Uros Bizjak wrote:
> On Fri, Oct 12, 2012 at 7:57 PM, Ulrich Weigand wrote:
> > I was wondering if the i386 port maintainers could have a look at this
> > pattern. Shouldn't we really have two patterns, one to *load* an unaligned
> > value and one to *store* and unali
unspec
in the first place, but only plain moves, and check MEM_ALIGN in the
move insn emitter to see which variant of the instruction is required ...]
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
> * gcc.dg/lower-subreg-1.c: Disable on arm-*-* targets.
I just noticed that the triple is incomplete; we're supposed to use
arm*-*-* instead of just arm-*-*.
Checked in the the following fix as obvious.
Bye,
Ulrich
2012-10-01 Ulrich Weigand
* gcc.dg/lower-su
**
static unsigned int
rest_of_handle_lower_subreg2 (void)
{
! decompose_multiword_subregs ();
return 0;
}
--- 1656,1662 ----
static unsigned int
rest_of_handle_lower_subreg2 (void)
{
! decompose_multiword_subregs (true);
return 0;
}
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
ing
> other overheads.
What instruction are you refering to here? Loads from memory?
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
good to me. OK if testing passes.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
g:
2012-09-17 Andrew Stubbs
Ulrich Weigand
* config/arm/arm.c (arm_print_operand): Add new 'E' format code.
(arm_emit_coreregs_64bit_shift): Fix comment.
* config/arm/arm.h (enum reg_class): Add VFP_LO_REGS_EVEN.
(REG_CLASS_NAMES, REG_CLASS_CONTENTS, IS
ubreg" routine on rld[r].in, check
whether the result is a hard register and use its REGNO_REG_CLASS.
Bernd, given that you worked on this recently, any other thoughts?
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
is used ...
All other code in neon.md *either* transforms the NEON lane numbers
into RTL subpart numbers and put those in vec_select etc. *or* uses
a NEON lane number unchanged as argument of an UNSPEC. It was only
this one pattern that broke this rule.
> This is OK.
Checked in.
Thanks,
Ulr
Richard Earnshaw wrote:
> On 14/09/12 19:02, Ulrich Weigand wrote:
> > * config/arm/arm.c (output_move_neon): Update comment.
> > Use vld1.64/vst1.64 instead of vldm/vstm where possible.
> > (neon_vector_mem_operand): Support double-word modes.
> > * co
n arm-linux-gnueabi.
OK for mainline?
Bye,
Ulrich
2012-09-14 Ulrich Weigand
* config/arm/arm.c (arm_rtx_costs_1): Handle vec_extract and vec_set
patterns.
* config/arm/arm.md ("vec_set_internal"): Support memory source
operands, implemented via v
Ulrich Weigand
* config/arm/arm.c (output_move_neon): Update comment.
Use vld1.64/vst1.64 instead of vldm/vstm where possible.
(neon_vector_mem_operand): Support double-word modes.
* config/arm/neon.md (*neon_mov VD): Call output_move_neon
instead
can-assembler "lsrs\tr\[0-9\]" { target arm_thumb2_ok } }} */
/* { dg-final { scan-assembler "movs\tr\[0-9\]" { target { ! arm_thumb2_ok} }
} } */
--- 9,13
r[i] = 0;
}
! /* { dg-final { scan-assembler "lsrs\tr\[0-9\]" { target arm_thumb2_ok } } }
*/
/* { dg-final { scan-assembler "movs\tr\[0-9\]" { target { ! arm_thumb2_ok} }
} } */
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
that earlier passes
> have deemed to be safe, and using them as a temporary is not one of those.
This sounds reasonable, agreed.
> * reload.c (find_dummy_reload): Don't use OUT as a reload reg
> for IN if it overlaps a fixed register.
OK.
Thanks,
Ulrich
--
Dr. Ulri
Richard Guenther wrote:
> On Wed, 15 Aug 2012, Ulrich Weigand wrote:
> > It seems flow_loops_find by itself is not quite enough, but everything
> > necessary to use the loop structures seems to be encapsulated in
> > loop_optimizer_init / loop_optimizer_finalize, which are al
Richard Guenther wrote:
> On Tue, 14 Aug 2012, Ulrich Weigand wrote:
> > Looks like this broke SPU build, since spu_machine_dependent_reorg
> > accesses ->loop_depth. According to comments in the code, this
> > was done because of concerns that loop_father may no longer be
single_pred_p (bb)
&& prev->loop_depth == bb->loop_depth)
prop = prev;
Any suggestions?
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
rl"
> + [(set (zero_extract:DI
> + (match_operand:DI 0 "nonimmediate_operand" "+d")
> + (match_operand 1 "const_int_operand" "")
> + (match_operand 2 "const_int_operand" ""))
> + (and:DI
> + (lshiftrt:DI
> + (match_dup 0)
> + (match_operand 3 "const_int_operand" ""))
> + (match_operand:DI 4 "nonimmediate_operand" "d")))
> + (clobber (reg:CC CC_REGNUM))]
> + "TARGET_Z10
> + && INTVAL (operands[3]) == 64 - INTVAL (operands[1]) - INTVAL
> (operands[2])"
> + "rnsbg\t%0,%4,%2,%2+%1-1,64-%2,%1"
I guess the last "," is supposed to be a "-". (Then we might
as well use %3 instead of 64-%2-%1.)
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
akes
+ explicit use of vector types may be incompatible with binary objects
+ built with older versions of GCC. Auto-vectorized code is not affected
+ by this change.
+
General Optimizer Improvements
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
Richard Earnshaw wrote:
> On 10/08/12 14:44, Ulrich Weigand wrote:
> > Would the following htdocs patch be OK with you? Feel free to suggest
> > a more appropriate wording ...
>
> I think we need to make it clear that this also fixes a bug in the
> compiler that could
es larger than 8 bytes in size), to comply with the AAPCS.
+This is an ABI change that affects e.g. layout of structures having a member
+of vector type. Code using such types may be incompatible with binary objects
+built with older versions of GCC.
+
General Optimizer Improvements
--
Dr
Richard Guenther wrote:
> On Tue, Aug 7, 2012 at 4:56 PM, Ulrich Weigand wrote:
> > Would it be OK to backport this to 4.7 and possibly 4.6?
> I'll defer the decision to the target maintainers. But please double-check
> for any changes in the vectorizer parts when backporti
Richard Guenther wrote:
> On Tue, Aug 7, 2012 at 4:56 PM, Ulrich Weigand wrote:
> > Would it be OK to backport this to 4.7 and possibly 4.6?
> I'll defer the decision to the target maintainers. But please double-check
> for any changes in the vectorizer parts when backpor
nse.
> Can you please test with --with-arch=z10?
Tested with no regressions.
> * config/s390/s390.c (s390_expand_cs_hqi): Copy val to a temp before
> performing the compare for the restart loop.
OK.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
Richard Henderson wrote:
> On 08/07/2012 10:02 AM, Ulrich Weigand wrote:
> > The following patch changes the builtin expander to pass a MEM oldval
> > as-is to the back-end expander, so that the back-end can move the
> > store to before the CC operation. With that patch I&
Richard Henderson wrote:
> On 08/06/2012 11:34 AM, Ulrich Weigand wrote:
> > There is one particular inefficiency I have noticed. This code:
> >
> > if (!__atomic_compare_exchange_n (&v, &expected, max, 0 , 0, 0))
> > abort ();
> >
> > from
pand_insv to aid that. Try RISBG last, after other
> mechanisms have failed; don't require operands in registers for
> it but force them there instead. Try a limited form of ICM.
This looks good to me. Retested with no regressions.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchai
Richard Guenther wrote:
> On Fri, Jul 27, 2012 at 5:24 PM, Ulrich Weigand wrote:
> > ChangeLog:
> >
> > * target.def (vector_alignment): New target hook.
> > * doc/tm.texi.in (TARGET_VECTOR_ALIGNMENT): Document new hook.
> >
register is always
equivalent to the memory so the substitution is valid even if there
are intervening stores. Also, don't move a volatile asm or
UNSPEC_VOLATILE across any other insns. */
|| (! all_adjacent
&& (((!MEM_P (src)
|| ! find_reg_note (insn, REG_EQUIV, src))
&& use_crosses_set_p (src, DF_INSN_LUID (insn)))
|| (GET_CODE (src) == ASM_OPERANDS && MEM_VOLATILE_P (src))
|| GET_CODE (src) == UNSPEC_VOLATILE))
Note that we have exactly the case mentioned, where a series of instructions
to be combined all set the same (CC) register. If they're all adjacent,
this still gets optimized away -- but they no longer are due to the store.
Is there a way of structuring the atomic optabs/expander so that the store
gets done either before or after all the CC operations?
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
nd expand_insv needs to be
called with bitpos 0 (due to bits-big-endian).
When reverting this part of your patch (and together with the EQ/NE fix
pointed out here: http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00170.html),
I can complete a bootstrap/testing cycle without regressions.
(There's still code being generated that looks a bit inefficient, but that's
a different story.)
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
> - cmpv = force_reg (SImode, val);
> - store_bit_field (cmpv, GET_MODE_BITSIZE (mode), 0,
> -0, 0, SImode, cmp);
> + cc = s390_emit_compare_and_swap (NE, res, ac.memsi, cmpv, newv);
> + emit_insn (gen_cstorecc4 (btarget, cc, XEXP (cc, 0), XEXP (cc, 1)));
> }
... and here.
This fixes the main atomic test failures I was seeing. I've restarted
the full bootstrap / regression test now ...
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
x27;ll have a look what's going on here.
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
o a RISBG. I guess the point of the
RISBG is that you can avoid the extra shift ... Now, if that shift
can be moved ahead of the loop, that may not be all that big of a
win. On the other hand, these loops hopefully don't loop very often
if we don't have a lot of contention ...
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
Richard Henderson wrote:
> On 2012-07-30 07:09, Ulrich Weigand wrote:
> > This seems to disable use of ICM / STCM to perform byte or
> > aligned halfword access. Why is this necessary? Those operations
> > are supposed to provide the required operand consistency ...
>
Richard Guenther wrote:
> On Fri, Jul 27, 2012 at 5:24 PM, Ulrich Weigand wrote:
> > OK for mainline?
>
> Ok. Please add to the documentation that the default vector alignment
> has to be a power-of-two multiple of the default vector element alignment.
Committed, thanks. The
ROL
(LCTLG), STORE CHARACTERS UNDER MASK,
and STORE CONTROL (STCTG) access their storage
operands in a left-to-right direction, and all bytes
accessed within each doubleword appear to be
accessed concurrently as observed by other CPUs. ]
Otherwise the patch looks good to me.
Thanks,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
lete_output_reload can see it. */
! if (replace_reloads && recog_data.operand[opnum] != x)
! /* We mark the USE with QImode so that we recognize it as one that
!can be safely deleted at the end of reload. */
! PUT_MODE (emit_insn_before (gen_rtx_USE (VOIDmode, SUBREG_REG (x)), insn),
! QImode);
if (address_reloaded)
*address_reloaded = reloaded;
! return tem;
}
/* Substitute into the current INSN the registers into which we have reloaded
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
quot; } } */
--- 56,60 ----
/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0
"vect" } } */
! /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 2 "vect" { xfail { vect_no_align || { ! vect_natural_alignment } } } }
} */
/* { dg-final { cleanup-tree-dump "vect" } } */
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
ulrich.weig...@de.ibm.com
address will exactly correspond to
> misalignment when masking off and applying to a vector. And,
> you may have to conditionalize some vectorizer tests with the
> effective-target unaligned_stack.
I *think* this ought to work out OK, since the realignment scheme
consults the type alignment:
new_st
201 - 300 of 480 matches
Mail list logo