On 2/27/24 8:25 AM, Jeff Law wrote:
On 2/25/24 21:53, Greg McGary wrote:
Add option -m(no-)autovec-segment to enable/disable autovectorizer
from emitting vector segment load/store instructions. This is useful for
performance experiments.
gcc/ChangeLog:
* config/riscv/autovec.md
On 2/26/24 5:17 PM, Greg McGary wrote:
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr113010.c
b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
new file mode 100644
index 000..a95c613c1df
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
@@ -0,0 +1,9 @@
+int
The sign-bit-copies of a sign-extending load cannot be known until runtime on
WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM
load. See the fix for PR112758.
2024-02-22 Greg McGary
PR rtl-optimization/113010
* combine.cc (simplify_comparison
Add option -m(no-)autovec-segment to enable/disable autovectorizer
from emitting vector segment load/store instructions. This is useful for
performance experiments.
gcc/ChangeLog:
* config/riscv/autovec.md (vec_mask_len_load_lanes,
vec_mask_len_store_lanes):
Predicate with
On 2/22/24 2:08 PM, Jakub Jelinek wrote:
On Thu, Feb 22, 2024 at 12:59:18PM -0800, Greg McGary wrote:
The sign bit of a sign-extending load cannot be known until runtime,
so don't attempt to simplify it in the combiner.
2024-02-22 Greg McGary
PR rtl-optimization/113010
The sign bit of a sign-extending load cannot be known until runtime,
so don't attempt to simplify it in the combiner.
2024-02-22 Greg McGary
PR rtl-optimization/113010
* combine.cc (simplify_comparison): Don't simplify high part
of paradoxical-SUBREG-of-MEM on machines
On 2/4/24 9:58 PM, Jeff Law wrote:
On 2/2/24 15:48, Greg McGary wrote:
input: (sign_extend:DI (mem/c:SI (symbol_ref:DI ("minus_1") [flags
0x86] ) [1 minus_1+0 S4 A32]))
result: (subreg:DI (mem/c:SI (symbol_ref:DI ("minus_1") [flags 0x86]
) [1 minus_1+0 S4 A32]) 0)
On 2/1/24 10:24 PM, Jeff Law wrote:
On 2/1/24 18:24, Greg McGary wrote:
However, for a machine where (WORD_REGISTER_OPERATIONS &&
load_extend_op (inner_mode) == SIGN_EXTEND), the high part of a PSoM
is only known at runtime as 0s or 1s. That's the downstream bug. The
fix for such
On 1/18/24 9:24 AM, Jeff Law wrote:
On 1/17/24 20:53, Greg McGary wrote:
While the code comment is true, perhaps it obscures the primary intent,
which is recognition that the pattern (SIGN_EXTEND (mem ...) ) is
destined
to expand into a single memory-load instruction and no simplification
On Tue, Jan 16, 2024 at 11:44 PM Richard Biener
wrote:
> > On Tue, Jan 16, 2024 at 11:20 PM Greg McGary wrote:
> > >
> > > The sign bit of a sign-extending load cannot be known until runtime,
> > > so don't attempt to simplify it in the combiner.
>
The sign bit of a sign-extending load cannot be known until runtime,
so don't attempt to simplify it in the combiner.
2024-01-11 Greg McGary
PR rtl-optimization/113010
* combine.cc (expand_compound_operation): Don't simplify
SIGN_EXTEND of a MEM
On 11/25/12 23:33, Maxim Kuvyrkov wrote:
You essentially need a fix-up pass just before the end of compilation
(machine-dependent reorg, if memory serves me right) to space instructions
consuming values from CPRs from the CALL_INSNS that set those CPRs. I.e.,
for the 99% of compilation you
On 11/26/12 12:46, Maxim Kuvyrkov wrote:
I wonder if kludgy fixups refers to the dummy-instruction solution I
mentioned above. The complete dependence graph is a myth. You cannot have a
complete dependence graph for a function -- scheduler works on DAG regions
(and I doubt it will ever
I'm working onaport to a VLIW DSP with anexposed pipeline (i.e., no
interlocks). Some operations OPhave as much as 2-cycle latency on values
of the call-preserved regs CPR. E.g., if the callee's epiloguerestores a
CPR in the delay slot of the return instruction, then any OP with that CPR
as
When the timing requirements are not met upon queueing an insn with
INSN_EXACT_TICK, the scheduler backtracks. This seems wasteful.
Why not prioritize INSN_EXACT_TICK insns so that we queue them
first on the cycle they need?
I'm working on a DSP port whose unit reservations are very sensitive to
operand signature. E.g., for an assembler mnemonic, there can be 35-50
different combinations of operand register classes, each having different
impacts on latencies and function units. For assembler code generation, very
On 05/11/12 16:00, Greg McGary wrote:
My question is this: does it make sense to double MAX_RECOG_ALTERNATIVES so
that I can use insn attributes to identify operand signatures, or should I use
another approach?
After some exploration, I don't see that another approach is even possible
I'm working on a port that does loads stores in two phases.
Every load/store is funneled through the intermediate registers ld and st
standing between memory and the rest of the register file.
Example:
ld=4(rB)
...
...
rC=ld
st=rD
8(rB)=st
rB is
On 04/27/12 14:31, Greg McGary wrote:
I'm working on a port that does loads stores in two phases.
Every load/store is funneled through the intermediate registers ld and st
standing between memory and the rest of the register file.
Example:
ld=4(rB
On 05/05/10 21:27, Jeff Law wrote:
On 05/05/10 21:34, Greg McGary wrote:
On 05/05/10 20:21, Jeff Law wrote:
I'm not sure they are ever legitimized -- IIRC caller-save tries to only
generate addressing modes which are safe for precisely this reason.
Apparently not so: caller
reload() setup_save_areas() assign_stack_local_1() creates a mem
address whose offset too large to fit into the machine insn's offset
operand. Later, reload() save_call_clobbered_regs() insert_save()
adjust_address_1() change_address_1() asserts because the address is
not legitimate.
On 05/05/10 20:21, Jeff Law wrote:
On 05/05/10 17:45, Greg McGary wrote:
reload() setup_save_areas() assign_stack_local_1() creates a mem
address whose offset too large to fit into the machine insn's offset
operand. Later, reload() save_call_clobbered_regs() insert_save
On 04/28/10 05:58, Michael Matz wrote:
On Tue, 27 Apr 2010, Greg McGary wrote:
(define_insn *udivmodsi4_libcall
[(set (reg:SI 4)
(udiv:SI (reg:SI 1)
(reg:SI 2)))
(set (reg:SI 1)
(umod:SI (reg:SI 1)
(reg:SI 2)))
(clobber (reg:SI 2))
(clobber
On 04/26/10 22:09, Ian Lance Taylor wrote:
Greg McGaryg...@mcgary.org writes:
I have a port without div or mod machine instructions. I wrote
divmodsi4 patterns that do the libcall directly, hoping that GCC would
recognize the opportunity to use a single divmodsi4 to compute both
quotient
I have a port without div or mod machine instructions. I wrote
divmodsi4 patterns that do the libcall directly, hoping that GCC would
recognize the opportunity to use a single divmodsi4 to compute both
quotient and remainder. Alas, GCC calls divmodsi4 twice with the same
divisor and dividend
I'm doing a port for an unusual new machine which is 32-bit RISCy in
every way, except that it has 48-bit pointers. Pointers have a
high-order 16-bit segID and low-order 32-bit seg offset. Most ALU
instructions only work on 32 bits, zeroing the upper 16-bit seg ID in
the result. A few ALU
I extracted the MFPGPR hunks from Peter Bergner's [PATCH] Add POWER6
machine description, posted on 2006-11-01 and dropped them into
gcc-4.0.3, but the result fails with error: insn does not satisfy its
constraints:
.../src/gcc-4.0.3/gcc/config/rs6000/darwin-ldouble.c: In function
I'm working on a port that has instructions to move bits between
64-bit floating-point and 64-bit general-purpose regs. I say bits
because there's no conversion between float and int: the bit pattern
is unaltered. Therefore, it's possible to use scratch FPRs for
spilling GPRs vice-versa, and
I'm working with a machine that has a memory-increment insn. It's a
network-processor performance hack that allows no-latency accumulation
of statistical counters. The insn sends the increment and address to
the memory controller which does the add, avoiding the usual
long-latency
Paul Brook [EMAIL PROTECTED] writes:
It should just work if you have the appropriate movsi pattern/alternative.
m68k has an memory-increment instruction (aka add :-).
Touche. I've had my head in RISC-land too long... 8^)
G
Daniel Jacobowitz [EMAIL PROTECTED] writes:
... Or you could try telling the entire compiler to treat them as
registers, instead of just reload. That's likely to work as well or
better.
So, I define these as a separate register class, and only the movM
insn patterns get constraints that
James E Wilson [EMAIL PROTECTED] writes:
Greg McGary wrote:
I found that
emit_no_conflict_block() reordered insns gen'd by
expand_doubleword_shift() in a way that violated dependency between
compares and associated conditional-move insns that had the target
register as destination
My port failed the DImode part of the rotate regression-tests
(gcc.c-torture/execute/20020508-[123].c). I found that
emit_no_conflict_block() reordered insns gen'd by
expand_doubleword_shift() in a way that violated dependency between
compares and associated conditional-move insns that had the
33 matches
Mail list logo