On 02/26/2016 07:54 AM, Renlin Li wrote:
Hi all,
I admit that, the title looks a little bit confusing.
The situation is like this,
To make insn_1 strict, lra generates a new insn_1_reload insn.
In insn_1_reload, there is a scratch operand with this form
clobber (match_scratch:MODE x "=&r")
When lra tries to reload insn_1_reload in later iteration, a new pseudo
register (let say RXX) is created to replace this scratch operand
in-place.
Additionally, a new insn will be generated and inserted after
insn_1_reload to
finish the reload. It's in this form:
(set scratch, RXX)
And this instruction is illegal. no target implements this kind of
pattern.
LRA will ICE because of this.
"internal compiler error: in lra_set_insn_recog_data, at lra.c:964"
And indeed, this pattern has no side-effect. The scratch operand should
stay inside the pattern.
Normally, at the very beginning of LRA reload, all scratch operands
will be
replaced by newly created pseudo register. However, this is a problem
when
generated reload insn has output scratch operand.
I have checked, x86, arm, aarch64, mips, arc all have such patterns.
But it's
not triggered. In my case, it's triggered by compiling glibc with
local change.
So a simple change is made in this patch. The output operand is
reloaded only
when it's not a scratch operand and it's not unused since then.
aarch64-none-linux-gnu bootstrap and regression test OK.
x86_64-linux bootstrap and regression test OK.
OK for trunk?
Thanks for working on this and providing a good description of the
problem. Could you fill a PR and provide a test even if you can not
reduce it.
It is always hard to describe a bound of responsibilities for correct
behaviour between LRA/reload and machine-dependent code. For example,
reload can work when you describe a register class constraint in an insn
definition and actually a subclass of the class rejects the operand
mode. I could provide the same behaviour for LRA but it is wrong as
global register allocator will generate inefficient code.
As for the scratch. As I understand the scratch was introduced for
operands which will not require any resources (memory or a new register)
for some insn alternatives. If we use pseudo for this, it will always
need memory or a register. The typical constraint for scratch is "r,X"
or "0r". So I guess using just "&r" for scratch is a bad practice.
Still for compatibility I think we should implement the same reload
behaviour for this case too.
I believe we should use the same technique -- changing scratches to
pseudo and back at the end of LRA if they don't need a register. It
will solve also a possible problem for correct scratch generation during
LRA.
I am going to work on this problem on the next week. A test case would
be a help for me.
gcc/ChangeLog:
2016-02-26 Renlin Li<renlin...@arm.com>
* lra-constraints.c (curr_insn_transform): Don't generate reload for
output scratch operand.
Sorry, I can not accept the patch as I'd like to provide a better
solution I described above. The patch is also wrong for unused
non-scratch operands. They still should be reloaded if they do not
satisfy their constraints even if they are not used later.