On 02/26/2016 07:54 AM, Renlin Li wrote:
Hi all,

I admit that, the title looks a little bit confusing.

The situation is like this,
To make insn_1 strict, lra generates a new insn_1_reload insn.
In insn_1_reload, there is a scratch operand with this form
clobber (match_scratch:MODE x "=&r")

When lra tries to reload insn_1_reload in later iteration, a new pseudo
register (let say RXX) is created to replace this scratch operand in-place. Additionally, a new insn will be generated and inserted after insn_1_reload to
finish the reload. It's in this form:
(set scratch, RXX)

And this instruction is illegal. no target implements this kind of pattern.
LRA will ICE because of this.
"internal compiler error: in lra_set_insn_recog_data, at lra.c:964"

And indeed, this pattern has no side-effect. The scratch operand should
stay inside the pattern.

Normally, at the very beginning of LRA reload, all scratch operands will be replaced by newly created pseudo register. However, this is a problem when
generated reload insn has output scratch operand.

I have checked, x86, arm, aarch64, mips, arc all have such patterns. But it's not triggered. In my case, it's triggered by compiling glibc with local change.

So a simple change is made in this patch. The output operand is reloaded only
when it's not a scratch operand and it's not unused since then.


aarch64-none-linux-gnu bootstrap and regression test OK.
x86_64-linux bootstrap and regression test OK.
OK for trunk?


Thanks for working on this and providing a good description of the problem. Could you fill a PR and provide a test even if you can not reduce it.

It is always hard to describe a bound of responsibilities for correct behaviour between LRA/reload and machine-dependent code. For example, reload can work when you describe a register class constraint in an insn definition and actually a subclass of the class rejects the operand mode. I could provide the same behaviour for LRA but it is wrong as global register allocator will generate inefficient code.

As for the scratch. As I understand the scratch was introduced for operands which will not require any resources (memory or a new register) for some insn alternatives. If we use pseudo for this, it will always need memory or a register. The typical constraint for scratch is "r,X" or "0r". So I guess using just "&r" for scratch is a bad practice. Still for compatibility I think we should implement the same reload behaviour for this case too.

I believe we should use the same technique -- changing scratches to pseudo and back at the end of LRA if they don't need a register. It will solve also a possible problem for correct scratch generation during LRA.

I am going to work on this problem on the next week. A test case would be a help for me.

gcc/ChangeLog:

2016-02-26  Renlin Li<renlin...@arm.com>

    * lra-constraints.c (curr_insn_transform): Don't generate reload for
    output scratch operand.

Sorry, I can not accept the patch as I'd like to provide a better solution I described above. The patch is also wrong for unused non-scratch operands. They still should be reloaded if they do not satisfy their constraints even if they are not used later.

Reply via email to