On 1/22/24 00:45, Richard Biener wrote:
On Fri, Jan 19, 2024 at 5:06 PM Georg-Johann Lay <a...@gjlay.de> wrote:



Am 18.01.24 um 20:54 schrieb Roger Sayle:

This patch tweaks RTL expansion of multi-word shifts and rotates to use
PLUS rather than IOR for disjunctive operations.  During expansion of
these operations, the middle-end creates RTL like (X<<C1) | (Y>>C2)
where the constants C1 and C2 guarantee that bits don't overlap.
Hence the IOR can be performed by any any_or_plus operation, such as
IOR, XOR or PLUS; for word-size operations where carry chains aren't
an issue these should all be equally fast (single-cycle) instructions.
The benefit of this change is that targets with shift-and-add insns,
like x86's lea, can benefit from the LSHIFT-ADD form.

An example of a backend that benefits is ARC, which is demonstrated
by these two simple functions:

But there are also back-ends where this is bad.

The reason is that with ORI, the back-end needs only to operate no
these sub-words where the sub-mask is non-zero.  But for PLUS this
is not the case because the back-end does not know that intermediate
carry will be zero.  Hence, with PLUS, more instructions are needed.
An example is AVR, but maybe much more target with multi-word operations
are affected in a bad way.

Take for example the case with 2 words and a value of 1.

LO |= 1
HI |= 0

can be optimized to

LO |= 1

but for addition this is not the case:

LO += 1
HI +=c 0 ;; Does not know that always carry = 0.

I wonder if the PLUS can be done on the lowpart only to make this
detail obvious?
In theory, yes. This class of problems has often been punted to the target expanders (far from ideal).

I still suspect the way forward here is to have the exp* code query one or more target properties to guide IOR vs PLUS selection.

Jeff

Reply via email to