On 03/30/2012 10:39 AM, Georg-Johann Lay wrote:
No, this pass only splits operations that are wider than word mode into
word mode sized chunks.
On a machine where word mode is SI, it will split DI shifts and zero
extends and any moves wider that SI mode into a series of SI operations.
It does nothing for things in QI, or HI mode. The pass was written
before there were machines with fast vector move operations.
This patch takes a different approach to fixing PR52543 than does the
This patch transforms the lower-subreg pass(es) from unconditionally
splitting wide moves, zero extensions, and shifts, so that it now takes
into account the target specific costs and only does the transformations
if it is profitable.
As far as I understand the pass, it's not only about splitting these
but also to the additional benefits of the split, i.e. AND 0xfffffffe will
one QI operation instead of 1 SI operation that costs 4 QI.
And in fact, the positive benefit of subreg-lowering occurs with bit-wise
like AND, IOR, EOR etc.
And one problem is that the pass is not sensitive to address spaces.
For example, HI splits for generic space are profitable, for non-generic
they are not.
Thus, a patch should also address address-space sensivity.
It might be that there are issues where the address space considerations
may need to be taken into consideration. Someone who has a port with
memory operation like this may want to consider making that
enhancement. I think that once this patch is in place, that kind of
change will be easier to incorporate.
Unconditional splitting is a problem that not only occurs on the AVR but
is also a problem on the ARM NEON and my private port. Furthermore, it
is a problem that is likely to occur on most modern larger machines
since these machines are more likely to have fast instructions for
moving things that are larger than word mode.
At compiler initialization time, each mode that is larger that a word
mode is examined to determine if the cost of moving a value of that mode
is less expensive that inserting the proper number of word sided
moves. If it is cheaper to split it up, a bit is set to allow moves of
that mode to be lowered.
As written above, the mode is *not* enough. For MEM there are is also
address space involved.