bytewise reverse operations more effectively

Kyrill Tkachov Wed, 19 Mar 2014 02:56:26 -0700

Hi all,

This patch series attempts to improve code generation on arm and aarch64 forvarious bitwise operations that can be expressed with rev16 instructions inthose architectures. In particular expressions of the form:

((x & 0x00ff00ff) << 8) | ((x & 0xff00ff00) >> 8)

This can appear in places like the Linux kernel and can be directly mapped to asingle rev16 instruction.

This series has 3 parts:

[1/3] Add a new field to the rtx costs tables to represent the latency of therev* group of instructions that will be used to accurately model the cost ofthese operations. Use it to properly cost existing patterns that generate rev16(for bswap operations).

[2/3] Add aarch64 combine patterns to recognise the above bitwise operations andmap them to rev16. Model the cost appropriately and add helper functions thatcan be reused by the arm backend.

[3/3] Define similar combine patterns for arm and reuse the helper functionsintroduced in patch 2/3 to properly cost them.


I'm proposing these for next stage-1 of course.

Thanks,
Kyrill

[ARM/AArch64][0/3] Handle bitwise/bytewise reverse operations more effectively

Reply via email to