On Sat, 11 Oct 2025 16:59:34 +0530 Shreesh Adiga <[email protected]> wrote:
> Replace the clearing of lower 32 bits of XMM register with blend of > zero register. > Remove the clearing of upper 64 bits of tmp1 as it is redundant. > tmp1 after clearing upper bits was being xor with tmp2 before the > bits 96:65 from tmp2 were returned. The xor operation of bits 96:65 > remains unchanged due to tmp1 having bits 96:64 cleared to 0. > After removing the xor operation, the clearing of upper 64 bits of tmp1 > becomes redundant and hence can be removed. > Clang is able to optimize away the AND + memory operand with the > above sequence, however GCC is still emitting the code for AND with > memory operands which is being explicitly eliminated here. > > Additionally replace the 48 byte crc_xmm_shift_tab with the contents of > shf_table which is 32 bytes, achieving the same functionality. > Applied to net-next

