A couple of more general instructions which require quadword aligned storage operands and 128-bit values in even-odd pairs of 64-bit GPRs:
Compare Double and Swap (CDSG) Compare and Swap and Store (CSST) Having the ability to assemble quadword aligned 128-bit items for use with these instructions would be helpful.