Hello, I'm trying to fix the scheduling of some floating point instructions on c6x. In particular, the mpydp (double * double) and mpyspdp (float * double) instructions both have some cycles of functional unit latency but that latency varies depending on the next instruction on the functional unit.
Specifically, mpydp has a latency of 4 cycles, but if it is followed by mpyspdp, then it's 7. See SPRUFE8B [1] 4.3.2 .M-Unit Constraints for the details. I managed to produce correct code by changing the description of the mpydp reservation, bumping the latency to 7 cycles, but that hurts performance, and I have trouble finding a way to express how to make the latency conditional. Before you remind me, I am aware that this target is scheduled for removal soon. Thanks for any help with this. -- Richard Braun [1] https://www.ti.com/lit/ug/sprufe8b/sprufe8b.pdf
