> Can atom execute two IMUL in parallel? Or what exactly is the pipeline
As I understand from Intel's optimization reference manual, the behavior is as
follows: if the instruction immediately following IMUL has shorter latency,
execution is stalled for 4 cycles (which is IMUL's latency); otherwise, a
4-or-more cycles latency instruction can be issued after IMUL without a stall.
In other words, IMUL is pipelined with respect to other long-latency
instructions, but not to short-latency instructions.
>From reading the patch, I could not understand the link between pipeline
behavior and what the patch appears to do.
Hope that helps.