On Fri, 6 Nov 2009, Gabe Black wrote:
> I debated adding an hadd microop or adding a flag that changed the
> behavior of maddf, but in the end I didn't do either since I didn't have
> a ready way to test any implementation of hadd. Between those two I'd
> probably go with the hadd microop since maddf might end up overly
> complicated and hard to use. I think it would be reasonable (but not
> necessarily the right thing to do) to have an hadd microop since that
> might be something the SSE pipeline knew how to do directly.
I was thinking about it, and it's true that a "hadds" microop could make
haddps much shorter. In that case haddps could be implemented something
like the following:
hadds ufp1, xmmh, dest=top
hadds ufp2, xmml, dest=top
hadds ufp1, xmmhm, dest=bottom
hadds ufp2, xmmlm, dest=bottom
movfp xmml, ufp2
movfp xmmh, ufp1
which has 6uops which is the same number that it takes on core2.
Vince
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev