On Fri, 6 Nov 2009, Gabe Black wrote:

> I debated adding an hadd microop or adding a flag that changed the 
> behavior of maddf, but in the end I didn't do either since I didn't have 
> a ready way to test any implementation of hadd. Between those two I'd 
> probably go with the hadd microop since maddf might end up overly 
> complicated and hard to use. I think it would be reasonable (but not 
> necessarily the right thing to do) to have an hadd microop since that 
> might be something the SSE pipeline knew how to do directly. 

I was thinking about it, and it's true that a "hadds" microop could make 
haddps much shorter.  In that case haddps could be implemented something 
like the following:

     hadds ufp1, xmmh, dest=top
     hadds ufp2, xmml, dest=top
     hadds ufp1, xmmhm, dest=bottom
     hadds ufp2, xmmlm, dest=bottom
     movfp xmml, ufp2
     movfp xmmh, ufp1

which has 6uops which is the same number that it takes on core2.

Vince
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to