> On May 13, 2012, 7:25 p.m., Gabe Black wrote:
> > Thanks for stepping up and taking a shot at this. I saw some possible 
> > improvements to your implementation and, after play with it a bit, came up 
> > with this:
> > 
> > 
> >     shuffle ufp1, xmml, xmmh, ext=((0 << 0) | (2 << 2)), size=4
> >     shuffle ufp2, xmml, xmmh, ext=((1 << 0) | (3 << 2)), size=4
> >     shuffle ufp3, xmmlm, xmmhm, ext=((0 << 0) | (2 << 2)), size=4
> >     shuffle ufp4, xmmlm, xmmhm, ext=((1 << 0) | (3 << 2)), size=4
> > 
> >     maddf xmml, ufp1, ufp2, size=4
> >     maddf xmmh, ufp3, ufp4, size=4
> > 
> > 
> > The memory versions follow naturally. It works/should work by moving the 
> > input values to the position they'll be in the answer with the "shuffle" 
> > microop, and then adding them together in parallel. I've verified that this 
> > compiles but haven't functionally tested it. Could you please use your test 
> > program to do that?
> > 
> > Also, the HADDPS_XMM_P version is basically the same as HADDPS_XMM_M, it 
> > just uses RIP relative addressing for the memory operand. The microcode for 
> > those typically read the RIP into microcode register t7 and then use the 
> > riprel address computation shorthand but are otherwise the same as the 
> > normal memory version. That addressing mode is only available in 64 bit 
> > mode, and to make sure you're using the version you want (RIP relative 
> > versus regular) you may have to encode the instruction manually.

And actually, if you wanted to write up little test programs for other 
instructions we're missing or even go as far as putting them into a new 
regression, that would be great. It would be a great tool for you (or anyone 
else) to get and keep those instructions working, and it would be nice to fill 
in some of the gaps in our x86 instruction support.


- Gabe


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1189/#review2685
-----------------------------------------------------------


On May 11, 2012, 5:19 p.m., Marc Orr wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/1189/
> -----------------------------------------------------------
> 
> (Updated May 11, 2012, 5:19 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Description
> -------
> 
> Changeset 8981:bd580154c720
> ---------------------------
> x86 ISA: Implement the sse3 haddps instruction.
> 
> This patch is a revised version of Vince Weaver's  patch from 592.
> 
> 
> Diffs
> -----
> 
>   src/arch/x86/isa/decoder/two_byte_opcodes.isa 
> 4388495beb44ba859d20177371caf9e14902ef91 
>   
> src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py
>  4388495beb44ba859d20177371caf9e14902ef91 
> 
> Diff: http://reviews.gem5.org/r/1189/diff/
> 
> 
> Testing
> -------
> 
> Wrote a little program that uses haddps. I was able to test both the XMM_XMM 
> version and the XMM_M version. I don't understand what the XMM_P version is 
> so I was not able to test it.
> 
> 
> Thanks,
> 
> Marc Orr
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to