> On May 13, 2012, 7:25 p.m., Gabe Black wrote: > > Thanks for stepping up and taking a shot at this. I saw some possible > > improvements to your implementation and, after play with it a bit, came up > > with this: > > > > > > shuffle ufp1, xmml, xmmh, ext=((0 << 0) | (2 << 2)), size=4 > > shuffle ufp2, xmml, xmmh, ext=((1 << 0) | (3 << 2)), size=4 > > shuffle ufp3, xmmlm, xmmhm, ext=((0 << 0) | (2 << 2)), size=4 > > shuffle ufp4, xmmlm, xmmhm, ext=((1 << 0) | (3 << 2)), size=4 > > > > maddf xmml, ufp1, ufp2, size=4 > > maddf xmmh, ufp3, ufp4, size=4 > > > > > > The memory versions follow naturally. It works/should work by moving the > > input values to the position they'll be in the answer with the "shuffle" > > microop, and then adding them together in parallel. I've verified that this > > compiles but haven't functionally tested it. Could you please use your test > > program to do that? > > > > Also, the HADDPS_XMM_P version is basically the same as HADDPS_XMM_M, it > > just uses RIP relative addressing for the memory operand. The microcode for > > those typically read the RIP into microcode register t7 and then use the > > riprel address computation shorthand but are otherwise the same as the > > normal memory version. That addressing mode is only available in 64 bit > > mode, and to make sure you're using the version you want (RIP relative > > versus regular) you may have to encode the instruction manually. > > Gabe Black wrote: > And actually, if you wanted to write up little test programs for other > instructions we're missing or even go as far as putting them into a new > regression, that would be great. It would be a great tool for you (or anyone > else) to get and keep those instructions working, and it would be nice to > fill in some of the gaps in our x86 instruction support.
In the near term, I could certainly put the test program that I have for haddps somewhere in the repo or make it into a regression if that would be useful. - Marc ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/1189/#review2685 ----------------------------------------------------------- On May 14, 2012, 6:27 p.m., Marc Orr wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/1189/ > ----------------------------------------------------------- > > (Updated May 14, 2012, 6:27 p.m.) > > > Review request for Default. > > > Description > ------- > > Changeset 8981:463aba906774 > --------------------------- > x86 ISA: Implement the sse3 haddps instruction. > > This patch is a revised version of Vince Weaver's patch from 592. > > > Diffs > ----- > > src/arch/x86/isa/decoder/two_byte_opcodes.isa > 4388495beb44ba859d20177371caf9e14902ef91 > > src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py > 4388495beb44ba859d20177371caf9e14902ef91 > > Diff: http://reviews.gem5.org/r/1189/diff/ > > > Testing > ------- > > Wrote a little program that uses haddps. All 3 haddps versions were tested > (XMM_XMM, XMM_M, and XMM_P). > > > Thanks, > > Marc Orr > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
