There's a subtle bug in this implementation that crops up when the operations are split in half like they are. In HADDPD_XMM_XMM, if xmml and xmmlm refer to the same register, you'll overwrite the value you're supposed to use in the second uop. You'll want to use maybe ufp1 in the first uop and a move at the end to cover that possibility. I'm sure real hardware would do something smarter like not forward the first result, but unfortunately we can't impose assumptions like that on the CPU models.
Gabe Vince Weaver wrote: > On Tue, 27 Oct 2009, Gabriel Michael Black wrote: > > >> Thanks. I have a half completed change that replaces cryptic ext constants >> with more meaningful names, and hopefully that makes those microops easier to >> work with in the future. >> > > you're right, I missed that I had contrasting values for ext= > within this patch. I wasted a lot of time fiddling with options when I > couldn't get the maddf instruction to work, only to realize later it was > working fine, it was the movhpd that was breaking things. I forgot to > reset everything to sane values in the final version of the patch. > > Any work that clarifies the uop situation a bit would be welcome. > > Vince > > > > >> Gabe >> >> Quoting Vince Weaver <[email protected]>: >> >> >>> This patch adds support for the haddpd sse instruction. >>> >>> Attached is a simple test case (which depends on the movhpd patch I sent >>> earlier today). >>> >>> >>> # HG changeset patch >>> # User Vince Weaver <[email protected]> >>> # Date 1256679237 14400 >>> # Node ID 71078e842148765616feba710e31616ee39d382c >>> # Parent a3c85a29b838e0e15a459f64b2d83b821aacf520 >>> Add support for haddpd instruction to x86 >>> >>> diff -r a3c85a29b838 -r 71078e842148 >>> src/arch/x86/isa/decoder/two_byte_opcodes.isa >>> --- a/src/arch/x86/isa/decoder/two_byte_opcodes.isa Tue Oct 27 09:24:40 >>> 2009 -0700 >>> +++ b/src/arch/x86/isa/decoder/two_byte_opcodes.isa Tue Oct 27 17:33:57 >>> 2009 -0400 >>> @@ -707,7 +707,7 @@ >>> } >>> // operand size (0x66) >>> 0x1: decode OPCODE_OP_BOTTOM3 { >>> - 0x4: WarnUnimpl::haddpd_Vo_Wo(); >>> + 0x4: HADDPD(Vo,Wo); >>> 0x5: WarnUnimpl::hsubpd_Vo_Wo(); >>> 0x6: WarnUnimpl::movd_Ed_Vd(); >>> 0x7: WarnUnimpl::movdqa_Wo_Vo(); >>> diff -r a3c85a29b838 -r 71078e842148 >>> src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py >>> --- >>> a/src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py >>> Tue Oct 27 09:24:40 2009 -0700 >>> +++ >>> b/src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py >>> Tue Oct 27 17:33:57 2009 -0400 >>> @@ -55,5 +55,24 @@ >>> >>> microcode = ''' >>> # HADDPS >>> -# HADDPD >>> + >>> +def macroop HADDPD_XMM_XMM { >>> + maddf xmml, xmmh , xmml, size=8, ext=0 >>> + maddf xmmh, xmmlm, xmmhm, size=8, ext=0 >>> +}; >>> + >>> +def macroop HADDPD_XMM_M { >>> + ldfp ufp1, seg, sib, disp, dataSize=8 >>> + ldfp ufp2, seg, sib, "DISPLACEMENT+8", dataSize=8 >>> + maddf xmml, xmmh, xmml, size=8, ext=1 >>> + maddf xmmh, ufp1, ufp2, size=8, ext=1 >>> +}; >>> + >>> +def macroop HADDPD_XMM_P { >>> + rdip t7 >>> + ldfp ufp1, seg, riprel, disp, dataSize=8 >>> + ldfp ufp2, seg, riprel, "DISPLACEMENT+8", dataSize=8 >>> + maddf xmml, xmmh, xmml, size=8, ext=1 >>> + maddf xmmh, ufp1, ufp2, size=8, ext=1 >>> +}; >>> ''' >>> >> >> > > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
