On Mon, 26 Oct 2009, Gabe Black wrote: > In actual implementations, I believe real hardware does things in two > halves like we do which is why I did things that way. The shifts you > mention below are an obvious exception, and I haven't figured out what > to do with them yet.
I managed to get together code that could handle shifts of 0-7 bytes, but couldn't figure out a way to handle 8-15 properly without some sort of conditional branch. It might be doable with a conditional swap instruction or something like the C "?" operator, but I don't think the uop code has support for anything like that. In case you are interested, here is the list of missing FP/SSE instructions that happen when running spec2k compiled with -msse3 for x86_64 on X86_SE (this assumes movdqu/movdqa is implemented, and that prefetch is turned into a no-op). crafty, gcc and gzip complain about the missing psrldq_VRo_Ib instruction but seem to run more or less correctly without it. I'm not sure what the movd_Vo_Ed instruction pertains too, in digging through the various manuals I couldn't figure out exactly which argument types map to it. missing insn benchmarks ------------ ---------------- fadd eon,equake,lucas fldcw_Mw all fldenv vpr.place,facerec fldpi eon,equake,lucas fnstcw_Mw all fnstenv vpr.place,facerec fnstsw eon,equake,fma3d,lucas,mesa fprem1 eon,equake,lucas fsincos eon,equake,fma3d,mesa fwait applu,apsi,facerec,fma3d,galgel,lucas, mgrid,sixtrack,swim fxch eon,equake,lucas haddpd_Vo_Wo eon,galgel,twolf haddps_Vo_Wo vpr movd_Vo_Ed bzip,eon,fma3d,gcc prefetch_nta bzip,gzip,vpr psrldq_VRo_Ib crafty,gcc,parser,vpr _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
