> On April 3, 2011, 5:27 p.m., Gabe Black wrote: > > src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py, > > line 41 > > <http://reviews.gem5.org/r/592/diff/1/?file=11018#file11018line41> > > > > ext is no longer set to a raw bitvector that selects per instruction > > features like this since, as you can see, it's pretty opaque just looking > > at it. The maddf ext=1 becomes ext=Scalar. For msrli and mslli, ext=0 is > > the default and can be dropped. It would leave the ops as SIMD. Since > > they're already operating at the full width of the fp register type (a > > double) the value is especially redundant.
I've address this in a newly submitted patch (since I can't update this one). > On April 3, 2011, 5:27 p.m., Gabe Black wrote: > > src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py, > > line 55 > > <http://reviews.gem5.org/r/592/diff/1/?file=11018#file11018line55> > > > > This implementation is a bit inefficient, although not terribly so. You > > have to be careful since the two operands may be the same registers and you > > don't want to overwrite something you still need, but, for instance, the > > maddf one line above, this shift of ufp4 and the maddf on line 60 could all > > update xmmh since all "high" halves of xmm registers have been read and no > > faults can happen. The moves that read out xmmlm could be moved higher, and > > xmml could also be updated directly. > > > > I think it -may- also be possible to do something clever and cut down > > the number of microops shifting things around to pack and unpack the > > results. I may have also suspected this was true when I wrote the much > > simpler 64 bit wide version of this instruction below this one where the > > components are whole registers and can be indexed directly, but then didn't > > come up with anything and punted for later. In a new patch (since I can't update this one), I removed a few redundant loads from each macroop. I didn't quite achieve the optimization that you are suggesting though. > On April 3, 2011, 5:27 p.m., Gabe Black wrote: > > src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py, > > line 78 > > <http://reviews.gem5.org/r/592/diff/1/?file=11018#file11018line78> > > > > This microop is changing architecturally visible state and effectively > > committing to completing the op before all the possibly faulting ops have > > executed, specifically the following loads. There are 8 microcode fp > > registers so you can just use the others and leave ufp3 around until the > > end. I've address this in a newly submitted patch (since I can't update this one). > On April 3, 2011, 5:27 p.m., Gabe Black wrote: > > src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py, > > line 108 > > <http://reviews.gem5.org/r/592/diff/1/?file=11018#file11018line108> > > > > Like above, this can't happen before the loads. I've address this in a newly submitted patch (since I can't update this one). - Marc ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/592/#review1096 ----------------------------------------------------------- On March 17, 2011, 4:07 p.m., Lisa Hsu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/592/ > ----------------------------------------------------------- > > (Updated March 17, 2011, 4:07 p.m.) > > > Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and > Nathan Binkert. > > > Description > ------- > > X86: haddps: Another patch from Vince Weaver > > > Diffs > ----- > > src/arch/x86/isa/decoder/two_byte_opcodes.isa 2e269d6fb3e6 > > src/arch/x86/isa/insts/simd128/floating_point/arithmetic/horizontal_addition.py > 2e269d6fb3e6 > > Diff: http://reviews.gem5.org/r/592/diff/ > > > Testing > ------- > > > Thanks, > > Lisa Hsu > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
