----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2218/#review4996 -----------------------------------------------------------
configs/ruby/MESI_Three_Level.py <http://reviews.gem5.org/r/2218/#comment4593> I realise this is orthogonal to this patch, but is there any chance we could factor out the generic bits here so that it does not have to be repeated for every single coherency protocol? This improvement should ultimately happen before this patch goes in. Just a thought... - Andreas Hansson On March 31, 2014, 10:53 a.m., Emilio Castillo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2218/ > ----------------------------------------------------------- > > (Updated March 31, 2014, 10:53 a.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > This patch fixes the se.py script by adding the ruby clock domain. > > This would cause the ruby clock domain to be set at 1GHz by default. > Running simulations with the cpu clock set at 2GHz or 1GHz will output the > same time results and could distort > power measurements. > > The patch also sets the clock domain for each coherence protocol, L1's and > Sequencer shares the cpu clock domain, > while the rest of the components use the ruby clock domain. > > Thanks to Mr. Nilay Valsh for his help while figuring out what was happening. > > > Diffs > ----- > > configs/example/se.py 46ccaf2cdef3 > configs/ruby/MESI_Three_Level.py 46ccaf2cdef3 > configs/ruby/MESI_Two_Level.py 46ccaf2cdef3 > configs/ruby/MOESI_CMP_directory.py 46ccaf2cdef3 > configs/ruby/MOESI_CMP_token.py 46ccaf2cdef3 > configs/ruby/MOESI_hammer.py 46ccaf2cdef3 > > Diff: http://reviews.gem5.org/r/2218/diff/ > > > Testing > ------- > > This was tested using a timing cpu with a code that is not memory or I/O > bounded such as: > > #include <math.h> > #include <stdio.h> > > void main() > { > float var=0.0f; > int i; > for (i=0;i<800000;i++) > var=exp(var); > printf("var %f\n",var); > } > > If CPU freq. is halved from 2GHz to 1GHz, execution time is also expected to > decrease. > (This has been verified with the classic memory model). > > > build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 -c ./a.out > --cpu-type=timing --caches --cpu-clock=2GHz > > sim_seconds 0.076027 # > Number of seconds simulated > system.cpu0.numCycles 152035861 > > > build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 -c ./a.out > --cpu-type=timing --caches --cpu-clock=1GHz > > sim_seconds 0.152036 > system.cpu0.numCycles 152054288 > > > However if ruby is used (with the se.py fixed by adding the ruby clock), > execution time will be the same at 2GHz and 1GHz for the cpu. > > build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 --ruby > --num-l2cache=2 --num-dirs=2 -c ./a.out --cpu-clock=2GHz > sim_seconds 0.304070 > system.cpu0.numCycles 608140702 > > build/X86_MESI_Two_Level/gem5.fast configs/example/se.py -n 2 --ruby > --num-l2cache=2 --num-dirs=2 -c ./a.out --cpu-clock=1GHz > sim_seconds 0.304070 # > Number of seconds simulated > system.cpu0.numCycles 304070351 > > > Suppose the cache access cycles are set to 2 cycles at 2GHz, if the CPU freq. > is also set to 2GHz then a memory access will take > 2 cycles, even for ins. fetch in the simple cpus. > > If the CPU freq is now lowered to 1GHz, each memory access to the L1's will > take 1 cycle seen from the cpu side. > Ins. fetch will take now 1 Cycle, thus the number of execution cycles will be > exactly twice more in the 2GHz cpu. > > 608140702/304070351=2.0 > > This patch fixes it with an approach similar taken in the classic memory > model, where the l1 controllers are set to the cpu clock domain. > > > Thanks, > > Emilio Castillo > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
