Hey Abbas, Please always reply to the gem5 mailing list, and CC me
when appropriate,

I can understand why you would like to have a fixed number.

I think the stats can vary due to a very wide number of complex
factors. Some of those may be more accurate, others no one knows,
others just bugs.

This can also be observed by the fact that the stats checks have been
CHANGED by a long time, e.g.:
https://www.mail-archive.com/[email protected]/msg26855.html changes
happen so often that devs haven't found the time to properly
understand and justify them.

My recommendation is that you re-run your old experiments on the newer
gem5 version, and compare everything there.

gem5 is not a cycle accurate system simulator, so absolute values or
small variations are not meaningful in general.

This also teaches us that results obtained with small margins are
generally not meaningful for publication since the noise is too great.

What that error margin is, I don't know.

On Tue, Sep 11, 2018 at 3:51 PM Abbas Fairouz <[email protected]> wrote:
>
> Hi Ciro,
>
> Thanks for your reply.
>
> The reason I was asking about the differences between these two versions of 
> GEM5, because I have published a paper two years ago using the old GEM5 
> version. Now, I need two do more experiments on GEM5 using new memory 
> technologies (such as HBM). I'm getting different results in the new GEM5 
> version, for the same benchmarks I used in the old GEM5 version.
>
> Is it because the new GEM5 has more accurate:
> 1) Memory modeling?
> 2) Cache modeling?
> 3) CPU modeling?
>
>
>
>
> Best regards,
> Abbas Fairouz
>
>
> -------------------------------------------------
> Abbas Fairouz, PhD candidate
> Dept. of ECE, Texas A&M University
> College Station, TX 77843, USA
> -------------------------------------------------
>
>
> On Wed, Sep 5, 2018 at 4:52 AM, Ciro Santilli <[email protected]> wrote:
>>
>> Thanks for the detailed report,
>>
>> I recommend that if you really care about this difference, then do a 
>> bisection of gem5 and pinpoint which commit introduced it, and then tell us 
>> which one it was, possibly also pinging the author for clarification.
>>
>> If you are not familiar with bisection, here is a detailed example that you 
>> should be able to adapt easily for this use case: 
>> https://github.com/cirosantilli/linux-kernel-module-cheat/tree/83b36867cf06ffdca3ce04296a8568d4f37ea13b#bisection
>>
>>
>> On Tue, Sep 4, 2018 at 8:50 PM Abbas Fairouz <[email protected]> wrote:
>>>
>>> Hi guys,
>>>
>>> I have simulated a simple "hello world" example on two different versions 
>>> of GEM5. I have got two different "system.cpu.numCycles" results in both 
>>> simulations. In both simulations, I have been using the same configurations 
>>> (linux image, vm, caches, ...etc).
>>>
>>> I will list the parts of the configuration files and "stats.txt" files for 
>>> both simulations.
>>>
>>> They have the same path to ~/gem5/system files.
>>> I ran them on the same configuration: FS mode, O3 CPU, CPU speed is 2GHz, 
>>> DDR3_1600, l2 cache.
>>>
>>>
>>> Running script is "test.rcS":
>>>
>>> /sbin/m5 resetstats
>>>
>>> echo "Start"
>>>
>>> echo `ls`
>>>
>>> cd test
>>>
>>> ./a.out
>>>
>>> echo "Bye"
>>>
>>> /sbin/m5 exit
>>>
>>>
>>>
>>> "a.out" is a binary code of "hello.c" file:
>>>
>>> #include <stdio.h>
>>>
>>> int main()
>>>
>>> {
>>>
>>>     //printf() displays the string inside quotation
>>>
>>>     printf("Hello, World!\n");
>>>
>>>     int x = 100 + 5 * 23 - 16 + 6 * 44 - 289 / 4;
>>>
>>>     printf("X = %d\n", x);
>>>
>>>
>>>     return 0;
>>>
>>> }
>>>
>>>
>>>
>>>
>>>
>>> ==========================================================
>>>
>>> Old GEM5 (gem5-stable-0e86fac7254c)
>>>
>>> In "configs/common/FSConfig.py":
>>>
>>>     # Command line
>>>
>>>     self.boot_osflags = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 ' + \
>>>
>>>                         'root=/dev/hda1'
>>>
>>>     # abbas
>>>
>>>     #self.kernel = binary('x86_64-vmlinux-2.6.22.9')
>>>
>>>     self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
>>>
>>>     #self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
>>>
>>>     return self
>>>
>>>
>>>
>>> In "configs/common/Benchmarks.py":
>>>
>>> elif buildEnv['TARGET_ISA'] == 'x86':
>>>
>>>             # abbas
>>>
>>>             #return env.get('LINUX_IMAGE', disk('x86root.img'))
>>>
>>>             return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
>>>
>>>
>>>
>>> In "configs/common/Simulation.py":
>>>
>>> elif options.fast_forward:
>>>
>>>         CPUClass = TmpClass
>>>
>>>         # Abbas
>>>
>>>         #TmpClass = AtomicSimpleCPU
>>>
>>>         #test_mem_mode = 'atomic'
>>>
>>>         TmpClass = TimingSimpleCPU
>>>
>>>         test_mem_mode = 'timing'
>>>
>>>
>>>
>>> Running GEM5 command:
>>>
>>> ./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches 
>>> --l2cache --l1d_size=128kB --script=myscripts/test.rcS 
>>> --mem-type=DDR3_1600_x64 --restore-with-cpu=detailed
>>>
>>>
>>> GEM5 terminal (tail):
>>>
>>> TCP cubic registered
>>>
>>> NET: Registered protocol family 1
>>>
>>> NET: Registered protocol family 10
>>>
>>> IPv6 over IPv4 tunneling driver
>>>
>>> NET: Registered protocol family 17
>>>
>>> EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
>>>
>>> VFS: Mounted root (ext2 filesystem).
>>>
>>> Freeing unused kernel memory: 248k freed
>>>
>>> mounting filesystems...
>>>
>>> loading script...
>>>
>>> Start
>>>
>>> benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal 
>>> opt parsec proc real root sbin sys test tmp usr  var
>>>
>>> Hello
>>>
>>> X = 391
>>>
>>> Bye
>>>
>>>
>>>
>>>
>>> In "stats.txt" file:
>>>
>>> system.cpu.apic_clk_domain.clock                 8000                       
>>> # Clock period in ticks
>>>
>>> system.cpu.numCycles                          4273712                       
>>> # number of cpu cycles simulated
>>>
>>> system.cpu.numWorkItemsStarted                      0                       
>>> # number of work items this cpu started
>>>
>>> system.cpu.numWorkItemsCompleted                    0                       
>>> # number of work items this cpu completed
>>>
>>> system.cpu.committedInsts                     1954222                       
>>> # Number of instructions committed
>>>
>>> system.cpu.committedOps                       3584009                       
>>> # Number of ops (including micro ops) committed
>>>
>>> system.cpu.num_int_alu_accesses               3508387                       
>>> # Number of integer alu accesses
>>>
>>> system.cpu.num_fp_alu_accesses                  21132                       
>>> # Number of float alu accesses
>>>
>>> system.cpu.num_func_calls                       85033                       
>>> # number of times a function call or return      occured
>>>
>>> system.cpu.num_conditional_control_insts       254623                       
>>> # number of instructions that are conditional    controls
>>>
>>> system.cpu.num_int_insts                      3508387                       
>>> # number of integer instructions
>>>
>>> system.cpu.num_fp_insts                         21132                       
>>> # number of float instructions
>>>
>>> system.cpu.num_int_register_reads             7285240                       
>>> # number of times the integer registers were read
>>>
>>> system.cpu.num_int_register_writes            2775300                       
>>> # number of times the integer registers were     written
>>>
>>> system.cpu.num_fp_register_reads                35511                       
>>> # number of times the floating registers were    read
>>>
>>> system.cpu.num_fp_register_writes               16891                       
>>> # number of times the floating registers were    written
>>>
>>> system.cpu.num_cc_register_reads              1862494                       
>>> # number of times the CC registers were read
>>>
>>> system.cpu.num_cc_register_writes             1160708                       
>>> # number of times the CC registers were written
>>>
>>> system.cpu.num_mem_refs                        885650                       
>>> # number of memory refs
>>>
>>> system.cpu.num_load_insts                      499134                       
>>> # Number of load instructions
>>>
>>> system.cpu.num_store_insts                     386516                       
>>> # Number of store instructions
>>>
>>> system.cpu.num_idle_cycles               109958.492414                      
>>>  # Number of idle cycles
>>>
>>> system.cpu.num_busy_cycles               4163753.507586                     
>>>   # Number of busy cycles
>>>
>>> system.cpu.not_idle_fraction                 0.974271                       
>>> # Percentage of non-idle cycles
>>>
>>> system.cpu.idle_fraction                     0.025729                       
>>> # Percentage of idle cycles
>>>
>>> system.cpu.Branches                            374315                       
>>> # Number of branches fetched
>>>
>>> system.cpu.op_class::No_OpClass                 22624      0.63%      0.63% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::IntAlu                   2647876     73.88%     74.51% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::IntMult                     6228      0.17%     74.68% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::IntDiv                      3691      0.10%     74.78% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::FloatAdd                   18119      0.51%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::FloatCmp                       0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::FloatCvt                       0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::FloatMult                      0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::FloatDiv                       0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::FloatSqrt                      0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::SimdAdd                        0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::SimdAddAcc                     0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::SimdAlu                        0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::SimdCmp                        0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::SimdCvt                        0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::SimdMisc                       0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>> system.cpu.op_class::SimdMult                       0      0.00%     75.29% 
>>> # Class of executed instruction
>>>
>>>
>>>
>>>
>>>
>>>
>>> New GEM5
>>>
>>> In "configs/common/FSConfig.py":
>>>
>>> # Command line
>>>
>>>     if not cmdline:
>>>
>>>         cmdline = 'earlyprintk=ttyS0 console=ttyS0 lpj=7999923 
>>> root=/dev/hda1'
>>>
>>>     self.boot_osflags = fillInCmdline(mdesc, cmdline)
>>>
>>>     # abbas
>>>
>>>     #self.kernel = binary('x86_64-vmlinux-2.6.22.9')
>>>
>>>     self.kernel = binary('x86_64-vmlinux-2.6.22.9.smp')
>>>
>>>     #self.kernel = binary('x86_64-vmlinux-2.6.28.4-smp')
>>>
>>>     return self
>>>
>>>
>>>
>>> In "configs/common/Benchmarks.py":
>>>
>>> elif buildEnv['TARGET_ISA'] == 'x86':
>>>
>>>             # abbas
>>>
>>>             #return env.get('LINUX_IMAGE', disk('x86root.img'))
>>>
>>>             #return env.get('LINUX_IMAGE', disk('linux-x86.img'))
>>>
>>>             return env.get('LINUX_IMAGE', disk('x86root-taco.img'))
>>>
>>>
>>>
>>> In "configs/common/Simulation.py":
>>>
>>> elif options.fast_forward:
>>>
>>>         CPUClass = TmpClass
>>>
>>>         # Abbas
>>>
>>>         #TmpClass = AtomicSimpleCPU
>>>
>>>         #test_mem_mode = 'atomic'
>>>
>>>         TmpClass = TimingSimpleCPU
>>>
>>>         test_mem_mode = 'timing'
>>>
>>>
>>>
>>> Running GEM5 command:
>>>
>>> ./build/X86/gem5.opt -d m5out/test ./configs/example/fs.py --caches 
>>> --l2cache --l1d_size=128kB --script=myscripts/test.rcS 
>>> --mem-type=DDR3_1600_8x8 --restore-with-cpu=DerivO3CPU
>>>
>>>
>>>
>>>
>>> GEM5 terminal (tail):
>>>
>>> TCP cubic registered
>>>
>>> NET: Registered protocol family 1
>>>
>>> NET: Registered protocol family 10
>>>
>>> IPv6 over IPv4 tunneling driver
>>>
>>> NET: Registered protocol family 17
>>>
>>> EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
>>>
>>> VFS: Mounted root (ext2 filesystem).
>>>
>>> Freeing unused kernel memory: 248k freed
>>>
>>> mounting filesystems...
>>>
>>> loading script...
>>>
>>> Start
>>>
>>> benches bin boot dev etc home lib lib32 lib64 linuxrc lost+found mnt normal 
>>> opt parsec proc real root sbin sys test tmp usr  var
>>>
>>> Hello
>>>
>>> X = 391
>>>
>>> Bye
>>>
>>>
>>>
>>>
>>> In "stats.txt" file:
>>>
>>> system.cpu_voltage_domain.voltage                   1                       
>>> # Voltage in Volts
>>>
>>> system.cpu_clk_domain.clock                       500                       
>>> # Clock period in ticks
>>>
>>> system.cpu.dtb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500       
>>>                 # Cumulative time (in ticks) in  various power states
>>>
>>> system.cpu.dtb.rdAccesses                      497427                       
>>> # TLB accesses on read requests
>>>
>>> system.cpu.dtb.wrAccesses                      384596                       
>>> # TLB accesses on write requests
>>>
>>> system.cpu.dtb.rdMisses                           434                       
>>> # TLB misses on read requests
>>>
>>> system.cpu.dtb.wrMisses                           163                       
>>> # TLB misses on write requests
>>>
>>> system.cpu.apic_clk_domain.clock                 8000                       
>>> # Clock period in ticks
>>>
>>> system.cpu.interrupts.pwrStateResidencyTicks::UNDEFINED 5141035093500       
>>>                 # Cumulative time (in ticks) in  various power states
>>>
>>> system.cpu.itb.walker.pwrStateResidencyTicks::UNDEFINED 5141035093500       
>>>                 # Cumulative time (in ticks) in  various power states
>>>
>>> system.cpu.itb.rdAccesses                           0                       
>>> # TLB accesses on read requests
>>>
>>> system.cpu.itb.wrAccesses                     2532817                       
>>> # TLB accesses on write requests
>>>
>>> system.cpu.itb.rdMisses                             0                       
>>> # TLB misses on read requests
>>>
>>> system.cpu.itb.wrMisses                           640                       
>>> # TLB misses on write requests
>>>
>>> system.cpu.numPwrStateTransitions                  64                       
>>> # Number of power state transitions
>>>
>>> system.cpu.pwrStateClkGateDist::samples            32                       
>>> # Distribution of time spent in the clock gated  state
>>>
>>> system.cpu.pwrStateClkGateDist::mean     1344463.875000                     
>>>   # Distribution of time spent in the clock      gated state
>>>
>>> system.cpu.pwrStateClkGateDist::stdev    1757712.048093                     
>>>   # Distribution of time spent in the clock      gated state
>>>
>>> system.cpu.pwrStateClkGateDist::1000-5e+10           32    100.00%    
>>> 100.00% # Distribution of time spent in the clock      gated state
>>>
>>> system.cpu.pwrStateClkGateDist::min_value       219525                      
>>>  # Distribution of time spent in the clock gated state
>>>
>>> system.cpu.pwrStateClkGateDist::max_value      4847757                      
>>>  # Distribution of time spent in the clock gated state
>>>
>>> system.cpu.pwrStateClkGateDist::total              32                       
>>> # Distribution of time spent in the clock gated  state
>>>
>>> system.cpu.pwrStateResidencyTicks::ON      2768793027                       
>>> # Cumulative time (in ticks) in various power    states
>>>
>>> system.cpu.pwrStateResidencyTicks::CLK_GATED     43022844                   
>>>     # Cumulative time (in ticks) in various      power states
>>>
>>> system.cpu.numCycles                          4233161                       
>>> # number of cpu cycles simulated
>>>
>>> system.cpu.numWorkItemsStarted                      0                       
>>> # number of work items this cpu started
>>>
>>> system.cpu.numWorkItemsCompleted                    0                       
>>> # number of work items this cpu completed
>>>
>>> system.cpu.kern.inst.arm                            0                       
>>> # number of arm instructions executed
>>>
>>> system.cpu.kern.inst.quiesce                        0                       
>>> # number of quiesce instructions executed
>>>
>>> system.cpu.committedInsts                     1956251                       
>>> # Number of instructions committed
>>>
>>> system.cpu.committedOps                       3569940                       
>>> # Number of ops (including micro ops) committed
>>>
>>> system.cpu.num_int_alu_accesses               3492413                       
>>> # Number of integer alu accesses
>>>
>>> system.cpu.num_fp_alu_accesses                  21132                       
>>> # Number of float alu accesses
>>>
>>> system.cpu.num_vec_alu_accesses                     0                       
>>> # Number of vector alu accesses
>>>
>>> system.cpu.num_func_calls                       84965                       
>>> # number of times a function call or return
>>>
>>>
>>>
>>>
>>> ==========================================================
>>>
>>>
>>>
>>> Can anyone explains to me why both simulations does not have the same 
>>> number of cycles?
>>>
>>> Old GEM5:    system.cpu.numCycles     4273712
>>> New GEM5:  system.cpu.numCycles     4233161
>>>
>>>
>>> Best regards,
>>> Abbas Fairouz
>>>
>>>
>>> -------------------------------------------------
>>> Abbas Fairouz, PhD candidate
>>> Dept. of ECE, Texas A&M University
>>> College Station, TX 77843, USA
>>> -------------------------------------------------
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to