I suspect you're not running exactly the same binary in both cases. __libc_start_main is one of the functions provided by glibc (if I remember correctly) which run before main() and get some basic things set up. If it says __libc_start_main in one, it should say it in the other one too, unless the thing that finds the symbol name was broken somehow.
Gabe On 04/14/12 22:50, Mahmood Naderan wrote: > I reduced the number of fast forward to 20 instructions and maxinst to > 10 and turn on the ExecAll flag. > > The old one looks like: > 23000: system.cpu + A0 T0 : @_start+36.3 : CALL_NEAR_I : subi > rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38 > 24000: system.cpu + A0 T0 : @_start+36.4 : CALL_NEAR_I : wrip , > t7, t1 : IntAlu : > 25000: system.cpu + A0 T0 : @__libc_start_main : push r15 > 25000: system.cpu + A0 T0 : @__libc_start_main.0 : PUSH_R : st > r15, SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000 > A=0x7fffffffed30 > hack: be nice to actually delete the event here > Switched CPUS @ tick 25000 > Changing memory mode to timing > switching cpus > **** REAL SIMULATION **** > info: Entering event queue @ 25000. Starting simulation... > 67000: system.switch_cpus + A0 T0 : @__libc_start_main.1 : PUSH_R > : subi rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed30 FetchSeq=1 > CPSeq=0 > 67000: system.switch_cpus + A0 T0 : @__libc_start_main+2 : mov eax, 0 > 67000: system.switch_cpus + A0 T0 : @__libc_start_main+2.0 : > MOV_R_I : limm eax, 0 : IntAlu : D=0x0000000000000000 FetchSeq=2 > CPSeq=1 > 67000: system.switch_cpus + A0 T0 : @__libc_start_main+7 : push r14 > > > > > But the new one is: > 23000: system.cpu + A0 T0 : 0x400364.3 : CALL_NEAR_I : subi > rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38 > 24000: system.cpu + A0 T0 : 0x400364.4 : CALL_NEAR_I : wrip , > t7, t1 : IntAlu : > 25000: system.cpu + A0 T0 : 0x470960 : push r15 > 25000: system.cpu + A0 T0 : 0x470960.0 : PUSH_R : st r15, > SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000 > A=0x7fffffffed30 > 26000: system.cpu + A0 T0 : 0x470960.1 : PUSH_R : subi rsp, > rsp, 0x8 : IntAlu : D=0x00007fffffffed30 > 27000: system.cpu + A0 T0 : 0x470962 : mov eax, 0 > > > > As you can see, in the old version switch at tick 25000 but the new > version switch at 41000. The gap is large though. > > Do you know what does " @__libc_start_main" mean in the old version? > > On 4/15/12, Mahmood Naderan <[email protected]> wrote: >> I am trying what you said, but can you clarify this: >> >> Although the -F option is 20M instruction in both versions, I noticed that >> the old version enters real simulation at tick 22,407,755,000 but the new >> version enters at tick 90,443,309,000 >> >> I made the config files as closely as possible (same system bus freq, O3 >> parameters, ...) >> >> Why they switch at different tick numbers? >> -- >> // Naderan *Mahmood; >> >> >> On Sun, Apr 15, 2012 at 9:35 AM, Korey Sewell <[email protected]> wrote: >> >>> - make every O3CPU parameter that is different in the new version, the >>> same as the old version >>> >>> - check the stats file for major differences. >>> For example: Are the L1/L2 miss rates higher or lower? Are your caches >>> the >>> same size and associativity? This is h.264, so is there a lot of floating >>> point insts being committed? If so, maybe the change is in the latencies >>> of >>> the FP-Unit in the Function Unit Pool. >>> >>> - run gem5 for a small # of instructions (e.g. maxinsts=10) and see if >>> there is a difference in the number of ticks it takes to complete (this >>> is >>> *after* all the O3 parameters are the same). If there is a difference, >>> then >>> turn on some O3 flags or check the stats and see what's going on there. >>> If >>> there is no difference increase the maxinsts and try again until you see >>> the simulations diverging. >>> >>> >>> >>> On Sun, Apr 15, 2012 at 12:46 AM, Mahmood Naderan >>> <[email protected]>wrote: >>> >>>> I did that. >>>> There are some differences and I attached them. In short, I see this: >>>> >>>> old: >>>> children=dcache dtb icache itb tracer workload >>>> >>>> new: >>>> children=dcache dtb icache interrupts itb tracer workload >>>> >>>> Also the commitwidth, fetchwidth and some other parameters are 8 in the >>>> new version, but they are 4 in the old version. So I really wonder why >>>> it >>>> has a very low IPC. >>>> >>>> I will be greatly thankful if someone else try that. >>>> Also, I emailed another problem at >>>> http://permalink.gmane.org/gmane.comp.emulators.m5.devel/14987 about >>>> "Unable to find destination for addr" which I encountered in the new >>>> version. >>>> >>>> Appreciate any idea. >>>> >>>> >>>> >>>>> I believe the 'dotencode' message just means you should upgrade to a >>>> newer version of mercurial. >>>> ok I will try that. >>>> -- >>>> // Naderan *Mahmood; >>>> >>>> >>>> >>>> On Sun, Apr 15, 2012 at 3:45 AM, Steve Reinhardt >>>> <[email protected]>wrote: >>>> >>>>> I believe the 'dotencode' message just means you should upgrade to a >>>>> newer version of mercurial. >>>>> >>>>> >>>>> On Sat, Apr 14, 2012 at 10:36 AM, Mahmood Naderan >>>>> <[email protected]>wrote: >>>>> >>>>>> I forgot to say that I removed the 'dotencode' feature and the "hg >>>>>> heads" says: >>>>>> >>>>>> mahmood@tiger:gem5$ hg heads >>>>>> changeset: 8920:99083b5b7ed4 >>>>>> abort: data/.hgtags.i@b151ff1fd9df: no match found! >>>>>> >>>>>> >>>>>> On 4/14/12, Mahmood Naderan <[email protected]> wrote: >>>>>>> For the old one, I use: >>>>>>> build/X86_SE/m5.fast configs/example/cmp.py -F 20000000 --maxtick >>>>>>> 10000000000 -d --caches --l2cache -b h264_sss >>>>>>> --prog-interval=1000000 >>>>>>> >>>>>>> for the new one I use: >>>>>>> build/X86/m5.fast configs/example/cmp.py --cpu-type=detailed -F >>>>>>> 20000000 --maxtick 10000000000 --caches --l2cache -b h264_sss >>>>>>> --prog-interval=1000000 >>>>>>> >>>>>>> I attached the configs and stats. Thanks >>>>>>> >>>>>>> On 4/14/12, Nilay Vaish <[email protected]> wrote: >>>>>>>> So, with 8613:712d8bf07020 you got and IPC of 1.54, and with some >>>>>> version >>>>>>>> near 8944:d062cc7a8bdf, you get an ipc of 0.093. Which CPU type are >>>>>> you >>>>>>>> using? >>>>>>>> >>>>>>>> -- >>>>>>>> Nilay >>>>>>>> >>>>>>>> On Sat, 14 Apr 2012, Mahmood Naderan wrote: >>>>>>>> >>>>>>>>> The previous release is: >>>>>>>>> changeset: 8613:712d8bf07020 >>>>>>>>> tag: tip >>>>>>>>> user: Nilay Vaish<[email protected]> >>>>>>>>> date: Sat Nov 05 15:32:23 2011 -0500 >>>>>>>>> summary: Tests: Update stats due to addition of fence microop >>>>>>>>> >>>>>>>>> >>>>>>>>> And the IPC is 1.541534 >>>>>>>>> >>>>>>>>> However for the new release, I am not able to find the head: >>>>>>>>> mahmood@tiger:gem5$ hg head >>>>>>>>> abort: requirement 'dotencode' not supported! >>>>>>>>> >>>>>>>>> >>>>>>>>> On 4/14/12, Nilay Vaish <[email protected]> wrote: >>>>>>>>>> How much is the difference and which versions of gem5 are you >>>>>> talking >>>>>>>>>> about? >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Nilay >>>>>>>>>> >>>>>>>>>> On Sat, 14 Apr 2012, Mahmood Naderan wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> In the new version, I see that the IPC of h264 (with sss input) >>>>>>>>>>> is >>>>>>>>>>> very very low. However with the previous releases, this value is >>>>>> fine >>>>>>>>>>> and acceptable. >>>>>>>>>>> >>>>>>>>>>> Do you know how can I find the bottleneck? Which stat value >>>>>>>>>>> shows >>>>>> the >>>>>>>>>>> weired behaviour? >>>>>>>>>>> >>>>>>>>>>> ISA = x86 >>>>>>>>>>> -F = 50,000,000 >>>>>>>>>>> --maxtick = 10,000,000,000 >>>>>>>>>>> L1 = 32kB, 4 >>>>>>>>>>> L2 = 2MB, 16 >>>>>>>>>>> >>>>>>>>>>> the IPC obtained is 0.093432 >>>>>>>>>>> >>>>>>>>>>> Have you faced such result? Please let me know >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> gem5-users mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> gem5-users mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> // Naderan *Mahmood; >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> [email protected] >>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> gem5-users mailing list >>>>>>>> [email protected] >>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> // Naderan *Mahmood; >>>>>>> >>>>>> >>>>>> -- >>>>>> -- >>>>>> // Naderan *Mahmood; >>>>>> _______________________________________________ >>>>>> gem5-users mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >>>>> >>>>> _______________________________________________ >>>>> gem5-users mailing list >>>>> [email protected] >>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>> >>>> >>>> _______________________________________________ >>>> gem5-users mailing list >>>> [email protected] >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>> >>> >>> -- >>> - Korey >>> >>> _______________________________________________ >>> gem5-users mailing list >>> [email protected] >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>> > _______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
