Thanks for helping track this down. To expand on Nilay's suggestion, you can use the 'hg bisect' command to help you do the binary search (read the help or google for more info).
If you can pinpoint the changeset where the IPC drops, that will probably identify the issue directly. Steve On Sun, Apr 15, 2012 at 8:26 AM, Nilay Vaish <[email protected]> wrote: > You can binary search through the versions and figure out the earliest > version which shows a low ipc. > > -- > Nilay > > > On Sun, 15 Apr 2012, Mahmood Naderan wrote: > > With an untouched latest revision 8954:3c7232fec7fd >> the problem still exists. No matter what is the previous version, an >> IPC of 0.077 or 0.03 are not normal >> >> On 4/15/12, Mahmood Naderan <[email protected]> wrote: >> >>> I haven't change the new version yet. There maybe something wrong with >>> the loader. But I am not sure. Who can check that? >>> >>> >>> P.S: Dear Gabe, I think there is something wrong with the address >>> translator. Greatly appreciate if you check >>> http://permalink.gmane.org/**gmane.comp.emulators.m5.users/**9944<http://permalink.gmane.org/gmane.comp.emulators.m5.users/9944> >>> >>> On 4/15/12, Gabe Black <[email protected]> wrote: >>> >>>> It's worth looking into why it doesn't find the __libc_start_main symbol >>>> in the new version. If it's a bug we should fix it, even if it doesn't >>>> directly have anything to do with your problem. You can also try >>>> versions between your new and old one and see where things start >>>> behaving poorly. This is of course assuming you haven't changed the >>>> simulator in some way. If you have, all bets are off since that might be >>>> what's changing the behavior. >>>> >>>> Gabe >>>> >>>> On 04/14/12 23:31, Mahmood Naderan wrote: >>>> >>>>> Well, in MyBench.py there is only one entry for h264_sss >>>>> h264_dir = spec_dir + '464.h264ref/exe/' >>>>> h264_bin = h264_dir + 'h264ref_base.amd64-m64-gcc44-**nn' >>>>> h264_sss_data = h264_dir + 'sss_encoder_main.cfg' >>>>> >>>>> h264_sss = LiveProcess() >>>>> h264_sss.executable = h264_bin >>>>> h264_sss.cmd = [h264_sss.executable] + ['-d', h264_sss_data] >>>>> h264_sss.cwd = h264_dir >>>>> >>>>> >>>>> >>>>> On 4/15/12, Gabe Black <[email protected]> wrote: >>>>> >>>>>> I suspect you're not running exactly the same binary in both cases. >>>>>> __libc_start_main is one of the functions provided by glibc (if I >>>>>> remember correctly) which run before main() and get some basic things >>>>>> set up. If it says __libc_start_main in one, it should say it in the >>>>>> other one too, unless the thing that finds the symbol name was broken >>>>>> somehow. >>>>>> >>>>>> Gabe >>>>>> >>>>>> On 04/14/12 22:50, Mahmood Naderan wrote: >>>>>> >>>>>>> I reduced the number of fast forward to 20 instructions and maxinst >>>>>>> to >>>>>>> 10 and turn on the ExecAll flag. >>>>>>> >>>>>>> The old one looks like: >>>>>>> 23000: system.cpu + A0 T0 : @_start+36.3 : CALL_NEAR_I : subi >>>>>>> rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38 >>>>>>> 24000: system.cpu + A0 T0 : @_start+36.4 : CALL_NEAR_I : wrip , >>>>>>> t7, t1 : IntAlu : >>>>>>> 25000: system.cpu + A0 T0 : @__libc_start_main : push r15 >>>>>>> 25000: system.cpu + A0 T0 : @__libc_start_main.0 : PUSH_R : st >>>>>>> r15, SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000 >>>>>>> A=0x7fffffffed30 >>>>>>> hack: be nice to actually delete the event here >>>>>>> Switched CPUS @ tick 25000 >>>>>>> Changing memory mode to timing >>>>>>> switching cpus >>>>>>> **** REAL SIMULATION **** >>>>>>> info: Entering event queue @ 25000. Starting simulation... >>>>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main.1 : PUSH_R >>>>>>> : subi rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed30 FetchSeq=1 >>>>>>> CPSeq=0 >>>>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main+2 : mov >>>>>>> eax, 0 >>>>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main+2.0 : >>>>>>> MOV_R_I : limm eax, 0 : IntAlu : D=0x0000000000000000 FetchSeq=2 >>>>>>> CPSeq=1 >>>>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main+7 : push >>>>>>> r14 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> But the new one is: >>>>>>> 23000: system.cpu + A0 T0 : 0x400364.3 : CALL_NEAR_I : subi >>>>>>> rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38 >>>>>>> 24000: system.cpu + A0 T0 : 0x400364.4 : CALL_NEAR_I : wrip , >>>>>>> t7, t1 : IntAlu : >>>>>>> 25000: system.cpu + A0 T0 : 0x470960 : push r15 >>>>>>> 25000: system.cpu + A0 T0 : 0x470960.0 : PUSH_R : st r15, >>>>>>> SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000 >>>>>>> A=0x7fffffffed30 >>>>>>> 26000: system.cpu + A0 T0 : 0x470960.1 : PUSH_R : subi rsp, >>>>>>> rsp, 0x8 : IntAlu : D=0x00007fffffffed30 >>>>>>> 27000: system.cpu + A0 T0 : 0x470962 : mov eax, 0 >>>>>>> >>>>>>> >>>>>>> >>>>>>> As you can see, in the old version switch at tick 25000 but the new >>>>>>> version switch at 41000. The gap is large though. >>>>>>> >>>>>>> Do you know what does " @__libc_start_main" mean in the old version? >>>>>>> >>>>>>> On 4/15/12, Mahmood Naderan <[email protected]> wrote: >>>>>>> >>>>>>>> I am trying what you said, but can you clarify this: >>>>>>>> >>>>>>>> Although the -F option is 20M instruction in both versions, I >>>>>>>> noticed >>>>>>>> that >>>>>>>> the old version enters real simulation at tick 22,407,755,000 but >>>>>>>> the >>>>>>>> new >>>>>>>> version enters at tick 90,443,309,000 >>>>>>>> >>>>>>>> I made the config files as closely as possible (same system bus >>>>>>>> freq, >>>>>>>> O3 >>>>>>>> parameters, ...) >>>>>>>> >>>>>>>> Why they switch at different tick numbers? >>>>>>>> -- >>>>>>>> // Naderan *Mahmood; >>>>>>>> >>>>>>>> >>>>>>>> On Sun, Apr 15, 2012 at 9:35 AM, Korey Sewell <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>> - make every O3CPU parameter that is different in the new version, >>>>>>>>> the >>>>>>>>> same as the old version >>>>>>>>> >>>>>>>>> - check the stats file for major differences. >>>>>>>>> For example: Are the L1/L2 miss rates higher or lower? Are your >>>>>>>>> caches >>>>>>>>> the >>>>>>>>> same size and associativity? This is h.264, so is there a lot of >>>>>>>>> floating >>>>>>>>> point insts being committed? If so, maybe the change is in the >>>>>>>>> latencies >>>>>>>>> of >>>>>>>>> the FP-Unit in the Function Unit Pool. >>>>>>>>> >>>>>>>>> - run gem5 for a small # of instructions (e.g. maxinsts=10) and see >>>>>>>>> if >>>>>>>>> there is a difference in the number of ticks it takes to complete >>>>>>>>> (this >>>>>>>>> is >>>>>>>>> *after* all the O3 parameters are the same). If there is a >>>>>>>>> difference, >>>>>>>>> then >>>>>>>>> turn on some O3 flags or check the stats and see what's going on >>>>>>>>> there. >>>>>>>>> If >>>>>>>>> there is no difference increase the maxinsts and try again until >>>>>>>>> you >>>>>>>>> see >>>>>>>>> the simulations diverging. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sun, Apr 15, 2012 at 12:46 AM, Mahmood Naderan >>>>>>>>> <[email protected]>wrote: >>>>>>>>> >>>>>>>>> I did that. >>>>>>>>>> There are some differences and I attached them. In short, I see >>>>>>>>>> this: >>>>>>>>>> >>>>>>>>>> old: >>>>>>>>>> children=dcache dtb icache itb tracer workload >>>>>>>>>> >>>>>>>>>> new: >>>>>>>>>> children=dcache dtb icache interrupts itb tracer workload >>>>>>>>>> >>>>>>>>>> Also the commitwidth, fetchwidth and some other parameters are 8 >>>>>>>>>> in >>>>>>>>>> the >>>>>>>>>> new version, but they are 4 in the old version. So I really wonder >>>>>>>>>> why >>>>>>>>>> it >>>>>>>>>> has a very low IPC. >>>>>>>>>> >>>>>>>>>> I will be greatly thankful if someone else try that. >>>>>>>>>> Also, I emailed another problem at >>>>>>>>>> http://permalink.gmane.org/**gmane.comp.emulators.m5.devel/** >>>>>>>>>> 14987<http://permalink.gmane.org/gmane.comp.emulators.m5.devel/14987> >>>>>>>>>> about >>>>>>>>>> "Unable to find destination for addr" which I encountered in the >>>>>>>>>> new >>>>>>>>>> version. >>>>>>>>>> >>>>>>>>>> Appreciate any idea. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I believe the 'dotencode' message just means you should upgrade >>>>>>>>>>> to >>>>>>>>>>> a >>>>>>>>>>> >>>>>>>>>> newer version of mercurial. >>>>>>>>>> ok I will try that. >>>>>>>>>> -- >>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sun, Apr 15, 2012 at 3:45 AM, Steve Reinhardt >>>>>>>>>> <[email protected]>wrote: >>>>>>>>>> >>>>>>>>>> I believe the 'dotencode' message just means you should upgrade >>>>>>>>>>> to >>>>>>>>>>> a >>>>>>>>>>> newer version of mercurial. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Apr 14, 2012 at 10:36 AM, Mahmood Naderan >>>>>>>>>>> <[email protected]>wrote: >>>>>>>>>>> >>>>>>>>>>> I forgot to say that I removed the 'dotencode' feature and the >>>>>>>>>>>> "hg >>>>>>>>>>>> heads" says: >>>>>>>>>>>> >>>>>>>>>>>> mahmood@tiger:gem5$ hg heads >>>>>>>>>>>> changeset: 8920:99083b5b7ed4 >>>>>>>>>>>> abort: data/.hgtags.i@b151ff1fd9df: no match found! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 4/14/12, Mahmood Naderan <[email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> For the old one, I use: >>>>>>>>>>>>> build/X86_SE/m5.fast configs/example/cmp.py -F 20000000 >>>>>>>>>>>>> --maxtick >>>>>>>>>>>>> 10000000000 -d --caches --l2cache -b h264_sss >>>>>>>>>>>>> --prog-interval=1000000 >>>>>>>>>>>>> >>>>>>>>>>>>> for the new one I use: >>>>>>>>>>>>> build/X86/m5.fast configs/example/cmp.py --cpu-type=detailed >>>>>>>>>>>>> -F >>>>>>>>>>>>> 20000000 --maxtick 10000000000 --caches --l2cache -b h264_sss >>>>>>>>>>>>> --prog-interval=1000000 >>>>>>>>>>>>> >>>>>>>>>>>>> I attached the configs and stats. Thanks >>>>>>>>>>>>> >>>>>>>>>>>>> On 4/14/12, Nilay Vaish <[email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> So, with 8613:712d8bf07020 you got and IPC of 1.54, and with >>>>>>>>>>>>>> some >>>>>>>>>>>>>> >>>>>>>>>>>>> version >>>>>>>>>>>> >>>>>>>>>>>>> near 8944:d062cc7a8bdf, you get an ipc of 0.093. Which CPU type >>>>>>>>>>>>>> are >>>>>>>>>>>>>> >>>>>>>>>>>>> you >>>>>>>>>>>> >>>>>>>>>>>>> using? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Nilay >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, 14 Apr 2012, Mahmood Naderan wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> The previous release is: >>>>>>>>>>>>>>> changeset: 8613:712d8bf07020 >>>>>>>>>>>>>>> tag: tip >>>>>>>>>>>>>>> user: Nilay Vaish<[email protected]> >>>>>>>>>>>>>>> date: Sat Nov 05 15:32:23 2011 -0500 >>>>>>>>>>>>>>> summary: Tests: Update stats due to addition of fence >>>>>>>>>>>>>>> microop >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And the IPC is 1.541534 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However for the new release, I am not able to find the head: >>>>>>>>>>>>>>> mahmood@tiger:gem5$ hg head >>>>>>>>>>>>>>> abort: requirement 'dotencode' not supported! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 4/14/12, Nilay Vaish <[email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> How much is the difference and which versions of gem5 are >>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> talking >>>>>>>>>>>> >>>>>>>>>>>>> about? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Nilay >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Sat, 14 Apr 2012, Mahmood Naderan wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> In the new version, I see that the IPC of h264 (with sss >>>>>>>>>>>>>>>>> input) >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> very very low. However with the previous releases, this >>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> fine >>>>>>>>>>>> >>>>>>>>>>>>> and acceptable. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Do you know how can I find the bottleneck? Which stat value >>>>>>>>>>>>>>>>> shows >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>>>> weired behaviour? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ISA = x86 >>>>>>>>>>>>>>>>> -F = 50,000,000 >>>>>>>>>>>>>>>>> --maxtick = 10,000,000,000 >>>>>>>>>>>>>>>>> L1 = 32kB, 4 >>>>>>>>>>>>>>>>> L2 = 2MB, 16 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> the IPC obtained is 0.093432 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Have you faced such result? Please let me know >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>>>>>>>> ______________________________**_________________ >>>>>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ______________________________**_________________ >>>>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>>>>>> ______________________________**_________________ >>>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ______________________________**_________________ >>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>> -- >>>>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>> -- >>>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>>> ______________________________**_________________ >>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>>>>> >>>>>>>>>>>> ______________________________**_________________ >>>>>>>>>>> gem5-users mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>>>> >>>>>>>>>>> ______________________________**_________________ >>>>>>>>>> gem5-users mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> - Korey >>>>>>>>> >>>>>>>>> ______________________________**_________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> [email protected] >>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>>>>> >>>>>>>>> ______________________________**_________________ >>>>>> gem5-users mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>>>> >>>>>> >>>>> >>>> ______________________________**_________________ >>>> gem5-users mailing list >>>> [email protected] >>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >>>> >>>> >>> >>> -- >>> -- >>> // Naderan *Mahmood; >>> >>> >> >> -- >> -- >> // Naderan *Mahmood; >> ______________________________**_________________ >> gem5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >> >> ______________________________**_________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users> >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
