With an untouched latest revision 8954:3c7232fec7fd the problem still exists. No matter what is the previous version, an IPC of 0.077 or 0.03 are not normal
On 4/15/12, Mahmood Naderan <[email protected]> wrote: > I haven't change the new version yet. There maybe something wrong with > the loader. But I am not sure. Who can check that? > > > P.S: Dear Gabe, I think there is something wrong with the address > translator. Greatly appreciate if you check > http://permalink.gmane.org/gmane.comp.emulators.m5.users/9944 > > On 4/15/12, Gabe Black <[email protected]> wrote: >> It's worth looking into why it doesn't find the __libc_start_main symbol >> in the new version. If it's a bug we should fix it, even if it doesn't >> directly have anything to do with your problem. You can also try >> versions between your new and old one and see where things start >> behaving poorly. This is of course assuming you haven't changed the >> simulator in some way. If you have, all bets are off since that might be >> what's changing the behavior. >> >> Gabe >> >> On 04/14/12 23:31, Mahmood Naderan wrote: >>> Well, in MyBench.py there is only one entry for h264_sss >>> h264_dir = spec_dir + '464.h264ref/exe/' >>> h264_bin = h264_dir + 'h264ref_base.amd64-m64-gcc44-nn' >>> h264_sss_data = h264_dir + 'sss_encoder_main.cfg' >>> >>> h264_sss = LiveProcess() >>> h264_sss.executable = h264_bin >>> h264_sss.cmd = [h264_sss.executable] + ['-d', h264_sss_data] >>> h264_sss.cwd = h264_dir >>> >>> >>> >>> On 4/15/12, Gabe Black <[email protected]> wrote: >>>> I suspect you're not running exactly the same binary in both cases. >>>> __libc_start_main is one of the functions provided by glibc (if I >>>> remember correctly) which run before main() and get some basic things >>>> set up. If it says __libc_start_main in one, it should say it in the >>>> other one too, unless the thing that finds the symbol name was broken >>>> somehow. >>>> >>>> Gabe >>>> >>>> On 04/14/12 22:50, Mahmood Naderan wrote: >>>>> I reduced the number of fast forward to 20 instructions and maxinst to >>>>> 10 and turn on the ExecAll flag. >>>>> >>>>> The old one looks like: >>>>> 23000: system.cpu + A0 T0 : @_start+36.3 : CALL_NEAR_I : subi >>>>> rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38 >>>>> 24000: system.cpu + A0 T0 : @_start+36.4 : CALL_NEAR_I : wrip , >>>>> t7, t1 : IntAlu : >>>>> 25000: system.cpu + A0 T0 : @__libc_start_main : push r15 >>>>> 25000: system.cpu + A0 T0 : @__libc_start_main.0 : PUSH_R : st >>>>> r15, SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000 >>>>> A=0x7fffffffed30 >>>>> hack: be nice to actually delete the event here >>>>> Switched CPUS @ tick 25000 >>>>> Changing memory mode to timing >>>>> switching cpus >>>>> **** REAL SIMULATION **** >>>>> info: Entering event queue @ 25000. Starting simulation... >>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main.1 : PUSH_R >>>>> : subi rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed30 FetchSeq=1 >>>>> CPSeq=0 >>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main+2 : mov >>>>> eax, 0 >>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main+2.0 : >>>>> MOV_R_I : limm eax, 0 : IntAlu : D=0x0000000000000000 FetchSeq=2 >>>>> CPSeq=1 >>>>> 67000: system.switch_cpus + A0 T0 : @__libc_start_main+7 : push >>>>> r14 >>>>> >>>>> >>>>> >>>>> >>>>> But the new one is: >>>>> 23000: system.cpu + A0 T0 : 0x400364.3 : CALL_NEAR_I : subi >>>>> rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38 >>>>> 24000: system.cpu + A0 T0 : 0x400364.4 : CALL_NEAR_I : wrip , >>>>> t7, t1 : IntAlu : >>>>> 25000: system.cpu + A0 T0 : 0x470960 : push r15 >>>>> 25000: system.cpu + A0 T0 : 0x470960.0 : PUSH_R : st r15, >>>>> SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000 >>>>> A=0x7fffffffed30 >>>>> 26000: system.cpu + A0 T0 : 0x470960.1 : PUSH_R : subi rsp, >>>>> rsp, 0x8 : IntAlu : D=0x00007fffffffed30 >>>>> 27000: system.cpu + A0 T0 : 0x470962 : mov eax, 0 >>>>> >>>>> >>>>> >>>>> As you can see, in the old version switch at tick 25000 but the new >>>>> version switch at 41000. The gap is large though. >>>>> >>>>> Do you know what does " @__libc_start_main" mean in the old version? >>>>> >>>>> On 4/15/12, Mahmood Naderan <[email protected]> wrote: >>>>>> I am trying what you said, but can you clarify this: >>>>>> >>>>>> Although the -F option is 20M instruction in both versions, I noticed >>>>>> that >>>>>> the old version enters real simulation at tick 22,407,755,000 but the >>>>>> new >>>>>> version enters at tick 90,443,309,000 >>>>>> >>>>>> I made the config files as closely as possible (same system bus freq, >>>>>> O3 >>>>>> parameters, ...) >>>>>> >>>>>> Why they switch at different tick numbers? >>>>>> -- >>>>>> // Naderan *Mahmood; >>>>>> >>>>>> >>>>>> On Sun, Apr 15, 2012 at 9:35 AM, Korey Sewell <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> - make every O3CPU parameter that is different in the new version, >>>>>>> the >>>>>>> same as the old version >>>>>>> >>>>>>> - check the stats file for major differences. >>>>>>> For example: Are the L1/L2 miss rates higher or lower? Are your >>>>>>> caches >>>>>>> the >>>>>>> same size and associativity? This is h.264, so is there a lot of >>>>>>> floating >>>>>>> point insts being committed? If so, maybe the change is in the >>>>>>> latencies >>>>>>> of >>>>>>> the FP-Unit in the Function Unit Pool. >>>>>>> >>>>>>> - run gem5 for a small # of instructions (e.g. maxinsts=10) and see >>>>>>> if >>>>>>> there is a difference in the number of ticks it takes to complete >>>>>>> (this >>>>>>> is >>>>>>> *after* all the O3 parameters are the same). If there is a >>>>>>> difference, >>>>>>> then >>>>>>> turn on some O3 flags or check the stats and see what's going on >>>>>>> there. >>>>>>> If >>>>>>> there is no difference increase the maxinsts and try again until you >>>>>>> see >>>>>>> the simulations diverging. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sun, Apr 15, 2012 at 12:46 AM, Mahmood Naderan >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> I did that. >>>>>>>> There are some differences and I attached them. In short, I see >>>>>>>> this: >>>>>>>> >>>>>>>> old: >>>>>>>> children=dcache dtb icache itb tracer workload >>>>>>>> >>>>>>>> new: >>>>>>>> children=dcache dtb icache interrupts itb tracer workload >>>>>>>> >>>>>>>> Also the commitwidth, fetchwidth and some other parameters are 8 in >>>>>>>> the >>>>>>>> new version, but they are 4 in the old version. So I really wonder >>>>>>>> why >>>>>>>> it >>>>>>>> has a very low IPC. >>>>>>>> >>>>>>>> I will be greatly thankful if someone else try that. >>>>>>>> Also, I emailed another problem at >>>>>>>> http://permalink.gmane.org/gmane.comp.emulators.m5.devel/14987 >>>>>>>> about >>>>>>>> "Unable to find destination for addr" which I encountered in the >>>>>>>> new >>>>>>>> version. >>>>>>>> >>>>>>>> Appreciate any idea. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> I believe the 'dotencode' message just means you should upgrade to >>>>>>>>> a >>>>>>>> newer version of mercurial. >>>>>>>> ok I will try that. >>>>>>>> -- >>>>>>>> // Naderan *Mahmood; >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Sun, Apr 15, 2012 at 3:45 AM, Steve Reinhardt >>>>>>>> <[email protected]>wrote: >>>>>>>> >>>>>>>>> I believe the 'dotencode' message just means you should upgrade to >>>>>>>>> a >>>>>>>>> newer version of mercurial. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Apr 14, 2012 at 10:36 AM, Mahmood Naderan >>>>>>>>> <[email protected]>wrote: >>>>>>>>> >>>>>>>>>> I forgot to say that I removed the 'dotencode' feature and the >>>>>>>>>> "hg >>>>>>>>>> heads" says: >>>>>>>>>> >>>>>>>>>> mahmood@tiger:gem5$ hg heads >>>>>>>>>> changeset: 8920:99083b5b7ed4 >>>>>>>>>> abort: data/.hgtags.i@b151ff1fd9df: no match found! >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 4/14/12, Mahmood Naderan <[email protected]> wrote: >>>>>>>>>>> For the old one, I use: >>>>>>>>>>> build/X86_SE/m5.fast configs/example/cmp.py -F 20000000 >>>>>>>>>>> --maxtick >>>>>>>>>>> 10000000000 -d --caches --l2cache -b h264_sss >>>>>>>>>>> --prog-interval=1000000 >>>>>>>>>>> >>>>>>>>>>> for the new one I use: >>>>>>>>>>> build/X86/m5.fast configs/example/cmp.py --cpu-type=detailed -F >>>>>>>>>>> 20000000 --maxtick 10000000000 --caches --l2cache -b h264_sss >>>>>>>>>>> --prog-interval=1000000 >>>>>>>>>>> >>>>>>>>>>> I attached the configs and stats. Thanks >>>>>>>>>>> >>>>>>>>>>> On 4/14/12, Nilay Vaish <[email protected]> wrote: >>>>>>>>>>>> So, with 8613:712d8bf07020 you got and IPC of 1.54, and with >>>>>>>>>>>> some >>>>>>>>>> version >>>>>>>>>>>> near 8944:d062cc7a8bdf, you get an ipc of 0.093. Which CPU type >>>>>>>>>>>> are >>>>>>>>>> you >>>>>>>>>>>> using? >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Nilay >>>>>>>>>>>> >>>>>>>>>>>> On Sat, 14 Apr 2012, Mahmood Naderan wrote: >>>>>>>>>>>> >>>>>>>>>>>>> The previous release is: >>>>>>>>>>>>> changeset: 8613:712d8bf07020 >>>>>>>>>>>>> tag: tip >>>>>>>>>>>>> user: Nilay Vaish<[email protected]> >>>>>>>>>>>>> date: Sat Nov 05 15:32:23 2011 -0500 >>>>>>>>>>>>> summary: Tests: Update stats due to addition of fence >>>>>>>>>>>>> microop >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> And the IPC is 1.541534 >>>>>>>>>>>>> >>>>>>>>>>>>> However for the new release, I am not able to find the head: >>>>>>>>>>>>> mahmood@tiger:gem5$ hg head >>>>>>>>>>>>> abort: requirement 'dotencode' not supported! >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 4/14/12, Nilay Vaish <[email protected]> wrote: >>>>>>>>>>>>>> How much is the difference and which versions of gem5 are you >>>>>>>>>> talking >>>>>>>>>>>>>> about? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Nilay >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, 14 Apr 2012, Mahmood Naderan wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> In the new version, I see that the IPC of h264 (with sss >>>>>>>>>>>>>>> input) >>>>>>>>>>>>>>> is >>>>>>>>>>>>>>> very very low. However with the previous releases, this >>>>>>>>>>>>>>> value >>>>>>>>>>>>>>> is >>>>>>>>>> fine >>>>>>>>>>>>>>> and acceptable. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Do you know how can I find the bottleneck? Which stat value >>>>>>>>>>>>>>> shows >>>>>>>>>> the >>>>>>>>>>>>>>> weired behaviour? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ISA = x86 >>>>>>>>>>>>>>> -F = 50,000,000 >>>>>>>>>>>>>>> --maxtick = 10,000,000,000 >>>>>>>>>>>>>>> L1 = 32kB, 4 >>>>>>>>>>>>>>> L2 = 2MB, 16 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> the IPC obtained is 0.093432 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Have you faced such result? Please let me know >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> -- >>>>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> -- >>>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> // Naderan *Mahmood; >>>>>>>>>> _______________________________________________ >>>>>>>>>> gem5-users mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> [email protected] >>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> gem5-users mailing list >>>>>>>> [email protected] >>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> - Korey >>>>>>> >>>>>>> _______________________________________________ >>>>>>> gem5-users mailing list >>>>>>> [email protected] >>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>> >>>> _______________________________________________ >>>> gem5-users mailing list >>>> [email protected] >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>> >> >> _______________________________________________ >> gem5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> > > > -- > -- > // Naderan *Mahmood; > -- -- // Naderan *Mahmood; _______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
