You can binary search through the versions and figure out the earliest
version which shows a low ipc.
--
Nilay
On Sun, 15 Apr 2012, Mahmood Naderan wrote:
With an untouched latest revision 8954:3c7232fec7fd
the problem still exists. No matter what is the previous version, an
IPC of 0.077 or 0.03 are not normal
On 4/15/12, Mahmood Naderan <[email protected]> wrote:
I haven't change the new version yet. There maybe something wrong with
the loader. But I am not sure. Who can check that?
P.S: Dear Gabe, I think there is something wrong with the address
translator. Greatly appreciate if you check
http://permalink.gmane.org/gmane.comp.emulators.m5.users/9944
On 4/15/12, Gabe Black <[email protected]> wrote:
It's worth looking into why it doesn't find the __libc_start_main symbol
in the new version. If it's a bug we should fix it, even if it doesn't
directly have anything to do with your problem. You can also try
versions between your new and old one and see where things start
behaving poorly. This is of course assuming you haven't changed the
simulator in some way. If you have, all bets are off since that might be
what's changing the behavior.
Gabe
On 04/14/12 23:31, Mahmood Naderan wrote:
Well, in MyBench.py there is only one entry for h264_sss
h264_dir = spec_dir + '464.h264ref/exe/'
h264_bin = h264_dir + 'h264ref_base.amd64-m64-gcc44-nn'
h264_sss_data = h264_dir + 'sss_encoder_main.cfg'
h264_sss = LiveProcess()
h264_sss.executable = h264_bin
h264_sss.cmd = [h264_sss.executable] + ['-d', h264_sss_data]
h264_sss.cwd = h264_dir
On 4/15/12, Gabe Black <[email protected]> wrote:
I suspect you're not running exactly the same binary in both cases.
__libc_start_main is one of the functions provided by glibc (if I
remember correctly) which run before main() and get some basic things
set up. If it says __libc_start_main in one, it should say it in the
other one too, unless the thing that finds the symbol name was broken
somehow.
Gabe
On 04/14/12 22:50, Mahmood Naderan wrote:
I reduced the number of fast forward to 20 instructions and maxinst to
10 and turn on the ExecAll flag.
The old one looks like:
23000: system.cpu + A0 T0 : @_start+36.3 : CALL_NEAR_I : subi
rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38
24000: system.cpu + A0 T0 : @_start+36.4 : CALL_NEAR_I : wrip ,
t7, t1 : IntAlu :
25000: system.cpu + A0 T0 : @__libc_start_main : push r15
25000: system.cpu + A0 T0 : @__libc_start_main.0 : PUSH_R : st
r15, SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000
A=0x7fffffffed30
hack: be nice to actually delete the event here
Switched CPUS @ tick 25000
Changing memory mode to timing
switching cpus
**** REAL SIMULATION ****
info: Entering event queue @ 25000. Starting simulation...
67000: system.switch_cpus + A0 T0 : @__libc_start_main.1 : PUSH_R
: subi rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed30 FetchSeq=1
CPSeq=0
67000: system.switch_cpus + A0 T0 : @__libc_start_main+2 : mov
eax, 0
67000: system.switch_cpus + A0 T0 : @__libc_start_main+2.0 :
MOV_R_I : limm eax, 0 : IntAlu : D=0x0000000000000000 FetchSeq=2
CPSeq=1
67000: system.switch_cpus + A0 T0 : @__libc_start_main+7 : push
r14
But the new one is:
23000: system.cpu + A0 T0 : 0x400364.3 : CALL_NEAR_I : subi
rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed38
24000: system.cpu + A0 T0 : 0x400364.4 : CALL_NEAR_I : wrip ,
t7, t1 : IntAlu :
25000: system.cpu + A0 T0 : 0x470960 : push r15
25000: system.cpu + A0 T0 : 0x470960.0 : PUSH_R : st r15,
SS:[rsp + 0xfffffffffffffff8] : MemWrite : D=0x0000000000000000
A=0x7fffffffed30
26000: system.cpu + A0 T0 : 0x470960.1 : PUSH_R : subi rsp,
rsp, 0x8 : IntAlu : D=0x00007fffffffed30
27000: system.cpu + A0 T0 : 0x470962 : mov eax, 0
As you can see, in the old version switch at tick 25000 but the new
version switch at 41000. The gap is large though.
Do you know what does " @__libc_start_main" mean in the old version?
On 4/15/12, Mahmood Naderan <[email protected]> wrote:
I am trying what you said, but can you clarify this:
Although the -F option is 20M instruction in both versions, I noticed
that
the old version enters real simulation at tick 22,407,755,000 but the
new
version enters at tick 90,443,309,000
I made the config files as closely as possible (same system bus freq,
O3
parameters, ...)
Why they switch at different tick numbers?
--
// Naderan *Mahmood;
On Sun, Apr 15, 2012 at 9:35 AM, Korey Sewell <[email protected]>
wrote:
- make every O3CPU parameter that is different in the new version,
the
same as the old version
- check the stats file for major differences.
For example: Are the L1/L2 miss rates higher or lower? Are your
caches
the
same size and associativity? This is h.264, so is there a lot of
floating
point insts being committed? If so, maybe the change is in the
latencies
of
the FP-Unit in the Function Unit Pool.
- run gem5 for a small # of instructions (e.g. maxinsts=10) and see
if
there is a difference in the number of ticks it takes to complete
(this
is
*after* all the O3 parameters are the same). If there is a
difference,
then
turn on some O3 flags or check the stats and see what's going on
there.
If
there is no difference increase the maxinsts and try again until you
see
the simulations diverging.
On Sun, Apr 15, 2012 at 12:46 AM, Mahmood Naderan
<[email protected]>wrote:
I did that.
There are some differences and I attached them. In short, I see
this:
old:
children=dcache dtb icache itb tracer workload
new:
children=dcache dtb icache interrupts itb tracer workload
Also the commitwidth, fetchwidth and some other parameters are 8 in
the
new version, but they are 4 in the old version. So I really wonder
why
it
has a very low IPC.
I will be greatly thankful if someone else try that.
Also, I emailed another problem at
http://permalink.gmane.org/gmane.comp.emulators.m5.devel/14987
about
"Unable to find destination for addr" which I encountered in the
new
version.
Appreciate any idea.
I believe the 'dotencode' message just means you should upgrade to
a
newer version of mercurial.
ok I will try that.
--
// Naderan *Mahmood;
On Sun, Apr 15, 2012 at 3:45 AM, Steve Reinhardt
<[email protected]>wrote:
I believe the 'dotencode' message just means you should upgrade to
a
newer version of mercurial.
On Sat, Apr 14, 2012 at 10:36 AM, Mahmood Naderan
<[email protected]>wrote:
I forgot to say that I removed the 'dotencode' feature and the
"hg
heads" says:
mahmood@tiger:gem5$ hg heads
changeset: 8920:99083b5b7ed4
abort: data/.hgtags.i@b151ff1fd9df: no match found!
On 4/14/12, Mahmood Naderan <[email protected]> wrote:
For the old one, I use:
build/X86_SE/m5.fast configs/example/cmp.py -F 20000000
--maxtick
10000000000 -d --caches --l2cache -b h264_sss
--prog-interval=1000000
for the new one I use:
build/X86/m5.fast configs/example/cmp.py --cpu-type=detailed -F
20000000 --maxtick 10000000000 --caches --l2cache -b h264_sss
--prog-interval=1000000
I attached the configs and stats. Thanks
On 4/14/12, Nilay Vaish <[email protected]> wrote:
So, with 8613:712d8bf07020 you got and IPC of 1.54, and with
some
version
near 8944:d062cc7a8bdf, you get an ipc of 0.093. Which CPU type
are
you
using?
--
Nilay
On Sat, 14 Apr 2012, Mahmood Naderan wrote:
The previous release is:
changeset: 8613:712d8bf07020
tag: tip
user: Nilay Vaish<[email protected]>
date: Sat Nov 05 15:32:23 2011 -0500
summary: Tests: Update stats due to addition of fence
microop
And the IPC is 1.541534
However for the new release, I am not able to find the head:
mahmood@tiger:gem5$ hg head
abort: requirement 'dotencode' not supported!
On 4/14/12, Nilay Vaish <[email protected]> wrote:
How much is the difference and which versions of gem5 are you
talking
about?
--
Nilay
On Sat, 14 Apr 2012, Mahmood Naderan wrote:
Hi,
In the new version, I see that the IPC of h264 (with sss
input)
is
very very low. However with the previous releases, this
value
is
fine
and acceptable.
Do you know how can I find the bottleneck? Which stat value
shows
the
weired behaviour?
ISA = x86
-F = 50,000,000
--maxtick = 10,000,000,000
L1 = 32kB, 4
L2 = 2MB, 16
the IPC obtained is 0.093432
Have you faced such result? Please let me know
--
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
--
--
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
--
--
// Naderan *Mahmood;
--
--
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
--
- Korey
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
--
--
// Naderan *Mahmood;
--
--
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users