Thanks you guys, Gabe and Korey.

I am just curious that almost 16% loads are not committed [Proportion of
uncommitted Load (279749292-235487280)/279749292=15.8%], it seems the branch
prediction accurate rate is not very high. so I calculated branch mis
predicted rate = 10975518/111349558 = 1%(see below). But the accuracy I
think is high enough. So now I am not pretty sure whether all the difference
of committed load and L1 cache read request are coming from branch
misprediction. Is there any other possible way the CPU issue a load but not
commit it? For this simulation I do not open any L1 prefetching.

Many thanks again

Dawei

PS: some branch statistics
system.cpu.commit.COM:branches 111349558  # Number of branches committed
system.cpu.commit.branchMispredicts 10975518 # The number of times a branch
was mispredicted




> Message: 4
> Date: Tue, 22 Feb 2011 22:32:07 -0500
> From: Gabriel Michael Black <[email protected]>
> To: [email protected]
> Subject: Re: [m5-users] Some confusing about O3 CPU statistic data
> Message-ID: <[email protected]>
> Content-Type: text/plain;       charset=ISO-8859-1;     DelSp="Yes";
>        format="flowed"
>
> Part of it might be from instructions that fault like loads that miss
> in the TLB. Those might count towards instructions the commit stage
> committed (since they finished) but not towards the count the CPU is
> tracking because they didn't commit without a fault. I haven't
> verified that this is actually what's happening. If it is, the
> difference is a little confusing, but might be there to keep EIO
> traces happy. Those expect particular numbers of instructions counted
> in a particular way to have occurred between important events like
> system calls.
>
> Gabe
>
> Quoting Dawei Wang <[email protected]>:
>
> > Hello, everyone,
> >
> > Now I am using M5 to simulation Spec CPU2006, I have got some
> in-consistence
> > statistics, Hope someone could give reasonable explanations.
> >
> > I simulated benchmark 401.bzip2 with 1 billion instructions, and only one
> > cpu core in SE mode.
> >
> > The simulations cmd is here:
> > build/ALPHA_SE/m5.opt /configs/example/se_cpu2006.py -n1 --detailed
> --caches
> > --l2cache --maxinsts 1000000000 -b bzip2
> >
> > [1] The first question is there are two commit parameters showing below.
> > According to their names, I guess they are both meaning how many are
> > instruction are committed. But there number are different, any reason? I
> > want to simulated 1,000,000,000 insts.
> > system.cpu.commit.COM:count     1001844707   # Number of instructions
> > committed
> > system.cpu.committedInsts           1000000000   # Number of Instructions
> > Simulated
> > system.cpu.committedInsts_total   1000000000   # Number of Instructions
> > Simulated
> >
> > Simply, my question is why "system.cpu.commit.COM:count
> > != system.cpu.committedInsts"? what is the difference come from?
> >
> > [2]The second question is why the # of memory loads inst committed is not
> > equal to L1 dcache ReadReq demand access?
> > why the # of memory reference inst committed is not equal to L1 dcache
> > overall demand access?
> >
> > system.cpu.commit.COM:loads        235487280   # Number of loads
> committed
> > system.cpu.commit.COM:membars              70    # Number of memory
> barriers
> > committed
> > system.cpu.commit.COM:refs          328322098    # Number of memory
> > references committed
> >
> > system.cpu.dcache.ReadReq_accesses  279749292    # number of ReadReq
> > accesses(hits+misses)
> > system.cpu.dcache.WriteReq_accesses    92768715   # number of WriteReq
> > accesses(hits+misses)
> >
> > Simply, my question is why "system.cpu.commit.COM:loads
> > !=system.cpu.dcache.ReadReq_accesses"? where the extra dcache ReadReq
> come
> > from? One of my guess is the difference is from some speculation
> > instructions, there load instructions finally does not commit, but I am
> not
> > sure whether all two number difference is coming from un-committed load
> > instruction?
> >
> > [3]The third question is how do we define idleCycles? Is that mean there
> is
> > no operation in any functional unit due to data dependency or waiting
> > loading data from memory?
> > system.cpu.idleCycles 7409255  # Total number of cycles that the CPU has
> > spent unscheduled due to idling
> >
> > Many thanks in advance
> >
> > Dawei
> >
>
>
>
>
> ------------------------------
>
> Message: 5
> Date: Tue, 22 Feb 2011 22:52:02 -0500
> From: Korey Sewell <[email protected]>
> To: M5 users mailing list <[email protected]>
> Subject: Re: [m5-users] Some confusing about O3 CPU statistic data
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi Dawei,
> If you would like you can add code (i.e. a stat) to count the # of
> speculative loads that are made. It is an out-of-order model, so you have
> to
> expect there will be some amount of loads that were made that go to the
> memory system, but weren't actually committed. What's the branch mispredict
> rate? How many mispredicts required a flush of the pipeline? Things like
> that are going to cancel loads.
>
> For the commit count, please look around line 976:
> 974:                // To match the old model, don't count nops and
> instruction
> 975:                // prefetches towards the total commit count.
> 976:                if (!head_inst->isNop() &&
> !head_inst->isInstPrefetch())
> {
> ...
>
> If you would like, you could add a stat counting the nops to be sure. Or
> you
> can run your  workload on the simple timing model (which counts nops) and
> see if the commit count there matches the commit stage commit count.
>
> For #3,
> Yes, the CPU will sleep if there is no activity due to some long latency
> stall (memory access, TLB translation, etc.) that it must wait for and no
> progress can be made. Those are idle cycles.
>
> On Tue, Feb 22, 2011 at 9:09 PM, Dawei Wang <[email protected]
> >wrote:
>
> > Hello, everyone,
> >
> > Now I am using M5 to simulation Spec CPU2006, I have got some
> > in-consistence statistics, Hope someone could give reasonable
> explanations.
> >
> > I simulated benchmark 401.bzip2 with 1 billion instructions, and only one
> > cpu core in SE mode.
> >
> > The simulations cmd is here:
> > build/ALPHA_SE/m5.opt /configs/example/se_cpu2006.py -n1 --detailed
> > --caches --l2cache --maxinsts 1000000000 -b bzip2
> >
> > [1] The first question is there are two commit parameters showing below.
> > According to their names, I guess they are both meaning how many are
> > instruction are committed. But there number are different, any reason? I
> > want to simulated 1,000,000,000 insts.
> > system.cpu.commit.COM:count     1001844707   # Number of instructions
> > committed
> > system.cpu.committedInsts           1000000000   # Number of Instructions
> > Simulated
> > system.cpu.committedInsts_total   1000000000   # Number of Instructions
> > Simulated
> >
> > Simply, my question is why "system.cpu.commit.COM:count
> > != system.cpu.committedInsts"? what is the difference come from?
> >
> > [2]The second question is why the # of memory loads inst committed is not
> > equal to L1 dcache ReadReq demand access?
> > why the # of memory reference inst committed is not equal to L1 dcache
> > overall demand access?
> >
> > system.cpu.commit.COM:loads        235487280   # Number of loads
> committed
> > system.cpu.commit.COM:membars              70    # Number of memory
> > barriers committed
> > system.cpu.commit.COM:refs          328322098    # Number of memory
> > references committed
> >
> > system.cpu.dcache.ReadReq_accesses  279749292    # number of ReadReq
> > accesses(hits+misses)
> > system.cpu.dcache.WriteReq_accesses    92768715   # number of WriteReq
> > accesses(hits+misses)
> >
> > Simply, my question is why "system.cpu.commit.COM:loads
> > !=system.cpu.dcache.ReadReq_accesses"? where the extra dcache ReadReq
> come
> > from? One of my guess is the difference is from some speculation
> > instructions, there load instructions finally does not commit, but I am
> not
> > sure whether all two number difference is coming from un-committed load
> > instruction?
> >
> > [3]The third question is how do we define idleCycles? Is that mean there
> is
> > no operation in any functional unit due to data dependency or waiting
> > loading data from memory?
> > system.cpu.idleCycles 7409255  # Total number of cycles that the CPU has
> > spent unscheduled due to idling
> >
> > Many thanks in advance
> >
> > Dawei
> >
> > _______________________________________________
> > m5-users mailing list
> > [email protected]
> > http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
> >
>
>
>
> --
> - Korey
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://m5sim.org/cgi-bin/mailman/private/m5-users/attachments/20110222/b9158310/attachment.html
> >
>
> ------------------------------
>
> _______________________________________________
> m5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>
> End of m5-users Digest, Vol 55, Issue 44
> ****************************************
>
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to