Hello,

I'm not expert in caches, but the explanations given here (
http://www.gem5.org/Classic_Memory_System) explains the limitations of the
classic memory model. Regarding the coherence protocols, they say "Our
intent is that this protocol is adequate for researchers studying aspects
of system behavior other than coherence mechanisms". So, maybe your
comparison between gem5 and arm platforms is not realistic with the classic
mem model. And arm_detailed uses the classic mem model.

Regards,

Fernando

--
Fernando A. Endo, PhD student and researcher

Université de Grenoble, UJF
France



2013/6/5 Xiangyang Guo <[email protected]>

> Yes, I also want to get some information about this. Because from my
> experience, the data I collect is weird. For example, the number of L1 miss
> is much higher.
>
>
> On Mon, Jun 3, 2013 at 11:18 PM, huangyongbing 
> <[email protected]>wrote:
>
>> Hi all,****
>>
>> ** **
>>
>>        I have to recall the accuracy problem of gem5. When running BBench
>> to gem5 platform, I want to know whether somebody have compared the
>> microarchitectural metrics such as L1 instruction cache miss measured from
>> gem5 and real hardware board. And which parameters should I change in order
>> to obtain similar results based on the arm_detailed CPU model. I have
>> already tried to adjust several important configuration parameters of gem5,
>> but still failed to get wanted results.****
>>
>>        If the simulator results have big difference from the real
>> platform, all the optimizations of CPU architecture based on the simulators
>> would be useless. I found that many users are using gem5 simulator to
>> simulate the ARM platform. Are there somebody meeting the same problem?**
>> **
>>
>> ** **
>>
>>        Thanks.****
>>
>> ** **
>>
>> Best regards,****
>>
>> Yongbing Huang****
>>
>> ** **
>>
>> ** **
>>
>> *From:* [email protected] [mailto:[email protected]]
>> *On Behalf Of *huangyongbing
>> *Sent:* Monday, January 28, 2013 10:11 AM
>> *To:* 'gem5 users mailing list'
>> *Subject:* [SPAM] Re: [gem5-users] Mismatched stats between gem5 and
>> performance counters when running BBench on ARM platform****
>>
>> ** **
>>
>> Hi Orangeade,****
>>
>> ** **
>>
>>          Thanks for your reply. I really have done some work to localize
>> the problem.****
>>
>> ** **
>>
>> **1)       **I use arm_detailed mode in gem5. I also close the
>> prefetcher on gem5, the same in real ARM platform.****
>>
>> **2)       **I have already change default 64B cache line into 32B cache
>> line.****
>>
>> **3)       **I noticed about this. So I run a micro-benchmark just using
>> CPU on ARM platform and gem5. The results seem the same as running bbench.
>> I will check about this.****
>>
>> **4)       **In the real ARM platform, round robin cache replacement
>> policy is used. But I use LRU replacement policy in gem5. I don’t know how
>> much effects are caused by replacement policy.  I will implement round
>> robin in gem5 and test again in the next step.****
>>
>> ** **
>>
>> Thanks!****
>>
>> ** **
>>
>> Best regards,****
>>
>> ** **
>>
>> Yongbing Huang****
>>
>> ** **
>>
>> *From:* [email protected] [mailto:[email protected]]
>> *On Behalf Of *Mr. Orangeade
>> *Sent:* Monday, January 28, 2013 3:07 AM
>> *To:* [email protected]
>> *Subject:* Re: [gem5-users] Mismatched stats between gem5 and
>> performance counters when running BBench on ARM platform****
>>
>> ** **
>>
>>
>> Hi Yongbing,
>>
>> I don't have any 100% solution for you but have a few questions which may 
>> help you to localize the problem:
>>
>> (1) Which type of model (functional or arm_detailed) do you run to collect 
>> the stats?
>>     Theoretically you should run 'arm_detailed' to take into account 
>> speculative misses.
>>
>> (2) 'arm_detailed' uses 64B cache line while Cortex-A9 has 32B cache line.
>>
>>     Did you take this into account (i.e. changed 'arm_detailed' cache line 
>> size to 32B)?
>>
>> (3) Not sure about Chromium but browsers in general may use GPU for 
>> compositing on real HW and execution path will be different comparing to 
>> SW-only BBench in gem5.
>>
>> Orangeade
>>
>> Yongbing wrote:
>>
>> Hi all,
>>
>>          I recently compared the micro-architectural metrics such as L1
>> cache miss collected by gem5 with that collected by performance counters on
>> real ARM platform. I found that their difference was so big. For example,
>>
>> the Icache miss rate per 1k instruction of bbench was about 30 collected by
>> hardware performance counters (referring to the paper published by Anthony
>> Gutierrez in IISWC'2011), but only about 3 for gem5. It's about 10x
>>
>> difference.****
>>
>>
>> _______________________________________________
>> gem5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
> _______________________________________________
> gem5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to