Re: [gem5-users] Monitor and dump the cache state during full system simulation

Shuai Wang Thu, 05 Jan 2017 14:20:05 -0800

Some debugging shows that this can be actually traced back to the random
number generator utility of the test case I use. I will see what I can do
here.


On Thu, Jan 5, 2017 at 4:57 PM, Shuai Wang <[email protected]> wrote:

> Hey Majid  and Jason,
>
>
> Thank you so much for these detailed information. I learned a lot from
> them:) Now I am able to boot the full-system simulation and record the
> cache status. I really appreciate your help!
>
> While the current system works well on all the program binaries compiled
> from C program, I am trapped in executing one C++ test case in the
> full-system simulation mode...
>
> My current configuration is: Kernel 3.2.0 + Ubuntu 12.04.4 64-bit. I
> compiled the kernel with the gem5 provided configuration
> file: linux-2.6.28.4.
>
> When running the binary code, it throws an exception:
>
> pure virtual method called
> terminate called without an active exception
> Aborted
>
> Of course, this more seems like a source code-level bug, calling virtual
> function inside the constructor or something. However, I tried the same
> code on various physical machines (including a Ubuntu 12.04 64-bit with
> kernel version 3.2.0 and a Ubuntu 12.04 64-bit with kernel version
> 3.8.0-44), and all of them work well.
>
> Besides, although debugging on the simulated platform is too slow, I use
> strace to dump the system call sequence when executing the static-linked
> C++ binary code and checked the constructor/destructor functions around the
> call sites; I haven't seen any suspicious code pieces so far.
>
> I am starting to think, maybe some configurations of the simulated
> platform lead to this issue.. Is there anything chance that any of this
> problem looks familiar to you guys..? If so, could you shed some lights on
> it? Thank you in advance!
>
>
> Sincerely,
> Shuai
>
>
>
> On Thu, Jan 5, 2017 at 8:40 AM, Jason Lowe-Power <[email protected]>
> wrote:
>
>> Hi Shuai,
>>
>> By default, se/fs.py use the atomic CPU, which performs atomic memory
>> accesses. This CPU/memory mode is used to fast-forward the simulation, and
>> does not accurately perform timing operations. All of the memory
>> requests/responses flow through the "atomic" functions (recvAtomic). You
>> should specify the CPU type on the command line to use timing mode (e.g.,
>> --cpu-type=timing or --cpu-type=detailed).
>>
>> Cheers,
>> Jason
>>
>> On Wed, Jan 4, 2017 at 9:52 AM Shuai Wang <[email protected]> wrote:
>>
>>> Sorry, what I mean is function "recvTimingReq", not function "
>>> schedTimingResp"...
>>>
>>> On Wed, Jan 4, 2017 at 10:23 AM, Shuai Wang <[email protected]>
>>> wrote:
>>>
>>> Besides, while my instrumented code in the "schedTimingResp" function
>>> works well when leveraging the system call mode, I find the 
>>> "schedTimingResp"
>>> function is never executed in the full-system simulation mode. Am I missed
>>> anything here?
>>>
>>> This is the command I use:
>>>
>>> ./build/X86/gem5.opt --debug-flags=CacheDebug
>>> --debug-file=cacheDebug.out.gz configs/example/fs.py
>>> --disk-image=/home/test/work/x86_full_system/disks/linux-x86.img
>>> --kernel=/home/test/work/x86_full_system/binaries/x86_64-vmlinux-3.2.1
>>> --caches
>>>
>>> By inserting some printf at the beginning of function "schedTimingResp
>>> ", I am pretty sure this function is never invoked...
>>>
>>> On Tue, Jan 3, 2017 at 11:13 PM, Shuai Wang <[email protected]>
>>> wrote:
>>>
>>> Hi Majid,
>>>
>>> Thank you so much for your detailed information. I strongly appreciate
>>> that.
>>>
>>> I tried to update the code as you said, and it works fine to dump the
>>> information in the C++ code. However, I am still confused to interpret the
>>> "cache state" information at this step. Could you please take a look at the
>>> following questions and shed some lights on it? Thank you!
>>>
>>> The first question is that I can never observe a "cache miss":
>>>
>>> So what I am basically doing right now, as you suggested, is to check
>>> the conditions in the context of each "cpuSidePort->schedTimingResp" to
>>> decide whether the current memory addressing leads to a hit or miss.
>>> However, after running multiple test cases (including some small binaries
>>> and mediam size GNU Coreutils binaries), all I can find is the "hit" 
>>> (schedTimingResp
>>> at line 742 of the cache.cc) and schedTimingResp at line 1454 of the
>>> cache.cc. Basically I cannot find any "miss" (schedTimingResp at line
>>> 801 of the cache.cc).  Am I missed anything here?
>>>
>>> The second question is still about the interpretation of the cache state:
>>>
>>> If I understood correctly,  given a N-bit memory address, it is
>>> dissected into the following three parts in a memory access:
>>>
>>> [image: Inline image 2]
>>>
>>> The "set index" is used to locate the cache set in which the data may be
>>> stored, and the tag is used to confirm that the data currently indeed
>>> presents in one of the cache lines in that cache set. In other words, I
>>> understand that the "cache state" (hit; miss; etc.) should be associated
>>> with each cache set regarding every memory addressing.
>>>
>>> Given the above context, I would like to confirm that the captured
>>> "hit/miss" surely represents the cache state of the accessed cache set. Or
>>> it is actually something towards the cache lines?
>>>
>>> Am I clear on this? Any suggestion and advice would be appreciated!
>>> Thank you!
>>> Sincerely,
>>> Shuai
>>>
>>>
>>>
>>> On Tue, Jan 3, 2017 at 3:12 PM, Majid Namaki Shoushtari <
>>> [email protected]> wrote:
>>>
>>> Hi Shuai,
>>>
>>> I don't think Jason meant that you need to add a function to Caches.py.
>>> You will need to add something to the C++ class (src/mem/cache/cache.hh/cc).
>>>
>>> I'm not sure what kind of information you need to dump, but basically
>>> all of the incoming requests from CPU are received here:
>>> "Cache::CpuSidePort::recvTimingReq(PacketPtr pkt)"
>>> and all of the responses to CPU are happening anywhere there is a call
>>> to: "cpuSidePort->schedTimingResp". There is currently four places that
>>> responses to CPU are scheduled. If you read the code, it's relatively easy
>>> to figure out which call site covers what condition (hit, miss, uncacheable
>>> access, etc).
>>>
>>> If you need to dump this information for one (some) specific cache(s)
>>> only, one way of doing it is to pass a boolean variable and make it
>>> conditional based on the value of that variable. For that you will need to
>>> add the variable to Caches.py and possibly CacheConfig.py.
>>>
>>> Cheers,
>>> Majid
>>>
>>>
>>> On Tue, Jan 3, 2017 at 8:21 AM, Shuai Wang <[email protected]>
>>> wrote:
>>>
>>> Dear Jason,
>>>
>>> Thank you so much for your reply. Could you please elaborate more on how
>>> to "implement a function in Caches.py to dump the data"? As far as I can
>>> see, there are only some cache parameters defined in this scripts.. I
>>> really have no idea how should I bridge the code there with the runtime
>>> cache state (my focus is the L1 D Cache)...
>>>
>>> I am not a system person and I am sincerely sorry if it is actually
>>> quite obvious... Thank you so much in advance!
>>>
>>> Sincerely,
>>> Shuai
>>>
>>> On Mon, Jan 2, 2017 at 11:01 AM, Jason Lowe-Power <[email protected]>
>>> wrote:
>>>
>>> Hi Shuai,
>>>
>>> There is currently nothing built into gem5 to dump the cache state
>>> (unless you're using Ruby in which case you can look at the code to take a
>>> checkpoint in the RubySystem class and the CacheTrace class). However, it
>>> should be pretty simple to dump the data in the classic caches. You would
>>> need to get a pointer to all of the caches, then add a function to the
>>> Cache class that dumps the data. You may be able to leverage the DDUMP
>>> macro which formats data in a reasonable way. Or, if you're only going to
>>> be using code to consume the output, you can look into the protobuf support
>>> in gem5 for dumping/consuming data.
>>>
>>> Cheers,
>>> Jason
>>>
>>> On Thu, Dec 29, 2016 at 10:38 PM Shuai Wang <[email protected]>
>>> wrote:
>>>
>>> Dear list,
>>>
>>>
>>> I am using the full-system simulation of gem5 to analyze the cache
>>> access of some x86 binary code. I have been able to add a monitor between
>>> the CPU and the L1 data cache to track all the cache access when executing
>>> the binary code on the simulated OS.
>>>
>>> Currently, I am thinking to go one step further and dump the cache state
>>> during the execution of the binary code. After a quick search online, I am
>>> unable to find some useful information, and I am wondering if it is
>>> actually possible to do so..?
>>>
>>> Could anyone provide some pointers regarding this task? Thank you in
>>> advance!
>>>
>>> Sincerely,
>>> Shuai
>>> _______________________________________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>> --
>>>
>>> Jason
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>>
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>>
>>>
>>>
>>> --
>>> Majid Namaki Shoushtari
>>> PhD Candidate
>>> Department of Computer Science
>>> University of California, Irvine
>>> Irvine, CA 92697
>>> [email protected]
>>> http://www.ics.uci.edu/~anamakis
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>>
>>>
>>>
>>> --
>>
>> Jason
>>
>
>

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Monitor and dump the cache state during full system simulation

Reply via email to