Re: [gem5-dev] Architecture of gem5

Potter, Brandon Fri, 09 Sep 2016 11:26:04 -0700

Yeah, I remember the "Linux process trackers"; they were my motivation for 
exploring the memory reads on the kernel structures. I wanted to extend them, 
but the source was closed so I started hacking around to implement my own.

Unfortunately, I don't get to use FS here at AMD much; we rely almost 
exclusively on SE for now. So, I don't know if it's possible to make Python 
scripts interact with a running kernel in FS. I suspect that it might be 
possible to build the mechanism easily enough though. You could create Python 
wrappers for C++ methods that read physical memory; the memory should be 
accessible through the "System" class. To make the equivalent of HAPs, you 
could create pseudo instructions and instrument a Linux kernel with them. 
Alternatively, you could add hooks into gem5 on register state changes to do 
some magical thing that you want to do. Hopefully someone else has something 
more constructive to say, but that's what comes to mind for me.

The caveat with this kind of thing is that we would have to maintain it as the 
Linux kernel moves forward to keep it working. The kernel developers have a 
tendency to change the symbols that are made available through 
kallsyms/System.map. Currently, I think that they're not even visible by 
default with a kernel; they turned them off for security reasons. We'd have to 
compile a new kernel with the switch flipped on. Anyways, I might not be a big 
deal, because we already package our FS kernels and post them on the gem5 
website.

This is a project that I've wanted to work on for the past few years, but I 
haven't had the time to devote to it. However, I do agree that this would be an 
excellent feature to add for research purposes.

-Brandon

-----Original Message-----
From: gem5-dev [mailto:[email protected]] On Behalf Of Boris 
Shingarov/Employee/LW-US
Sent: Friday, September 9, 2016 8:03 AM
To: gem5 Developer List <[email protected]>
Subject: Re: [gem5-dev] Architecture of gem5

I remember back when I was using Simics, there were these things called "Linux 
process trackers" that did exactly that. It  was all written in Python, and 
there was a point when various  processor-specific trackers got simplified / 
pulled together into one  generic tracker in which only the truly ISA-dependent 
knowledge was  pushed down to subclasses like x86_ops/ppc_ops etc., which would 
 override methods like "syscall_num()" according with passing that num in  eax 
vs r0; or overriding "current_task()" according to the struct being  kept on 
the kernel stack vs pointed by sprg3, etc.
It would be  extremely interesting to have similar functionality in gem5, and I 
have  even thought about writing something like that myself, but I don't know  
much about the infrastructure that would allow one to hook up such  Python 
scripts to haps like "memory access breakpoint" or "interrupt" or  "user/kernel 
mode change" easily and even on-the-fly in the middle of  debugging the kernel. 
 If someone could provide an initial explanation  of what mechanisms are there 
beyond using Python for initial  configuration, it would serve as a push 
forward at least for me.

-----"gem5-dev" <[email protected]> wrote: -----
To: gem5 Developer List <[email protected]>
From: "Potter, Brandon" 
Sent by: "gem5-dev" 
Date: 09/08/2016 07:26PM
Subject: Re: [gem5-dev] Architecture of gem5

With full-system mode, you can do kernel introspection. The kernel resides in 
the simulated memory which you have complete control over; you can do whatever 
you want to do with it. The kernel is what is responsible for managing the 
processes and their memory so if you want interesting kernel-level information, 
you might look into reading the kernel's physical memory directly.

With complete access to the full memory layout, you can play tricks with the 
physical memory and OS symbols to access structures, namely task_struct and 
mm_struct. It's difficult to do this and requires a bit of trial and error to 
figure out which symbols the kernel exposes to allow you to do this, but it is 
possible. The trick is to read the hex values from physical memory and compare 
them with the Linux kernel source; you can reason about which fields are 
pointers and which have values that are defined in the source. It just takes 
work to puzzle out the rest. I did this once with Simics so it is possible, but 
it's time consuming.

If you obtain access to the task_struct list, you can get access to the 
mm_struct which give you the full memory layout for each process. With the full 
memory layout, you can discern which areas of each process' virtual address 
space are mapped and which physical frames (in the simulated memory) correspond 
to those virtual address. In this way, you can get both the virtual and 
physical addresses of addresses for any process.

As Jason mentioned, you can use /proc/{some_pid}/maps to figure out the virtual 
address ranges that are mapped (at a specific point in the execution of that 
process). Furthermore, you can use pagemap to figure out what the physical 
address is. The catch with using these methods is that the simulator has to be 
running and the process that you're interested in examining has to be running 
to read the files (because the maps and pagemaps features are pseudo files). If 
you have a short-running process, you typically have to add a `while(1)` loop 
into the application to figure out the mappings.

Also as Jason mentioned on X86, you can examine the ring level that the CPU is 
in to figure out if the kernel is executing or if the process is running in 
userspace; if you don't understand what we're talking about, read 
http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection/. You 
can examine the segmentation registers and figure out at any given time what's 
going on. (If you don't know much about memory, I'd read the rest of his blog 
posts on the topic as well. It's a nice summary of Linux memory for the 
uninitiated.) Also, userspace processes should not have access to kernel space 
data. Linux draws a line in the virtual memory sand for different architectures 
to specify what is "kernel space" and what is "user space". With X86-64, 48 
bits [47...0] are used for virtual addresses. (The 47th bit gets sign-extended 
to the 63rd bit so that all of the intervening bits don't really matter.) If 
the sign extended bits are 1s, then the accesses are going into kernel space 
and you can be reasonably sure that the kernel is executing. So, the fool proof 
method is the CPL level in the code segment register, but you can sometimes 
tell by the accesses if the kernel is running or not. If you want to learn 
about memory in Linux, the canonical document, at least in my mind is: 
https://www.kernel.org/doc/gorman/. (Although, the book is a bit dated.)

If you're looking to track a specific process in X86, you should be able to 
monitor the CR3 register to verify that the correct process is running; the CR3 
holds the base frame for the page tables so it's a decent way to figure out if 
a 'special' process is being run.

-Brandon

-----Original Message-----
From: gem5-dev [mailto:[email protected]] On Behalf Of Jason Lowe-Power
Sent: Wednesday, August 31, 2016 8:59 AM
To: gem5 Developer List <[email protected]>
Subject: Re: [gem5-dev] Architecture of gem5

Hi Jasmin,

In full-system mode, gem5 runs a full operating system and all software.It is 
analogous to a pure virtualization platform. In full-system mode, gem5 must 
model all devices, etc. that the OS expects to interact with.
This is in comparison to syscall-emulation mode. In SE mode, gem5 only executes 
user-mode code. All OS system calls are routed into the simulator and are 
*emulated*.

To answer your questions:
   - Is it possible to reason wether an instruction is user instruction or 
kernel instruction?
Yes and no. No, there is no simple function to call to see if you are currently 
running in kernel or user mode. However, depending on your kernel / OS certain 
PC addresses represent kernel vs user-mode code. Additionally, you could watch 
what mode the CPU is in (ring-0 vs ring-3, etc), depending on the architecture.

   - Can we know to which process is an instruction belongs inside of the  OS?
This is a little more tricky, but it may be possible based on the physical 
address of the PC and using OS interfaces (e.g., /proc on Linux).

   - How is memory mapped to OS processes?
Again, this is tricky, but may be possible with introspection into /proc or 
something similar.

Overall, I believe something may exist that does what you're trying to do.
There was a presentation a few years ago at the gem5 users workshop that did 
some of these things. See the PDF here:
http://gem5.org/wiki/images/9/9f/2012_12_01_gem5_workshop_Streamline.pdf. I 
don't know what the current state of that project is. You may want to contact 
the author directly.

Hope this helps!

Jason

On Wed, Aug 31, 2016 at 6:28 AM Jasmin Jahic <[email protected]> wrote:

> Hello,
>
> I will try to refine my last question a bit. In the gem5 full system 
> mode;
>
>    - Is it possible to reason wether an instruction is user 
> instruction or
>    kernel instruction?
>    - Can we know to which process is an instruction belongs inside of 
> the
>    OS?
>    - How is memory mapped to OS processes?
>
> I hope that someone has some knowledge about the questions above. If 
> yes, they would help a lot.
>
> Best regards,
> Jasmin
>
>
> On Tue, Aug 30, 2016 at 6:22 PM, Stine, James 
> <[email protected]>
> wrote:
>
> > Many apologies - my email got corrupted.  Please ignore last Email.
> >
> > James
> >
> > > On Aug 30, 2016, at 11:21 AM, Stine, James 
> > > <[email protected]>
> > wrote:
> > >
> > > I can make smaller if you want..  Let me know if not what you need 
> > > or
> > want.  Thanks for letting me know!  Take care.
> > >
> > > J
> > >
> > > <Memo_to_OSU_Faculty_SHPE.pdf>
> > >
> > >
> > >> On Aug 30, 2016, at 11:18 AM, Jasmin Jahic 
> > >> <[email protected]>
> > wrote:
> > >>
> > >> Hello,
> > >>
> > >> I have one question regarding the architecture of gem5 and I hope 
> > >> that
> > you
> > >> can help me. I am interested where gem5 in Full system mode ends 
> > >> and
> > where
> > >> the OS is completely taking over?
> > >>
> > >> For example, can I influence scheduling of the tasks by modifying 
> > >> the
> > gem5
> > >> code directly, or is the gem5 simply running the OS as any other
> > program?
> > >>
> > >> Another example, from the OS's console I can start a simple binary.
> Can
> > I
> > >> modify the code to load a binary or is that handled completely 
> > >> through
> > the
> > >> OS, and gem5 cannot distinguish between instructions coming from 
> > >> the
> OS
> > or
> > >> other process and the regular binary I would run from the console?
> > >>
> > >> Best regards,
> > >> Jasmin
> > >> _______________________________________________
> > >> gem5-dev mailing list
> > >> [email protected]
> > >> http://m5sim.org/mailman/listinfo/gem5-dev
> > >
> >
> > _______________________________________________
> > gem5-dev mailing list
> > [email protected]
> > http://m5sim.org/mailman/listinfo/gem5-dev
> >
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Architecture of gem5

Reply via email to