On 4/7/25 13:36, Yannis Bolliger wrote:


Hi Pierrick,

Thanks for your reply!

On Monday, April 7th, 2025 at 7:04 PM, Pierrick Bouvier 
<pierrick.bouv...@linaro.org> wrote:



Hi Yannis,

Is it possible to correlate these addresses? What do I need to look out
for?


It should be possible to correlate these addresses.

- Did you use qemu_plugin_get_hwaddr to obtain physical address?
- Are you seeing the right address on kernel side? (should you add
virt_to_page? Do you really need to offset that with RAM address in the
memory map?)

I use qemu_plugin_get_hwarddr and I indeed do not need to add the offset.
I confirmed with gva2gpa and both kernel and qemu logs always return a 
consistent physical address.

That's a good starting point.

Although gva2gpa usually only works on either the user virt address or the 
kernel one, perhaps depending on what context the cpu is running in?


I'm not familiar with our monitor command gva2gpa, but I guess it's relying on current page table set. Kernel and user space may have different ones (and additional ones depending on privilege level and architecture). To be able to write to user space, kernel must still map this address in its own page table, so there still should be a correct virtual -> physical correct mapping from kernel context.

I'm not sure how those details are handled in Linux kernel, but I know that kernel space is mapped on a specific partition of the address space, and user space on the rest. So user address space is accessible to kernel (but not the opposite, for obvious reasons). Before Meltdown exploit, it used to be the same page table (but with U bit set to 0 on kernel entries). To mitigate it, they now have distinct page tables (KPTI) [1][2], but I don't think it's enabled by default because of the cost related to syscalls (flushing TLB completely).

[1] https://www.kernel.org/doc/html/next/x86/pti.html
[2] https://en.wikipedia.org/wiki/Kernel_page-table_isolation

There is not a single and absolute answer to "Which physical address matches this virtual one?" during all the execution. It will vary based on current context.

The problem I have is that I can now find some accesses to the buffer addresses 
I log, but usually only to either one.
As a short example for a read i.e. a kernel to user copy (I take one kernel 
line and grep the memory trace to find any accesses to the same pages, not 
exact address):
KernelRecord {command: system_server, cpu: 5, size: 4096, op: r, 
kernel_address: 0x000000010443b000, user_address: 0x000000014c7ad015 }
LogRecord {insn_count: 11137708523, store: 0, address: 0x000000010443b000 }
LogRecord {insn_count: 11137708739, store: 0, address: 0x000000010443b010 }
LogRecord {insn_count: 11137708750, store: 0, address: 0x000000010443b020 }
... (goes on)

I can see the whole page being read from the kernel buffer, but I cannot find 
any stores to the corresponding user buffer physical address.

Do you see any store at all (even with a different address)?

What I would expect are interleaved loads and stores. Depending on the specific 
kernel logline I can find only writes to the user address but never an 
interleaving.

Do you have any idea what the issue could be? The behavior of gva2gpa sort of 
suggests that perhaps during the callback the wrong context is used for either 
one, but that is just an uninformed guess from my side.


If you don't see any store, the most obvious idea I have is to check if you use QEMU_PLUGIN_MEM_RW, and not only QEMU_PLUGIN_MEM_R, when registering the plugin memory callback.

From QEMU perspective, the fact those things happen from a kernel or user space does not really matters. All it sees are load/stores, and instrument them.

--------
Some more background info on what I'm doing so you don't waste any time looking 
for an issue on your side in case I'm just using it wrongly:
- I use multithreaded TCG (SMP=8)
- kernel 5.10.234 (android cuttlefish)
- qemu built from source (state from 4 weeks ago 5136598e26)
- I did adjust the execlog plugin somewhat

For the last point, I basically stripped it down to the minimum I needed and 
added some optimizations for my specific requirements. I did that because the 
original plugin caused my kernel to lock up completely due to the work done in 
the callbacks.

More details:
- I only need a count of instructions in between memory accesses so in the 
insn_exec_cb I just increment a global counter atomically with 
__atomic_fetch_add (sequential consistency) and store it in the per vcpu 
LogRecord struct
- In the memory callback I just do qemu_plugin_mem_is_store, 
qemu_plugin_get_hwaddr and qemu_plugin_hwaddr_physaddr to fill the LogRecord 
struct and write it to the per vcpu logfile
- I added function to the plugin api to get the log enabled state (log_mask) so 
I can avoid doing anything in the callbacks (not just avoid printing as done 
internally by qemu_plugin_outs)
- I do not use any locks since I have allocated everything per VCPU. I only use 
the atomic add in the insn_exec callback sort of as logical time and to 
potentially serialize my trace later


If you can, please share your plugin code, or at least the memory callback setup to make sure everything is ok.

It's possible that what you try to observe is split amongst several vcpus. I'm not sure how Linux kernel deals with those copies, but if several kernel threads are involved, you won't see all side effects only by observing a single vcpu.

I would suggest to debug with -smp 1 first.

Reply via email to