On 4/7/25 13:36, Yannis Bolliger wrote:
Hi Pierrick,
Thanks for your reply!
On Monday, April 7th, 2025 at 7:04 PM, Pierrick Bouvier
<pierrick.bouv...@linaro.org> wrote:
Hi Yannis,
Is it possible to correlate these addresses? What do I need to look out
for?
It should be possible to correlate these addresses.
- Did you use qemu_plugin_get_hwaddr to obtain physical address?
- Are you seeing the right address on kernel side? (should you add
virt_to_page? Do you really need to offset that with RAM address in the
memory map?)
I use qemu_plugin_get_hwarddr and I indeed do not need to add the offset.
I confirmed with gva2gpa and both kernel and qemu logs always return a
consistent physical address.
That's a good starting point.
Although gva2gpa usually only works on either the user virt address or the
kernel one, perhaps depending on what context the cpu is running in?
I'm not familiar with our monitor command gva2gpa, but I guess it's
relying on current page table set. Kernel and user space may have
different ones (and additional ones depending on privilege level and
architecture). To be able to write to user space, kernel must still map
this address in its own page table, so there still should be a correct
virtual -> physical correct mapping from kernel context.
I'm not sure how those details are handled in Linux kernel, but I know
that kernel space is mapped on a specific partition of the address
space, and user space on the rest. So user address space is accessible
to kernel (but not the opposite, for obvious reasons).
Before Meltdown exploit, it used to be the same page table (but with U
bit set to 0 on kernel entries). To mitigate it, they now have distinct
page tables (KPTI) [1][2], but I don't think it's enabled by default
because of the cost related to syscalls (flushing TLB completely).
[1] https://www.kernel.org/doc/html/next/x86/pti.html
[2] https://en.wikipedia.org/wiki/Kernel_page-table_isolation
There is not a single and absolute answer to "Which physical address
matches this virtual one?" during all the execution. It will vary based
on current context.
The problem I have is that I can now find some accesses to the buffer addresses
I log, but usually only to either one.
As a short example for a read i.e. a kernel to user copy (I take one kernel
line and grep the memory trace to find any accesses to the same pages, not
exact address):
KernelRecord {command: system_server, cpu: 5, size: 4096, op: r,
kernel_address: 0x000000010443b000, user_address: 0x000000014c7ad015 }
LogRecord {insn_count: 11137708523, store: 0, address: 0x000000010443b000 }
LogRecord {insn_count: 11137708739, store: 0, address: 0x000000010443b010 }
LogRecord {insn_count: 11137708750, store: 0, address: 0x000000010443b020 }
... (goes on)
I can see the whole page being read from the kernel buffer, but I cannot find
any stores to the corresponding user buffer physical address.
Do you see any store at all (even with a different address)?
What I would expect are interleaved loads and stores. Depending on the specific
kernel logline I can find only writes to the user address but never an
interleaving.
Do you have any idea what the issue could be? The behavior of gva2gpa sort of
suggests that perhaps during the callback the wrong context is used for either
one, but that is just an uninformed guess from my side.
If you don't see any store, the most obvious idea I have is to check if
you use QEMU_PLUGIN_MEM_RW, and not only QEMU_PLUGIN_MEM_R, when
registering the plugin memory callback.
From QEMU perspective, the fact those things happen from a kernel or
user space does not really matters. All it sees are load/stores, and
instrument them.
--------
Some more background info on what I'm doing so you don't waste any time looking
for an issue on your side in case I'm just using it wrongly:
- I use multithreaded TCG (SMP=8)
- kernel 5.10.234 (android cuttlefish)
- qemu built from source (state from 4 weeks ago 5136598e26)
- I did adjust the execlog plugin somewhat
For the last point, I basically stripped it down to the minimum I needed and
added some optimizations for my specific requirements. I did that because the
original plugin caused my kernel to lock up completely due to the work done in
the callbacks.
More details:
- I only need a count of instructions in between memory accesses so in the
insn_exec_cb I just increment a global counter atomically with
__atomic_fetch_add (sequential consistency) and store it in the per vcpu
LogRecord struct
- In the memory callback I just do qemu_plugin_mem_is_store,
qemu_plugin_get_hwaddr and qemu_plugin_hwaddr_physaddr to fill the LogRecord
struct and write it to the per vcpu logfile
- I added function to the plugin api to get the log enabled state (log_mask) so
I can avoid doing anything in the callbacks (not just avoid printing as done
internally by qemu_plugin_outs)
- I do not use any locks since I have allocated everything per VCPU. I only use
the atomic add in the insn_exec callback sort of as logical time and to
potentially serialize my trace later
If you can, please share your plugin code, or at least the memory
callback setup to make sure everything is ok.
It's possible that what you try to observe is split amongst several
vcpus. I'm not sure how Linux kernel deals with those copies, but if
several kernel threads are involved, you won't see all side effects only
by observing a single vcpu.
I would suggest to debug with -smp 1 first.