On Monday, March 9th, 2026 at 6:54 PM, Roy Fru <[email protected]> wrote:
> Good day Mentors, > I hope this email finds you well > I am writing to submit my report for Experiment-0: VM Oversubscription and > Host-Guest Tracing. The report includes: Hi Roy, Thanks for the detailed update. I have a couple of comments. > - VM setup and vCPU oversubscription details While your set-up correctly achieves over-subscription ratio of 1:2, it also adds one-to-one pinning for the vCPUs on the host. This would have been fine if our goal was to dictate the vCPU placement decisions for the host scheduler. But we don't intend to do that. Phantom tracker simply observes the host scheduler decisions. So please refrain from one-to-one pinning. > - Simultaneous execution of NPB UA benchmark and stress-ng workload Good choice to run stress-ng as the parallel workload. Your script waits until both the workloads finish inside their respective VMs. At the moment, we treat the workload inside VM-2 (stress-ng) as noise, and it is okay to terminate the script as soon as the UA benchmark finished inside VM-1. On the other hand, to create a more hostile situation for UA, it is a good idea to ensure that the noise has already started running inside VM2 before UA begins its execution inside VM-1. > - Host and guest sched_switch tracing using trace-cmd Your host-guest tracing set-up looks good. Instead of manually launching and killing trace-cmd, you can also consider including trace-cmd in your experiment script. For example, my script does the following: ssh vng-vm1 trace-cmd agent -D ssh vng-vm2 trace-cmd agent -D trace-cmd record -e sched -v -e sched_stat_runtime -A @4:823 --name vm1 -e printk:console -e sys_enter_sched_setaffinity -e sys_enter_futex -e sched -v -e sched_stat_runtime -A @5:823 --name vm2 -e sched -v -e sched_stat_runtime ./exp.sh And exp.sh simply launches the workloads inside both VM over ssh and waits for completion similar to your script. > - KernelShark analysis of CPU scheduling and phantom vCPU behavior I hope you are having fun playing with KernelShark. Good observation about context_switches between the two vCPUs on the same pCPUs. However, observation of the phantom vCPUs also requires information about which task was running on a vCPU when it got preempted on the host. In the example of vCPU CPU 1/KVM-463092 (the first vCPU), you should check that 1. the vCPU belongs to the target VM-1, and 2. the vCPU was running a UA worker thread when the context_switch happened on the host. > - Problems encountered and their solutions It is great that you upgraded to the latest version of trace-cmd and your host-guest timestamps are now synchronized. trace-cmd agent makes things quite easier these days, but otherwise, under the hood these things are a bit complex. Feel free to check-out trace-cmd attach man page if you're curious :). > The report is attached as a PDF, along with the relevant screenshots used in > the analysis. > Kindly let me know if you would like me to include additional materials such > as script logs or raw trace files in future submissions. I will provide > updates on the next part of Experiment-0 soon. The report is already quite detailed. I am looking forward to the next part. Additionally, since you are now already familiar with KernelShark and scheduler traces, you might also want to check-out schedgraph-tools and perfetto. Himadri
