Re: Submission of Experiment-0 Report – VM Oversubscription & Tracing

Himadri Chhaya-Shailesh via Gcc Wed, 11 Mar 2026 16:01:31 -0700

On Monday, March 9th, 2026 at 6:54 PM, Roy Fru <[email protected]> wrote:


> Good day Mentors,
> I hope this email finds you well

> I am writing to submit my report for Experiment-0: VM Oversubscription and 
> Host-Guest Tracing. The report includes:

Hi Roy,
Thanks for the detailed update. I have a couple of comments.

> - VM setup and vCPU oversubscription details

While your set-up correctly achieves over-subscription ratio of 1:2, it also 
adds one-to-one pinning for the vCPUs on the host. This would have been fine if 
our goal was to dictate the vCPU placement decisions for the host scheduler. 
But we don't intend to do that. Phantom tracker simply observes the host 
scheduler decisions. So please refrain from one-to-one pinning.

> - Simultaneous execution of NPB UA benchmark and stress-ng workload

Good choice to run stress-ng as the parallel workload.

Your script waits until both the workloads finish inside their respective VMs. 
At the moment, we treat the workload inside VM-2 (stress-ng) as noise, and it 
is okay to terminate the script as soon as the UA benchmark finished inside 
VM-1. On the other hand, to create a more hostile situation for UA, it is a 
good idea to ensure that the noise has already started running inside VM2 
before UA begins its execution inside VM-1.

> - Host and guest sched_switch tracing using trace-cmd

Your host-guest tracing set-up looks good. Instead of manually launching and 
killing trace-cmd, you can also consider including trace-cmd in your experiment 
script. For example, my script does the following:

ssh vng-vm1 trace-cmd agent -D
ssh vng-vm2 trace-cmd agent -D
trace-cmd record -e sched -v -e sched_stat_runtime -A @4:823 --name vm1 -e 
printk:console -e sys_enter_sched_setaffinity -e sys_enter_futex -e sched -v -e 
sched_stat_runtime -A @5:823 --name vm2 -e sched -v -e sched_stat_runtime 
./exp.sh

And exp.sh simply launches the workloads inside both VM over ssh and waits for 
completion similar to your script.

> - KernelShark analysis of CPU scheduling and phantom vCPU behavior

I hope you are having fun playing with KernelShark.

Good observation about context_switches between the two vCPUs on the same 
pCPUs. However, observation of the phantom vCPUs also requires information 
about which task was running on a vCPU when it got preempted on the host. In 
the example of  vCPU CPU 1/KVM-463092 (the first vCPU), you should check that 
1. the vCPU belongs to the target VM-1, and 2. the vCPU was running a UA worker 
thread when the context_switch happened on the host.

> - Problems encountered and their solutions

It is great that you upgraded to the latest version of trace-cmd and your 
host-guest timestamps are now synchronized. trace-cmd agent makes things quite 
easier these days, but otherwise, under the hood these things are a bit 
complex. Feel free to check-out trace-cmd attach man page if you're curious :).

> The report is attached as a PDF, along with the relevant screenshots used in 
> the analysis.
> Kindly let me know if you would like me to include additional materials such 
> as script logs or raw trace files in future submissions. I will provide 
> updates on the next part of Experiment-0 soon.

The report is already quite detailed. I am looking forward to the next part.

Additionally, since you are now already familiar with KernelShark and scheduler 
traces, you might also want to check-out schedgraph-tools and perfetto.

Himadri

Re: Submission of Experiment-0 Report – VM Oversubscription & Tracing

Reply via email to