On Tue, Jun 16, 2026 at 02:33:43PM +0200, David Hildenbrand (Arm) wrote:
> On 5/13/26 18:50, Gregory Price wrote:
> > 
> > The pull model remains available and is the default.
> 
> I don't quite see the big benefit here, really: either it's a timer in the
> hypervisor or a timer in the VM. A slow VM will, in either model, delay the
> update of stats.
> 
> If you need some "liveness detection", is virtio-balloon stats updates really
> the right mechanism?
> 
> I don't quite understand the "Latency-sensitive consumers" problem. If the VM 
> is
> slow, it is slow and will mess with latency-sensitive consumers in either way?
>

Latency sensitive here should probably be defined as "Does not like
blocking operations".  This was prototyped in the context of
cloud-hypervisor [1] and an orchestrator trying poll 1000 VMs on a
single machine for stats. 

The poller couldn't determine the difference between "guest is slow" and
"guest is hung" and so had to block on the operation (I didn't see how
to solve this async).

Similarly, having a single thread just round-robin poll the VMs is
bluntly inefficient and provides poor guarantees about the liveliness
of the stats (a couple slow guests can cause other guests' stats to
become stale for 10s of seconds).

Definitely an RFC here because I'm not sure if I was missing something
that might help me solve the problem.

~Gregory

[1] https://github.com/cloud-hypervisor/cloud-hypervisor

Reply via email to