Hello Peter,

On Thursday, May 08, 2025 22:23 CEST, Peter Xu <pet...@redhat.com> wrote:

> > The scenarios where zeroing is not required (incoming migration and
> > -loadvm) share a common characteristic: the VM has not yet run in the
> > current QEMU process.
> > To avoid splitting read_ramblock_mapped_ram(), could we implement
> > a check to determine if the VM has ever run and decide whether to zero
> > the memory based on that? Maybe using RunState?
> > 
> > Then we can add something like this to read_ramblock_mapped_ram()
> > ...
> > clear_bit_idx = 0;
> > for (...) {
> >     // Zero pages
> >     if (guest_has_ever_run()) {
> >         unread = TARGET_PAGE_SIZE * (set_bit_idx - clear_bit_idx);
> >         offset = clear_bit_idx << TARGET_PAGE_BITS;
> >         host = host_from_ram_block_offset(block, offset);
> >         if (!host) {...}
> >         ram_handle_zero(host, unread);
> >     }
> >     // Non-zero pages
> >     clear_bit_idx = find_next_zero_bit(bitmap, num_pages, set_bit_idx + 1);
> > ...
> > (Plus trailing zero pages handling)
> 
> [...]
> 
> > > >> > In a nutshell, I'm using dirty page tracking to load from the 
> > > >> > snapshot
> > > >> > only the pages that have been dirtied between two loadvm;
> > > >> > mapped-ram is required to seek and read only the dirtied pages.
> 
> I may not have the full picture here, please bare with me if so.
> 
> It looks to me the major feature here you're proposing is like a group of
> snapshots in sequence, while only the 1st snapshot contains full RAM data,
> the rest only contains what were dirtied?
>
> From what I can tell, the interface will be completely different from
> snapshots then - firstly the VM will be running when taking (at least the
> 2nd+) snapshots, meanwhile there will be an extra phase after normal save
> snapshot, which is "keep saving snapshots", during the window the user is
> open to snapshot at any time based on the 1st snapshot.  I'm curious what's
> the interfacing for the feature.  It seems we need a separate command
> saying that "finished storing the current group of snapshots", which should
> stop the dirty tracking.

My goal is to speed up recurrent snapshot restore of short living VMs.
In my use case I create one snapshot, and then I restore it thousands
of times, leaving the VM running for just a few functions execution for
example.
Still, you are right in saying that this is a two steps process.
What I added (not in this patch, but in a downstream fork atm) are a
couple of HMP commands:
- loadvm_for_hotreaload: in a nutshell it's a loadvm that also starts dirty
tracking
- hotreload: again a loadvm but that takes advantage of the dirty log
to selectively restore only dirty pages

> I'm also curious what is the use case, and I also wonder if "whether we
> could avoid memset(0) on a zero page" is anything important here - maybe
> you could start with simple (which is to always memset() for a zero page
> when a page is dirtied)?

My use case is, you guessed it, fuzz testing aka fuzzing.
About the zeroing, you are right, optimizing it is not a huge concern for
my use case, doing what you say is perfectly fine.

Just to be clear, what I describe above is not the content of this patch.
This patch aims only to make a first step in adding the support for the
mapped-ram feature for savevm/loadvm snapshots, which is a
prerequisite for my hotreload feature.
mapped-ram is currently supported only in (file) migration.
What's missing from this patch to have it working completely, is the
handling of zero pages. Differently from migration, with snapshots pages
are not all zero prior to the restore and must therefore be handled.

I hope I summarized in an understandable way, if not I'll be happy to
further clarify :) 
Thanks for the feedback!
Best
Marco


Reply via email to