On 07/25/14 17:48, Igor Mammedov wrote: > Add API to mark memory region as extend-able on migration, > to allow migration code to load smaller RAMBlock into > a bigger one on destination QEMU instance. > > This will allow to fix broken migration from QEMU 1.7/2.0 to > QEMU 2.1 due to ACPI tables size changes across 1.7/2.0/2.1 > versions by marking ACPI tables ROM blob as extend-able. > So that smaller tables from previos version could be always
("previous") > migrated to a bigger rom blob on new version. > > Credits-for-idea: Michael S. Tsirkin <m...@redhat.com> > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > --- > arch_init.c | 19 ++++++++++++++----- > exec.c | 15 +++++++++------ > include/exec/cpu-all.h | 15 ++++++++++++++- > include/exec/memory.h | 10 ++++++++++ > include/exec/ram_addr.h | 1 + > memory.c | 5 +++++ > 6 files changed, 53 insertions(+), 12 deletions(-) (Please pass the -O flag to git-format-patch, with an order file that moves header files (*.h) to the front. Header hunks should lead in a patch. Thanks.) > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h > index 6593be1..248c075 100644 > --- a/include/exec/ram_addr.h > +++ b/include/exec/ram_addr.h > @@ -33,6 +33,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr); > void *qemu_get_ram_ptr(ram_addr_t addr); > void qemu_ram_free(ram_addr_t addr); > void qemu_ram_free_from_ptr(ram_addr_t addr); > +void qemu_ram_set_extendable_on_migration(ram_addr_t addr); > > static inline bool cpu_physical_memory_get_dirty(ram_addr_t start, > ram_addr_t length, (Ugh. The declarations (prototypes) of qemu_ram_*() functions are scattered between "ram_addr.h" and "cpu-common.h" (both under include/exec). I can't see why that is a good thing (the function definitions all seem to be in exec.c).) > diff --git a/exec.c b/exec.c > index 765bd94..755b1d3 100644 > --- a/exec.c > +++ b/exec.c > @@ -69,12 +69,6 @@ AddressSpace address_space_memory; > MemoryRegion io_mem_rom, io_mem_notdirty; > static MemoryRegion io_mem_unassigned; > > -/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ > -#define RAM_PREALLOC (1 << 0) > - > -/* RAM is mmap-ed with MAP_SHARED */ > -#define RAM_SHARED (1 << 1) > - > #endif I'm not sure these macros should be replaced with an enum; the new flag could be introduced just following the existent pattern. > > struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus); > @@ -1214,6 +1208,15 @@ void qemu_ram_unset_idstr(ram_addr_t addr) > } > } > > +void qemu_ram_set_extendable_on_migration(ram_addr_t addr) > +{ > + RAMBlock *block = find_ram_block(addr); > + > + if (block) { > + block->flags |= RAM_EXTENDABLE_ON_MIGRATE; > + } > +} > + Let's see how some oher qemu_ram_*() functions search for their blocks. We can form two classes; the first class takes a ram_addr_t into some RAMBlock, the second class only accepts/finds a ram_addr_t only at the beginning of some RAMBlock. (1) containment: qemu_get_ram_fd(): calls qemu_get_ram_block() qemu_get_ram_block_host_ptr: calls qemu_get_ram_block() qemu_get_ram_ptr(): calls qemu_get_ram_block() qemu_ram_remap(): needs containment (2) exact block start match: qemu_ram_free(): needs (addr == block->offset) qemu_ram_free_from_ptr(): needs (addr == block->offset) qemu_ram_set_idstr(): calls find_ram_block() qemu_ram_unset_idstr(): calls find_ram_block() Your function will belong to the 2nd class. The definition is close to that of qemu_ram_unset_idstr(), another class member, OK. The declaration is close to qemu_ram_free_from_ptr(), which is another class member. OK. I'd throw in an assert(), rather than an "if", just like qemu_ram_set_idstr() does. > static int memory_try_enable_merging(void *addr, size_t len) > { > if (!qemu_opt_get_bool(qemu_get_machine_opts(), "mem-merge", true)) { Let's see the caller: > diff --git a/memory.c b/memory.c > index 64d7176..744c746 100644 > --- a/memory.c > +++ b/memory.c > @@ -1791,6 +1791,11 @@ bool memory_region_is_mapped(MemoryRegion *mr) > return mr->container ? true : false; > } > > +void memory_region_permit_extendable_migration(MemoryRegion *mr) > +{ > + qemu_ram_set_extendable_on_migration(mr->ram_addr); > +} > + > MemoryRegionSection memory_region_find(MemoryRegion *mr, > hwaddr addr, uint64_t size) > { > Looks matching, and consistent with other callers of the 2nd family functions. OK. > diff --git a/include/exec/memory.h b/include/exec/memory.h > index e2c8e3e..6c03f70 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -894,6 +894,16 @@ bool memory_region_present(MemoryRegion *container, > hwaddr addr); > bool memory_region_is_mapped(MemoryRegion *mr); > > /** > + * memory_region_permit_extendable_migration: allows to mark Please refer to commit 9d85d557326df69fe0570e7de84b2f57e133c7e7 Author: Michael Tokarev <m...@tls.msk.ru> Date: Mon Apr 7 13:34:58 2014 +0400 doc: grammify "allows to" > + * #MemoryRegion as extendable on migrartion, which permits to "migrartion": garbled spelling "permits to": I guess it's similar to "allows to" > + * load source memory block of smaller size than destination memory block > + * at migration time > + * > + * @mr: a #MemoryRegion which should be marked as extendable on migration > + */ ("whose #RAMBlock should be marked as"...) > +void memory_region_permit_extendable_migration(MemoryRegion *mr); > + > +/** > * memory_region_find: translate an address/size relative to a > * MemoryRegion into a #MemoryRegionSection. > * Then, as I said, > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > index f91581f..2b9aea3 100644 > --- a/include/exec/cpu-all.h > +++ b/include/exec/cpu-all.h > @@ -296,13 +296,26 @@ CPUArchState *cpu_copy(CPUArchState *env); > #if !defined(CONFIG_USER_ONLY) > > /* memory API */ > +typedef enum { > + /* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ > + RAM_PREALLOC = 1 << 0, > + /* RAM is mmap-ed with MAP_SHARED */ > + RAM_SHARED = 1 << 1, > + /* > + * Allow at migration time to load RAMBlock of smaller size than > + * destination RAMBlock is > + */ > + RAM_EXTENDABLE_ON_MIGRATE = 1 << 2, > + RAM_FLAGS_END = 1 << 31 > +} RAMBlockFlags; > + ... I kind of disagree with the enumification of these flags. There's another use of "allow [] to". RAM_FLAGS_END is wrong. It shifts a signed 32-bit integer with value 1 to the left by 31 bits, shifting the set 0th bit into the sign bit. Undefined behavior, Peter had a number of patches exterminating such shifts (IIRC). You need to say 1U << 31. However, that won't work either, because enumeration constants have type "int" (and 1U << 31 is not representable as "int"). The C99 spec says this (6.4.4.3p2), and then separately it also says "The expression that defines the value of an enumeration constant shall be an integer constant expression that has a value representable as an int." (6.7.2.2p2). So, I'd say just drop the enumification, and if you still want a RAM_FLAGS_END macro, define it as (1U << 31). > > typedef struct RAMBlock { > struct MemoryRegion *mr; > uint8_t *host; > ram_addr_t offset; > ram_addr_t length; > - uint32_t flags; > + RAMBlockFlags flags; > char idstr[256]; > /* Reads can take either the iothread or the ramlist lock. > * Writes must take both locks. uint32_t is just fine here. IMHO. > > diff --git a/arch_init.c b/arch_init.c > index 8ddaf35..812f8b5 100644 > --- a/arch_init.c > +++ b/arch_init.c > @@ -1071,11 +1071,20 @@ static int ram_load(QEMUFile *f, void *opaque, int > version_id) > > QTAILQ_FOREACH(block, &ram_list.blocks, next) { > if (!strncmp(id, block->idstr, sizeof(id))) { > - if (block->length != length) { > - error_report("Length mismatch: %s: " RAM_ADDR_FMT > - " in != " RAM_ADDR_FMT, id, length, > - block->length); > - ret = -EINVAL; > + if (block->flags & RAM_EXTENDABLE_ON_MIGRATE) { > + if (block->length < length) { > + error_report("Length too big: %s: > "RAM_ADDR_FMT missing space between '"' and RAM_ADDR_FMT (for readability) > + " > " RAM_ADDR_FMT, id, length, you dropped the important word "in" here (cf. "in !=" above). That word explains it to the user that the incoming size (stored in "length") is too big. > + block->length); > + ret = -EINVAL; > + } > + } else { > + if (block->length != length) { > + error_report("Length mismatch: %s: > "RAM_ADDR_FMT missing space between '"' and RAM_ADDR_FMT (for readability) > + " in != " RAM_ADDR_FMT, id, > length, > + block->length); > + ret = -EINVAL; > + } > } > break; > } Can you please explain where it is ensured that the non-overwritten part of a greater RAMBlock is zero-filled? As far as I can see: main() [vl.c] qemu_run_machine_init_done_notifiers() [vl.c] notifier_list_notify() [util/notify.c] pc_guest_info_machine_done() [hw/i386/pc.c] acpi_setup() [hw/i386/acpi-build.c] acpi_add_rom_blob() [hw/i386/acpi-build.c] rom_add_blob() [hw/core/loader.c] rom_set_mr() [hw/core/loader.c] memory_region_init_ram() [memory.c] qemu_ram_alloc() [exec.c] memcpy() <-------------------------- populates it fw_cfg_add_file_callback() [hw/nvram/fw_cfg.c] fw_cfg_add_bytes_read_callback() [hw/nvram/fw_cfg.c] <----- stores "len" qemu_start_incoming_migration() [migration.c] I currently cannot see where the trailing portion of the bigger RAMBlock would be overwritten with zeroes. I also don't see why that portion of the RAMBlock / memory region would not be exposed to the guest over fw_cfg. The size of the fw_cfg entry is stored while the target (incoming) qemu process is setting up, in fw_cfg_add_bytes_read_callback(), and that "len" value will reflect the greater length when reading too. See fw_cfg_read(). If that portion contains nonzero garbage (leftovers from the original ACPI stuff, generated on the incoming host), then OVMF's ACPI payload parser will choke on it (if you migrate such a VM before OVMF does its ACPI parsing). It can handle trailing zeros, but not trailing garbage. Summary, roughly in order of growing importance: - various typos and grammar errors, not too important to fix, - please consider assert()ing the success of find_ram_block() in qemu_ram_set_extendable_on_migration(), - please fix whitespace errors near RAM_ADDR_FMT, - please keep the "in" word in the new error message there, - please do something about RAM_FLAGS_END; preferably, keep the macros and drop the enum, - please prove (or enforce) that the trailing portion of oversized RAMBlocks are filled with zeroes (a memset() might suffice near the new (block->length < length) check, in ram_load()). Thanks, Laszlo