On Thu, Jun 12, 2014 at 10:55:40AM +0200, Alexander Graf wrote: > > On 12.06.14 09:54, Michael S. Tsirkin wrote: > >On Thu, Jun 12, 2014 at 09:43:51AM +0200, Greg Kurz wrote: > >>On Thu, 29 May 2014 12:16:26 +0200 > >>Paolo Bonzini <pbonz...@redhat.com> wrote: > >>>Il 29/05/2014 11:12, Greg Kurz ha scritto: > >>>>int virtio_load(VirtIODevice *vdev, QEMUFile *f) > >>>>{ > >>>>[...] > >>>> nheads = vring_avail_idx(&vdev->vq[i]) - > >>>> vdev->vq[i].last_avail_idx; > >>>> ^^^^^^^^^^^ > >>>> /* Check it isn't doing very strange things with descriptor > >>>> numbers. */ > >>>> if (nheads > vdev->vq[i].vring.num) { > >>>>[...] > >>>>} > >>>> > >>>>and > >>>> > >>>>static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id) > >>>>{ > >>>>[...] > >>>> /* The config space */ > >>>> qemu_get_be16s(f, &s->config.cols); > >>>> qemu_get_be16s(f, &s->config.rows); > >>>> > >>>> qemu_get_be32s(f, &max_nr_ports); > >>>> tswap32s(&max_nr_ports); > >>>> ^^^^^^ > >>>> if (max_nr_ports > tswap32(s->config.max_nr_ports)) { > >>>>[...] > >>>>} > >>>> > >>>>If we stream subsections after the device descriptor as it is done > >>>>in VMState, these two will break because the device endian is stale. > >>>> > >>>>The first one can be easily dealt with: just defer the sanity check > >>>>to a post_load function. > >>>Good, we're lucky here. > >>> > >>>>The second is a bit more tricky: the > >>>>virtio serial migrates its config space (target endian) and the > >>>>active ports bitmap. The load code assumes max_nr_ports from the > >>>>config space tells the size of the ports bitmap... that means the > >>>>virtio migration protocol is also contaminated by target endianness. :-\ > >>>Ouch. > >>> > >>>I guess we could break migration in the case of host endianness != > >>>target endianness, like this: > >>> > >>> /* These three used to be fetched in target endianness and then > >>> * stored as big endian. It ended up as little endian if host and > >>> * target endianness doesn't match. > >>> * > >>> * Starting with qemu 2.1, we always store as big endian. The > >>> * version wasn't bumped to avoid breaking backwards compatibility. > >>> * We check the validity of max_nr_ports, and the incorrect- > >>> * endianness max_nr_ports will be huge, which will abort migration > >>> * anyway. > >>> */ > >>> uint16_t cols = tswap16(s->config.cols); > >>> uint16_t rows = tswap16(s->config.rows); > >>> uint32_t max_nr_ports = tswap32(s->config.max_nr_ports); > >>> > >>> qemu_put_be16s(f, &cols); > >>> qemu_put_be16s(f, &rows); > >>> qemu_put_be32s(f, &max_nr_ports); > >>> > >>>... > >>> > >>> uint16_t cols, rows; > >>> > >>> qemu_get_be16s(f, &cols); > >>> qemu_get_be16s(f, &rows); > >>> qemu_get_be32s(f, &max_nr_ports); > >>> > >>> /* Convert back to target endianness when storing into the config > >>> * space. > >>> */ > >>Paolo, > >> > >>The patch set to support endian changing targets adds a device_endian > >>field to the VirtIODevice structure to be used instead of the default > >>target endianness as it happens with tswap() macros. It also introduces > >>virtio_tswap() helpers for this purpose, but they can only be used when > >>the device_endian field has been restored... in a subsection after the > >>device descriptor... :-\ > >Store it earlier then, using plain put/get. > >You can still add a section conditionally to cause > >a cleaner failure in broken cross-version scenarios. > > > >>If the scenario is ppc64le-on-ppc64: tswap() macros don't do anything > >>and we cannot convert back to LE... > >> > >>> s->config.cols = tswap16(cols); > >>> s->config.rows = tswap16(rows); > >>Since cols and rows are not involved in the protocol, we can safely > >>defer the conversion to post load. > >> > >>> if (max_nr_ports > tswap32(s->config.max_nr_ports) { > >>> ... > >>> } > >>> > >>Since we know that 0 < max_nr_ports < 32, is it acceptable to guess > >>the correct endianness with a heuristic ? > >> > >>if (max_nr_ports > tswap32(s->config.max_nr_ports)) { > >> max_nr_ports = bswap32(max_nr_ports); > >>} > >> > >>if (max_nr_ports > tswap32(s->config.max_nr_ports)) { > >> return -EINVAL; > >>} > >> > >>>>In the case the answer for above is "legacy virtio really sucks" then, > >>>>is it acceptable to not honor bug-compatibility with older versions and > >>>>fix the code ? :) > >>>As long as the common cases don't break, yes. The question is what are > >>>the common cases. Here I think the only non-obscure case that could > >>>break is x86-on-PPC, and it's not that common. > >>> > >>>Paolo > >>> > >>Thanks. > >One starts doubting whether all this hackery is worth it. virtio 1.0 > >should be out real soon now, it makes everything LE so the problem goes > >away. It's not like PPC LE is so popular that we must support old drivers > >at all costs. Won't time be better spent backporting virtio 1.0 drivers? > > There are already released and working Linux distributions (Ubuntu, > openSUSE, maybe others) that don't have virtio 1.0 drivers. Putting our > heads into the sand is not an option ;). > > > Alex
I don't get it. Does virtio work there at the moment? -- MST