On 07/26/2011 03:13 PM, Juan Quintela wrote:
Anthony Liguori<anth...@codemonkey.ws> wrote:
On 07/26/2011 07:07 AM, Juan Quintela wrote:
- Be able to describe that different features/versions. This is not the
difficult part, it can be subsections, optional fields, whatever.
What is the difficult part is _knowing_ what fields needs to be on
each version. That again depends of the device, not migration.
- Be able to to do forward/bacward compatibility (and without
comunication both sides is basically impossible).
Hrm, I'm not sure I agree with these conclusions.
Management tools should do their best job to create two compatible
device models.
How? only part that can have enough information is the "new" part
(either source of destination). And we are being very careful about not
allowing any comunication/setting of what is in the other side.
I'll explain below.
- Send things on the wire (really this is the easy part, we can play
with it touching only migration functions.).
We also need a way to future proof ourselves.
We have been very bad at this. Automatic checking is the only way that
I can think of.
I don't know what you mean by automatic checking.
We should have unit test to see that (at least) the obvious migration work.
Oh, 100% agree. In fact, I've posted patches :) But I wasn't happy
with the level of completeness of those tests and want to write better
tests which is part of my motivation in visiting this topic.
We have two things here. Device level& protocol level.
Device level: very late to set anything.
Protocol level: we can set things here, but notice that only a few
things cane be set here.
Once we have a protocol level feature bit, we can add device level
feature bits as a new feature.
This don't help migration time is very late to configure a device. We
need to configure it at creation time. It makes no sense to try to
migrate device foo with 4 bar's and at migration time try to "push" it
into only 2 bars. Having it created with 2 bars in the 1st place is the
only sane solution.
I misunderstood what you were suggesting. For guest visible device
features, they must be configured at creation time. I'm in full agreement.
It's 100% mechanical and makes absolutely no logic change. It works
equally well with legacy and VMstate migration handlers.
3) Add a Visitor class that operates on QEMUFile.
At this state, we can migrate to data structures. That means we can
migrate to QEMUFile, QObjects, or JSON. We could change the protocol
at this stage to something that was still binary but had section sizes
and things of that nature.
That was the whole point of vmstate.
The problem with vmstate is that it's an all or nothing thing and the
conversion isn't programmatic.
This is the whole point. We are being declarative, and we create a
mecanism about how to visit all nodes. What we do in each node is not
VMState business. VMState only defines the nodes, and which ones belong
to each version.
Right. Thinking more after the call, I think this may be a better way
to explain what I'm proposing.
With VMState, we provide a declarative description of each devices
state. Because it's declarative, some things end up being tough to
describe like variable sized arrays and complex data structures. You've
worked through a lot of these, but this is fundamentally what makes this
approach difficult to complete.
At the end of VMState conversion, we have a declaration of how to read
the current state of the device tree. We can write a function that
takes all of the VMState descriptions and builds something from those
descriptions.
But right now, what we actually have is a routine that takes a VMState
data description, and then calls a marshalling function. In essence,
the data description gets interpreted to an imperative serialization
mechanism.
I'm suggesting that instead of trying to eliminate the imperativeness
(which will be hard since we have a lot of hooks in various places), we
should embrace the imperativeness. Instead of marshalling to a
QEMUFile, we marshal to a Visitor, Visitor being an abstract that can
marshal to arbitrary formats/objects.
So we never actually walk the VMState tables to do anything. The
unconverted purely imperative routines we just convert to use marshal to
a Visitor instead of QEMUFile.
What this gives us is a way to achieve the same level of abstraction
that VMState would give us for almost no work at all. That
fundamentally let's us take the next step in "fixing" migration.
device with some features -> migration -> device with other features
and it works. This means that "migration" does magic, and this is never
going to work.
Until now, this kind of worked because we only supported migration from
old -> new, or the same version. Migration from old -> new can never
have new features. But from new -> old to work, we need a way to
disable the new features. That is completely independent of migration.
At startup time, not dynamically. And we have this, that's what -M
pc-X
is about.
It don't work.
Here's how I think we can fix this.
We have two concepts today, the machine and devices. Not all devices
can be created by an end user as some are implied by the machine (this
is qdev.no_user). Since not everything is directly created by the user,
there is no easy way to basically do a dump of the device model, then
feed that back into QEMU for recreation.
We do compatibility by using global properties for the different
machines but this is a tough proposition to get right as the granularity
is pretty poor. I can't change a property of a particular device
created by the machine without changing it universally.
With an improved qdev (which I think is QOM, but for now, just ignore
that), we would be able to do the following:
1) create a device, that creates other devices as children of itself
*without* those children being under a bus hierarchy.
2) eliminate the notion of machines altogether, and instead replace
machines with a chipset, soc device, or whatever is the logic device
that basically equates to what the machine logic does today.
The pc machine code is basically the i440fx. You could take everything
that it does, call it an i440fx object, and make "machine" properties
properties of the i440fx. That makes what we think of as machine
creation identical to device creation.
3) eliminate anonymous devices, implicit bus assignment, and all of the
other features of qdev that prevent the device model from being
described in a stable fashion.
'-device-add e1000,id=foo' is ambiguous as is '-net nic,model=virtio
-net nic,model=virtio'.
The rules about how we find the bus and location of e1000 in the device
model today are arbitrary and difficult to introspect. The result is
that what you use to create a device model becomes wildly different than
what you would use to recreate a device model. There's really no way to
programmaticaly discover this today either. qdev doesn't return the
properties value at construction time but rather the current value.
That's not necessarily the value you want to use to recreate the device.
That's not to say that we shouldn't have friendly interfaces that do
automatic PCI assignment bus assignment. But that has to live a level
higher up than where it lives today in order to create stable device trees.
The rules in QOM are meant to solve these problems. They basically are:
a) All devices must have a unique name at the time of creation making
stable device addressing guaranteed.
b) All relationships between devices are expressed as connections
between plugs and sockets. There are no exceptions here. The
implication is that you never need to use code to recreate a device
model, you can always dump the device model and recreate it via QMP
commands.
c) All device properties are settable after creation time. This might
not seem like a big deal, but in order to support composition, the act
of instantiating a device such as the PIIX which creates more devices
like a UART requires that you set the UARTs construction properties
after creation of the PIIX. Without having an explicit "realize" state
where construction properties have been set, this problem is incredibly
difficult to solve.
qdev cannot satisfy these requirements as it sits today. Maybe there's
a way to incrementally evolve qdev into QOM, I haven't really thought it
through yet.
But the end goal is pretty clear. We should be able to do (qemu)
dump_device_model > foo.cfg in the source and then (qemu)
import_device_model foo.cfg in the destination right before the final
stage of migration. And this should be part of the migration protocol.
This would make migration with hotplug work, along with scores of other
things. This is another reason why a unified model makes sense, just as
you want to dump the device tree, you want to be able to enumerate the
backends to make sure that identically named backends exist on the
destination. Doing that in a single operation is a lot easier than
doing it 10 different ways.
Regards,
Anthony Liguori