Re: device compatibility interface for live migration with assigned devices

2020-09-10 Thread Sean Mooney
On Thu, 2020-09-10 at 14:38 +0200, Cornelia Huck wrote:
> On Wed, 9 Sep 2020 10:13:09 +0800
> Yan Zhao  wrote:
> 
> > > > still, I'd like to put it more explicitly to make ensure it's not 
> > > > missed:
> > > > the reason we want to specify compatible_type as a trait and check
> > > > whether target compatible_type is the superset of source
> > > > compatible_type is for the consideration of backward compatibility.
> > > > e.g.
> > > > an old generation device may have a mdev type xxx-v4-yyy, while a newer
> > > > generation  device may be of mdev type xxx-v5-yyy.
> > > > with the compatible_type traits, the old generation device is still
> > > > able to be regarded as compatible to newer generation device even their
> > > > mdev types are not equal.  
> > > 
> > > If you want to support migration from v4 to v5, can't the (presumably
> > > newer) driver that supports v5 simply register the v4 type as well, so
> > > that the mdev can be created as v4? (Just like QEMU versioned machine
> > > types work.)  
> > 
> > yes, it should work in some conditions.
> > but it may not be that good in some cases when v5 and v4 in the name string
> > of mdev type identify hardware generation (e.g. v4 for gen8, and v5 for
> > gen9)
> > 
> > e.g.
> > (1). when src mdev type is v4 and target mdev type is v5 as
> > software does not support it initially, and v4 and v5 identify hardware
> > differences.
> 
> My first hunch here is: Don't introduce types that may be compatible
> later. Either make them compatible, or make them distinct by design,
> and possibly add a different, compatible type later.
> 
> > then after software upgrade, v5 is now compatible to v4, should the
> > software now downgrade mdev type from v5 to v4?
> > not sure if moving hardware generation info into a separate attribute
> > from mdev type name is better. e.g. remove v4, v5 in mdev type, while use
> > compatible_pci_ids to identify compatibility.
> 
> If the generations are compatible, don't mention it in the mdev type.
> If they aren't, use distinct types, so that management software doesn't
> have to guess. At least that would be my naive approach here.
yep that is what i would prefer to see too.
> 
> > 
> > (2) name string of mdev type is composed by "driver_name + type_name".
> > in some devices, e.g. qat, different generations of devices are binding to
> > drivers of different names, e.g. "qat-v4", "qat-v5".
> > then though type_name is equal, mdev type is not equal. e.g.
> > "qat-v4-type1", "qat-v5-type1".
> 
> I guess that shows a shortcoming of that "driver_name + type_name"
> approach? Or maybe I'm just confused.
yes i really dont like haveing the version in the mdev-type name 
i would stongly perfger just qat-type-1 wehere qat is just there as a way of 
namespacing.
although symmetric-cryto, asymmetric-cryto and compression woudl be a better 
name then type-1, type-2, type-3 if
that is what they would end up mapping too. e.g. qat-compression or qat-aes is 
a much better name then type-1
higher layers of software are unlikely to parse the mdev names but as a human 
looking at them its much eaiser to
understand if the names are meaningful. the qat prefix i think is important 
however to make sure that your mdev-types
dont colide with other vendeors mdev types. so i woudl encurage all vendors to 
prefix there mdev types with etiher the
device name or the vendor.
> 



Re: device compatibility interface for live migration with assigned devices

2020-08-28 Thread Sean Mooney
On Fri, 2020-08-28 at 15:47 +0200, Cornelia Huck wrote:
> On Wed, 26 Aug 2020 14:41:17 +0800
> Yan Zhao  wrote:
> 
> > previously, we want to regard the two mdevs created with dsa-1dwq x 30 and
> > dsa-2dwq x 15 as compatible, because the two mdevs consist equal resources.
> > 
> > But, as it's a burden to upper layer, we agree that if this condition
> > happens, we still treat the two as incompatible.
> > 
> > To fix it, either the driver should expose dsa-1dwq only, or the target
> > dsa-2dwq needs to be destroyed and reallocated via dsa-1dwq x 30.
> 
> AFAIU, these are mdev types, aren't they? So, basically, any management
> software needs to take care to use the matching mdev type on the target
> system for device creation?

or just do the simple thing of use the same mdev type on the source and dest.
matching mdevtypes is not nessiarly trivial. we could do that but we woudl have
to do that in python rather then sql so it would be slower to do at least today.

we dont currently have the ablity to say the resouce provider must have 1 of 
these
set of traits. just that we must have a specific trait. this is a feature we 
have
disucssed a couple of times and delayed untill we really really need it but its 
not out
of the question that we could add it for this usecase. i suspect however we 
would do exact
match first and explore this later after the inital mdev migration works.

by the way i was looking at some vdpa reslated matiail today and noticed vdpa 
devices are nolonger
usign mdevs and and now use a vhost chardev so i guess we will need a 
completely seperate mechanioum
for vdpa vs mdev migration as a result. that is rather unfortunet but i guess 
that is life.
> 



Re: device compatibility interface for live migration with assigned devices

2020-08-20 Thread Sean Mooney
On Thu, 2020-08-20 at 14:27 +0800, Yan Zhao wrote:
> On Thu, Aug 20, 2020 at 06:16:28AM +0100, Sean Mooney wrote:
> > On Thu, 2020-08-20 at 12:01 +0800, Yan Zhao wrote:
> > > On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> > > > On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > > > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > > > Daniel P. Berrangé  wrote:
> > > > > > 
> > > > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > > > >On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > > > 
> > > > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > > > 
> > > > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > > > 
> > > > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > > > 
> > > > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > > > > >  we actually can also retrieve the same information through 
> > > > > > > > sysfs, .e.g
> > > > > > > > 
> > > > > > > >  |- [path to device]
> > > > > > > > |--- migration
> > > > > > > > | |--- self
> > > > > > > > | |   |---device_api
> > > > > > > > ||   |---mdev_type
> > > > > > > > ||   |---software_version
> > > > > > > > ||   |---device_id
> > > > > > > > ||   |---aggregator
> > > > > > > > | |--- compatible
> > > > > > > > | |   |---device_api
> > > > > > > > ||   |---mdev_type
> > > > > > > > ||   |---software_version
> > > > > > > > ||   |---device_id
> > > > > > > > ||   |---aggregator
> > > > > > > > 
> > > > > > > > 
> > > > > > > >  Yes but:
> > > > > > > > 
> > > > > > > >  - You need one file per attribute (one syscall for one 
> > > > > > > > attribute)
> > > > > > > >  - Attribute is coupled with kobject
> > > > > > 
> > > > > > Is that really that bad? You have the device with an embedded 
> > > > > > kobject
> > > > > > anyway, and you can just put things into an attribute group?
> > > > > > 
> > > > > > [Also, I think that self/compatible split in the example makes 
> > > > > > things
> > > > > > needlessly complex. Shouldn't semantic versioning and matching 
> > > > > > already
> > > > > > cover nearly everything? I would expect very few cases that are more
> > > > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > > > need that self/compatible split for that, either.]
> > > > > 
> > > > > Hi Cornelia,
> > > > > 
> > > > > The reason I want to declare compatible list of attributes is that
> > > > > sometimes it's not a simple 1:1 matching of source attributes and 
> > > > > target attributes
> > > > > as I demonstrated below,
> > > > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is 
> > > > > compatible to
> > > > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > > > >(mdev_type i915-GVTg_V5_8 + aggregator 4)
> > > > 
> > > > the way you are doing the nameing is till really confusing by the way
> > > > if this has not already been merged in the kernel can you chagne the 
> > > > mdev
> > > > so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 
> > > > instead of half the device
> > > > 
> > > > currently you need to deived the aggratod by the number at the end of 
> > > > the mdev type to figure out
> > > > how much of the phsicial device is being used with is a very unfridly 
> > > > api convention
> > > > 
> > > > the way aggrator ar

Re: device compatibility interface for live migration with assigned devices

2020-08-19 Thread Sean Mooney
On Thu, 2020-08-20 at 12:01 +0800, Yan Zhao wrote:
> On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> > On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > Daniel P. Berrangé  wrote:
> > > > 
> > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > >On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > 
> > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > 
> > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > 
> > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > 
> > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > > >  we actually can also retrieve the same information through sysfs, 
> > > > > > .e.g
> > > > > > 
> > > > > >  |- [path to device]
> > > > > > |--- migration
> > > > > > | |--- self
> > > > > > | |   |---device_api
> > > > > > ||   |---mdev_type
> > > > > > ||   |---software_version
> > > > > > ||   |---device_id
> > > > > > ||   |---aggregator
> > > > > > | |--- compatible
> > > > > > | |   |---device_api
> > > > > > ||   |---mdev_type
> > > > > > ||   |---software_version
> > > > > > ||   |---device_id
> > > > > > ||   |---aggregator
> > > > > > 
> > > > > > 
> > > > > >  Yes but:
> > > > > > 
> > > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > > >  - Attribute is coupled with kobject
> > > > 
> > > > Is that really that bad? You have the device with an embedded kobject
> > > > anyway, and you can just put things into an attribute group?
> > > > 
> > > > [Also, I think that self/compatible split in the example makes things
> > > > needlessly complex. Shouldn't semantic versioning and matching already
> > > > cover nearly everything? I would expect very few cases that are more
> > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > need that self/compatible split for that, either.]
> > > 
> > > Hi Cornelia,
> > > 
> > > The reason I want to declare compatible list of attributes is that
> > > sometimes it's not a simple 1:1 matching of source attributes and target 
> > > attributes
> > > as I demonstrated below,
> > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > >(mdev_type i915-GVTg_V5_8 + aggregator 4)
> > 
> > the way you are doing the nameing is till really confusing by the way
> > if this has not already been merged in the kernel can you chagne the mdev
> > so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead 
> > of half the device
> > 
> > currently you need to deived the aggratod by the number at the end of the 
> > mdev type to figure out
> > how much of the phsicial device is being used with is a very unfridly api 
> > convention
> > 
> > the way aggrator are being proposed in general is not really someting i 
> > like but i thin this at least
> > is something that should be able to correct.
> > 
> > with the complexity in the mdev type name + aggrator i suspect that this 
> > will never be support
> > in openstack nova directly requireing integration via cyborg unless we can 
> > pre partion the
> > device in to mdevs staicaly and just ignore this.
> > 
> > this is way to vendor sepecif to integrate into something like openstack in 
> > nova unless we can guarentee
> > taht how aggreator work will be portable across vendors genericly.
> > 
> > > 
> > > and aggragator may be just one of such examples that 1:1 matching does not
> > > fit.
> > 
> > for openstack nova i dont see us support anything beyond the 1:1 case where 
> > the mdev type does not change.
> > 
> 
> hi Sean,
> I understand it's hard for openstack. 

Re: device compatibility interface for live migration with assigned devices

2020-08-19 Thread Sean Mooney
On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > On Tue, 18 Aug 2020 10:16:28 +0100
> > Daniel P. Berrangé  wrote:
> > 
> > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > >On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > 
> > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > 
> > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > 
> > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > 
> > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > 
> > > >  |- [path to device]
> > > > |--- migration
> > > > | |--- self
> > > > | |   |---device_api
> > > > ||   |---mdev_type
> > > > ||   |---software_version
> > > > ||   |---device_id
> > > > ||   |---aggregator
> > > > | |--- compatible
> > > > | |   |---device_api
> > > > ||   |---mdev_type
> > > > ||   |---software_version
> > > > ||   |---device_id
> > > > ||   |---aggregator
> > > > 
> > > > 
> > > >  Yes but:
> > > > 
> > > >  - You need one file per attribute (one syscall for one attribute)
> > > >  - Attribute is coupled with kobject
> > 
> > Is that really that bad? You have the device with an embedded kobject
> > anyway, and you can just put things into an attribute group?
> > 
> > [Also, I think that self/compatible split in the example makes things
> > needlessly complex. Shouldn't semantic versioning and matching already
> > cover nearly everything? I would expect very few cases that are more
> > complex than that. Maybe the aggregation stuff, but I don't think we
> > need that self/compatible split for that, either.]
> 
> Hi Cornelia,
> 
> The reason I want to declare compatible list of attributes is that
> sometimes it's not a simple 1:1 matching of source attributes and target 
> attributes
> as I demonstrated below,
> source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
>(mdev_type i915-GVTg_V5_8 + aggregator 4)
the way you are doing the nameing is till really confusing by the way
if this has not already been merged in the kernel can you chagne the mdev
so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead of 
half the device

currently you need to deived the aggratod by the number at the end of the mdev 
type to figure out
how much of the phsicial device is being used with is a very unfridly api 
convention

the way aggrator are being proposed in general is not really someting i like 
but i thin this at least
is something that should be able to correct.

with the complexity in the mdev type name + aggrator i suspect that this will 
never be support
in openstack nova directly requireing integration via cyborg unless we can pre 
partion the
device in to mdevs staicaly and just ignore this.

this is way to vendor sepecif to integrate into something like openstack in 
nova unless we can guarentee
taht how aggreator work will be portable across vendors genericly.

> 
> and aggragator may be just one of such examples that 1:1 matching does not
> fit.
for openstack nova i dont see us support anything beyond the 1:1 case where the 
mdev type does not change.

i woudl really prefer if there was just one mdev type that repsented the 
minimal allcatable unit and the
aggragaotr where used to create compostions of that. i.e instad of 
i915-GVTg_V5_2 beign half the device,
have 1 mdev type i915-GVTg and if the device support 8 of them then we can 
aggrate 4 of i915-GVTg

if you want to have muplie mdev type to model the different amoutn of the 
resouce e.g. i915-GVTg_small i915-GVTg_large
that is totlaly fine too or even i915-GVTg_4 indcating it sis 4 of i915-GVTg

failing that i would just expose an mdev type per composable resouce and allow 
us to compose them a the user level with
some other construct mudeling a attament to the device. e.g. create composed 
mdev or somethig that is an aggreateion of
multiple sub resouces each of which is an mdev. so kind of like how bond port 
work. we would create an mdev for each of
the sub resouces and then create a bond or aggrated mdev by reference the other 
mdevs by uuid then attach only the
aggreated mdev to the instance.

the current aggrator syntax and sematic however make me rather uncofrotable 
when i think about orchestating vms on top
of it even to boot them let alone migrate them.
> 
> So, we explicitly list out self/compatible attributes, and management
> tools only need to check if self attributes is contained compatible
> attributes.
> 
> or do you mean only compatible list is enough, and the management tools
> need to find out self list by themselves?
> But I think provide a self list is easier for management tools.
> 
> Thanks
> Yan
> 



Re: device compatibility interface for live migration with assigned devices

2020-08-14 Thread Sean Mooney
On Fri, 2020-08-14 at 13:16 +0800, Yan Zhao wrote:
> On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > 
> > On 2020/8/10 下午3:46, Yan Zhao wrote:
> > > > driver is it handled by?
> > > 
> > > It looks that the devlink is for network device specific, and in
> > > devlink.h, it says
> > > include/uapi/linux/devlink.h - Network physical device Netlink
> > > interface,
> > 
> > 
> > Actually not, I think there used to have some discussion last year and the
> > conclusion is to remove this comment.
> > 
> > It supports IB and probably vDPA in the future.
> > 
> 
> hmm... sorry, I didn't find the referred discussion. only below discussion
> regarding to why to add devlink.
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg95801.html
>   >This doesn't seem to be too much related to networking? Why can't 
> something
>   >like this be in sysfs?
>   
>   It is related to networking quite bit. There has been couple of
>   iteration of this, including sysfs and configfs implementations. There
>   has been a consensus reached that this should be done by netlink. I
>   believe netlink is really the best for this purpose. Sysfs is not a good
>   idea
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg96102.html
>   >there is already a way to change eth/ib via
>   >echo 'eth' > /sys/bus/pci/drivers/mlx4_core/:02:00.0/mlx4_port1
>   >
>   >sounds like this is another way to achieve the same?
>   
>   It is. However the current way is driver-specific, not correct.
>   For mlx5, we need the same, it cannot be done in this way. Do devlink is
>   the correct way to go.
im not sure i agree with that.
standardising a filesystem based api that is used across all vendors is also a 
valid
option.  that said if devlink is the right choice form a kerenl perspective by 
all
means use it but i have not heard a convincing argument for why it actually 
better.
with tthat said we have been uing tools like ethtool to manage aspect of nics 
for decades
so its not that strange an idea to use a tool and binary protocoal rather then 
a text
based interface for this but there are advantages to both approches.
> 
> https://lwn.net/Articles/674867/
>   There a is need for some userspace API that would allow to expose things
>   that are not directly related to any device class like net_device of
>   ib_device, but rather chip-wide/switch-ASIC-wide stuff.
> 
>   Use cases:
>   1) get/set of port type (Ethernet/InfiniBand)
>   2) monitoring of hardware messages to and from chip
>   3) setting up port splitters - split port into multiple ones and squash 
> again,
>  enables usage of splitter cable
>   4) setting up shared buffers - shared among multiple ports within one 
> chip
> 
> 
> 
> we actually can also retrieve the same information through sysfs, .e.g
> 
> > - [path to device]
> 
>   |--- migration
>   | |--- self
>   | |   |---device_api
>   |   |   |---mdev_type
>   |   |   |---software_version
>   |   |   |---device_id
>   |   |   |---aggregator
>   | |--- compatible
>   | |   |---device_api
>   |   |   |---mdev_type
>   |   |   |---software_version
>   |   |   |---device_id
>   |   |   |---aggregator
> 
> 
> 
> > 
> > >   I feel like it's not very appropriate for a GPU driver to use
> > > this interface. Is that right?
> > 
> > 
> > I think not though most of the users are switch or ethernet devices. It
> > doesn't prevent you from inventing new abstractions.
> 
> so need to patch devlink core and the userspace devlink tool?
> e.g. devlink migration
and devlink python libs if openstack was to use it directly.
we do have caes where we just frok a process and execaute a comannd in a shell
with or without elevated privladge but we really dont like doing that due to 
the performacne impacat and security implciations so where we can use python 
bindign
over c apis we do. pyroute2 is the only python lib i know off of the top of my 
head
that support devlink so we would need to enhacne it to support this new devlink 
api.
there may be otherss i have not really looked in the past since we dont need to 
use
devlink at all today.
> 
> > Note that devlink is based on netlink, netlink has been widely used by
> > various subsystems other than networking.
> 
> the advantage of netlink I see is that it can monitor device status and
> notify upper layer that migration database needs to get updated.
> But not sure whether openstack would like to use this capability.
> As Sean said, it's heavy for openstack. it's heavy for vendor driver
> as well :)
> 
> And devlink monitor now listens the notification and dumps the state
> changes. If we want to use it, need to let it forward the notification
> and dumped info to openstack, right?
i dont think we would use direct devlink monitoring in nova even if it was 
avaiable.
we could but we already poll libvirt and the system for other resouce 
periodicly.
we 

Re: device compatibility interface for live migration with assigned devices

2020-08-05 Thread Sean Mooney
On Wed, 2020-08-05 at 12:53 +0200, Jiri Pirko wrote:
> Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.z...@intel.com wrote:
> > On Wed, Aug 05, 2020 at 04:02:48PM +0800, Jason Wang wrote:
> > > 
> > > On 2020/8/5 下午3:56, Jiri Pirko wrote:
> > > > Wed, Aug 05, 2020 at 04:41:54AM CEST, jasow...@redhat.com wrote:
> > > > > On 2020/8/5 上午10:16, Yan Zhao wrote:
> > > > > > On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
> > > > > > > On 2020/8/5 上午12:35, Cornelia Huck wrote:
> > > > > > > > [sorry about not chiming in earlier]
> > > > > > > > 
> > > > > > > > On Wed, 29 Jul 2020 16:05:03 +0800
> > > > > > > > Yan Zhao  wrote:
> > > > > > > > 
> > > > > > > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson 
> > > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > (...)
> > > > > > > > 
> > > > > > > > > > Based on the feedback we've received, the previously 
> > > > > > > > > > proposed interface
> > > > > > > > > > is not viable.  I think there's agreement that the user 
> > > > > > > > > > needs to be
> > > > > > > > > > able to parse and interpret the version information.  Using 
> > > > > > > > > > json seems
> > > > > > > > > > viable, but I don't know if it's the best option.  Is there 
> > > > > > > > > > any
> > > > > > > > > > precedent of markup strings returned via sysfs we could 
> > > > > > > > > > follow?
> > > > > > > > 
> > > > > > > > I don't think encoding complex information in a sysfs file is a 
> > > > > > > > viable
> > > > > > > > approach. Quoting Documentation/filesystems/sysfs.rst:
> > > > > > > > 
> > > > > > > > "Attributes should be ASCII text files, preferably with only 
> > > > > > > > one value
> > > > > > > > per file. It is noted that it may not be efficient to contain 
> > > > > > > > only one
> > > > > > > > value per file, so it is socially acceptable to express an 
> > > > > > > > array of
> > > > > > > > values of the same type.
> > > > > > > > Mixing types, expressing multiple lines of data, and doing fancy
> > > > > > > > formatting of data is heavily frowned upon."
> > > > > > > > 
> > > > > > > > Even though this is an older file, I think these restrictions 
> > > > > > > > still
> > > > > > > > apply.
> > > > > > > 
> > > > > > > +1, that's another reason why devlink(netlink) is better.
> > > > > > > 
> > > > > > 
> > > > > > hi Jason,
> > > > > > do you have any materials or sample code about devlink, so we can 
> > > > > > have a good
> > > > > > study of it?
> > > > > > I found some kernel docs about it but my preliminary study didn't 
> > > > > > show me the
> > > > > > advantage of devlink.
> > > > > 
> > > > > CC Jiri and Parav for a better answer for this.
> > > > > 
> > > > > My understanding is that the following advantages are obvious (as I 
> > > > > replied
> > > > > in another thread):
> > > > > 
> > > > > - existing users (NIC, crypto, SCSI, ib), mature and stable
> > > > > - much better error reporting (ext_ack other than string or errno)
> > > > > - namespace aware
> > > > > - do not couple with kobject
> > > > 
> > > > Jason, what is your use case?
> > > 
> > > 
> > > I think the use case is to report device compatibility for live migration.
> > > Yan proposed a simple sysfs based migration version first, but it looks 
> > > not
> > > sufficient and something based on JSON is discussed.
> > > 
> > > Yan, can you help to summarize the discussion so far for Jiri as a
> > > reference?
> > > 
> > 
> > yes.
> > we are currently defining an device live migration compatibility
> > interface in order to let user space like openstack and libvirt knows
> > which two devices are live migration compatible.
> > currently the devices include mdev (a kernel emulated virtual device)
> > and physical devices (e.g.  a VF of a PCI SRIOV device).
> > 
> > the attributes we want user space to compare including
> > common attribues:
> >device_api: vfio-pci, vfio-ccw...
> >mdev_type: mdev type of mdev or similar signature for physical device
> >   It specifies a device's hardware capability. e.g.
> >i915-GVTg_V5_4 means it's of 1/4 of a gen9 Intel graphics
> >device.
by the way this nameing sceam works the opisite of how it would have expected
i woudl have expected to i915-GVTg_V5 to be the same as i915-GVTg_V5_1 and 
i915-GVTg_V5_4 to use 4 times the amount of resouce as i915-GVTg_V5_1 not 1 
quarter.

i would much rather see i915-GVTg_V5_4 express as aggreataor:i915-GVTg_V5=4
e.g. that it is 4 of the basic i915-GVTg_V5 type
the invertion of the relationship makes this much harder to resonabout IMO.

if i915-GVTg_V5_8 and i915-GVTg_V5_4 are both actully claiming the same resouce
and both can be used at the same time with your suggested nameing scemem i have 
have
to fine the mdevtype with the largest value and store that then do math by 
devidign it by the suffix
of the requested type every time i want to claim the resouce in our placement 
inventoies.

if we represent it the way i suggest we dont
if it 

Re: device compatibility interface for live migration with assigned devices

2020-07-30 Thread Sean Mooney
On Thu, 2020-07-30 at 11:41 +0800, Yan Zhao wrote:
> > > >interface_version=3
> > 
> > Not much granularity here, I prefer Sean's previous
> > .[.bugfix] scheme.
> > 
> 
> yes, .[.bugfix] scheme may be better, but I'm not sure if
> it works for a complicated scenario.
> e.g for pv_mode,
> (1) initially,  pv_mode is not supported, so it's pv_mode=none, it's 0.0.0,
> (2) then, pv_mode=ppgtt is supported, pv_mode="none+ppgtt", it's 0.1.0,
> indicating pv_mode=none can migrate to pv_mode="none+ppgtt", but not vice 
> versa.
> (3) later, pv_mode=context is also supported,
> pv_mode="none+ppgtt+context", so it's 0.2.0.
> 
> But if later, pv_mode=ppgtt is removed. pv_mode="none+context", how to
> name its version?
it would become 1.0.0
addtion of a feature is a minor version bump as its backwards compatiable.
if you dont request the new feature you dont need to use it and it can continue 
to behave like
a 0.0.0 device evne if its capably of acting as a 0.1.0 device.
when you remove a feature that is backward incompatable as any isnstance that 
was prevously not
using it would nolonger work so you have to bump the major version.
>  "none+ppgtt" (0.1.0) is not compatible to
> "none+context", but "none+ppgtt+context" (0.2.0) is compatible to
> "none+context".
> 
> Maintain such scheme is painful to vendor driver.
not really its how most software libs are version today. some use other schemes
but semantic versioning is don right is a concies and easy to consume set of 
rules
https://semver.org/ however you are right that it forcnes vendor to think about 
backwards
and forwards compatiablty with each change which for the most part is a good 
thing.
it goes hand in hand with have stable abi and api definitons to ensuring 
firmware updates and driver chagnes
dont break userspace that depend on the kernel interfaces they expose.




Re: device compatibility interface for live migration with assigned devices

2020-07-30 Thread Sean Mooney
On Thu, 2020-07-30 at 09:56 +0800, Yan Zhao wrote:
> On Wed, Jul 29, 2020 at 12:28:46PM +0100, Sean Mooney wrote:
> > On Wed, 2020-07-29 at 16:05 +0800, Yan Zhao wrote:
> > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> > > > On Mon, 27 Jul 2020 15:24:40 +0800
> > > > Yan Zhao  wrote:
> > > > 
> > > > > > > As you indicate, the vendor driver is responsible for checking 
> > > > > > > version
> > > > > > > information embedded within the migration stream.  Therefore a
> > > > > > > migration should fail early if the devices are incompatible.  Is 
> > > > > > > it  
> > > > > > 
> > > > > > but as I know, currently in VFIO migration protocol, we have no way 
> > > > > > to
> > > > > > get vendor specific compatibility checking string in migration 
> > > > > > setup stage
> > > > > > (i.e. .save_setup stage) before the device is set to _SAVING state.
> > > > > > In this way, for devices who does not save device data in precopy 
> > > > > > stage,
> > > > > > the migration compatibility checking is as late as in stop-and-copy
> > > > > > stage, which is too late.
> > > > > > do you think we need to add the getting/checking of vendor specific
> > > > > > compatibility string early in save_setup stage?
> > > > > >  
> > > > > 
> > > > > hi Alex,
> > > > > after an offline discussion with Kevin, I realized that it may not be 
> > > > > a
> > > > > problem if migration compatibility check in vendor driver occurs late 
> > > > > in
> > > > > stop-and-copy phase for some devices, because if we report device
> > > > > compatibility attributes clearly in an interface, the chances for
> > > > > libvirt/openstack to make a wrong decision is little.
> > > > 
> > > > I think it would be wise for a vendor driver to implement a pre-copy
> > > > phase, even if only to send version information and verify it at the
> > > > target.  Deciding you have no device state to send during pre-copy does
> > > > not mean your vendor driver needs to opt-out of the pre-copy phase
> > > > entirely.  Please also note that pre-copy is at the user's discretion,
> > > > we've defined that we can enter stop-and-copy at any point, including
> > > > without a pre-copy phase, so I would recommend that vendor drivers
> > > > validate compatibility at the start of both the pre-copy and the
> > > > stop-and-copy phases.
> > > > 
> > > 
> > > ok. got it!
> > > 
> > > > > so, do you think we are now arriving at an agreement that we'll give 
> > > > > up
> > > > > the read-and-test scheme and start to defining one interface (perhaps 
> > > > > in
> > > > > json format), from which libvirt/openstack is able to parse and find 
> > > > > out
> > > > > compatibility list of a source mdev/physical device?
> > > > 
> > > > Based on the feedback we've received, the previously proposed interface
> > > > is not viable.  I think there's agreement that the user needs to be
> > > > able to parse and interpret the version information.  Using json seems
> > > > viable, but I don't know if it's the best option.  Is there any
> > > > precedent of markup strings returned via sysfs we could follow?
> > > 
> > > I found some examples of using formatted string under /sys, mostly under
> > > tracing. maybe we can do a similar implementation.
> > > 
> > > #cat /sys/kernel/debug/tracing/events/kvm/kvm_mmio/format
> > > 
> > > name: kvm_mmio
> > > ID: 32
> > > format:
> > > field:unsigned short common_type;   offset:0;   size:2; 
> > > signed:0;
> > > field:unsigned char common_flags;   offset:2;   size:1; 
> > > signed:0;
> > > field:unsigned char common_preempt_count;   offset:3;   
> > > size:1; signed:0;
> > > field:int common_pid;   offset:4;   size:4; signed:1;
> > > 
> > > field:u32 type; offset:8;   size:4; signed:0;
> > > field:u32 len;  offset:12;  size:4; signed:0;
> > > field:u64 gpa;  offset:16;  size:8; signed:0;
&

Re: device compatibility interface for live migration with assigned devices

2020-07-29 Thread Sean Mooney
On Wed, 2020-07-29 at 16:05 +0800, Yan Zhao wrote:
> On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> > On Mon, 27 Jul 2020 15:24:40 +0800
> > Yan Zhao  wrote:
> > 
> > > > > As you indicate, the vendor driver is responsible for checking version
> > > > > information embedded within the migration stream.  Therefore a
> > > > > migration should fail early if the devices are incompatible.  Is it  
> > > > 
> > > > but as I know, currently in VFIO migration protocol, we have no way to
> > > > get vendor specific compatibility checking string in migration setup 
> > > > stage
> > > > (i.e. .save_setup stage) before the device is set to _SAVING state.
> > > > In this way, for devices who does not save device data in precopy stage,
> > > > the migration compatibility checking is as late as in stop-and-copy
> > > > stage, which is too late.
> > > > do you think we need to add the getting/checking of vendor specific
> > > > compatibility string early in save_setup stage?
> > > >  
> > > 
> > > hi Alex,
> > > after an offline discussion with Kevin, I realized that it may not be a
> > > problem if migration compatibility check in vendor driver occurs late in
> > > stop-and-copy phase for some devices, because if we report device
> > > compatibility attributes clearly in an interface, the chances for
> > > libvirt/openstack to make a wrong decision is little.
> > 
> > I think it would be wise for a vendor driver to implement a pre-copy
> > phase, even if only to send version information and verify it at the
> > target.  Deciding you have no device state to send during pre-copy does
> > not mean your vendor driver needs to opt-out of the pre-copy phase
> > entirely.  Please also note that pre-copy is at the user's discretion,
> > we've defined that we can enter stop-and-copy at any point, including
> > without a pre-copy phase, so I would recommend that vendor drivers
> > validate compatibility at the start of both the pre-copy and the
> > stop-and-copy phases.
> > 
> 
> ok. got it!
> 
> > > so, do you think we are now arriving at an agreement that we'll give up
> > > the read-and-test scheme and start to defining one interface (perhaps in
> > > json format), from which libvirt/openstack is able to parse and find out
> > > compatibility list of a source mdev/physical device?
> > 
> > Based on the feedback we've received, the previously proposed interface
> > is not viable.  I think there's agreement that the user needs to be
> > able to parse and interpret the version information.  Using json seems
> > viable, but I don't know if it's the best option.  Is there any
> > precedent of markup strings returned via sysfs we could follow?
> 
> I found some examples of using formatted string under /sys, mostly under
> tracing. maybe we can do a similar implementation.
> 
> #cat /sys/kernel/debug/tracing/events/kvm/kvm_mmio/format
> 
> name: kvm_mmio
> ID: 32
> format:
> field:unsigned short common_type;   offset:0;   size:2; 
> signed:0;
> field:unsigned char common_flags;   offset:2;   size:1; 
> signed:0;
> field:unsigned char common_preempt_count;   offset:3;   
> size:1; signed:0;
> field:int common_pid;   offset:4;   size:4; signed:1;
> 
> field:u32 type; offset:8;   size:4; signed:0;
> field:u32 len;  offset:12;  size:4; signed:0;
> field:u64 gpa;  offset:16;  size:8; signed:0;
> field:u64 val;  offset:24;  size:8; signed:0;
> 
> print fmt: "mmio %s len %u gpa 0x%llx val 0x%llx", 
> __print_symbolic(REC->type, { 0, "unsatisfied-read" }, { 1, "read"
> }, { 2, "write" }), REC->len, REC->gpa, REC->val
> 
this is not json fromat and its not supper frendly to parse.
> 
> #cat /sys/devices/pci:00/:00:02.0/uevent
> DRIVER=vfio-pci
> PCI_CLASS=3
> PCI_ID=8086:591D
> PCI_SUBSYS_ID=8086:2212
> PCI_SLOT_NAME=:00:02.0
> MODALIAS=pci:v8086d591Dsv8086sd2212bc03sc00i00
> 
this is ini format or conf formant 
this is pretty simple to parse whichi would be fine.
that said you could also have a version or capablitiy directory with a file
for each key and a singel value.

i would prefer to only have to do one read personally the list the files in
directory and then read tehm all ot build the datastucture myself but that is
doable though the simple ini format use d for uevent seams the best of 3 options
provided above.
> > 
> > Your idea of having both a "self" object and an array of "compatible"
> > objects is perhaps something we can build on, but we must not assume
> > PCI devices at the root level of the object.  Providing both the
> > mdev-type and the driver is a bit redundant, since the former includes
> > the latter.  We can't have vendor specific versioning schemes though,
> > ie. gvt-version. We need to agree on a common scheme and decide which
> > fields the version is relative to, ex. just the mdev type?
> 
> what about making all comparing fields vendor specific?
> 

Re: device compatibility interface for live migration with assigned devices

2020-07-20 Thread Sean Mooney
On Mon, 2020-07-20 at 11:41 +0800, Jason Wang wrote:
> On 2020/7/18 上午12:12, Alex Williamson wrote:
> > On Thu, 16 Jul 2020 16:32:30 +0800
> > Yan Zhao  wrote:
> > 
> > > On Thu, Jul 16, 2020 at 12:16:26PM +0800, Jason Wang wrote:
> > > > On 2020/7/14 上午7:29, Yan Zhao wrote:
> > > > > hi folks,
> > > > > we are defining a device migration compatibility interface that helps 
> > > > > upper
> > > > > layer stack like openstack/ovirt/libvirt to check if two devices are
> > > > > live migration compatible.
> > > > > The "devices" here could be MDEVs, physical devices, or hybrid of the 
> > > > > two.
> > > > > e.g. we could use it to check whether
> > > > > - a src MDEV can migrate to a target MDEV,
> > > > > - a src VF in SRIOV can migrate to a target VF in SRIOV,
> > > > > - a src MDEV can migration to a target VF in SRIOV.
> > > > > (e.g. SIOV/SRIOV backward compatibility case)
> > > > > 
> > > > > The upper layer stack could use this interface as the last step to 
> > > > > check
> > > > > if one device is able to migrate to another device before triggering 
> > > > > a real
> > > > > live migration procedure.
> > > > > we are not sure if this interface is of value or help to you. please 
> > > > > don't
> > > > > hesitate to drop your valuable comments.
> > > > > 
> > > > > 
> > > > > (1) interface definition
> > > > > The interface is defined in below way:
> > > > > 
> > > > >__userspace
> > > > > /\  \
> > > > >/ \write
> > > > >   / read  \
> > > > >  /__   ___\|/_
> > > > > | migration_version | | migration_version |-->check migration
> > > > > - -   compatibility
> > > > >device Adevice B
> > > > > 
> > > > > 
> > > > > a device attribute named migration_version is defined under each 
> > > > > device's
> > > > > sysfs node. e.g. 
> > > > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).
> > > > 
> > > > Are you aware of the devlink based device management interface that is
> > > > proposed upstream? I think it has many advantages over sysfs, do you
> > > > consider to switch to that?
> > 
> > Advantages, such as?
> 
> 
> My understanding for devlink(netlink) over sysfs (some are mentioned at 
> the time of vDPA sysfs mgmt API discussion) are:
i tought netlink was used more a as a configuration protocoal to qurry and 
confire nic and i guess
other devices in its devlink form requireint a tool to be witten that can speak 
the protocal to interact with.
the primary advantate of sysfs is that everything is just a file. there are no 
addtional depleenceis
needed and unlike netlink there are not interoperatblity issues in a 
coanitnerised env. if you are using diffrenet
version of libc and gcc in the contaienr vs the host my understanding is tools 
like ethtool from ubuntu deployed
in a container on a centos host can have issue communicating with the host 
kernel. if its jsut a file unless
the format the data is returnin in chagnes or the layout of sysfs changes its 
compatiable regardless of what you
use to read it.
> 
> - existing users (NIC, crypto, SCSI, ib), mature and stable
> - much better error reporting (ext_ack other than string or errno)
> - namespace aware
> - do not couple with kobject
> 
> Thanks
> 



Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Sean Mooney
resending with full cc list since i had this typed up
i would blame my email provier but my email client does not seam to like long 
cc lists.
we probably want to continue on  alex's thread to not split the disscusion.
but i have responed inline with some example of  how openstack schdules and 
what i ment by different mdev_types


On Tue, 2020-07-14 at 20:29 +0100, Sean Mooney wrote:
> On Tue, 2020-07-14 at 11:01 -0600, Alex Williamson wrote:
> > On Tue, 14 Jul 2020 13:33:24 +0100
> > Sean Mooney  wrote:
> > 
> > > On Tue, 2020-07-14 at 11:21 +0100, Daniel P. Berrangé wrote:
> > > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:  
> > > > > hi folks,
> > > > > we are defining a device migration compatibility interface that helps 
> > > > > upper
> > > > > layer stack like openstack/ovirt/libvirt to check if two devices are
> > > > > live migration compatible.
> > > > > The "devices" here could be MDEVs, physical devices, or hybrid of the 
> > > > > two.
> > > > > e.g. we could use it to check whether
> > > > > - a src MDEV can migrate to a target MDEV,  
> > > 
> > > mdev live migration is completely possible to do but i agree with Dan 
> > > barrange's comments
> > > from the point of view of openstack integration i dont see calling out to 
> > > a vender sepecific
> > > tool to be an accpetable
> > 
> > As I replied to Dan, I'm hoping Yan was referring more to vendor
> > specific knowledge rather than actual tools.
> > 
> > > solutions for device compatiablity checking. the sys filesystem
> > > that describs the mdevs that can be created shoudl also
> > > contain the relevent infomation such
> > > taht nova could integrate it via libvirt xml representation or directly 
> > > retrive the
> > > info from
> > > sysfs.
> > > > > - a src VF in SRIOV can migrate to a target VF in SRIOV,  
> > > 
> > > so vf to vf migration is not possible in the general case as there is no 
> > > standarised
> > > way to transfer teh device state as part of the siorv specs produced by 
> > > the pci-sig
> > > as such there is not vender neutral way to support sriov live migration. 
> > 
> > We're not talking about a general case, we're talking about physical
> > devices which have vfio wrappers or hooks with device specific
> > knowledge in order to support the vfio migration interface.  The point
> > is that a discussion around vfio device migration cannot be limited to
> > mdev devices.
> 
> ok upstream in  openstack at least we do not plan to support generic 
> livemigration
> for passthough devivces. we cheat with network interfaces since in generaly 
> operating
> systems handel hotplug of a nic somewhat safely so wehre no abstraction layer 
> like
> an mdev is present or a macvtap device we hot unplug the nic before the 
> migration
> and attach a new one after.  for gpus or crypto cards this likely would not 
> be viable
> since you can bond generic hardware devices to hide the removal and readdtion 
> of a generic
> pci device. we were hoping that there would be a convergenca around MDEVs as 
> a way to provide
> that abstraction going forward for generic device or some other new 
> mechanisum in the future.
> > 
> > > > > - a src MDEV can migration to a target VF in SRIOV.  
> > > 
> > > that also makes this unviable
> > > > >   (e.g. SIOV/SRIOV backward compatibility case)
> > > > > 
> > > > > The upper layer stack could use this interface as the last step to 
> > > > > check
> > > > > if one device is able to migrate to another device before triggering 
> > > > > a real
> > > > > live migration procedure.  
> > > 
> > > well actully that is already too late really. ideally we would want to do 
> > > this compaiablity
> > > check much sooneer to avoid the migration failing. in an openstack 
> > > envionment  at least
> > > by the time we invoke libvirt (assuming your using the libvirt driver) to 
> > > do the migration we have alreaedy
> > > finished schduling the instance to the new host. if if we do the 
> > > compatiablity check at this point
> > > and it fails then the live migration is aborted and will not be retired. 
> > > These types of late check lead to a
> > > poor user experince as unless you check the migration detial it basically 
> > > looks like the

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Sean Mooney
On Tue, 2020-07-14 at 11:21 +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> > hi folks,
> > we are defining a device migration compatibility interface that helps upper
> > layer stack like openstack/ovirt/libvirt to check if two devices are
> > live migration compatible.
> > The "devices" here could be MDEVs, physical devices, or hybrid of the two.
> > e.g. we could use it to check whether
> > - a src MDEV can migrate to a target MDEV,
mdev live migration is completely possible to do but i agree with Dan 
barrange's comments
from the point of view of openstack integration i dont see calling out to a 
vender sepecific
tool to be an accpetable
solutions for device compatiablity checking. the sys filesystem
that describs the mdevs that can be created shoudl also
contain the relevent infomation such
taht nova could integrate it via libvirt xml representation or directly retrive 
the
info from
sysfs.
> > - a src VF in SRIOV can migrate to a target VF in SRIOV,
so vf to vf migration is not possible in the general case as there is no 
standarised
way to transfer teh device state as part of the siorv specs produced by the 
pci-sig
as such there is not vender neutral way to support sriov live migration. 
> > - a src MDEV can migration to a target VF in SRIOV.
that also makes this unviable
> >   (e.g. SIOV/SRIOV backward compatibility case)
> > 
> > The upper layer stack could use this interface as the last step to check
> > if one device is able to migrate to another device before triggering a real
> > live migration procedure.
well actully that is already too late really. ideally we would want to do this 
compaiablity
check much sooneer to avoid the migration failing. in an openstack envionment  
at least
by the time we invoke libvirt (assuming your using the libvirt driver) to do 
the migration we have alreaedy
finished schduling the instance to the new host. if if we do the compatiablity 
check at this point
and it fails then the live migration is aborted and will not be retired. These 
types of late check lead to a
poor user experince as unless you check the migration detial it basically looks 
like the migration was ignored
as it start to migrate and then continuge running on the orgininal host.

when using generic pci passhotuhg with openstack, the pci alias is intended to 
reference a single vendor id/product
id so you will have 1+ alias for each type of device. that allows openstack to 
schedule based on the availability of a
compatibale device because we track inventories of pci devices and can query 
that when selecting a host.

if we were to support mdev live migration in the future we would want to take 
the same declarative approch.
1 interospec the capability of the deivce we manage
2 create inventories of the allocatable devices and there capabilities
3 schdule the instance to a host based on the device-type/capabilities and 
claim it atomicly to prevent raceces
4 have the lower level hyperviors do addtional validation if need prelive 
migration.

this proposal seams to be targeting extending step 4 where as ideally we should 
focuse on providing the info that would
be relevant in set 1 preferably in a vendor neutral way vai a kernel interface 
like /sys.
 
> > we are not sure if this interface is of value or help to you. please don't
> > hesitate to drop your valuable comments.
> > 
> > 
> > (1) interface definition
> > The interface is defined in below way:
> > 
> >  __userspace
> >   /\  \
> >  / \write
> > / read  \
> >/__   ___\|/_
> >   | migration_version | | migration_version |-->check migration
> >   - -   compatibility
> >  device Adevice B
> > 
> > 
> > a device attribute named migration_version is defined under each device's
> > sysfs node. e.g. 
> > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).
this might be useful as we could tag the inventory with the migration version 
and only might to
devices with  the same version
> > userspace tools read the migration_version as a string from the source 
> > device,
> > and write it to the migration_version sysfs attribute in the target device.
this would not be useful as the schduler cannot directlly connect to the 
compute host
and even if it could it would be extreamly slow to do this for 1000s of hosts 
and potentally
multiple devices per host.
> > 
> > The userspace should treat ANY of below conditions as two devices not 
> > compatible:
> > - any one of the two devices does not have a migration_version attribute
> > - error when reading from migration_version attribute of one device
> > - error when writing migration_version string of one device to
> >   migration_version attribute of the other device
> > 
> > The string read from migration_version attribute is defined by