Re: [Xen-devel] QEMU XenServer/XenProject Working group meeting 29th September 2016

2016-10-20 Thread Stefano Stabellini
On Thu, 20 Oct 2016, Lars Kurth wrote:
> > On 18 Oct 2016, at 20:54, Stefano Stabellini  wrote:
> > 
> > I think this kind of calls should be announced on xen-devel before they
> > happen, to give a chance to other people to participate (I cannot
> > promise I would have participated but it is the principle that counts).
> > 
> > If I missed the announcement, I apologize.
> 
> Stefano, the meeting started off as an internal meeting to brainstorm and 
> share experiences and challenges we have with QEMU amongst different Citrix 
> teams with a view to get a wider dialog started. Maybe we are at the stage 
> where it makes sense to open it up. 

No worries, I didn't mean to pick on you guys, as I wrote I am not sure
I would actually have participated, but I think the meeting would work
better for you and the Xen community if it was open. In fact I think
that we don't have enough open meetings in the Xen community in general:
for your information Julien and I are going to start organizing one for
Xen on ARM soon, with the intention of making it a regular monthly
meeting.


> > On Fri, 14 Oct 2016, Jennifer Herbert wrote:
> >> XenStore
> >> 
> >> 
> >> For the non-pv part of QEMU, XenStore is only used in two places.
> >> There is the DM state, and the physmap mechanism.  Although there is a
> >> vague plan for replacing the physmap mechanism, it is some way off.
> >> 
> >> The DM state key is used for knowing when the qemu process is running
> >> etcetera, QMP would seem to be an option to replace it - however there
> >> is no (nice) way to wait on a socket until it has been opened.  One
> >> solution might be to use Xenstore to let you know the QMP sockets
> >> where available, before QEMU drops privileges,  and then QMP could be
> >> used to know QEMU is in the running state.
> >> 
> >> To avoid the need to use xs-restrict, you would need to both replace
> >> physmap and rework qemu startup procedure. The use of xs-restrict would
> >> be more expedient, and does not look to need that much work.
> >> 
> >> Discussion was had over how secure it would be to allow a guest access
> >> to these Xenstore keys - it was concluded that a guest could mostly
> >> only mess itself up.  If I guest attempted to prevent itself from being
> >> migrated, the tool stack time it out, and could kill it.
> >> 
> >> There followed a discussion on the Xenbus protocol, and additions
> >> needed.  The aim is to merely restrict the permission for the command,
> >> to that of the guest who's domID you provide.  It was proposed that
> >> it uses the header as is, with its  16 bytes, with the command
> >> 'one-time-restrict' , and then the payload would have two additional
> >> field at the start.  These two field would correspond to the domid to
> >> restrict as, and the real command. Transaction ID and tags would be
> >> taken from the real header.
> >> 
> >> Although inter domain xs-restrict is not specifically needed for this
> >> project, it is thought it might be a blocking items for upstream
> >> acceptance.  It it thoughts these changes would not require that much
> >> work to implement, and may be useful in use use cases. Only a few
> >> changes to QEMU would be needed, and libxl should be able to track
> >> QEMU versions.  Ian Jackson volunteered to look at this, with David
> >> helping  with the kernel bits.  Ian won't have time to look at this
> >> until after Xen 4.8 is released.
> >> 
> >> There discussion about what may fail once privileges are taken away,
> >> which would include CDs and PCI pass though.  It is thought the full
> >> list can only be known by trying.  Not everything needs to work for
> >> acceptance upstream, such as PCI pass though.   If such an
> >> incompatible feature is needed, restrictions can be turned off.  These
> >> problems can be fixed in a later phase, with CDs likely being at teh
> >> top of the list.
> > 
> > One thing to note is that xs-restrict is unimplemented in cxenstored.
> > 
> > 
> >> disaggregation
> >> =
> >> 
> >> A disaggregation proposal which had previously been posted to a QEMU
> >> forum was discussed.  It was not previously accepted by all. The big
> >> question was how to separate the device models from the machine, with
> >> a particular point of contention being around PIIX and the idea of
> >> starting a QEMU instance without one.
> > 
> > Right. In particular I tend to agree with the other QEMU maintainers
> > when they say: why ask for a PIIX3 compatible machine, when actually you
> > don't want to be PIIX3 compatible?
> > 
> > 
> >> The general desire from us is
> >> we want to have a specific device emulated and nothing else.
> > 
> > This is really not possible with QEMU, because QEMU is a machine
> > emulator, not a device emulator. BTW who wants this? I mean, why is this
> > part of the QEMU depriv discussion? It is not necessary. I think what we
> > want for QEMU depriv is to be able to build a QEMU PV machine with just
> > the PV backends in

Re: [Xen-devel] QEMU XenServer/XenProject Working group meeting 29th September 2016

2016-10-20 Thread Lars Kurth

> On 18 Oct 2016, at 20:54, Stefano Stabellini  wrote:
> 
> I think this kind of calls should be announced on xen-devel before they
> happen, to give a chance to other people to participate (I cannot
> promise I would have participated but it is the principle that counts).
> 
> If I missed the announcement, I apologize.

Stefano, the meeting started off as an internal meeting to brainstorm and share 
experiences and challenges we have with QEMU amongst different Citrix teams 
with a view to get a wider dialog started. Maybe we are at the stage where it 
makes sense to open it up. 

> On Fri, 14 Oct 2016, Jennifer Herbert wrote:
>> XenStore
>> 
>> 
>> For the non-pv part of QEMU, XenStore is only used in two places.
>> There is the DM state, and the physmap mechanism.  Although there is a
>> vague plan for replacing the physmap mechanism, it is some way off.
>> 
>> The DM state key is used for knowing when the qemu process is running
>> etcetera, QMP would seem to be an option to replace it - however there
>> is no (nice) way to wait on a socket until it has been opened.  One
>> solution might be to use Xenstore to let you know the QMP sockets
>> where available, before QEMU drops privileges,  and then QMP could be
>> used to know QEMU is in the running state.
>> 
>> To avoid the need to use xs-restrict, you would need to both replace
>> physmap and rework qemu startup procedure. The use of xs-restrict would
>> be more expedient, and does not look to need that much work.
>> 
>> Discussion was had over how secure it would be to allow a guest access
>> to these Xenstore keys - it was concluded that a guest could mostly
>> only mess itself up.  If I guest attempted to prevent itself from being
>> migrated, the tool stack time it out, and could kill it.
>> 
>> There followed a discussion on the Xenbus protocol, and additions
>> needed.  The aim is to merely restrict the permission for the command,
>> to that of the guest who's domID you provide.  It was proposed that
>> it uses the header as is, with its  16 bytes, with the command
>> 'one-time-restrict' , and then the payload would have two additional
>> field at the start.  These two field would correspond to the domid to
>> restrict as, and the real command. Transaction ID and tags would be
>> taken from the real header.
>> 
>> Although inter domain xs-restrict is not specifically needed for this
>> project, it is thought it might be a blocking items for upstream
>> acceptance.  It it thoughts these changes would not require that much
>> work to implement, and may be useful in use use cases. Only a few
>> changes to QEMU would be needed, and libxl should be able to track
>> QEMU versions.  Ian Jackson volunteered to look at this, with David
>> helping  with the kernel bits.  Ian won't have time to look at this
>> until after Xen 4.8 is released.
>> 
>> There discussion about what may fail once privileges are taken away,
>> which would include CDs and PCI pass though.  It is thought the full
>> list can only be known by trying.  Not everything needs to work for
>> acceptance upstream, such as PCI pass though.   If such an
>> incompatible feature is needed, restrictions can be turned off.  These
>> problems can be fixed in a later phase, with CDs likely being at teh
>> top of the list.
> 
> One thing to note is that xs-restrict is unimplemented in cxenstored.
> 
> 
>> disaggregation
>> =
>> 
>> A disaggregation proposal which had previously been posted to a QEMU
>> forum was discussed.  It was not previously accepted by all. The big
>> question was how to separate the device models from the machine, with
>> a particular point of contention being around PIIX and the idea of
>> starting a QEMU instance without one.
> 
> Right. In particular I tend to agree with the other QEMU maintainers
> when they say: why ask for a PIIX3 compatible machine, when actually you
> don't want to be PIIX3 compatible?
> 
> 
>> The general desire from us is
>> we want to have a specific device emulated and nothing else.
> 
> This is really not possible with QEMU, because QEMU is a machine
> emulator, not a device emulator. BTW who wants this? I mean, why is this
> part of the QEMU depriv discussion? It is not necessary. I think what we
> want for QEMU depriv is to be able to build a QEMU PV machine with just
> the PV backends in it, which is attainable with the current
> architecture. I know there are use cases for having an emulator of just
> one device, but I don't think they should be confused with the more
> important underlying issue here, which is QEMU running with full
> privileges.
> 
> 
>> It is
>> suggested you would have a software interface between each device that
>> looked a software version of PCI.  The PIIX device could be attached to
>> CPU this pseudo PCI interface.  This would fit in well with how IOREQ
>> server and IOMMU works.  Although this sounds like a large
>> architectural change is wanted, its suggested that actually its just
>> that w

Re: [Xen-devel] QEMU XenServer/XenProject Working group meeting 29th September 2016

2016-10-18 Thread Stefano Stabellini
I think this kind of calls should be announced on xen-devel before they
happen, to give a chance to other people to participate (I cannot
promise I would have participated but it is the principle that counts).

If I missed the announcement, I apologize.


On Fri, 14 Oct 2016, Jennifer Herbert wrote:
> XenStore
> 
> 
> For the non-pv part of QEMU, XenStore is only used in two places.
> There is the DM state, and the physmap mechanism.  Although there is a
> vague plan for replacing the physmap mechanism, it is some way off.
> 
> The DM state key is used for knowing when the qemu process is running
> etcetera, QMP would seem to be an option to replace it - however there
> is no (nice) way to wait on a socket until it has been opened.  One
> solution might be to use Xenstore to let you know the QMP sockets
> where available, before QEMU drops privileges,  and then QMP could be
> used to know QEMU is in the running state.
> 
> To avoid the need to use xs-restrict, you would need to both replace
> physmap and rework qemu startup procedure. The use of xs-restrict would
> be more expedient, and does not look to need that much work.
> 
> Discussion was had over how secure it would be to allow a guest access
> to these Xenstore keys - it was concluded that a guest could mostly
> only mess itself up.  If I guest attempted to prevent itself from being
> migrated, the tool stack time it out, and could kill it.
> 
> There followed a discussion on the Xenbus protocol, and additions
> needed.  The aim is to merely restrict the permission for the command,
> to that of the guest who's domID you provide.  It was proposed that
> it uses the header as is, with its  16 bytes, with the command
> 'one-time-restrict' , and then the payload would have two additional
> field at the start.  These two field would correspond to the domid to
> restrict as, and the real command. Transaction ID and tags would be
> taken from the real header.
> 
> Although inter domain xs-restrict is not specifically needed for this
> project, it is thought it might be a blocking items for upstream
> acceptance.  It it thoughts these changes would not require that much
> work to implement, and may be useful in use use cases. Only a few
> changes to QEMU would be needed, and libxl should be able to track
> QEMU versions.  Ian Jackson volunteered to look at this, with David
> helping  with the kernel bits.  Ian won't have time to look at this
> until after Xen 4.8 is released.
> 
> There discussion about what may fail once privileges are taken away,
> which would include CDs and PCI pass though.  It is thought the full
> list can only be known by trying.  Not everything needs to work for
> acceptance upstream, such as PCI pass though.   If such an
> incompatible feature is needed, restrictions can be turned off.  These
> problems can be fixed in a later phase, with CDs likely being at teh
> top of the list.

One thing to note is that xs-restrict is unimplemented in cxenstored.

 
> disaggregation
> =
> 
> A disaggregation proposal which had previously been posted to a QEMU
> forum was discussed.  It was not previously accepted by all. The big
> question was how to separate the device models from the machine, with
> a particular point of contention being around PIIX and the idea of
> starting a QEMU instance without one.

Right. In particular I tend to agree with the other QEMU maintainers
when they say: why ask for a PIIX3 compatible machine, when actually you
don't want to be PIIX3 compatible?


> The general desire from us is
> we want to have a specific device emulated and nothing else.

This is really not possible with QEMU, because QEMU is a machine
emulator, not a device emulator. BTW who wants this? I mean, why is this
part of the QEMU depriv discussion? It is not necessary. I think what we
want for QEMU depriv is to be able to build a QEMU PV machine with just
the PV backends in it, which is attainable with the current
architecture. I know there are use cases for having an emulator of just
one device, but I don't think they should be confused with the more
important underlying issue here, which is QEMU running with full
privileges.


> It is
> suggested you would have a software interface between each device that
> looked a software version of PCI.  The PIIX device could be attached to
> CPU this pseudo PCI interface.  This would fit in well with how IOREQ
> server and IOMMU works.  Although this sounds like a large
> architectural change is wanted, its suggested that actually its just
> that we're asking them to take a different stability and plug-ability
> posture on the interfaces they already have.
> 
> This architectural issue is the cause behind lots of little
> annoyances, which have been going on for years. Xen is having to make
> up lots of strange stuff to keep QEMU happy, and there is confusion
> over memory ownership.  Fixing the architecture  should make our lives
> much easier.  These architectural issues are also m