Re: [Xen-devel] [PATCH v5 2/2] docs/designs: Add a design document for migration of xenstore data

2020-03-05 Thread Durrant, Paul
> -Original Message-
> From: Julien Grall 
> Sent: 04 March 2020 18:32
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: Andrew Cooper ; George Dunlap 
> ; Ian
> Jackson ; Jan Beulich ; Konrad 
> Rzeszutek Wilk
> ; Stefano Stabellini ; Wei 
> Liu 
> Subject: RE: [EXTERNAL][PATCH v5 2/2] docs/designs: Add a design document for 
> migration of xenstore
> data
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> Hi Paul,
> 
> On 13/02/2020 10:53, Paul Durrant wrote:
> > This patch details proposes extra migration data and xenstore protocol
> > extensions to support non-cooperative live migration of guests.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Ian Jackson 
> > Cc: Jan Beulich 
> > Cc: Julien Grall 
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: Stefano Stabellini 
> > Cc: Wei Liu 
> >
> > v5:
> >   - Add QUIESCE
> >   - Make semantics of  in GET_DOMAIN_WATCHES more clear
> >
> > v4:
> >   - Drop the restrictions on special paths
> >
> > v3:
> >   - New in v3
> > ---
> >   docs/designs/xenstore-migration.md | 136 +
> >   1 file changed, 136 insertions(+)
> >   create mode 100644 docs/designs/xenstore-migration.md
> >
> > diff --git a/docs/designs/xenstore-migration.md 
> > b/docs/designs/xenstore-migration.md
> > new file mode 100644
> > index 00..5cfe2d9a7d
> > --- /dev/null
> > +++ b/docs/designs/xenstore-migration.md
> > @@ -0,0 +1,136 @@
> > +# Xenstore Migration
> > +
> > +## Background
> > +
> > +The design for *Non-Cooperative Migration of Guests*[1] explains that extra
> > +save records are required in the migrations stream to allow a guest running
> > +PV drivers to be migrated without its co-operation. Moreover the save
> > +records must include details of registered xenstore watches as well as
> > +content; information that cannot currently be recovered from `xenstored`,
> > +and hence some extension to the xenstore protocol[2] will also be required.
> > +
> > +The *libxenlight Domain Image Format* specification[3] already defines a
> > +record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
> > +transferring xenstore data pertaining to the domain directly as it is
> > +specified such that keys are relative to the path
> > +`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to
> > +define at least one new save record type.
> > +
> > +## Proposal
> > +
> > +### New Save Record
> > +
> > +A new mandatory record type should be defined within the libxenlight Domain
> > +Image Format:
> > +
> > +`0x0007: DOMAIN_XENSTORE_DATA`
> > +
> > +The format of each of these new records should be as follows:
> > +
> > +
> > +```
> > +0 1 2 3 4 5 6 7 octet
> > ++++
> > +| type   | record specific data   |
> > +++|
> > +...
> > ++-+
> > +```
> > +
> > +
> > +| Field | Description |
> > +|---|---|
> 
> Did you indend to add more - so | is on the same column as the onter lines?
> 

Yep, cut'n'paste error.

> > +| `type` | 0x: invalid |
> > +|| 0x0001: node data |
> > +|| 0x0002: watch data |
> 
> Should not the last | be some of the columns on all the lines?
> 
> > +|| 0x0003 - 0x: reserved for future use |
> 
> Looking at the spec, the command TRANSACTION_END *must* be used with an
> existing transaction. As a guest would be migrate to a new domain, the
> transaction ID would now be invalid.
> 
> I understand that xenstored is able to cope with it, but such behavior
> is not described in the spec. So I am not sure we can expect a guest to
> cope with an error value other than the ones described for the command.
> 

And (as we discussed offline) there would be an issue if the migrated guest 
started a new transaction before completing one that was started pre-migration, 
as the ids may clash. So, we are going to need a record to transfer open 
transaction ids so that we can reserve them in the receiving xenstored.

> > +
> > +
> > +where data is always in the form of a NUL separated and termin

Re: [Xen-devel] [PATCH v3 5/6] mm: add 'is_special_page' macro...

2020-03-05 Thread Durrant, Paul
> -Original Message-
> From: Tamas K Lengyel 
> Sent: 05 March 2020 15:10
> To: pdurr...@amzn.com
> Cc: Xen-devel ; Durrant, Paul 
> ; Jan Beulich
> ; Andrew Cooper ; Wei Liu 
> ; Roger Pau Monné
> ; George Dunlap ; Ian Jackson
> ; Julien Grall ; Konrad Rzeszutek 
> Wilk
> ; Stefano Stabellini ; Tim 
> Deegan 
> Subject: RE: [EXTERNAL][PATCH v3 5/6] mm: add 'is_special_page' macro...
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On Thu, Mar 5, 2020 at 5:45 AM  wrote:
> >
> > From: Paul Durrant 
> >
> > ... to cover xenheap and PGC_extra pages.
> >
> > PGC_extra pages are intended to hold data structures that are associated
> > with a domain and my be mapped by that domain. They should not be treated
> > as 'normal' guest pages (i.e. RAM or page tables). Hence, in many cases
> > where code currently tests is_xen_heap_page() it should also check for
> > the PGC_extra bit in 'count_info'.
> >
> > This patch therefore defines is_special_page() to cover both cases and
> > converts tests if is_xen_heap_page() to is_special_page() where
> > appropriate.
> 
> In context of VM forking, are these pages only used by some type of PV
> mechanism? If not, would we need to get them copied somehow or are
> these setup during the regular createdomain routine? Can they be
> copied on-demand, ie. do these pages pass a p2m_is_ram() check?

PGC_extra domheap pages are intended as direct replacements for shared xenheap 
pages and should be treated the same way. Thus they do not form part of the 
migration stream. Their p2m type depends entirely on how they are added to the 
p2m, as it is for any other page.

  Paul

> 
> Thanks,
> Tamas
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs

2020-03-05 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan 
> Beulich
> Sent: 05 March 2020 09:37
> To: Gautam, Varad 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper 
> ; Julien Grall
> ; Roger Pau Monné 
> Subject: Re: [Xen-devel] [PATCH v3] x86: irq: Do not BUG_ON multiple unbind 
> calls for shared pirqs
> 
> On 29.01.2020 12:47, Jan Beulich wrote:
> > On 29.01.2020 11:30, Roger Pau Monné wrote:
> >> Hello,
> >>
> >> Thanks for the patch! Next time could you please try to reply to the
> >> previous questions before sending a new version:
> >>
> >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg00257.html
> >>
> >> On Wed, Jan 29, 2020 at 10:28:07AM +0100, Varad Gautam wrote:
> >>> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> >>> In that scenario, it is possible to receive multiple _pirq_guest_unbind
> >>> calls for the same pirq from domain_kill, if the pirq has not yet been
> >>> removed from the domain's pirq_tree, as:
> >>>   domain_kill()
> >>> -> domain_relinquish_resources()
> >>>   -> pci_release_devices()
> >>> -> pci_clean_dpci_irq()
> >>>   -> pirq_guest_unbind()
> >>> -> __pirq_guest_unbind()
> >>>
> >>> For a shared pirq (nr_guests > 1), the first call would zap the current
> >>> domain from the pirq's guests[] list, but the action handler is never 
> >>> freed
> >>> as there are other guests using this pirq. As a result, on the second 
> >>> call,
> >>> __pirq_guest_unbind searches for the current domain which has been removed
> >>> from the guests[] list, and hits a BUG_ON.
> >>>
> >>> Make __pirq_guest_unbind safe to be called multiple times by letting xen
> >>> continue if a shared pirq has already been unbound from this guest. The
> >>> PIRQ will be cleaned up from the domain's pirq_tree during the destruction
> >>> in complete_domain_destroy anyways.
> >>
> >> So AFAICT this is because pt_pirq_softirq_active() returns true in
> >> pci_clean_dpci_irq() and hence the iteration is stopped and
> >> hvm_domain_irq(d)->dpci is not set to NULL.
> >>
> >> Would it be possible to clean the already processed IRQs from the
> >> domain pirq_tree?
> >
> > This might work, perhaps by way of invoking unmap_domain_pirq()
> > right after pirq_guest_unbind(), as long as hvm_dirq_assist() (as
> > called from dpci_softirq()) can be made skip all actual work it
> > means to do in such a case. Unfortunately the two ->masked fields
> > acted upon are different between __pirq_guest_unbind() and
> > hvm_dirq_assist().
> 
> Ping? Unless I hear back soon, I'm afraid I'm going to drop this
> patch from my "pending" mail folder, as not being agreed whether
> to stick to the current version or whether to go this alternative
> route. A more "natural" approach to fixing the issue would be
> quite nice, after all.

I'll try to pick this up tomorrow as I helped diagnose the problem being fixed.

  Paul

> 
> Jan
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 5/6] mm: add 'is_special_page' macro...

2020-03-06 Thread Durrant, Paul
> -Original Message-
> From: detected-as-s...@amazon.com  On Behalf Of 
> Alan Robinson
> Sent: 06 March 2020 07:02
> To: pdurr...@amzn.com
> Cc: xen-devel@lists.xenproject.org; Stefano Stabellini 
> ; Julien Grall
> ; Wei Liu ; Konrad Rzeszutek Wilk 
> ; Andrew Cooper
> ; Durrant, Paul ; Ian 
> Jackson
> ; George Dunlap ; Tim 
> Deegan ; Tamas
> K Lengyel ; Jan Beulich ; Roger Pau 
> Monné
> 
> Subject: RE: [EXTERNAL][Xen-devel] [PATCH v3 5/6] mm: add 'is_special_page' 
> macro...
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> A typo...
> 
> On Thu, Mar 05, 2020 at 01:45:03PM +0100, pdurr...@amzn.com wrote:
> >
> > PGC_extra pages are intended to hold data structures that are associated
> > with a domain and my be mapped by that domain. They should not be treated
> 
> s/my/may/

Good spot. Will fix. Thanks,

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] domain: use PGC_extra domheap page for shared_info

2020-03-06 Thread Durrant, Paul
> -Original Message-
> From: David Woodhouse 
> Sent: 06 March 2020 11:53
> To: Jan Beulich ; Durrant, Paul 
> Cc: jul...@xen.org; andrew.coop...@citrix.com; sstabell...@kernel.org; 
> konrad.w...@oracle.com;
> volodymyr_babc...@epam.com; ian.jack...@eu.citrix.com; w...@xen.org; 
> george.dun...@citrix.com; xen-
> de...@lists.xenproject.org
> Subject: RE: [EXTERNAL][PATCH 2/2] domain: use PGC_extra domheap page for 
> shared_info
> 
> On Fri, 2020-03-06 at 12:37 +0100, Jan Beulich wrote:
> > I've started looking at the latest version of Paul's series, but I'm
> > still struggling to see the picture: There's no true distinction
> > between Xen heap and domain heap on x86-64 (except on very large
> > systems). Therefore it is unclear to me what "those pages" is actually
> > referring to above. Surely new Xen can't be given any pages in use
> > _in any way_ by old Xen, no matter whether it's ones assigned to
> > domains, or ones used internally to (old) Xen.
> 
> Old and new Xen do not coexist. There is a kexec (via kexec_reloc.S and
> purgatory) from old to new.
> 
> There are some pages which new Xen MUST NOT scribble on, because they
> actually belong to the domains being preserved. That includes the EPT
> (or at least IOMMU) page tables.
> 
> I suppose new Xen also mustn't scribble on the pages in which old Xen
> has placed the migration information for those domains either. At
> least, not until it's consumed the data.
> 
> Anything else, however, is fine for new Xen to scribble on. Fairly much
> anything that the old Xen had allocated from its xenheap (and not
> subsequently shared to a guest, qv) is no longer required and can be
> treated as free memory by the new Xen, which now owns the machine.
> 

... so getting rid of shared xenheap pages altogether just makes life easier.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/6] x86 / pv: do not treat PGC_extra pages as RAM when constructing dom0

2020-03-06 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan 
> Beulich
> Sent: 06 March 2020 11:56
> To: pdurr...@amzn.com
> Cc: xen-devel@lists.xenproject.org; Durrant, Paul ; 
> Roger Pau Monné
> ; Wei Liu ; Andrew Cooper 
> 
> Subject: Re: [Xen-devel] [PATCH v3 3/6] x86 / pv: do not treat PGC_extra 
> pages as RAM when
> constructing dom0
> 
> On 05.03.2020 13:45, pdurr...@amzn.com wrote:
> > --- a/xen/arch/x86/pv/dom0_build.c
> > +++ b/xen/arch/x86/pv/dom0_build.c
> > @@ -792,6 +792,10 @@ int __init dom0_construct_pv(struct domain *d,
> >  {
> >  mfn = mfn_x(page_to_mfn(page));
> >  BUG_ON(SHARED_M2P(get_gpfn_from_mfn(mfn)));
> > +
> > +if ( page->count_info & PGC_extra )
> > +continue;
> 
> This surely is a pattern, i.e. there are more similar changes to
> make: tboot_gen_domain_integrity() e.g. ignores d->xenpage_list,
> and hence with the goal of converting the shared info page would
> also want adjustment. For dump_numa() it may be less important,
> but it would still look more correct if it too got changed.
> audit_p2m() might apparently complain about such pages (and
> hence might be a problem with the one PGC_extra page VMX domains
> now have). And this is only from me looking at
> page_list_for_each(..., &d->page_list) constructs; who knows
> what else there is.
> 

Those are dealt with by the is_special_page() patch later on I think. It didn't 
seem appropriate to use that macro here though since we know pages on the page 
list cannot be xenheap pages.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/6] x86 / p2m: remove page_list check in p2m_alloc_table

2020-03-06 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 06 March 2020 11:46
> To: pdurr...@amzn.com
> Cc: xen-devel@lists.xenproject.org; Durrant, Paul ; 
> Andrew Cooper
> ; George Dunlap ; Wei 
> Liu ; Roger Pau
> Monné 
> Subject: Re: [PATCH v3 2/6] x86 / p2m: remove page_list check in 
> p2m_alloc_table
> 
> On 05.03.2020 13:45, pdurr...@amzn.com wrote:
> > From: Paul Durrant 
> >
> > There does not seem to be any justification for refusing to create the
> > domain's p2m table simply because it may have assigned pages.
> 
> I think there is: If any such allocation had happened before, how
> would it be represented in the domain's p2m?

Insertion into the p2m is a separate action from page allocation. Why should 
they be linked?

> 
> > Particularly
> > it prevents the prior allocation of PGC_extra pages.
> 
> That's unfortunate, but will need taking care of differently then:
> 
> > --- a/xen/arch/x86/mm/p2m.c
> > +++ b/xen/arch/x86/mm/p2m.c
> > @@ -695,14 +695,6 @@ int p2m_alloc_table(struct p2m_domain *p2m)
> >
> >  p2m_lock(p2m);
> >
> > -if ( p2m_is_hostp2m(p2m)
> > - && !page_list_empty(&d->page_list) )
> > -{
> > -P2M_ERROR("dom %d already has memory allocated\n", d->domain_id);
> > -p2m_unlock(p2m);
> > -return -EINVAL;
> > -}
> 
> Instead of checking the list to be empty, how about checking
> domain_tot_pages() to return zero?

I could do that, and in fact my original code did, but with more consideration 
the whole test just didn't make sense to me. Yes, clearly the p2m has to be 
there before pages can be added into it, but I can't see any reason why you 
couldn't even allocate the entire guest RAM, then create the p2m and then add 
the pages into it.

  Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/6] x86 / p2m: remove page_list check in p2m_alloc_table

2020-03-06 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 06 March 2020 12:47
> To: Durrant, Paul 
> Cc: pdurr...@amzn.com; xen-devel@lists.xenproject.org; Andrew Cooper 
> ;
> George Dunlap ; Wei Liu ; Roger Pau 
> Monné 
> Subject: Re: [PATCH v3 2/6] x86 / p2m: remove page_list check in 
> p2m_alloc_table
> 
> On 06.03.2020 13:07, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Jan Beulich 
> >> Sent: 06 March 2020 11:46
> >> To: pdurr...@amzn.com
> >> Cc: xen-devel@lists.xenproject.org; Durrant, Paul ; 
> >> Andrew Cooper
> >> ; George Dunlap ; Wei 
> >> Liu ; Roger
> Pau
> >> Monné 
> >> Subject: Re: [PATCH v3 2/6] x86 / p2m: remove page_list check in 
> >> p2m_alloc_table
> >>
> >> On 05.03.2020 13:45, pdurr...@amzn.com wrote:
> >>> From: Paul Durrant 
> >>>
> >>> There does not seem to be any justification for refusing to create the
> >>> domain's p2m table simply because it may have assigned pages.
> >>
> >> I think there is: If any such allocation had happened before, how
> >> would it be represented in the domain's p2m?
> >
> > Insertion into the p2m is a separate action from page allocation. Why 
> > should they be linked?
> 
> They are, because of how XENMEM_populate_physmap works. Yes,
> they _could_ be separate steps, but that's only a theoretical
> consideration.

Then surely the check should be in the XENMEM_populate_physmap code?

> 
> >>> Particularly
> >>> it prevents the prior allocation of PGC_extra pages.
> >>
> >> That's unfortunate, but will need taking care of differently then:
> >>
> >>> --- a/xen/arch/x86/mm/p2m.c
> >>> +++ b/xen/arch/x86/mm/p2m.c
> >>> @@ -695,14 +695,6 @@ int p2m_alloc_table(struct p2m_domain *p2m)
> >>>
> >>>  p2m_lock(p2m);
> >>>
> >>> -if ( p2m_is_hostp2m(p2m)
> >>> - && !page_list_empty(&d->page_list) )
> >>> -{
> >>> -P2M_ERROR("dom %d already has memory allocated\n", d->domain_id);
> >>> -p2m_unlock(p2m);
> >>> -return -EINVAL;
> >>> -}
> >>
> >> Instead of checking the list to be empty, how about checking
> >> domain_tot_pages() to return zero?
> >
> > I could do that, and in fact my original code did, but with more
> > consideration the whole test just didn't make sense to me. Yes,
> > clearly the p2m has to be there before pages can be added into it,
> > but I can't see any reason why you couldn't even allocate the
> > entire guest RAM, then create the p2m and then add the pages into
> > it.
> 
> Right - more hypercalls (XENMEM_increase_reservation + operations
> like XENMAPSPACE_gmfn), and hence slower overall domain creation.
> Plus - XENMEM_increase_reservation is not very useful for
> translated domains, as it won't return the MFNs allocated, and
> there's no way to specify where they should appear in GFN space.
> Hence in practice I don't see how this whole operation could
> work without XENMEM_populate_physmap.
> 

Oh, it would mean a big change in the tools etc. so I'm not saying it's a good 
idea or even possible at the moment. I was just pointing out that, as far as 
the lower layers of code in Xen go, page allocation and p2m insertion are 
distinct actions.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs

2020-03-06 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of 
> Durrant, Paul
> Sent: 05 March 2020 17:37
> To: Jan Beulich ; Gautam, Varad 
> Cc: xen-devel@lists.xenproject.org; Roger Pau Monné ; 
> Julien Grall
> ; Andrew Cooper 
> Subject: Re: [Xen-devel] [PATCH v3] x86: irq: Do not BUG_ON multiple unbind 
> calls for shared pirqs
> 
> > -Original Message-
> > From: Xen-devel  On Behalf Of Jan 
> > Beulich
> > Sent: 05 March 2020 09:37
> > To: Gautam, Varad 
> > Cc: xen-devel@lists.xenproject.org; Andrew Cooper 
> > ; Julien Grall
> > ; Roger Pau Monné 
> > Subject: Re: [Xen-devel] [PATCH v3] x86: irq: Do not BUG_ON multiple unbind 
> > calls for shared pirqs
> >
> > On 29.01.2020 12:47, Jan Beulich wrote:
> > > On 29.01.2020 11:30, Roger Pau Monné wrote:
> > >> Hello,
> > >>
> > >> Thanks for the patch! Next time could you please try to reply to the
> > >> previous questions before sending a new version:
> > >>
> > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg00257.html
> > >>
> > >> On Wed, Jan 29, 2020 at 10:28:07AM +0100, Varad Gautam wrote:
> > >>> XEN_DOMCTL_destroydomain creates a continuation if domain_kill 
> > >>> -ERESTARTS.
> > >>> In that scenario, it is possible to receive multiple _pirq_guest_unbind
> > >>> calls for the same pirq from domain_kill, if the pirq has not yet been
> > >>> removed from the domain's pirq_tree, as:
> > >>>   domain_kill()
> > >>> -> domain_relinquish_resources()
> > >>>   -> pci_release_devices()
> > >>> -> pci_clean_dpci_irq()
> > >>>   -> pirq_guest_unbind()
> > >>> -> __pirq_guest_unbind()
> > >>>
> > >>> For a shared pirq (nr_guests > 1), the first call would zap the current
> > >>> domain from the pirq's guests[] list, but the action handler is never 
> > >>> freed
> > >>> as there are other guests using this pirq. As a result, on the second 
> > >>> call,
> > >>> __pirq_guest_unbind searches for the current domain which has been 
> > >>> removed
> > >>> from the guests[] list, and hits a BUG_ON.
> > >>>
> > >>> Make __pirq_guest_unbind safe to be called multiple times by letting xen
> > >>> continue if a shared pirq has already been unbound from this guest. The
> > >>> PIRQ will be cleaned up from the domain's pirq_tree during the 
> > >>> destruction
> > >>> in complete_domain_destroy anyways.
> > >>
> > >> So AFAICT this is because pt_pirq_softirq_active() returns true in
> > >> pci_clean_dpci_irq() and hence the iteration is stopped and
> > >> hvm_domain_irq(d)->dpci is not set to NULL.
> > >>
> > >> Would it be possible to clean the already processed IRQs from the
> > >> domain pirq_tree?
> > >
> > > This might work, perhaps by way of invoking unmap_domain_pirq()
> > > right after pirq_guest_unbind(), as long as hvm_dirq_assist() (as
> > > called from dpci_softirq()) can be made skip all actual work it
> > > means to do in such a case. Unfortunately the two ->masked fields
> > > acted upon are different between __pirq_guest_unbind() and
> > > hvm_dirq_assist().
> >
> > Ping? Unless I hear back soon, I'm afraid I'm going to drop this
> > patch from my "pending" mail folder, as not being agreed whether
> > to stick to the current version or whether to go this alternative
> > route. A more "natural" approach to fixing the issue would be
> > quite nice, after all.
> 
> I'll try to pick this up tomorrow as I helped diagnose the problem being 
> fixed.
> 

I'm looking at this now and I am finding the code very confusing, but I think 
we could achieve the cleanup by passing back the irq index out of 
__pirq_guest_unbind() such that pirq_guest_unbind() can call 
clean_domain_irq_pirq(). I still haven't got much of a clue as to how all the 
data structures hang together though.

  Paul


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 6/6] domain: use PGC_extra domheap page for shared_info

2020-03-09 Thread Durrant, Paul
> -Original Message-
> From: p...@xen.org 
> Sent: 09 March 2020 09:35
> To: xen-devel@lists.xenproject.org
> Cc: Durrant, Paul ; Paul Durrant ; 
> Stefano Stabellini
> ; Julien Grall ; Volodymyr Babchuk
> ; Andrew Cooper ; 
> George Dunlap
> ; Ian Jackson ; Jan 
> Beulich ;
> Konrad Rzeszutek Wilk ; Wei Liu 
> Subject: [PATCH v4 6/6] domain: use PGC_extra domheap page for shared_info
> 
> From: Paul Durrant 
> 
> Currently shared_info is a shared xenheap page but shared xenheap pages
> complicate future plans for live-update of Xen so it is desirable to,
> where possible, not use them [1]. This patch therefore converts shared_info
> into a PGC_extra domheap page. This does entail freeing shared_info during
> domain_relinquish_resources() rather than domain_destroy() so care is
> needed to avoid de-referencing a NULL shared_info pointer hence some
> extra checks of 'is_dying' are needed.
> 
> NOTE: For Arm, the call to free_shared_info() in arch_domain_destroy() is
>   left in place since it is idempotent and called in the error path for
>   arch_domain_create().
> 
> [1] See 
> https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg02018.html
> 
> Signed-off-by: Paul Durrant 

I realise I forgot to fold in the new dump function for shared_info (needed 
since it will no longer feature in the dump of xen pages) so I will send a v5 
of this series shortly.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] libxl: make creation of xenstore 'suspend event channel' node optional...

2020-03-12 Thread Durrant, Paul
Ping?

> -Original Message-
> From: pdurr...@amzn.com 
> Sent: 05 March 2020 12:14
> To: xen-devel@lists.xenproject.org
> Cc: Durrant, Paul ; Ian Jackson 
> ; Wei Liu
> ; Andrew Cooper ; George Dunlap 
> ; Jan
> Beulich ; Julien Grall ; Stefano Stabellini
> ; Anthony PERARD 
> Subject: [PATCH v2 2/2] libxl: make creation of xenstore 'suspend event 
> channel' node optional...
> 
> From: Paul Durrant 
> 
> ... and make the top level 'device' node in xenstore writable by the
> guest
> 
> The purpose and semantics of the suspend event channel node are explained
> in xenstore-paths.pandoc [1]. It was originally introduced in xend by
> commit 17636f47a474 "Teach xc_save to use event-channel-based domain
> suspend if available.". Note that, because, the top-level frontend
> 'device' node was created writable by the guest in xend, there was no
> need to explicitly create the 'suspend-event-channel' node as writable
> node.
> 
> However, libxl creates the 'device' node as read-only by the guest and so
> explicit creation of the 'suspend-event-channel' node is necessary to make
> it usable. This unfortunately has the side-effect of making some old
> Windows PV drivers [2] cease to function. This is because they scan the top
> level 'device' node, find the 'suspend' node and expect it to contain the
> usual sub-nodes describing a PV frontend. When this is found not to be the
> case, enumeration ceases and (because the 'suspend' node is observed before
> the 'vbd' node) no system disk is enumerated. Windows will then crash with
> bugcheck code 0x7B.
> 
> This patch adds a boolean 'suspend_event_channel' field into
> libxl_create_info to control whether the xenstore node is created and a
> similarly named option in xl.cfg which, for compatibility with previous
> libxl behaviour, defaults to true. It also makes the top level device node
> writable, as xend did, and updates xenstore-paths.pandoc to say that the
> suspend event channel node may not exist and that the guest may create it
> if it does not exist.
> 
> [1] 
> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/xenstore-paths.pandoc;hb=HEAD#l177
> [2] 
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/para-
> virtualized_windows_drivers_guide/sect-para-virtualized_windows_drivers_guide-
> installing_and_configuring_the_para_virtualized_drivers-installing_the_para_virtualized_drivers
> 
> NOTE: While adding the new LIBXL_HAVE_CREATEINFO_SUSPEND_EVENT_CHANNEL
>   definition into libxl.h, this patch corrects the previous stanza
>   which erroneously implies libxl_domain_create_info is a function.
> 
> Signed-off-by: Paul Durrant 
> ---
> Cc: Ian Jackson 
> Cc: Wei Liu 
> Cc: Andrew Cooper 
> Cc: George Dunlap 
> Cc: Jan Beulich 
> Cc: Julien Grall 
> Cc: Stefano Stabellini 
> Cc: Anthony PERARD 
> 
> v2:
>  - Update xenstore-paths.pandoc and squash patch #3
> ---
>  docs/man/xl.cfg.5.pod.in|  7 +++
>  docs/misc/xenstore-paths.pandoc |  7 ---
>  tools/libxl/libxl.h | 13 -
>  tools/libxl/libxl_create.c  | 14 ++
>  tools/libxl/libxl_types.idl |  1 +
>  tools/xl/xl_parse.c |  3 +++
>  6 files changed, 37 insertions(+), 8 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> index 0cad561375..5f476f1e1d 100644
> --- a/docs/man/xl.cfg.5.pod.in
> +++ b/docs/man/xl.cfg.5.pod.in
> @@ -668,6 +668,13 @@ file.
> 
>  =back
> 
> +=item B
> +
> +Create the xenstore path for the domain's suspend event channel. The
> +existence of this path can cause problems with older PV drivers running
> +in the guest. If this option is not specified then it will default to
> +B.
> +
>  =back
> 
>  =head2 Devices
> diff --git a/docs/misc/xenstore-paths.pandoc b/docs/misc/xenstore-paths.pandoc
> index e2ab5da54e..a8eecdb7ed 100644
> --- a/docs/misc/xenstore-paths.pandoc
> +++ b/docs/misc/xenstore-paths.pandoc
> @@ -176,10 +176,11 @@ The size of the video RAM this domain is configured 
> with.
> 
>   ~/device/suspend/event-channel = ""|EVTCHN [w]
> 
> -The domain's suspend event channel. The toolstack will create this
> -path with an empty value which the guest may choose to overwrite.
> +The domain's suspend event channel. The toolstack may create this
> +path with an empty value which the guest may choose to overwrite. If
> +the path does not exist then the guest may create it.
> 
> -If the guest overwrites this, it will be with the number of an unbound
&g

Re: [Xen-devel] [PATCH v6 5/5] domain: use PGC_extra domheap page for shared_info

2020-03-17 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 17 March 2020 13:14
> To: p...@xen.org
> Cc: xen-devel@lists.xenproject.org; Durrant, Paul ; 
> Stefano Stabellini
> ; Julien Grall ; Volodymyr Babchuk
> ; Andrew Cooper ; 
> George Dunlap
> ; Ian Jackson ; Konrad 
> Rzeszutek Wilk
> ; Wei Liu 
> Subject: RE: [EXTERNAL] [PATCH v6 5/5] domain: use PGC_extra domheap page for 
> shared_info
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 10.03.2020 18:49, p...@xen.org wrote:
> > From: Paul Durrant 
> >
> > Currently shared_info is a shared xenheap page but shared xenheap pages
> > complicate future plans for live-update of Xen so it is desirable to,
> > where possible, not use them [1]. This patch therefore converts shared_info
> > into a PGC_extra domheap page. This does entail freeing shared_info during
> > domain_relinquish_resources() rather than domain_destroy() so care is
> > needed to avoid de-referencing a NULL shared_info pointer hence some
> > extra checks of 'is_dying' are needed.
> 
> If there's going to be agreement to follow this route, the implementation,
> with a really minor cosmetic adjustment - see below -, looks okay to me.
> Nevertheless I continue to dislike the implication from the extra care
> that's now needed. As I think I have said before, I'd like to have at
> least one other REST maintainer's opinion here.
> 

Ok, fair enough.

> > --- a/xen/arch/x86/domain.c
> > +++ b/xen/arch/x86/domain.c
> > @@ -260,9 +260,12 @@ void dump_pageframe_info(struct domain *d)
> >
> >  page_list_for_each ( page, &d->extra_page_list )
> >  {
> > -printk("ExtraPage %p: caf=%08lx, taf=%" PRtype_info "\n",
> > +const char *tag = mfn_eq(page_to_mfn(page), d->shared_info.mfn) ?
> > +"[SHARED INFO]" : "";
> 
> Please can this be " [SHARED INFO]" with ...
> 
> > +printk("ExtraPage %p: caf=%08lx, taf=%" PRtype_info " %s\n",
> 
> ... the blank before the final %s dropped here, such that we won't
> have a trailing blank in the output?

Sure.

  Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 5/5] domain: use PGC_extra domheap page for shared_info

2020-03-24 Thread Durrant, Paul
> -Original Message-
[snip]
> > Currently shared_info is a shared xenheap page but shared xenheap pages
> > complicate future plans for live-update of Xen so it is desirable to,
> > where possible, not use them [1]. This patch therefore converts shared_info
> > into a PGC_extra domheap page. This does entail freeing shared_info during
> > domain_relinquish_resources() rather than domain_destroy() so care is
> > needed to avoid de-referencing a NULL shared_info pointer hence some
> > extra checks of 'is_dying' are needed.
> >
> > NOTE: For Arm, the call to free_shared_info() in arch_domain_destroy() is
> >left in place since it is idempotent and called in the error path for
> >arch_domain_create().
> 
> The approach looks good to me. I have one comment below.
> 

Thanks.

> [...]
> 
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 4ef0d3b21e..4f3266454f 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -1651,24 +1651,44 @@ int continue_hypercall_on_cpu(
> >
> >   int alloc_shared_info(struct domain *d, unsigned int memflags)
> >   {
> > -if ( (d->shared_info.virt = alloc_xenheap_pages(0, memflags)) == NULL )
> > +struct page_info *pg;
> > +
> > +pg = alloc_domheap_page(d, MEMF_no_refcount | memflags);
> > +if ( !pg )
> >   return -ENOMEM;
> >
> > -d->shared_info.mfn = virt_to_mfn(d->shared_info.virt);
> > +if ( !get_page_and_type(pg, d, PGT_writable_page) )
> > +{
> > +/*
> > + * The domain should not be running at this point so there is
> > + * no way we should reach this error path.
> > + */
> > +ASSERT_UNREACHABLE();
> > +return -ENODATA;
> > +}
> > +
> > +d->shared_info.mfn = page_to_mfn(pg);
> > +d->shared_info.virt = __map_domain_page_global(pg);
> 
> __map_domain_page_global() is not guaranteed to succeed. For instance,
> on Arm32 this will be a call to vmap().
> 
> So you want to check the return.
> 

Ok, I'll add a check.

As Jan discovered, I messed up the version numbering so there is already a v7 
series where I dropped this patch (until I can group it with conversion of 
other shared xenheap pages).

  Paul

> Cheers,
> 
> --
> Julien Grall


RE: [PATCH 02/10] viridian: move IPI hypercall implementation into separate function

2020-11-12 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 12 November 2020 08:38
> To: Paul Durrant 
> Cc: Durrant, Paul ; Wei Liu ; Andrew 
> Cooper
> ; Roger Pau Monné ; 
> xen-devel@lists.xenproject.org
> Subject: RE: [EXTERNAL] [PATCH 02/10] viridian: move IPI hypercall 
> implementation into separate
> function
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 11.11.2020 21:07, Paul Durrant wrote:
> > From: Paul Durrant 
> >
> > This patch moves the implementation of HVCALL_SEND_IPI that is currently
> > inline in viridian_hypercall() into a new hvcall_ipi() function.
> >
> > The new function returns Xen errno values similarly to hvcall_flush(). Hence
> > the errno translation code in viridial_hypercall() is generalized, removing
> > the need for the local 'status' variable.
> >
> > NOTE: The formatting of the 'out' label also corrected as per CODING_STYLE
> 
> How about correcting the adjacent switch() at the same time as well?
> 

Sure.

> >   and the code is adjusted to avoid a register copy-back if 'mode' is
> >   neither 8 nor 4.
> 
> While you mention it here, isn't this an unrelated change wanting
> its own justification?
> 

It was such small mod that I folded it but maybe it would be best to break it 
out into a separate patch and also do the format adjustment there.

  Paul

> Jan


RE: [PATCH 03/10] viridian: introduce a per-cpu hypercall_vpmask and accessor functions...

2020-11-19 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 19 November 2020 16:41
> To: p...@xen.org
> Cc: Durrant, Paul ; 'Wei Liu' ; 'Andrew 
> Cooper'
> ; 'Roger Pau Monné' ; 
> xen-devel@lists.xenproject.org
> Subject: RE: [EXTERNAL] [PATCH 03/10] viridian: introduce a per-cpu 
> hypercall_vpmask and accessor
> functions...
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 19.11.2020 17:02, Paul Durrant wrote:
> >> From: Jan Beulich  >> Sent: 12 November 2020 08:46
> >>
> >> On 11.11.2020 21:07, Paul Durrant wrote:
> >>> --- a/xen/arch/x86/hvm/viridian/viridian.c
> >>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
> >>> @@ -507,15 +507,41 @@ void viridian_domain_deinit(struct domain *d)
> >>>  XFREE(d->arch.hvm.viridian);
> >>>  }
> >>>
> >>> +struct hypercall_vpmask {
> >>> +DECLARE_BITMAP(mask, HVM_MAX_VCPUS);
> >>> +};
> >>> +
> >>> +static DEFINE_PER_CPU(struct hypercall_vpmask, hypercall_vpmask);
> >>> +
> >>> +static void vpmask_empty(struct hypercall_vpmask *vpmask)
> >>
> >> const?
> >
> > Yes, I suppose that's ook for all these since the outer struct is
> > not changing... It's a bit misleading though.
> 
> I'd be curious to learn about that "misleading" aspect.
> 

Because the function is modifying (zero-ing) the bitmap... so implying the mask 
is const is measleading.

  Paul

> Jan


RE: [PATCH v2 01/12] viridian: don't blindly write to 32-bit registers is 'mode' is invalid

2020-11-20 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 20 November 2020 14:20
> To: Paul Durrant 
> Cc: Durrant, Paul ; Wei Liu ; Andrew 
> Cooper
> ; Roger Pau Monné ; 
> xen-devel@lists.xenproject.org
> Subject: RE: [EXTERNAL] [PATCH v2 01/12] viridian: don't blindly write to 
> 32-bit registers is 'mode'
> is invalid
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 20.11.2020 10:48, Paul Durrant wrote:
> > From: Paul Durrant 
> >
> > If hvm_guest_x86_mode() returns something other than 8 or 4 then
> > viridian_hypercall() will return immediately but, on the way out, will write
> > back status as if 'mode' was 4. This patch simply makes it leave the 
> > registers
> > alone.
> 
> IOW 16-bit protected mode and real mode aren't allowed to make
> hypercalls (supported also be the earlier switch() in the
> function)?

I don't think running enlightened versions of OS/2 1.3 is really a use case :-)

> But then, to achieve what you want, wouldn't it be
> more direct to simply "return HVM_HCALL_completed;" straight
> from that earlier switch()'s default case? At which point the
> switch() you modify could become if/else? Anyway - you're the
> maintainer, I'm just wondering ...
> 

It could be done that way but I prefer the exit path via goto.

  Paul

> Jan


RE: [PATCH v2 05/12] viridian: use hypercall_vpmask in hvcall_ipi()

2020-11-20 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 20 November 2020 15:09
> To: Paul Durrant 
> Cc: Durrant, Paul ; Wei Liu ; Andrew 
> Cooper
> ; Roger Pau Monné ; 
> xen-devel@lists.xenproject.org
> Subject: RE: [EXTERNAL] [PATCH v2 05/12] viridian: use hypercall_vpmask in 
> hvcall_ipi()
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 20.11.2020 10:48, Paul Durrant wrote:
> > --- a/xen/arch/x86/hvm/viridian/viridian.c
> > +++ b/xen/arch/x86/hvm/viridian/viridian.c
> > @@ -551,6 +551,25 @@ static bool vpmask_test(const struct hypercall_vpmask 
> > *vpmask,
> >  return test_bit(vp, vpmask->mask);
> >  }
> >
> > +static unsigned int vpmask_first(struct hypercall_vpmask *vpmask)
> 
> Now this and ...
> 
> > +{
> > +return find_first_bit(vpmask->mask, HVM_MAX_VCPUS);
> > +}
> > +
> > +static unsigned int vpmask_next(struct hypercall_vpmask *vpmask, unsigned 
> > int vp)
> 
> ... this should really have pointers to const as parameters.
> 
> > @@ -631,13 +650,21 @@ static int hvcall_flush(union hypercall_input *input,
> >  return 0;
> >  }
> >
> > +static void send_ipi(struct hypercall_vpmask *vpmask, uint8_t vector)
> 
> And I guess this one should, too.
> 

True, they can be const.

  Paul

> Jan


RE: [PATCH v3 01/13] viridian: don't blindly write to 32-bit registers is 'mode' is invalid

2020-11-24 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 25 November 2020 07:52
> To: Paul Durrant 
> Cc: Durrant, Paul ; Wei Liu ; Andrew 
> Cooper
> ; Roger Pau Monné ; 
> xen-devel@lists.xenproject.org
> Subject: RE: [EXTERNAL] [PATCH v3 01/13] viridian: don't blindly write to 
> 32-bit registers is 'mode'
> is invalid
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 24.11.2020 20:07, Paul Durrant wrote:
> > From: Paul Durrant 
> >
> > If hvm_guest_x86_mode() returns something other than 8 or 4 then
> > viridian_hypercall() will return immediately but, on the way out, will write
> > back status as if 'mode' was 4. This patch simply makes it leave the 
> > registers
> > alone.
> >
> > NOTE: The formatting of the 'out' label and the switch statement are also
> >   adjusted as per CODING_STYLE.
> 
> Partly only as far as the latter goes:
> 
> > --- a/xen/arch/x86/hvm/viridian/viridian.c
> > +++ b/xen/arch/x86/hvm/viridian/viridian.c
> > @@ -692,13 +692,14 @@ int viridian_hypercall(struct cpu_user_regs *regs)
> >  break;
> >  }
> >
> > -out:
> > + out:
> >  output.result = status;
> >  switch (mode) {
> 
> This would want to be
> 
> switch ( mode )
> {
> 

Oh, yes.

> I guess this could easily be taken care of while committing.

Thanks,

  Paul

> 
> Jan


RE: [PATCH v4 1/3] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_disable_fifo, ...

2020-11-25 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 25 November 2020 09:36
> To: Andrew Cooper 
> Cc: Durrant, Paul ; Elnikety, Eslam 
> ; Christian Lindig
> ; David Scott ; Ian Jackson 
> ; Wei
> Liu ; George Dunlap ; Julien Grall 
> ; Stefano
> Stabellini ; xen-devel@lists.xenproject.org; Paul 
> Durrant 
> Subject: RE: [EXTERNAL] [PATCH v4 1/3] domctl: introduce a new domain create 
> flag,
> XEN_DOMCTL_CDF_disable_fifo, ...
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 25.11.2020 10:20, Jan Beulich wrote:
> > On 24.11.2020 20:17, Paul Durrant wrote:
> >> --- a/xen/include/public/domctl.h
> >> +++ b/xen/include/public/domctl.h
> >> @@ -70,9 +70,11 @@ struct xen_domctl_createdomain {
> >>  #define XEN_DOMCTL_CDF_iommu  (1U<<_XEN_DOMCTL_CDF_iommu)
> >>  #define _XEN_DOMCTL_CDF_nested_virt   6
> >>  #define XEN_DOMCTL_CDF_nested_virt(1U << _XEN_DOMCTL_CDF_nested_virt)
> >> +#define _XEN_DOMCTL_CDF_disable_fifo  7
> >> +#define XEN_DOMCTL_CDF_disable_fifo   (1U << _XEN_DOMCTL_CDF_disable_fifo)
> >
> > Despite getting longish, I think this needs "evtchn" somewhere in
> > the name. To keep size bounded, maybe XEN_DOMCTL_CDF_no_fifo_evtchn?
> >

I'm ok with that name; I'll send a v5.

> >>  /* Max XEN_DOMCTL_CDF_* constant.  Used for ABI checking. */
> >> -#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_nested_virt
> >> +#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_disable_fifo
> >
> > While not directly related to this patch, I'm puzzled by the
> > presence of this constant: I've not been able to find any use of
> > it. In particular you did have a need to modify
> > sanitise_domain_config().
> 
> So it was you to introduce this, right away without any user, in
> 7fb0e134f8c6 ("tools/ocaml: abi: Use formal conversion and check
> in more places"). The only reference is from what I regard as a
> comment (I don't speak any ocaml, so I may be wrong). Could you
> clarify why we need to maintain this constant?
> 

I can't remember the exact sequence of events but it became apparent at some 
point that the ocaml bindings were out of sync and they rely on a list of 
domain create flags where the number has to match the bit-shift value in 
domctl.h (among other things). Thus there is an auto-generated header called 
"xenctrl_abi_check.h" which is included by xenctrl_stubs.c. This header is 
generated from xenctrl.ml by the perl script "abi-check" and it relies the 
XEN_DOMCTL_CDF_MAX constant to form part of the checks it generates.

As an example, here is the generated header with this patch applied:

// found ocaml type x86_arch_emulation_flags at xenctrl.ml:38
BUILD_BUG_ON( XEN_X86_EMU_LAPIC  != (1u << 0)  );
BUILD_BUG_ON( XEN_X86_EMU_HPET   != (1u << 1)  );
BUILD_BUG_ON( XEN_X86_EMU_PM != (1u << 2)  );
BUILD_BUG_ON( XEN_X86_EMU_RTC!= (1u << 3)  );
BUILD_BUG_ON( XEN_X86_EMU_IOAPIC != (1u << 4)  );
BUILD_BUG_ON( XEN_X86_EMU_PIC!= (1u << 5)  );
BUILD_BUG_ON( XEN_X86_EMU_VGA!= (1u << 6)  );
BUILD_BUG_ON( XEN_X86_EMU_IOMMU  != (1u << 7)  );
BUILD_BUG_ON( XEN_X86_EMU_PIT!= (1u << 8)  );
BUILD_BUG_ON( XEN_X86_EMU_USE_PIRQ   != (1u << 9)  );
BUILD_BUG_ON( XEN_X86_EMU_VPCI   != (1u << 10) );
BUILD_BUG_ON( XEN_X86_EMU_ALL!= (1u << 11)-1u );
// found ocaml type domain_create_flag at xenctrl.ml:60
BUILD_BUG_ON( XEN_DOMCTL_CDF_hvm != (1u << 0)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_hap != (1u << 1)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_s3_integrity!= (1u << 2)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_oos_off != (1u << 3)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_xs_domain   != (1u << 4)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_iommu   != (1u << 5)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_nested_virt != (1u << 6)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_disable_fifo!= (1u << 7)  );
BUILD_BUG_ON( XEN_DOMCTL_CDF_MAX != (1u << 7)  );
// found ocaml type domain_create_iommu_opts at xenctrl.ml:70
BUILD_BUG_ON( XEN_DOMCTL_IOMMU_no_sharept!= (1u << 0)  );
BUILD_BUG_ON( XEN_DOMCTL_IOMMU_MAX   != (1u << 0)  );
// found ocaml type physinfo_cap_flag at xenctrl.ml:113
BUILD_BUG_ON( XEN_SYSCTL_PHYSCAP_hvm != (1u << 0)  );
BUILD_BUG_ON( XEN_SYSCTL_PHYSCAP_pv  != (1u << 1)  );
BUILD_BUG_ON( XEN_SYSCTL_PHYSCAP_directio!= (1u << 2)  );
BUILD_BUG_ON( XEN_SYSCTL_PHYSCAP_hap != (1u << 3)  );
BUILD_BUG_ON( XEN_SYSCTL_PHYSCAP_shadow  != (1u << 4)  );
BUILD_BUG_ON( XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share != (1u << 5)  );
BUILD_BUG_ON( XEN_SYSCTL_PHYSCAP_MAX != (1u << 5)  );

  Paul


RE: [PATCH v4 1/3] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_disable_fifo, ...

2020-11-25 Thread Durrant, Paul
> -Original Message-
> From: Andrew Cooper 
> Sent: 25 November 2020 11:31
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Durrant, Paul ; Elnikety, Eslam 
> ; Christian Lindig
> ; David Scott ; Ian Jackson 
> ; Wei
> Liu ; George Dunlap ; Jan Beulich 
> ; Julien
> Grall ; Stefano Stabellini 
> Subject: RE: [EXTERNAL] [PATCH v4 1/3] domctl: introduce a new domain create 
> flag,
> XEN_DOMCTL_CDF_disable_fifo, ...
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 24/11/2020 19:17, Paul Durrant wrote:
> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> > index 666aeb71bf1b..70701c59d053 100644
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -70,9 +70,11 @@ struct xen_domctl_createdomain {
> >  #define XEN_DOMCTL_CDF_iommu  (1U<<_XEN_DOMCTL_CDF_iommu)
> >  #define _XEN_DOMCTL_CDF_nested_virt   6
> >  #define XEN_DOMCTL_CDF_nested_virt(1U << _XEN_DOMCTL_CDF_nested_virt)
> > +#define _XEN_DOMCTL_CDF_disable_fifo  7
> > +#define XEN_DOMCTL_CDF_disable_fifo   (1U << _XEN_DOMCTL_CDF_disable_fifo)
> 
> The sense is backwards.  It should be a "permit the use of FIFO"
> control.  If the code had been written this way to begin with, the bug
> you found wouldn't have existed.
> 
> Given that there is not currently a way to disable FIFO, you can
> probably do without an enumeration of whether the hypervisor supports it
> or not.
> 

Ok, I can reverse the sense.

I found another one that we ought to control in a similar way... the per-cpu 
evtchn upcalls. AFAIK only the Windows PV drivers make use of it (and I can 
arrange to squash that with a registry flag) but it really falls into the same 
category as FIFO... so maybe we need a separate bit-field for these sorts of 
thing?

  Paul

> ~Andrew


RE: [PATCH v4 1/3] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_disable_fifo, ...

2020-11-25 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 25 November 2020 11:51
> To: p...@xen.org
> Cc: Durrant, Paul ; Elnikety, Eslam 
> ; 'Christian Lindig'
> ; 'David Scott' ; 'Ian Jackson' 
> ;
> 'Wei Liu' ; 'Andrew Cooper' ; 
> 'George Dunlap'
> ; 'Julien Grall' ; 'Stefano 
> Stabellini'
> ; xen-devel@lists.xenproject.org
> Subject: RE: [EXTERNAL] [PATCH v4 1/3] domctl: introduce a new domain create 
> flag,
> XEN_DOMCTL_CDF_disable_fifo, ...
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 25.11.2020 12:10, Paul Durrant wrote:
> >> From: Jan Beulich 
> >> Sent: 25 November 2020 09:20
> >>
> >> On 24.11.2020 20:17, Paul Durrant wrote:
> >>> From: Paul Durrant 
> >>>
> >>> ...to control the visibility of the FIFO event channel operations
> >>> (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) 
> >>> to
> >>> the guest.
> >>>
> >>> These operations were added to the public header in commit d2d50c2f308f
> >>> ("evtchn: add FIFO-based event channel ABI") and the first implementation
> >>> appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
> >>> EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
> >>> ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
> >>> that, a guest issuing those operations would receive a return value of
> >>> -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations 
> >>> but
> >>> running on an older (pre-4.4) Xen would fall back to using the 2-level 
> >>> event
> >>> channel interface upon seeing this return value.
> >>>
> >>> Unfortunately the uncontrolable appearance of these new operations in Xen 
> >>> 4.4
> >>> onwards has implications for hibernation of some Linux guests. During 
> >>> resume
> >>> from hibernation, there are two kernels involved: the "boot" kernel and 
> >>> the
> >>> "resume" kernel. The guest boot kernel may default to use FIFO operations 
> >>> and
> >>> instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On 
> >>> the
> >>> other hand, the resume kernel keeps assuming 2-level, because it was 
> >>> hibernated
> >>> on a version of Xen that did not support the FIFO operations.
> >>
> >> And the alternative of the boot kernel issuing EVTCHNOP_reset has
> >> other unwanted consequences. Maybe worth mentioning here, as
> >> otherwise this would look like the obvious way to return to 2-level
> >> mode?
> >>
> >> Also, why can't the boot kernel be instructed to avoid engaging
> >> FIFO mode?
> >>
> >
> > Both of those are, of course, viable alternatives if the guest can be
> > modified. The problem we need to work around is guest that are already
> > out there and cannot be updated.
> 
> Making use of EVTCHNOP_reset indeed would require a change to the
> kernel. But Linux has a command line option to suppress use of
> FIFO event channels, so I can't see why the boot kernel couldn't
> be passed this flag without any modification to the binary.
> 

I'm sure that was considered and found not to be feasible in our use-case. 
(Likely the command line is as much baked into the guest image as the kernel 
itself).

> >>> To maintain compatibility it is necessary to make Xen behave as it did
> >>> before the new operations were added and hence the code in this patch 
> >>> ensures
> >>> that, if XEN_DOMCTL_CDF_disable_fifo is set, the FIFO event channel 
> >>> operations
> >>> will again result in -ENOSYS being returned to the guest.
> >>
> >> Are there indeed dependencies on the precise return value anywhere?
> >> If so, the generally inappropriate use (do_event_channel_op()'s
> >> default case really would also need switching) would want a brief
> >> comment, so it'll be understood by readers that this isn't code to
> >> derive other code from. If not, -EPERM or -EACCES perhaps?
> >>
> >
> > The patch, as stated, is reverting behaviour and so the -ENOSYS really
> > needs to sta

Re: [Xen-devel] [PATCH 4/6] domctl: set XEN_DOMCTL_createdomain 'rover' if valid domid is specified

2020-01-03 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 02 January 2020 17:25
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ; Jan
> Beulich ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu 
> Subject: Re: [PATCH 4/6] domctl: set XEN_DOMCTL_createdomain 'rover' if
> valid domid is specified
> 
> Paul Durrant writes ("[PATCH 4/6] domctl: set XEN_DOMCTL_createdomain
> 'rover' if valid domid is specified"):
> > The value of 'rover' is the value at which Xen will start searching for
> an
> > unused domid if none is specified. Currently it is only updated when a
> > domid is automatically chosen, rather than specified by the caller,
> which
> > makes it very hard to describe Xen's rationale in choosing domids in an
> > environment where some domain creations have specified domids and some
> > don't.
> > This patch always updates 'rover' after a successful creation, even in
> the
> > case that domid is specified by the caller. This ensures that, if Xen
> > automatically chooses a domid for a subsequent domain creation it will
> > always be the next available value after the domid of the most recently
> > created domain.
> 
> I'm not yet convinced this behaviour is better, but I'm open to
> persuasion.
> 
> The existing allocation system falls down in any case if the domid
> gets near to wrapping round.  If it doesn't, then without this patch
> it is possible to have two ranges of domids: automatically allocated
> ones, and statically allocated high ones.
> 
> With this patch, statically allocating a domid resets rover and causes
> the remaining bits of static space to be polluted.
> 
> What am I missing ?  What are the use cases here ?

The problem I was tackling was trying to document how Xen chooses a domid. E.g. 
it is not correct to say that it chooses the lowest numbered available domid. 
The best I could come up with was 'the next available domid after the last one 
it thought of' which is pretty poor... and also becomes harder to explain if 
some domids are not actually chosen by Xen, because the toolstack specified 
them.

The end game which this series is working towards is transparent migration 
which, because of the 'design' of PV protocols and certain hypercalls in the 
guest ABI, requires that the domid is persisted on migration. This patch is 
simply trying to lay groundwork for the co-existence of domains with specified 
domids and those with automatically allocated domids. 

  Paul

> 
> Thanks,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/6] libxl: allow creation of domains with specified or random domid

2020-01-03 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 02 January 2020 17:27
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu ; Anthony Perard
> 
> Subject: Re: [PATCH 5/6] libxl: allow creation of domains with specified
> or random domid
> 
> Paul Durrant writes ("[PATCH 5/6] libxl: allow creation of domains with
> specified or random domid"):
> > This patch modifies do_domain_create() to use the value of domid that is
> > passed in. A new 'special value' - RANDOM_DOMID - is added into the API
> > and this, INVALID_DOMID or any valid domid is passed unmodified to
> > libxl__domain_make(). Any other invalid domid value will cause an error.
> >
> > If RANDOM_DOMID is passed in then libxl__domain_make() will use
> > libxl__random_bytes() to choose a domid. If the chosen value is not a
> > valid domid then this step will be repeated. Once a valid value is
> chosen
> > it will be passed to xc_domain_create() but if this fails with an errno
> > value of EEXIST, a new random value will be chosen and the operation
> will
> > be retried.
> 
> What is the use case for this ?
> 

The use-case is trying to make sure that, across a pool of hosts, the 
likelihood of creating two domains with the same domid is small. Because a 
transparent migration requires that domid is persisted, migrations are more 
likely to fail (due to a domid already being in use) if a sequential scheme is 
used.

> I think using this is hazardous if you ever destroy domains, because
> it will lead to reuse of domids in circumstances[1] where right now
> that can't happen.
> 
> [1] Fewer than ~2^16 creations per Xen boot.

Agreed, but is that actually a problem? A domain must be fully destroyed (i.e. 
all page refs dropped) before its id will become available for re-use. What are 
we actually trying to avoid by not immediately recycling? I'd certainly be open 
to adding some sort of retirement list for domids that would prevent re-use 
within a specified time, if that would help.

  Paul

> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 6/6] xl: allow specified domain id to be used for create, restore and migrate

2020-01-03 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 02 January 2020 17:30
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu 
> Subject: Re: [PATCH 6/6] xl: allow specified domain id to be used for
> create, restore and migrate
> 
> Paul Durrant writes ("[PATCH 6/6] xl: allow specified domain id to be used
> for create, restore and migrate"):
> > This patch adds the option to use a specified domain id to be used for
> > the create, restore and migrate lifecycle operations and documentation
> > thereof.
> 
> I approve of the idea that the xl user can specify the domid.  But:
> 
> Why should this be a command line argument to xl, rather than a domain
> config parameter ?  Obviously there needs to be a way to override the
> choice, especially to make localhost migration work, but there is
> already a way to specify extra config on domain create, at least.
> 

I debated which was best and came down on the side of a command line parameter 
becase I thought that chances of an admin wanting to tie a cfg to specified 
domid was small. But I guess the cfg can already specify a name, which needs to 
be overridden on a per-creation basis if an admin wants to use a common 
'template' cfg... so maybe that option is indeed more consistent.

  Paul

> Thanks,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Community Call: Call for Agenda Items and call details for Jan 9, 16:00 - 17:00 UTC

2020-01-07 Thread Durrant, Paul
> -Original Message-
> From: Andrew Cooper 
> Sent: 07 January 2020 00:26
> To: Lars Kurth ; xen-devel  de...@lists.xenproject.org>
> Cc: Rian Quinn ; Daniel P. Smith
> ; Doug Goldstein ; Brian
> Woods ; Rich Persaud ;
> anastassios.na...@onapp.com; mirela.simono...@aggios.com;
> edgar.igles...@xilinx.com; Ji, John ;
> robin.randh...@arm.com; daniel.ki...@oracle.com; Amit Shah
> ; Matt Spencer ; Robert Townley
> ; Artem Mygaiev ; Varad
> Gautam ; Tamas K Lengyel
> ; Christopher Clark
> ; George Dunlap ;
> Stefano Stabellini ; lambert.oliv...@gmail.com;
> Ian Jackson ; vfac...@de.adit-jv.com; Kevin
> Pearson ; intel-...@intel.com; Jarvis
> Roach ; Juergen Gross ;
> Sergey Dyasli ; Durrant, Paul
> ; Julien Grall ; Jeff
> Kubascik ; Natarajan, Janakarajan
> ; Stewart Hildebrand
> ; Volodymyr Babchuk
> ; Woodhouse, David ; Roger
> Pau Monne 
> Subject: Re: [Xen-devel] Community Call: Call for Agenda Items and call
> details for Jan 9, 16:00 - 17:00 UTC
> 
> On 06/01/2020 19:56, Lars Kurth wrote:
> > Dear community members,
> >
> > I hope you all had a restful holiday period and a Happy New Year!
> >
> > Please send me agenda items for this Thursday's community call (we
> agreed to move it by 1 week) preferably by Wednesday!
> >
> > A draft agenda is
> at https://cryptpad.fr/pad/#/2/pad/edit/ERZtMYD5j6k0sv-NG6Htl-AJ/
> > Please add agenda items to the document or reply to this e-mail
> 
> I think it would be very helpful for the community in general to know
> any specific plans each of us have for the 4.14 timeframe.
> 
> I personally am aware of a fair quantity of work from various people,
> but it is clear that the community as a whole doesn't really have an
> idea of who is working on what.
> 
> My contribution to the discussion starts with
> https://lore.kernel.org/xen-devel/941cf23c-13ed-14a1-fd25-
> 45b001d95...@citrix.com/T/#u
> but I think it would be helpful if others gave at least a brief overview
> of any plans and whether they are intending the work to hit the next
> release, or whether it is more likely to be a future release.

Agreed. I need a baseline list of items to track for 4.14. 

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86/hyperv: drop all __packed from hyperv-tlfs.h

2020-01-08 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Wei
> Liu
> Sent: 08 January 2020 12:27
> To: Paul Durrant 
> Cc: Wei Liu ; Wei Liu ; Andrew Cooper
> ; Michael Kelley ; Jan
> Beulich ; Xen Development List  de...@lists.xenproject.org>; Roger Pau Monné 
> Subject: Re: [Xen-devel] [PATCH 2/2] x86/hyperv: drop all __packed from
> hyperv-tlfs.h
> 
> On Wed, Jan 08, 2020 at 09:11:12AM +, Paul Durrant wrote:
> > On Tue, 7 Jan 2020 at 17:39, Wei Liu  wrote:
> > >
> > > All structures are already naturally aligned. Linux added those
> > > attributes out of paranoia.
> > >
> > > In Xen we've had instance we had to drop pointless __packed to placate
> > > gcc 9 (see ca9310b24e
> >
> > I think you should add:
> >
> > "x86/IO-APIC: fix build with gcc9"
> >
> > here.
> 
> I have fixed it up locally. Do you want me to resend these two patches
> just for this update?

As long as it is fixed when committed any way will do.

  Paul

> 
> Wei.
> 
> >
> > > ), it is better drop those attributes.
> > >
> > > Requested-by: Jan Beulich 
> > > Signed-off-by: Wei Liu 
> >
> >   Paul
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/Rules.mk: fix distclean

2020-01-09 Thread Durrant, Paul
> -Original Message-
> From: Wei Liu 
> Sent: 09 January 2020 13:52
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Ian Jackson
> ; Wei Liu ; Anthony PERARD
> 
> Subject: Re: [PATCH] tools/Rules.mk: fix distclean
> 
> On Thu, Jan 09, 2020 at 11:15:05AM +, Paul Durrant wrote:
> > Running 'make distclean' under tools will currently result in:
> >
> > tools/Rules.mk:245: *** You have to run ./configure before building or
> installing the tools.  Stop.
> >
> > This patch adds 'distclean', 'subdir-distclean%' and 'subdir-clean%' to
> > no-configure-targets, which allows 'make distclean' to run to
> completion.
> >
> > Signed-off-by: Paul Durrant 
> 
> Fixes: 00691c6c90b
> 
> Sorry for not noticing the breakage while reviewing that patch.
> 

Ok. I'm sure that could be added at commit if there are no other changes needed.

> Is there a way to pattern match all targets containing "clean"?
> 
> (Would have looked into it myself but -ETIME today)

I couldn't persuade filter to match against patterns with multiple % so this 
was the best I could come up with.

  Paul

> 
> > ---
> > Cc: Ian Jackson 
> > Cc: Wei Liu 
> > ---
> >  tools/Rules.mk | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/Rules.mk b/tools/Rules.mk
> > index 31cf419ef4..52f47be3f8 100644
> > --- a/tools/Rules.mk
> > +++ b/tools/Rules.mk
> > @@ -239,7 +239,7 @@ subdir-all-% subdir-clean-% subdir-install-% subdir-
> uninstall-%: .phony
> >  subdir-distclean-%: .phony
> > $(MAKE) -C $* distclean
> >
> > -no-configure-targets := clean subtree-force-update-all %-dir-force-
> update
> > +no-configure-targets := distclean subdir-distclean% clean subdir-clean%
> subtree-force-update-all %-dir-force-update
> >  ifeq (,$(filter $(no-configure-targets),$(MAKECMDGOALS)))
> >  $(XEN_ROOT)/config/Tools.mk:
> > $(error You have to run ./configure before building or installing
> the tools)
> > --
> > 2.20.1
> >

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] Introduce CHANGELOG.md

2020-01-10 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 10 January 2020 09:52
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu ; Lars Kurth
> 
> Subject: Re: [PATCH] Introduce CHANGELOG.md
> 
> On 10.01.2020 10:12, Paul Durrant wrote:
> > --- /dev/null
> > +++ b/CHANGELOG.md
> > @@ -0,0 +1,14 @@
> > +# Changelog
> > +
> > +All notable changes to Xen will be documented in this file.
> 
> How do we qualify what's "notable" and what's not? IOW I wonder
> whether "All" should be dropped, or be replaced by "Some".
> 

Agreed that it's debatable. Perhaps just drop the 'All' and say:

'Notable changes to Xen will be documented in this file.'

?

Patch authors ought to update the file if they consider their contribution(s) 
notable but I'd also hope that maintainers will express an opinion as to 
whether something should be included/not included. It's not going to be 
fool-proof but I think it will be better than nothing.

  Paul

> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 1/3] x86/setup: Don't skip 2MiB underneath relocated Xen image

2020-01-10 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of
> David Woodhouse
> Sent: 08 January 2020 17:25
> To: Xen-devel 
> Cc: Stefano Stabellini ; Julien Grall
> ; Wei Liu ; Konrad Rzeszutek Wilk
> ; George Dunlap ;
> Andrew Cooper ; p...@xen.org; Ian Jackson
> ; Jan Beulich ; Roger Pau
> Monné 
> Subject: [Xen-devel] [RFC PATCH 1/3] x86/setup: Don't skip 2MiB underneath
> relocated Xen image
> 
> From: David Woodhouse 
> 
> Set 'e' correctly to reflect the location that Xen is actually relocated
> to from its default 2MiB location. Not 2MiB below that.
> 
> Signed-off-by: David Woodhouse 
> ---
>  xen/arch/x86/setup.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 501f3f5e4b..47e065e5fe 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1077,9 +1077,9 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>  unsigned long pte_update_limit;
> 
>  /* Select relocation address. */
> -e = end - reloc_size;
> -xen_phys_start = e;
> -bootsym(trampoline_xen_phys_start) = e;
> +xen_phys_start = end - reloc_size;
> +e = xen_phys_start + XEN_IMG_OFFSET;
> +bootsym(trampoline_xen_phys_start) = xen_phys_start;
> 
>  /*
>   * No PTEs pointing above this address are candidates for
> relocation.

Do you not also need to adjust the setting of pte_update_limit that's just out 
of context below here?

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of domains with a specified or random domid

2020-01-13 Thread Durrant, Paul
> -Original Message-
> From: jandr...@gmail.com 
> Sent: 13 January 2020 16:16
> To: Durrant, Paul 
> Cc: xen-devel ; Anthony PERARD
> ; Ian Jackson ; Wei
> Liu 
> Subject: Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of domains
> with a specified or random domid
> 
> On Thu, Jan 9, 2020 at 6:50 AM Paul Durrant  wrote:
> >
> > This patch adds a 'domid' field to libxl_domain_create_info and then
> > modifies do_domain_create() to use that value if it is valid. Any valid
> > domid will be checked against the retired domid list before being passed
> > to libxl__domain_make().
> > If the domid value is invalid then Xen will choose the domid, as before,
> > unless the value is the new special RANDOM_DOMID value added to the API.
> > This value instructs libxl__domain_make() to select a random domid
> value,
> > check it for validity, verify it does not match a retired domain, and
> then
> > pass it to Xen's XEN_DOMCTL_createdomain operation. If Xen determines
> that
> > it co-incides with an existing domain, a new random value will be
> > selected and the operation will be re-tried.
> >
> > NOTE: libxl__logv() is also modified to only log valid domid values in
> >   messages rather than any domid, valid or otherwise, that is not
> >   INVALID_DOMID.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Ian Jackson 
> > Cc: Wei Liu 
> > Cc: Anthony PERARD 
> >
> > v2:
> >  - Re-worked to use a value from libxl_domain_create_info
> > ---
> >  tools/libxl/libxl.h  |  9 +
> >  tools/libxl/libxl_create.c   | 32 +++-
> >  tools/libxl/libxl_internal.c |  2 +-
> >  tools/libxl/libxl_types.idl  |  1 +
> >  4 files changed, 42 insertions(+), 2 deletions(-)
> >
> 
> 
> 
> > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> > index 1835a5502c..ee76dee364 100644
> > --- a/tools/libxl/libxl_create.c
> > +++ b/tools/libxl/libxl_create.c
> > @@ -600,9 +600,39 @@ int libxl__domain_make(libxl__gc *gc,
> libxl_domain_config *d_config,
> >  goto out;
> >  }
> >
> > -ret = xc_domain_create(ctx->xch, domid, &create);
> > +if (libxl_domid_valid_guest(info->domid)) {
> > +*domid = info->domid;
> > +
> > +if (libxl__is_retired_domid(gc, *domid)) {
> > +LOGED(ERROR, *domid, "domain id is retired");
> > +rc = ERROR_FAIL;
> > +goto out;
> > +}
> > +} else if (info->domid == RANDOM_DOMID) {
> > +*domid = 0; /* Zero-out initial value */
> > +}
> > +
> > +for (;;) {
> > +if (info->domid == RANDOM_DOMID) {
> > +/* Randomize lower order bytes */
> > +ret = libxl__random_bytes(gc, (void *)domid,
> > +  sizeof(uint16_t));
> 
> Casting to void * assumes little endian.

I think that's a fairly safe assumption as far as Xen goes...

> Using a temporary uint16_t

...but, yes, that might be neater.

> would avoid that assumption.  Also, masking down to 0x7fff would clear
> the top bit which is never valid.

That seems like a bit of a layering violation and the check in 
libxl_domid_valid_guest() is going to cause a pretty fast turn round the loop 
if the top bit is set so masking is not going to gain that much.

  Paul

> 
> Regards,
> Jason
> 
> > +if (ret < 0)
> > +break;
> > +
> > +if (!libxl_domid_valid_guest(*domid) ||
> > +libxl__is_retired_domid(gc, *domid))
> > +continue;
> > +}
> > +
> > +ret = xc_domain_create(ctx->xch, domid, &create);
> > +if (ret == 0 || errno != EEXIST || info->domid !=
> RANDOM_DOMID)
> > +break;
> > +}
> > +
> >  if (ret < 0) {
> >  LOGED(ERROR, *domid, "domain creation fail");
> > +*domid = INVALID_DOMID;
> >  rc = ERROR_FAIL;
> >  goto out;
> >  }
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] docs/misc: livepatch: Espace backslash

2020-01-14 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of
> Julien Grall
> Sent: 13 January 2020 23:12
> To: xen-devel@lists.xenproject.org
> Cc: Ross Lagerwall ; Wieczorkiewicz, Pawel
> ; Julien Grall ; Konrad Rzeszutek
Wilk
> 
> Subject: [Xen-devel] [PATCH] docs/misc: livepatch: Espace backslash
>

s/Espace/Escape, I assume

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of domains with a specified or random domid

2020-01-14 Thread Durrant, Paul
> -Original Message-
> From: Julien Grall 
> Sent: 13 January 2020 22:24
> To: Durrant, Paul ; jandr...@gmail.com
> Cc: Anthony PERARD ; xen-devel  de...@lists.xenproject.org>; Ian Jackson ; Wei
> Liu 
> Subject: Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of domains
> with a specified or random domid
> 
> Hi Paul,
> 
> On 13/01/2020 16:54, Durrant, Paul wrote:
> >> -Original Message-
> >> From: jandr...@gmail.com 
> >> Sent: 13 January 2020 16:16
> >> To: Durrant, Paul 
> >> Cc: xen-devel ; Anthony PERARD
> >> ; Ian Jackson ;
> Wei
> >> Liu 
> >> Subject: Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of
> domains
> >> with a specified or random domid
> >>
> >> On Thu, Jan 9, 2020 at 6:50 AM Paul Durrant 
> wrote:
> >>>
> >>> This patch adds a 'domid' field to libxl_domain_create_info and then
> >>> modifies do_domain_create() to use that value if it is valid. Any
> valid
> >>> domid will be checked against the retired domid list before being
> passed
> >>> to libxl__domain_make().
> >>> If the domid value is invalid then Xen will choose the domid, as
> before,
> >>> unless the value is the new special RANDOM_DOMID value added to the
> API.
> >>> This value instructs libxl__domain_make() to select a random domid
> >> value,
> >>> check it for validity, verify it does not match a retired domain, and
> >> then
> >>> pass it to Xen's XEN_DOMCTL_createdomain operation. If Xen determines
> >> that
> >>> it co-incides with an existing domain, a new random value will be
> >>> selected and the operation will be re-tried.
> >>>
> >>> NOTE: libxl__logv() is also modified to only log valid domid values in
> >>>messages rather than any domid, valid or otherwise, that is not
> >>>INVALID_DOMID.
> >>>
> >>> Signed-off-by: Paul Durrant 
> >>> ---
> >>> Cc: Ian Jackson 
> >>> Cc: Wei Liu 
> >>> Cc: Anthony PERARD 
> >>>
> >>> v2:
> >>>   - Re-worked to use a value from libxl_domain_create_info
> >>> ---
> >>>   tools/libxl/libxl.h  |  9 +
> >>>   tools/libxl/libxl_create.c   | 32 +++-
> >>>   tools/libxl/libxl_internal.c |  2 +-
> >>>   tools/libxl/libxl_types.idl  |  1 +
> >>>   4 files changed, 42 insertions(+), 2 deletions(-)
> >>>
> >>
> >> 
> >>
> >>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> >>> index 1835a5502c..ee76dee364 100644
> >>> --- a/tools/libxl/libxl_create.c
> >>> +++ b/tools/libxl/libxl_create.c
> >>> @@ -600,9 +600,39 @@ int libxl__domain_make(libxl__gc *gc,
> >> libxl_domain_config *d_config,
> >>>   goto out;
> >>>   }
> >>>
> >>> -ret = xc_domain_create(ctx->xch, domid, &create);
> >>> +if (libxl_domid_valid_guest(info->domid)) {
> >>> +*domid = info->domid;
> >>> +
> >>> +if (libxl__is_retired_domid(gc, *domid)) {
> >>> +LOGED(ERROR, *domid, "domain id is retired");
> >>> +rc = ERROR_FAIL;
> >>> +goto out;
> >>> +}
> >>> +} else if (info->domid == RANDOM_DOMID) {
> >>> +*domid = 0; /* Zero-out initial value */
> >>> +}
> >>> +
> >>> +for (;;) {
> >>> +if (info->domid == RANDOM_DOMID) {
> >>> +/* Randomize lower order bytes */
> >>> +ret = libxl__random_bytes(gc, (void *)domid,
> >>> +  sizeof(uint16_t));
> >>
> >> Casting to void * assumes little endian.
> >
> > I think that's a fairly safe assumption as far as Xen goes...
> 
> Not really, there are technically nothing (other than bug fixes)
> preventing us to use a big endian guest on Xen on Arm.
>

Ok.
 
> I actually did play with big endian on Xen in the past and managed to
> get a guest running. The main annoying part is Linux as it is assuming
> to use the same endian as the hypervisor. But other OS may not have this
> issues...
> 
> The hypervisor itself is likely going to stay little endian, so does the
> interface. For the tools, we should aim to not introduce more assumption
> that the software will be little endian.
> 

Fair enough. If there's a realistic possibility of running a BE tools domain 
then I'll code accordingly.

  Paul

> Cheers,
> 
> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of domains with a specified or random domid

2020-01-14 Thread Durrant, Paul
> -Original Message-
> From: jandr...@gmail.com 
> Sent: 13 January 2020 19:35
> To: Durrant, Paul 
> Cc: xen-devel ; Anthony PERARD
> ; Ian Jackson ; Wei
> Liu 
> Subject: Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of domains
> with a specified or random domid
> 
> On Mon, Jan 13, 2020 at 11:55 AM Durrant, Paul 
> wrote:
> >
> > > -Original Message-
> > > From: jandr...@gmail.com 
> > > Sent: 13 January 2020 16:16
> > > To: Durrant, Paul 
> > > Cc: xen-devel ; Anthony PERARD
> > > ; Ian Jackson ;
> Wei
> > > Liu 
> > > Subject: Re: [Xen-devel] [PATCH v2 4/6] libxl: allow creation of
> domains
> > > with a specified or random domid
> > >
> > > On Thu, Jan 9, 2020 at 6:50 AM Paul Durrant 
> wrote:
> > > >
> > > > This patch adds a 'domid' field to libxl_domain_create_info and then
> > > > modifies do_domain_create() to use that value if it is valid. Any
> valid
> > > > domid will be checked against the retired domid list before being
> passed
> > > > to libxl__domain_make().
> > > > If the domid value is invalid then Xen will choose the domid, as
> before,
> > > > unless the value is the new special RANDOM_DOMID value added to the
> API.
> > > > This value instructs libxl__domain_make() to select a random domid
> > > value,
> > > > check it for validity, verify it does not match a retired domain,
> and
> > > then
> > > > pass it to Xen's XEN_DOMCTL_createdomain operation. If Xen
> determines
> > > that
> > > > it co-incides with an existing domain, a new random value will be
> > > > selected and the operation will be re-tried.
> > > >
> > > > NOTE: libxl__logv() is also modified to only log valid domid values
> in
> > > >   messages rather than any domid, valid or otherwise, that is
> not
> > > >   INVALID_DOMID.
> > > >
> > > > Signed-off-by: Paul Durrant 
> > > > ---
> > > > Cc: Ian Jackson 
> > > > Cc: Wei Liu 
> > > > Cc: Anthony PERARD 
> > > >
> > > > v2:
> > > >  - Re-worked to use a value from libxl_domain_create_info
> > > > ---
> > > >  tools/libxl/libxl.h  |  9 +
> > > >  tools/libxl/libxl_create.c   | 32 +++-
> > > >  tools/libxl/libxl_internal.c |  2 +-
> > > >  tools/libxl/libxl_types.idl  |  1 +
> > > >  4 files changed, 42 insertions(+), 2 deletions(-)
> > > >
> > >
> > > 
> > >
> > > > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> > > > index 1835a5502c..ee76dee364 100644
> > > > --- a/tools/libxl/libxl_create.c
> > > > +++ b/tools/libxl/libxl_create.c
> > > > @@ -600,9 +600,39 @@ int libxl__domain_make(libxl__gc *gc,
> > > libxl_domain_config *d_config,
> > > >  goto out;
> > > >  }
> > > >
> > > > -ret = xc_domain_create(ctx->xch, domid, &create);
> > > > +if (libxl_domid_valid_guest(info->domid)) {
> > > > +*domid = info->domid;
> > > > +
> > > > +if (libxl__is_retired_domid(gc, *domid)) {
> > > > +LOGED(ERROR, *domid, "domain id is retired");
> > > > +rc = ERROR_FAIL;
> > > > +goto out;
> > > > +}
> > > > +} else if (info->domid == RANDOM_DOMID) {
> > > > +*domid = 0; /* Zero-out initial value */
> > > > +}
> > > > +
> > > > +for (;;) {
> > > > +if (info->domid == RANDOM_DOMID) {
> > > > +/* Randomize lower order bytes */
> > > > +ret = libxl__random_bytes(gc, (void *)domid,
> > > > +  sizeof(uint16_t));
> > >
> > > Casting to void * assumes little endian.
> >
> > I think that's a fairly safe assumption as far as Xen goes...
> >
> > > Using a temporary uint16_t
> >
> > ...but, yes, that might be neater.
> >
> > > would avoid that assumption.  Also, masking down to 0x7fff would clear
> > > the top bit which is never valid.
> >
> > That seems like a bit of a layering violation and the check in
> libxl_domid_valid_guest() is going to cause a pretty fast turn round the
> loop if the top bit is set so masking is not going to gain that much.
> 
> Yeah, there isn't a define or constant exposed for 0x7fff, so masking
> is a little dirty.  Since about ~half of random 16bit numbers will
> have the high bit set, we'll have to read a second one.  My natural
> instinct is to avoid those extra reads :)
> 

Perhaps I should try adding a DOMID_MASK definition somewhere.

  Paul

> Regards,
> Jason
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE

2020-01-15 Thread Durrant, Paul
> -Original Message-
> 
> > Linux PCI subsytem has an option resource_alignment that can be
> > applied to either a single device or all devices.  Booting with
> > pci=resource_aligment=4096 will align each device to a page.  Do you
> > think pciback should force resource_alignment=4096 for dom0?
>

That sounds like a good idea.
 
> Ideally Xen should keep track of the BARs position and size and refuse
> to passthrough devices that have BARs sharing a page with other
> devices BARs.
> 
> > Are
> > there other MMIO ranges to be concerned about adjacent to BARs?
> 
> IIRC you can have two BARs of different devices in the same 4K page,
> BARs are only aligned to it's size, so BARs smaller than 4K are not
> required to be page aligned.

If we had a notion of assignment groups for this, as well as devices sharing 
requester id, then Xen would not need to refuse pass-through, it would just 
require that all devices sharing the page were passed through as a unit.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 4/6] libxl: allow creation of domains with a specified or random domid

2020-01-16 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 16 January 2020 10:40
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Ian Jackson
> ; Wei Liu ; Anthony PERARD
> ; Andrew Cooper ;
> George Dunlap ; Julien Grall
> ; Konrad Rzeszutek Wilk ; Stefano
> Stabellini ; jandr...@gmail.com
> Subject: Re: [PATCH v3 4/6] libxl: allow creation of domains with a
> specified or random domid
> 
> On 16.01.2020 10:36, Paul Durrant wrote:
> > --- a/xen/include/public/xen.h
> > +++ b/xen/include/public/xen.h
> > @@ -614,6 +614,9 @@ DEFINE_XEN_GUEST_HANDLE(mmuext_op_t);
> >  /* Idle domain. */
> >  #define DOMID_IDLE   xen_mk_uint(0x7FFF)
> >
> > +/* Mask for valid domain id values */
> > +#define DOMID_MASK   0x7FFF
> 
> Seeing it used in context, any reason not to use xen_mk_uint()
> here as well?
> 

I did wonder but I thought it best not to impose a type on a mask.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 0/6] xl/libxl: domid allocation/preservation changes

2020-01-17 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 16 January 2020 19:43
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; Anthony Perard ;
> George Dunlap ; Jan Beulich ;
> jandr...@gmail.com; Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ; Wei
> Liu 
> Subject: Re: [PATCH v3 0/6] xl/libxl: domid allocation/preservation
> changes
> 
> Paul Durrant writes ("[PATCH v3 0/6] xl/libxl: domid
> allocation/preservation changes"):
> > This series was previously named "xl/libxl: allow creation of domains
> with
> > a specified domid".
> 
> Thanks.  I think Anthony ought to have been made a maintainer of
> tools/xl at the same time as of tools/libxl.  But that isn't so in
> MAINTAINERS right now, so he wasn't CC'd on all these patches.  If you
> could fix that up manually for future mails, that would be great.
> 

Ok, I'll re-base on top of the patch you posted. That should do the trick.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/6] libxl: add infrastructure to track and query 'retired' domids

2020-01-17 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 16 January 2020 19:28
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu ; Anthony Perard
> 
> Subject: Re: [PATCH v3 3/6] libxl: add infrastructure to track and query
> 'retired' domids
> 
> Thanks.  I think this is the algorithm as we discussed, thanks.
> I have some comments about the implementation...
> 
> Paul Durrant writes ("[PATCH v3 3/6] libxl: add infrastructure to track
> and query 'retired' domids"):
> > A domid is considered retired if the domain it represents was destroyed
> > less than a specified number of seconds ago. The number can be set using
> > the environment variable LIBXL_DOMID_MAX_RETIREMENT. If the variable
> does
> > not exist then a default value of 60s is used.
> ...
> 
> I'm afraid I think your update protocol for this file is wrong.  In
> general it is a bad idea to try to write over a file in-place.  Doing
> so is full of gotchas.  (In this particular case for example I think
> an interrupted attempt at cleaning the file can produce a corrupted
> file containing nonsense.)
> 
> Can we please use the standard write-to-new-file-and-rename ?

Ok, fair enough. I'd not really considered interruption as being too much of a 
risk but I guess I should.

> Ie, to launder
> flock(open("domid-history.lock"))
> fopen("domid-history","r")
> fopen("domid-history.new","w")
> fgets/fputs
> fclose
> rename
> close
> 
> (And no uses of ftell, fopen(,"r+"), etc.)
> 
> Reading can be done without taking the lock, if you so fancy.
> 
> I think there are a lot of missing error checks in this patch, but
> since I'm asking for a different approach I won't point them out
> individually.
> 

Ok.

> > +fd = open(name, O_RDWR|O_CREAT, 0644);
> > +if (fd < 0) {
> > +LOGE(ERROR, "unexpected error while trying open %s, errno=%d",
> > + name, errno);
> > +goto fail;
> > +}
> > +
> > +for (;;) {
> > +ret = flock(fd, LOCK_EX);
> 
> I looked for a utility function to do this but didn't find one.
> I think this is complicated because it needs to be a `carefd' in libxl
> terms because of concurrent forking by other threads in the
> application.
> 
> I suggest generalising libxl__lock_domain_userdata, which has all the
> necessary code (and which also would permit removing the file in the
> future).
> 
> I feel responsible for this inconvenience.  If this is too tiresome
> for you, I could do that part for you...
> 

That's ok; I'll insert a preceding generalization patch, unless it turns into a 
total can of worms... which I doubt it will.

> > +/* Write a domid retirement record */
> > +static void libxl__retire_domid(libxl__gc *gc, uint32_t domid)
> > +{
> ...
> > +while (fgets(line, sizeof(line), f)) {
> > +unsigned long sec;
> > +unsigned int ignored;
> > +
> > +roff = ftell(f);
> > +
> > +if (sscanf(line, "%lu %u", &sec, &ignored) != 2)
> > +continue; /* Purge malformed lines */
> 
> I'm not sure why you bother with fgets into a buffer, when you could
> just use fscanf rather than sscanf.  Your code doesn't need to take
> much care about weird syntax which might occur (and indeed your code
> here doesn't take such care).

Well, I need to pull the line into a buffer if I'm going to write it out again, 
but otherwise I could indeed use fscanf().

> 
> > @@ -1331,6 +1462,7 @@ static void devices_destroy_cb(libxl__egc *egc,
> >  if (!ctx->xch) goto badchild;
> >
> >  if (!dis->soft_reset) {
> > +libxl__retire_domid(gc, domid);
> 
> I wonder if a better term than "retired" would be possible.  I
> initially found this patch a bit confusing because I thought a retired
> domid would be one which had *not* been used recently.  Maybe
> "recent", "mark recent", etc. ?  Ultimately this is a bikeshed issue
> which I will leave this up to you, though.
> 

Ok, 'recent' is probably clearer. I'll s/retired/recent/g.

> 
> I don't much like the environment variable to configure this.  I don't
> object to keeping it but can we have a comment saying this is not
> intended for use in production ?  Personally I would rather it was
> hardcoded, or failing that, written to some config file.
> 

The problem is that libxl has no config file. Env variables seem to be used for 
other things so I followed suit. I'd rather keep the override for debug 
purposes; I'll stick a comment in the header saying that's what it's for 
though, as you suggest.

> 
> Finally, I think this patch needs an addition to xen-init-dom0 to
> remove or empty the record file.  This is because while /run is
> usually a tmpfs, this is not *necessarily* true.
> 

Ok, if we cannot rely on it being tmpfs then I will do that.

  Paul



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 4/6] libxl: allow creation of domains with a specified or random domid

2020-01-17 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 16 January 2020 19:36
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu ; Anthony Perard
> ; Andrew Cooper ;
> George Dunlap ; Jan Beulich ;
> Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ;
> jandr...@gmail.com
> Subject: Re: [PATCH v3 4/6] libxl: allow creation of domains with a
> specified or random domid
> 
> Hi.  This broadly contains what I expected, but:
> 
> Paul Durrant writes ("[PATCH v3 4/6] libxl: allow creation of domains with
> a specified or random domid"):
> 
> > +for (;;) {
> > +if (info->domid == RANDOM_DOMID) {
> > +uint16_t v;
> > +
> > +/* Randomize lower order bytes */
> > +ret = libxl__random_bytes(gc, (void *)&v, sizeof(v));
> > +if (ret < 0)
> > +break;
> > +
> > +v &= DOMID_MASK;
> > +if (!libxl_domid_valid_guest(v) ||
> > +libxl__is_retired_domid(gc, v))
> > +continue;
> > +
> > +*domid = v;
> > +}
> > +
> > +ret = xc_domain_create(ctx->xch, domid, &create);
> > +if (ret == 0 || errno != EEXIST || info->domid !=
> RANDOM_DOMID)
> > +break;
> > +}
> 
> I think this has a race.
> 
>   Thread A, in domain destroy   Thread B, in code above
> 
>  choose domid V
>  check V in recent domid list
> 
>  add V to recent domid list
>  destroy domain V in Xen
> 
>  create domain V in Xen
>  continue constructing V
> 
> Thread B improperly constructs a new guest using V, exposing anyone
> who was talking about V a moment ago to bugs.  Some code might even
> fail to spot the interval where V does not exist and carry on talking
> to the new V as if it were the old one...
> 
> I think there are only two possible solutions:
> 
>   - Check the domain's entry in the recent list *after* creating
> the domain in Xen.  This involves accepting that we will
> reuse the domid but only for a domain we are in the early
> stages of constructing, so hopefully without bad consequence?
> 
>   - Take the recent domid lock.
> 

Or take a global file lock in libxl around domain creation and destruction?

> Also, it seems to me that we should check the recent domid list if we
> let Xen choose the domid.  Maybe that can be in a subsequent patch...
> 

Well, we could solve all this, remove the need for a file and all the 
associated complexity by simply keeping history inside the hypervisor. I don't 
know how the Xen maintainers will feel about that though, as Xen itself 
shouldn't have a problem with eager domid re-use.

  Paul

> Thanks,
> Ian.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 4/6] libxl: allow creation of domains with a specified or random domid

2020-01-17 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 17 January 2020 12:36
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu ; Anthony Perard
> ; Andrew Cooper ;
> George Dunlap ; Jan Beulich ;
> Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ;
> jandr...@gmail.com
> Subject: RE: [PATCH v3 4/6] libxl: allow creation of domains with a
> specified or random domid
> 
> Durrant, Paul writes ("RE: [PATCH v3 4/6] libxl: allow creation of domains
> with a specified or random domid"):
> > [Ian:]
> > > I think there are only two possible solutions:
> > >
> > >   - Check the domain's entry in the recent list *after* creating
> > > the domain in Xen.  This involves accepting that we will
> > > reuse the domid but only for a domain we are in the early
> > > stages of constructing, so hopefully without bad consequence?
> > >
> > >   - Take the recent domid lock.
> > >
> >
> > Or take a global file lock in libxl around domain creation and
> destruction?
> 
> We want domain construction to be concurrent, when it can be.  So I
> think a lock around just xc_domain_create is OK but a lock around the
> whole operation is not.
> 
> > > Also, it seems to me that we should check the recent domid list if we
> > > let Xen choose the domid.  Maybe that can be in a subsequent patch...
> >
> > Well, we could solve all this, remove the need for a file and all the
> associated complexity by simply keeping history inside the hypervisor. I
> don't know how the Xen maintainers will feel about that though, as Xen
> itself shouldn't have a problem with eager domid re-use.
> 
> I think this doesn't need to be done in the hypervisor so I am
> inclined to say it shouldn't be.  Also, there is a lot of policy here...
> 

Ok, to cover all bases then it seems like checking the domid after creation and 
then destroying if it is too recent is the better option.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 4/6] libxl: allow creation of domains with a specified or random domid

2020-01-20 Thread Durrant, Paul

> -Original Message-
> From: Ian Jackson 
> Sent: 17 January 2020 15:31
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu ; Anthony Perard
> ; Andrew Cooper ;
> George Dunlap ; Jan Beulich ;
> Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ;
> jandr...@gmail.com
> Subject: RE: [PATCH v3 4/6] libxl: allow creation of domains with a
> specified or random domid
> 
> Durrant, Paul writes ("RE: [PATCH v3 4/6] libxl: allow creation of domains
> with a specified or random domid"):
> > Ok, to cover all bases then it seems like checking the domid after
> creation and then destroying if it is too recent is the better option.
> 
> I think so, yes.  I think the recent timestamp should be updated in
> this case.  (Faff!)
> 

I don't think we need to mess with the time-stamp in this case. The domain will 
be killed very quickly, before any PV backends are built and IIUC that's what 
we care about when it comes to re-using domids too quickly.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] xsm: add config option for denied string

2020-01-20 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan
> Beulich
> Sent: 20 January 2020 09:51
> To: Sergey Dyasli 
> Cc: Stefano Stabellini ; Julien Grall
> ; Wei Liu ; Konrad Rzeszutek Wilk
> ; George Dunlap ;
> Andrew Cooper ; Doug Goldstein
> ; xen-de...@lists.xen.org; Daniel De Graaf
> ; Ian Jackson 
> Subject: Re: [Xen-devel] [PATCH v3 1/2] xsm: add config option for denied
> string
> 
> On 17.01.2020 17:44, Sergey Dyasli wrote:
> > Signed-off-by: Sergey Dyasli 
> 
> In principle
> Acked-by: Jan Beulich 
> 
> But I think it would be nice to have a non-empty description, at
> least to reason why the option addition is deemed useful.
> 
> > --- a/xen/common/Kconfig
> > +++ b/xen/common/Kconfig
> > @@ -236,6 +236,14 @@ choice
> > bool "SILO" if XSM_SILO
> >  endchoice
> >
> > +config XSM_DENIED_STRING
> > +   string "xen_version denied string"
> 
> I guess inserting "hypercall" into this prompt would set better
> context without needing to resort to the help text, i.e.
> "xen_version hypercall denied string". Thoughts?
>

"xen_version hypercall denied information replacement string"?

It's not like the hypercall as a whole is being denied, after all.

  Paul

 
> Jan
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] Introduce CHANGELOG.md

2020-01-21 Thread Durrant, Paul
Ping? I have acks from Lars and Wei but this doesn't appear to have been 
committed. Are any more acks required?

  Paul

> -Original Message-
> From: Paul Durrant 
> Sent: 13 January 2020 15:32
> To: xen-devel@lists.xenproject.org
> Cc: Durrant, Paul ; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Jan Beulich ;
> Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ; Wei
> Liu 
> Subject: [PATCH v2] Introduce CHANGELOG.md
> 
> As agreed during the 2020-01 community call [1] this patch introduces a
> changelog, based on the principles explained at keepachangelog.com [2].
> A new MAINTAINERS entry is also added, with myself as (currently sole)
> maintainer.
> 
> [1] See C.2 at https://cryptpad.fr/pad/#/2/pad/edit/ERZtMYD5j6k0sv-NG6Htl-
> AJ/
> [2] https://keepachangelog.com/en/1.0.0/
> 
> Signed-off-by: Paul Durrant 
> Acked-by: Lars Kurth 
> ---
> Cc: Andrew Cooper 
> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Jan Beulich 
> Cc: Julien Grall 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Stefano Stabellini 
> Cc: Wei Liu 
> 
> v2:
>  - Dropped 'All' from 'All notable changes'
>  - Added Lars as a designated reviewer
> ---
>  CHANGELOG.md | 14 ++
>  MAINTAINERS  |  6 ++
>  2 files changed, 20 insertions(+)
>  create mode 100644 CHANGELOG.md
> 
> diff --git a/CHANGELOG.md b/CHANGELOG.md
> new file mode 100644
> index 00..b11e9bc4e3
> --- /dev/null
> +++ b/CHANGELOG.md
> @@ -0,0 +1,14 @@
> +# Changelog
> +
> +Notable changes to Xen will be documented in this file.
> +
> +The format is based on [Keep a
> Changelog](https://keepachangelog.com/en/1.0.0/)
> +
> +## [Unreleased](https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog)
> +
> +### Added
> + - This file and MAINTAINERS entry.
> +
> +##
> [4.13.0](https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-
> 4.13.0) - 2019-12-17
> +
> +> Pointer to release from which CHANGELOG tracking starts
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d5bd83073c..1ffc3dc600 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -198,6 +198,12 @@ F:   xen/include/asm-arm/
>  F:   xen/include/public/arch-arm/
>  F:   xen/include/public/arch-arm.h
> 
> +Change Log
> +M:   Paul Durrant 
> +R:   Lars Kurth 
> +S:   Maintained
> +F:   CHANGELOG.md
> +
>  Continuous Integration (CI)
>  M:   Doug Goldstein 
>  W:   https://gitlab.com/xen-project/xen
> --
> 2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] x86 / vmx: use a 'normal' domheap page for APIC_DEFAULT_PHYS_BASE

2020-01-21 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of
> Julien Grall
> Sent: 21 January 2020 12:29
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: Kevin Tian ; Stefano Stabellini
> ; Jun Nakajima ; Wei Liu
> ; Konrad Rzeszutek Wilk ; George
> Dunlap ; Andrew Cooper
> ; Ian Jackson ;
> Roger Pau Monné 
> Subject: Re: [Xen-devel] [PATCH 3/3] x86 / vmx: use a 'normal' domheap
> page for APIC_DEFAULT_PHYS_BASE
> 
> Hi,
> 
> On 21/01/2020 12:00, Paul Durrant wrote:
> > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> > index 919a270587..ef327072ed 100644
> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -2269,7 +2269,8 @@ int assign_pages(
> >
> >   if ( !(memflags & MEMF_no_refcount) )
> >   {
> > -if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages) )
> > +if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages) &&
> > + d->creation_finished )
> 
> This is not entirely obvious to me how this is safe. What would happen
> if d->creation_finished is set on another CPU at the same time? At least
> on Arm, this may not be seen directly.
> 
> I guess the problem would not only happen in this use case (I am more
> concerned in the physmap code), but it would be good to document how it
> is meant to be safe to use.
> 
> However, AFAIU, the only reason for the check to be here is because
> d->max_pages is set quite late. How about setting max_pages as part of
> the domain_create hypercall?

That would be useful but certainly more invasive. There's no way a guest vcpu 
can see creation_finished set to true as it is still paused. The only concern 
would be a stub domain causing domheap pages to be allocated on behalf of the 
guest, and can we not trust a stub domain until it's guest has been unpaused 
(since there is no scope for the guest to attack it until then)?

  Paul

> 
> Cheers,
> 
> --
> Julien Grall
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] x86 / vmx: use a 'normal' domheap page for APIC_DEFAULT_PHYS_BASE

2020-01-22 Thread Durrant, Paul
> -Original Message-
> From: Tian, Kevin 
> Sent: 22 January 2020 03:19
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: Nakajima, Jun ; Jan Beulich
> ; Andrew Cooper ; Wei Liu
> ; Roger Pau Monné ; George Dunlap
> ; Ian Jackson ;
> Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini 
> Subject: RE: [PATCH 3/3] x86 / vmx: use a 'normal' domheap page for
> APIC_DEFAULT_PHYS_BASE
> 
> > From: Paul Durrant 
> > Sent: Tuesday, January 21, 2020 8:00 PM
> >
> > vmx_alloc_vlapic_mapping() currently contains some very odd looking code
> > that allocates a MEMF_no_owner domheap page and then shares with the
> > guest
> > as if it were a xenheap page. This then requires
> vmx_free_vlapic_mapping()
> > to call a special function in the mm code: free_shared_domheap_page().
> >
> > By using a 'normal' domheap page (i.e. by not passing MEMF_no_owner to
> > alloc_domheap_page()), the odd looking code in
> vmx_alloc_vlapic_mapping()
> > can simply use get_page_and_type() to set up a writable mapping before
> > insertion in the P2M and vmx_free_vlapic_mapping() can simply release
> the
> > page using put_page_alloc_ref() followed by put_page_and_type(). This
> > then allows free_shared_domheap_page() to be purged.
> 
> I doubt whether using a normal domheap page is a right thing in concept.
> Note the APIC access page is the backend for virtual LAPIC_BASE which is
> a MMIO range from guest p.o.v.

Well, using a non-owned domheap page as a fake xenheap page just looks weird. 
Also 'special' code like this is historically a source of XSAs. IMO a 
straightforward domheap page here is totally appropriate; AFAICT the code is 
only as it is to work round the max_pages limit and the lack of a 
relinquish_resources() hvmfunc.

> 
> >
> > There is, however, some fall-out from this simplification:
> >
> > - alloc_domheap_page() will now call assign_pages() and run into the
> fact
> >   that 'max_pages' is not set until some time after domain_create(). To
> >   avoid an allocation failure, assign_pages() is modified to ignore the
> >   max_pages limit if 'creation_finished' is false. That value is not set
> >   to true until domain_unpause_by_systemcontroller() is called, and thus
> >   the guest cannot run (and hence cause memory allocation) until
> >   creation_finished is set to true.
> 
> However, doing so opens the window of no guard of allocations in
> the whole VM creation time. I'm not sure whether it's a worthwhile
> just for fixing a bad-looking but conceptually-correct code.
> 

I don’t think the code is conceptually correct. The lack of guard is only for 
pages assigned to that domain, and since the domain cannot possibly be running 
then I don't see it is a problem. That said, an alternative would be to have 
the domain creation code to set max_pages is to a minimal initial value 
sufficient to cover any alloc_domheap_pages() calls done by sub-functions.

  Paul

> >
> > - Because the domheap page is no longer a pseudo-xenheap page, the
> >   reference counting will prevent the domain from being destroyed. Thus
> >   the call to vmx_free_vlapic_mapping() is moved from the
> >   domain_destroy() method into the domain_relinquish_resources() method.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Jun Nakajima 
> > Cc: Kevin Tian 
> > Cc: Jan Beulich 
> > Cc: Andrew Cooper 
> > Cc: Wei Liu 
> > Cc: "Roger Pau Monné" 
> > Cc: George Dunlap 
> > Cc: Ian Jackson 
> > Cc: Julien Grall 
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: Stefano Stabellini 
> > ---
> >  xen/arch/x86/hvm/vmx/vmx.c | 29 ++---
> >  xen/arch/x86/mm.c  | 10 --
> >  xen/common/page_alloc.c|  3 ++-
> >  xen/include/asm-x86/mm.h   |  2 --
> >  4 files changed, 24 insertions(+), 20 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index 3fd3ac61e1..a2e6081485 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -421,10 +421,6 @@ static int vmx_domain_initialise(struct domain *d)
> >  }
> >
> >  static void vmx_domain_relinquish_resources(struct domain *d)
> > -{
> > -}
> > -
> > -static void vmx_domain_destroy(struct domain *d)
> >  {
> >  if ( !has_vlapic(d) )
> >  return;
> > @@ -432,6 +428,10 @@ static void vmx_domain_destroy(struct domain *d)
> >  vmx_free_vlapic_mapping(d);
> >  }
> >
> > +static void vmx_domain_destroy(

Re: [Xen-devel] [PATCH v3 2/9] xen: split parameter related definitions in own header file

2020-01-22 Thread Durrant, Paul
> -Original Message-
> From: Juergen Gross 
> Sent: 21 January 2020 08:43
> To: xen-devel@lists.xenproject.org
> Cc: Juergen Gross ; Stefano Stabellini
> ; Julien Grall ; Volodymyr Babchuk
> ; Andrew Cooper ;
> George Dunlap ; Ian Jackson
> ; Jan Beulich ; Konrad
> Rzeszutek Wilk ; Wei Liu ; Roger Pau
> Monné ; Durrant, Paul ; Jun
> Nakajima ; Kevin Tian ;
> Lukasz Hawrylko ; Christopher Clark
> ; Dario Faggioli ;
> Daniel De Graaf 
> Subject: [PATCH v3 2/9] xen: split parameter related definitions in own
> header file
> 
> Move the parameter related definitions from init.h into a new header
> file param.h. This will avoid include hell when new dependencies are
> added to parameter definitions.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Paul Durrant 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/3] x86 / hvm: add domain_relinquish_resources() method

2020-01-22 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 22 January 2020 15:51
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; Jun Nakajima ; Kevin Tian
> 
> Subject: Re: [PATCH 2/3] x86 / hvm: add domain_relinquish_resources()
> method
> 
> On 21.01.2020 13:00, Paul Durrant wrote:
> > There are two functions in hvm.c to deal with tear-down and a domain:
> > hvm_domain_relinquish_resources() and hvm_domain_destroy(). However,
> only
> > the latter has an associated method in 'hvm_funcs'. This patch adds
> > a method for the former and stub definitions for SVM and VMX.
> 
> Why the stubs? Simply ...
> 
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -715,6 +715,8 @@ int hvm_domain_initialise(struct domain *d)
> >
> >  void hvm_domain_relinquish_resources(struct domain *d)
> >  {
> > +hvm_funcs.domain_relinquish_resources(d);
> 
> ... stick a NULL check around this one. I also wonder whether, it
> being entirely new, this wouldn't better use alternative call
> patching right from the beginning. It's not the hottest path, but
> avoiding indirect calls seems quite desirable, especially when
> doing so is relatively cheap.
> 

I'd like it to align with the rest of the hvm_funcs so I'll add the NULL check, 
but alternatives patch for all hvm_funcs seems like a good thing I the longer 
term.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/3] x86 / hvm: add domain_relinquish_resources() method

2020-01-22 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 22 January 2020 16:01
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; Jun Nakajima ; Kevin Tian
> 
> Subject: Re: [PATCH 2/3] x86 / hvm: add domain_relinquish_resources()
> method
> 
> On 22.01.2020 16:56, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Jan Beulich 
> >> Sent: 22 January 2020 15:51
> >> To: Durrant, Paul 
> >> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> >> ; Wei Liu ; Roger Pau Monné
> >> ; Jun Nakajima ; Kevin
> Tian
> >> 
> >> Subject: Re: [PATCH 2/3] x86 / hvm: add domain_relinquish_resources()
> >> method
> >>
> >> On 21.01.2020 13:00, Paul Durrant wrote:
> >>> There are two functions in hvm.c to deal with tear-down and a domain:
> >>> hvm_domain_relinquish_resources() and hvm_domain_destroy(). However,
> >> only
> >>> the latter has an associated method in 'hvm_funcs'. This patch adds
> >>> a method for the former and stub definitions for SVM and VMX.
> >>
> >> Why the stubs? Simply ...
> >>
> >>> --- a/xen/arch/x86/hvm/hvm.c
> >>> +++ b/xen/arch/x86/hvm/hvm.c
> >>> @@ -715,6 +715,8 @@ int hvm_domain_initialise(struct domain *d)
> >>>
> >>>  void hvm_domain_relinquish_resources(struct domain *d)
> >>>  {
> >>> +hvm_funcs.domain_relinquish_resources(d);
> >>
> >> ... stick a NULL check around this one. I also wonder whether, it
> >> being entirely new, this wouldn't better use alternative call
> >> patching right from the beginning. It's not the hottest path, but
> >> avoiding indirect calls seems quite desirable, especially when
> >> doing so is relatively cheap.
> >>
> >
> > I'd like it to align with the rest of the hvm_funcs so I'll add the
> > NULL check, but alternatives patch for all hvm_funcs seems like a
> > good thing I the longer term.
> 
> The frequently used ones have been converted already. Hence my
> suggestion to make new ones use that model from the beginning.
> 

Oh, ok. I'll go look for some examples.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] x86 / vmx: use a 'normal' domheap page for APIC_DEFAULT_PHYS_BASE

2020-01-22 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 22 January 2020 16:17
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Jun Nakajima ;
> Kevin Tian ; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap ; Ian
> Jackson ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> 
> Subject: Re: [PATCH 3/3] x86 / vmx: use a 'normal' domheap page for
> APIC_DEFAULT_PHYS_BASE
> 
> On 21.01.2020 13:00, Paul Durrant wrote:
> > vmx_alloc_vlapic_mapping() currently contains some very odd looking code
> > that allocates a MEMF_no_owner domheap page and then shares with the
> guest
> > as if it were a xenheap page. This then requires
> vmx_free_vlapic_mapping()
> > to call a special function in the mm code: free_shared_domheap_page().
> >
> > By using a 'normal' domheap page (i.e. by not passing MEMF_no_owner to
> > alloc_domheap_page()), the odd looking code in
> vmx_alloc_vlapic_mapping()
> > can simply use get_page_and_type() to set up a writable mapping before
> > insertion in the P2M and vmx_free_vlapic_mapping() can simply release
> the
> > page using put_page_alloc_ref() followed by put_page_and_type(). This
> > then allows free_shared_domheap_page() to be purged.
> >
> > There is, however, some fall-out from this simplification:
> >
> > - alloc_domheap_page() will now call assign_pages() and run into the
> fact
> >   that 'max_pages' is not set until some time after domain_create(). To
> >   avoid an allocation failure, assign_pages() is modified to ignore the
> >   max_pages limit if 'creation_finished' is false. That value is not set
> >   to true until domain_unpause_by_systemcontroller() is called, and thus
> >   the guest cannot run (and hence cause memory allocation) until
> >   creation_finished is set to true.
> 
> But this check is also to guard against the tool stack (or possibly
> the controlling stubdom) to cause excess allocation. I don't think
> the checking should be undermined like this (and see also below).
>

Ok.
 
> Since certainly you've looked into this while creating the patch,
> could you remind me why it is that this page needs to be owned (as
> in its owner field set accordingly) by the guest? It's a helper
> page only, after all.
> 

Not sure why it was done that way. It's inserted into the guest P2M so having 
it owned by the guest seems like the right thing to do. A malicious guest could 
decrease-reservation it and I guess it avoids special-casing there.

> > @@ -3034,12 +3034,22 @@ static int vmx_alloc_vlapic_mapping(struct
> domain *d)
> >  if ( !cpu_has_vmx_virtualize_apic_accesses )
> >  return 0;
> >
> > -pg = alloc_domheap_page(d, MEMF_no_owner);
> > +pg = alloc_domheap_page(d, 0);
> 
> Did you consider passing MEMF_no_refcount here, to avoid the
> fiddling with assign_pages()? That'll in particular also
> avoid ...
> 

You remember what happened last time we did that (with the ioreq server page), 
right? That's why assign_pages() vetoes non-refcounted pages.

> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -2269,7 +2269,8 @@ int assign_pages(
> >
> >  if ( !(memflags & MEMF_no_refcount) )
> >  {
> > -if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages) )
> > +if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages) &&
> > + d->creation_finished )
> >  {
> >  gprintk(XENLOG_INFO, "Over-allocation for domain %u: "
> >  "%u > %u\n", d->domain_id,
> 
> ... invoking domain_adjust_tot_pages() right below here, which
> is wrong for helper pages like this one (as it reduces the
> amount the domain is actually permitted to allocate).
> 

True, but there is 'slop' to deal with things like the ioreq pages and I think 
this page is logically similar.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [Vote] For Xen Project Code of Conduct (deadline March 31st)

2020-01-23 Thread Durrant, Paul
> -Original Message-
> 
> People allowed to vote on behalf of the Hypervisor project are:
> Julien Grall, Andy Cooper, George Dunlap, Ian Jackson, Jan Beulich, Konrad
> R
> Wilk, Stefano Stabellini, Wei Liu and Paul Durrant (as Release Manager).

+1

> 
> People allowed to vote on behalf of the XAPI project are:
> Chandrika Srinivasan, Christian Lindig, Konstantina Chremmou,
> Rob Hoes, Zhang Li
> 
> People allowed to vote on behalf of the Windows PV Driver Project are:
> Paul Durrant, Ben Chalmers, Owen Smith
> 

+1

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] cmdline: treat hyphens and underscores the same

2020-01-23 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan
> Beulich
> Sent: 23 January 2020 11:43
> To: xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini ; Julien Grall
> ; Wei Liu ; Konrad Wilk
> ; George Dunlap ;
> Andrew Cooper ; Ian Jackson
> 
> Subject: [Xen-devel] [PATCH v2] cmdline: treat hyphens and underscores the
> same
> 
> In order to avoid permanently having to ask that no new command line
> options using underscores be introduced (albeit I'm likely to still make
> remarks), and in order to also allow extending the use of hyphens to
> pre-existing ones, introduce custom comparison functions treating both
> characters as matching.
> 
> Signed-off-by: Jan Beulich 
> ---
> v2: Rename to opt_str{,n}cmp(). Don't use the new function for comapring
> against "no-" in parse_params(). Add comment to cdiff().
> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -72,6 +72,11 @@ Some options take a comma separated list
>  Some parameters act as combinations of the above, most commonly a mix
>  of Boolean and String.  These are noted in the relevant sections.
> 
> +### Spelling
> +
> +Parameter names may include hyphens or underscores.  These are
> +generally being treated as matching one another by the parsing logic.
> +
>  ## Parameter details
> 
>  ### acpi
> --- a/xen/common/kernel.c
> +++ b/xen/common/kernel.c
> @@ -23,6 +23,53 @@ enum system_state system_state = SYS_STA
>  xen_commandline_t saved_cmdline;
>  static const char __initconst opt_builtin_cmdline[] = CONFIG_CMDLINE;
> 
> +/*
> + * Calculate the difference between two characters for command line
> parsing
> + * purposes, i.e. treating '-' and '_' the same.
> + */
> +static int cdiff(unsigned char c1, unsigned char c2)
> +{
> +int res = c1 - c2;
> +
> +if ( res && (c1 ^ c2) == ('-' ^ '_') &&
> + (c1 == '-' || c1 == '_') )
> +res = 0;
> +

Wow! That makes my head hurt. How about:

static int cdiff(unsigned char c1, unsigned char c2)
{
if ( c1 == '-' )
c1 = '_';

if ( c2 == '-' )
c2 = '_';

return c1 - c2;
}

?

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86 / vmx: make apic_access_mfn type-safe

2020-01-23 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan
> Beulich
> Sent: 23 January 2020 12:45
> To: Durrant, Paul 
> Cc: Kevin Tian ; Wei Liu ; Andrew Cooper
> ; Jun Nakajima ; xen-
> de...@lists.xenproject.org; Roger Pau Monné 
> Subject: Re: [Xen-devel] [PATCH v2 1/3] x86 / vmx: make apic_access_mfn
> type-safe
> 
> On 23.01.2020 13:21, Paul Durrant wrote:
> > Use mfn_t rather than unsigned long and change previous tests against 0
> to
> > tests against INVALID_MFN (also introducing initialization to that
> value).
> >
> > Signed-off-by: Paul Durrant 
> > Acked-by: Kevin Tian 
> > Reviewed-by: Jan Beulich 
> 
> No, this isn't what the R-b was given for.

Oh, sorry, I misunderstood; I thought the R-b was good as long as idempotency 
was ensured.

> 
> > v2:
> >  - Set apic_access_mfn to INVALID_MFN in vmx_free_vlapic_mapping() to
> make
> >the function idempotent
> 
> Andrew had suggested to use 0 instead of INVALID_MFN. I don't see
> how you achieved idempotency with this adjustment. Aiui
> vmx_free_vlapic_mapping() is supposed to also run correctly if
> vmx_alloc_vlapic_mapping() was never called.

It will. vmx_domain_initialise() will set apic_access_mfn to INVALID_MFN so 
vmx_free_vlapic_mapping() will do nothing.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86 / vmx: make apic_access_mfn type-safe

2020-01-23 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 23 January 2020 13:30
> To: Durrant, Paul 
> Cc: Kevin Tian ; Wei Liu ; Andrew Cooper
> ; Jun Nakajima ; xen-
> de...@lists.xenproject.org; Roger Pau Monné 
> Subject: Re: [PATCH v2 1/3] x86 / vmx: make apic_access_mfn type-safe
> 
> On 23.01.2020 14:09, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Xen-devel  On Behalf Of
> Jan
> >> Beulich
> >> Sent: 23 January 2020 12:45
> >> To: Durrant, Paul 
> >> Cc: Kevin Tian ; Wei Liu ; Andrew
> Cooper
> >> ; Jun Nakajima ;
> xen-
> >> de...@lists.xenproject.org; Roger Pau Monné 
> >> Subject: Re: [Xen-devel] [PATCH v2 1/3] x86 / vmx: make apic_access_mfn
> >> type-safe
> >>
> >> On 23.01.2020 13:21, Paul Durrant wrote:
> >>> Use mfn_t rather than unsigned long and change previous tests against
> 0
> >> to
> >>> tests against INVALID_MFN (also introducing initialization to that
> >> value).
> >>>
> >>> Signed-off-by: Paul Durrant 
> >>> Acked-by: Kevin Tian 
> >>> Reviewed-by: Jan Beulich 
> >>
> >> No, this isn't what the R-b was given for.
> >
> > Oh, sorry, I misunderstood; I thought the R-b was good as long as
> idempotency was ensured.
> >
> >>
> >>> v2:
> >>>  - Set apic_access_mfn to INVALID_MFN in vmx_free_vlapic_mapping() to
> >> make
> >>>the function idempotent
> >>
> >> Andrew had suggested to use 0 instead of INVALID_MFN. I don't see
> >> how you achieved idempotency with this adjustment. Aiui
> >> vmx_free_vlapic_mapping() is supposed to also run correctly if
> >> vmx_alloc_vlapic_mapping() was never called.
> >
> > It will. vmx_domain_initialise() will set apic_access_mfn to INVALID_MFN
> > so vmx_free_vlapic_mapping() will do nothing.
> 
> I'm sorry, it was implied that it also needs to work if
> vmx_domain_initialise() was never called. Andrew's goal after
> all is, aiui, to be able to call "destroy" functions on error
> paths irrespective of how far "create" had managed to progress.
> 

Oh, I see. That implication was not at all obvious to me. I thought he was just 
after being able to 'destroy' as many times as it took to finish, in which case 
our choice for the value of INVALID_MFN is indeed unfortunate. If that's the 
goal I'll switch to use _mfn(0) as a comparator.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/3] x86 / vmx: use a 'normal' domheap page for APIC_DEFAULT_PHYS_BASE

2020-01-23 Thread Durrant, Paul
> -Original Message-
> From: Julien Grall 
> Sent: 23 January 2020 15:26
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: Jan Beulich ; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap ; Ian
> Jackson ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ; Jun
> Nakajima ; Kevin Tian 
> Subject: Re: [PATCH v3 3/3] x86 / vmx: use a 'normal' domheap page for
> APIC_DEFAULT_PHYS_BASE
> 
> 
> 
> On 23/01/2020 14:03, Paul Durrant wrote:
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index ee3f9ffd3e..30c777acb8 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -339,6 +339,8 @@ static int sanitise_domain_config(struct
> xen_domctl_createdomain *config)
> >   return arch_sanitise_domain_config(config);
> >   }
> >
> > +#define DOMAIN_INIT_PAGES 1
> 
> Would it make sense to make this a per-arch define? This would allow
> each arch to define a different number of init pages (and catch any
> misuse).
>

I thought about that and decided against it. The arch code may want to increase 
(which may be a bad idea) but I think it should be set early. Ultimately it 
should come in from the toolstack via the domctl anyway.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/3] x86 / vmx: use a 'normal' domheap page for APIC_DEFAULT_PHYS_BASE

2020-01-23 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 23 January 2020 15:32
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap ; Ian
> Jackson ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> ; Jun Nakajima ; Kevin
> Tian 
> Subject: Re: [PATCH v3 3/3] x86 / vmx: use a 'normal' domheap page for
> APIC_DEFAULT_PHYS_BASE
> 
> On 23.01.2020 15:03, Paul Durrant wrote:
> > vmx_alloc_vlapic_mapping() currently contains some very odd looking code
> > that allocates a MEMF_no_owner domheap page and then shares with the
> guest
> > as if it were a xenheap page. This then requires
> vmx_free_vlapic_mapping()
> > to call a special function in the mm code: free_shared_domheap_page().
> >
> > By using a 'normal' domheap page (i.e. by not passing MEMF_no_owner to
> > alloc_domheap_page()), the odd looking code in
> vmx_alloc_vlapic_mapping()
> > can simply use get_page_and_type() to set up a writable mapping before
> > insertion in the P2M and vmx_free_vlapic_mapping() can simply release
> the
> > page using put_page_alloc_ref() followed by put_page_and_type(). This
> > then allows free_shared_domheap_page() to be purged.
> >
> > There is, however, some fall-out from this simplification:
> >
> > - alloc_domheap_page() will now call assign_pages() and run into the
> fact
> >   that 'max_pages' is not set until some time after domain_create(). To
> >   avoid an allocation failure, domain_create() is modified to set
> >   max_pages to an initial value, sufficient to cover any domheap
> >   allocations required to complete domain creation. The value will be
> >   set to the 'real' max_pages when the tool-stack later performs the
> >   XEN_DOMCTL_max_mem operation, thus allowing the rest of the domain's
> >   memory to be allocated.
> 
> I continue to disagree with this approach, and I don't think I've
> heard back on the alternative suggestion of using MEMF_no_refcount
> instead of MEMF_no_owner.

I responded in v1:

"
> Did you consider passing MEMF_no_refcount here, to avoid the
> fiddling with assign_pages()? That'll in particular also
> avoid ...
> 

You remember what happened last time we did that (with the ioreq server page), 
right? That's why assign_pages() vetoes non-refcounted pages.
"

> As said before, the page (and any other
> ones up to DOMAIN_INIT_PAGES, which I find rather fragile to be
> set to 1) will now get accounted against the amount allowed for
> the domain to allocate.
> 
> It also looks to me as if this will break e.g.
> p2m_pod_set_mem_target(), which at the very top has
> 
> /* P == B: Nothing to do (unless the guest is being created). */
> populated = d->tot_pages - p2m->pod.count;
> if ( populated > 0 && p2m->pod.entry_count == 0 )
> goto out;
> 
> This code assumes that during domain creation all pages recorded
> in ->tot_pages have also been recorded in ->pod.count.
> 
> There may be other uses of ->tot_pages which this change conflicts
> with.
> 
> > - Because the domheap page is no longer a pseudo-xenheap page, the
> >   reference counting will prevent the domain from being destroyed. Thus
> >   the call to vmx_free_vlapic_mapping() is moved from the
> >   domain_destroy() method into the domain_relinquish_resources() method.
> >   Whilst in the area, make the domain_destroy() method an optional
> >   alternative_vcall() (since it will no longer peform any function in
> VMX
> >   and is stubbed in SVM anyway).
> 
> All of this would, afaict, become irrelevant when using MEMF_no_refcount.
> 
> There looks to be one issue (which I think we have been discussing
> before): By not bumping ->tot_pages, its decrementing upon freeing
> the page will be a problem.

Yes, that's exactly the problem with assigning MEMF_no_refcount pages, which is 
way it is outlawed.

> I can see two possible solutions to this:
> - Introduce a flag indicating there should also be no accounting
>   upon freeing of the page.

What sort of flag did you have in mind? Do we have space anywhere in type-info 
or count-info to put it? If we can make assigning non-refcounted pages safe 
then it's certainly an attractive option.

> - Instead of avoiding to increment ->tot_pages, _also_ increment
>   ->max_pages. The interaction with XEN_DOMCTL_max_mem will then of
>   course be, well, interesting.
> 

Indeed, which is why I opted for the simple approach. As I've said in other 
responses, max_pages ought be set by the toolstack when the domain is created 
so I wanted to come up with something that's not too invasive until that change 
is made so if the pages do need to be ref-counted then I guess I'll have to 
figure out how to make the initial allocation co-exist with PoD.

  Paul 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/3] x86 / vmx: use a 'normal' domheap page for APIC_DEFAULT_PHYS_BASE

2020-01-24 Thread Durrant, Paul
> -Original Message-
> From: George Dunlap [mailto:george.dun...@citrix.com]
> Sent: 23 January 2020 14:46
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: Jan Beulich ; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap ; Ian
> Jackson ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> ; Jun Nakajima ; Kevin
> Tian 
> Subject: Re: [PATCH v3 3/3] x86 / vmx: use a 'normal' domheap page for
> APIC_DEFAULT_PHYS_BASE
> 
> On 1/23/20 2:03 PM, Paul Durrant wrote:
> > vmx_alloc_vlapic_mapping() currently contains some very odd looking
> > code that allocates a MEMF_no_owner domheap page and then shares with
> > the guest as if it were a xenheap page. This then requires
> > vmx_free_vlapic_mapping() to call a special function in the mm code:
> free_shared_domheap_page().
> >
> > By using a 'normal' domheap page (i.e. by not passing MEMF_no_owner to
> > alloc_domheap_page()), the odd looking code in
> > vmx_alloc_vlapic_mapping() can simply use get_page_and_type() to set
> > up a writable mapping before insertion in the P2M and
> > vmx_free_vlapic_mapping() can simply release the page using
> > put_page_alloc_ref() followed by put_page_and_type(). This then allows
> free_shared_domheap_page() to be purged.
> >
> > There is, however, some fall-out from this simplification:
> >
> > - alloc_domheap_page() will now call assign_pages() and run into the
> fact
> >   that 'max_pages' is not set until some time after domain_create(). To
> >   avoid an allocation failure, domain_create() is modified to set
> >   max_pages to an initial value, sufficient to cover any domheap
> >   allocations required to complete domain creation. The value will be
> >   set to the 'real' max_pages when the tool-stack later performs the
> >   XEN_DOMCTL_max_mem operation, thus allowing the rest of the domain's
> >   memory to be allocated.
> >
> > - Because the domheap page is no longer a pseudo-xenheap page, the
> >   reference counting will prevent the domain from being destroyed. Thus
> >   the call to vmx_free_vlapic_mapping() is moved from the
> >   domain_destroy() method into the domain_relinquish_resources() method.
> >   Whilst in the area, make the domain_destroy() method an optional
> >   alternative_vcall() (since it will no longer peform any function in
> VMX
> >   and is stubbed in SVM anyway).
> >
> > Signed-off-by: Paul Durrant 
> 
> This is an excellent change, thank you:
> 
> Reviewed-by: George Dunlap 
> 
> My only comment is that this would have been a bit easier to review broken
> down into probably three patches: 1) making domain_destroy optional, 2)
> moving vmx teardown to a relinquish_resources call 3) using "normal"
> pages".  But I don't think it's worth a re-send just for that.

Since I appear to need to do a v4 to re-work the allocation (assuming I can 
steal another PGC bit) I'll split things as you suggest.

  Paul

> 
>  -George
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 7/7] mm: remove donate_page()

2020-01-24 Thread Durrant, Paul


Sent from Workspace ONE Boxer
On 24 Jan 2020 17:57, Andrew Cooper  wrote:
>
> On 24/01/2020 15:31, Paul Durrant wrote:
> > This function appears to have no callers.
> >
> > Signed-off-by: Paul Durrant 
>
> Looks like tmem was the sole user.  donate_page() was introduced by
> 6009f4ddb2 and the last caller was dropped by c492e19fdd.
>
> This patch is standalone, and I can tweak the commit message on commit,
> if you're happy with that.
>

Fine with me.

  Paul
> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug

2020-01-27 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf
> Of Olaf Hering
> Sent: 27 January 2020 11:30
> To: xen-de...@lists.xen.org
> Subject: Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to
> qemu-xen bug
> 
> Am Mon, 13 Jan 2020 11:36:27 +0100
> schrieb Olaf Hering :
> 
> > This HVM domU fails to live migrate from staging-4.12 to staging-4.13:
> 
> It turned out libxl does not configure qemu correctly at runtime:
> libxl__build_device_model_args_new() uses 'qemu -machine xenfv', perhaps
> with the assumption that 'xenfv' does the right thing. Unfortunately,
> 'xenfv' entirely ignores compatibility of "pc-i440fx" between qemu
> versions, 'xenfv' just maps to 'pc' aka 'the lastest'. Instead of 'qemu -
> machine xenfv', libxl should run 'qemu -machine pc-i440fx-3.0 -device xen-
> platform -accel xen' to make sure the domU can be migrated safely to
> future versions of qemu.

Agreed, I think use xenfv needs to be dropped and xl/libxl ought to specify the 
pc version it wants, as you suggest. For compat though, if the pc version is 
not specified in xl.cfg we'd need a mechanism to scan the versions supported by 
the installed qemu and then pick the latest, such that it then gets baked into 
the json blob for save/restore/migration purposes. 

> 
> Maybe there should also be a way to select a specific variant of "pc-
> i440fx". Currently the only way to do that is to use
> device_model_args_hvm= in xl.cfg. Unfortunately libvirt does not support
> "b_info->extra*".
> 

Yeah, it should be a first class config option.

> Should the string "pc-i440fx-3.0" become a configure option?
> 

I suppose. Could we have "pc-i440fx" as the default, which libxl prefix matches 
against qemu's supported versions to select the latest? I guess that would work.

Functionally your code looks fine, but I don't think fixing on 3.0 is the right 
thing to do. What happens if someone is trying to use an older version of qemu? 
It's going to cause unexpected breakage I think.

  Paul


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug

2020-01-27 Thread Durrant, Paul
> -Original Message-
> From: Olaf Hering [mailto:o...@aepfle.de]
> Sent: 27 January 2020 13:00
> To: Durrant, Paul 
> Cc: xen-de...@lists.xen.org
> Subject: Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to
> qemu-xen bug
> 
> Am Mon, 27 Jan 2020 11:54:37 +0000
> schrieb "Durrant, Paul" :
> 
> > > Should the string "pc-i440fx-3.0" become a configure option?
> > I suppose. Could we have "pc-i440fx" as the default, which libxl prefix
> matches against qemu's supported versions to select the latest?
> 
> I think the qemu machine variant must become a property of the running
> domU, so that it will not get lost during migration. For incoming domUs
> without such property some default must be selected by libxl. libxl at
> runtime has no info what the initial qemu command was. So this fallback
> must become a compile or runtime knob as well. Not sure if it would be too
> cumbersome for host admins to apply the equivalent of
> "device_model_args_hvm=" to a five or six digit number of running domUs
> during or prior their migration.
> 
> There should be a --qemuu-hvm-machine, which may just default to 'pc-1.0'
> if not specified. That string should go to
> domain_build_info.u.hvm.qemuu_machine, so that it becomes part of the domU
> properties.
> 

Could we have an opinion from a toolstack maintainer (cc-ed), please?

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] docs: retrospectively add XS_DIRECTORY_PART to the xenstore protocol...

2020-01-27 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@citrix.com]
> Sent: 27 January 2020 15:49
> To: Jürgen Groß 
> Cc: Durrant, Paul ; xen-devel@lists.xenproject.org;
> Andrew Cooper ; George Dunlap
> ; Jan Beulich ; Julien Grall
> ; Konrad Rzeszutek Wilk ; Stefano
> Stabellini ; Wei Liu 
> Subject: Re: [PATCH] docs: retrospectively add XS_DIRECTORY_PART to the
> xenstore protocol...
> 
> Jürgen Groß writes ("Re: [PATCH] docs: retrospectively add
> XS_DIRECTORY_PART to the xenstore protocol..."):
> > On 27.01.20 16:33, Ian Jackson wrote:
> > > Paul Durrant writes ("[PATCH] docs: retrospectively add
> XS_DIRECTORY_PART to the xenstore protocol..."):
> > >> ... specification.
> > >>
> > >> This was added by commit 0ca64ed8 "xenstore: add support for
> > >> reading directory with many children" but not added to the
> > >> specification at that point. A version of xenstored supporting the
> > >> command was first released in Xen 4.9.
> > >
> > > Thanks for documenting this.  A docs fix like this should be
> > > backported if it applies, IMO.
> > >
> > > Acked-by: Ian Jackson 
> > > Backport: 4.9+
> > >
> > > I will commit it to staging momentarily.
> > >
> > >> +DIRECTORY_PART  | |*
> > >> +Performs the same function as DIRECTORY, but returns a
> > >> +sub-list of children starting at  in the overall
> > >> +child list and less than or equal to XENSTORE_PAYLOAD_MAX
> > >> +octets in length. If  is beyond the end of the
> > >> +overall child list then the returned sub-list will be
> > >> +empty.
> > >
> > > I wonder if it should be somehow made more explicit that `index' is
> > > a count of directory entries, not bytes.  Maybe this is obvious.
> >
> > But this is wrong. It is bytes, and the generation count returned is
> > missing (see my original patch back in 2017).
> 
> Sorry for being too quick.  I have reverted my commit.
> 

Since I got it wrong, I suggest just taking Juergen's original text (which I 
was unaware of before). It seems ok to me.

  Paul

> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] docs: Fix StudlyCaps in libxc-migration-stream.pandoc and xl.1.pod

2020-01-27 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 27 January 2020 16:46
> To: xen-devel@lists.xenproject.org
> Cc: Durrant, Paul ; Ian Jackson
> ; Wei Liu ; Andrew Cooper
> ; George Dunlap ;
> Jan Beulich ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> 
> Subject: [PATCH] docs: Fix StudlyCaps in libxc-migration-stream.pandoc and
> xl.1.pod
> 
> $ git-grep libxenctrl | wc -l
> 99
> $ git-grep libxc | wc -l
> 206
> $ git-grep libxenlight | wc -l
> 48
> $ git-grep libxl | wc -l
> 13433
> $ git-grep LibXen | wc -l
> 2
> $
> 
> Reported-by: Paul Durrant 
> Signed-off-by: Ian Jackson 
> ---
>  docs/man/xl.1.pod.in | 2 +-
>  docs/specs/libxc-migration-stream.pandoc | 2 +-

What about docs/specs/libxl-migration-stream.pandoc? It could use a similar fix 
while you're at it.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH resend] docs: add DIRECTORY_PART specification do xenstore protocol doc

2020-01-27 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of
> Juergen Gross
> Sent: 27 January 2020 16:51
> To: xen-devel@lists.xenproject.org
> Cc: Juergen Gross ; Stefano Stabellini
> ; Julien Grall ; Wei Liu
> ; Konrad Rzeszutek Wilk ; George
> Dunlap ; Andrew Cooper
> ; Ian Jackson 
> Subject: [Xen-devel] [PATCH resend] docs: add DIRECTORY_PART specification
> do xenstore protocol doc
> 
> DIRECTORY_PART was missing in docs/misc/xenstore.txt. Add it.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Paul Durrant 

> ---
>  docs/misc/xenstore.txt | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/docs/misc/xenstore.txt b/docs/misc/xenstore.txt
> index ae1b6a8c6e..65570183b6 100644
> --- a/docs/misc/xenstore.txt
> +++ b/docs/misc/xenstore.txt
> @@ -152,6 +152,15 @@ DIRECTORY|  leaf-name>|*
>   leafnames.  The resulting children are each named
>   /.
> 
> +DIRECTORY_PART   | | name>|*
> + Same as DIRECTORY, but to be used for children lists longer than
> + XENSTORE_PAYLOAD_MAX. Input are  and the byte offset into
> + the list of children to return. Return values are the generation
> + count  of the node (to be used to ensure the node hasn't
> + changed between two reads:  being the same for multiple
> + reads guarantees the node hasn't changed) and the list of children
> + starting at the specified  of the complete list.
> +
>  GET_PERMS| |+
>  SET_PERMS||+?
>is one of the following
> --
> 2.16.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 4/7] x86 / vmx: move teardown from domain_destroy()...

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 28 January 2020 08:15
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Jun Nakajima ;
> Kevin Tian ; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap 
> Subject: Re: [PATCH v4 4/7] x86 / vmx: move teardown from
> domain_destroy()...
> 
> On 24.01.2020 16:31, Paul Durrant wrote:
> > ... to domain_relinquish_resources().
> >
> > The teardown code frees the APICv page. This does not need to be done
> late
> > so do it in domain_relinquish_resources() rather than domain_destroy().
> 
> For the normal domain destruction path this is fine. For the error path
> of domain_create(), however, this will leak the page, as in this case
> hvm_domain_relinquish_resources() won't be called.

Well it's really arch_domain_create() that's at fault but, yes that needs 
fixing.

> I'm afraid there
> already is a similar issue with e.g. viridian_domain_deinit(). I guess
> I'll make a patch.
> 

Ok, thanks.

  Paul

> Jan
> 
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Jun Nakajima 
> > Cc: Kevin Tian 
> > Cc: Jan Beulich 
> > Cc: Andrew Cooper 
> > Cc: Wei Liu 
> > Cc: "Roger Pau Monné" 
> > Cc: George Dunlap 
> >
> > v4:
> >   - New in v4 (disaggregated from v3 patch #3)
> > ---
> >  xen/arch/x86/hvm/vmx/vmx.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index b262d38a7c..606f3dc2eb 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -419,7 +419,7 @@ static int vmx_domain_initialise(struct domain *d)
> >  return 0;
> >  }
> >
> > -static void vmx_domain_destroy(struct domain *d)
> > +static void vmx_domain_relinquish_resources(struct domain *d)
> >  {
> >  if ( !has_vlapic(d) )
> >  return;
> > @@ -2240,7 +2240,7 @@ static struct hvm_function_table __initdata
> vmx_function_table = {
> >  .cpu_up_prepare   = vmx_cpu_up_prepare,
> >  .cpu_dead = vmx_cpu_dead,
> >  .domain_initialise= vmx_domain_initialise,
> > -.domain_destroy   = vmx_domain_destroy,
> > +.domain_relinquish_resources = vmx_domain_relinquish_resources,
> >  .vcpu_initialise  = vmx_vcpu_initialise,
> >  .vcpu_destroy = vmx_vcpu_destroy,
> >  .save_cpu_ctxt= vmx_save_vmcs_ctxt,
> >

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/2] docs/designs: Add a design document for migration of xenstore data

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Jürgen Groß 
> Sent: 28 January 2020 13:36
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini ; Julien Grall
> ; Wei Liu ; Konrad Rzeszutek Wilk
> ; George Dunlap ;
> Andrew Cooper ; Ian Jackson
> 
> Subject: Re: [Xen-devel] [PATCH v3 2/2] docs/designs: Add a design
> document for migration of xenstore data
> 
> On 28.01.20 13:28, Paul Durrant wrote:
> > This patch details proposes extra migration data and xenstore protocol
> > extensions to support non-cooperative live migration of guests.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Ian Jackson 
> > Cc: Jan Beulich 
> > Cc: Julien Grall 
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: Stefano Stabellini 
> > Cc: Wei Liu 
> >
> > v3:
> >   - New in v3
> > ---
> >   docs/designs/xenstore-migration.md | 122 +
> >   1 file changed, 122 insertions(+)
> >   create mode 100644 docs/designs/xenstore-migration.md
> >
> > diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-
> migration.md
> > new file mode 100644
> > index 00..9020b6ff9a
> > --- /dev/null
> > +++ b/docs/designs/xenstore-migration.md
> > @@ -0,0 +1,122 @@
> > +**watch data**
> > +
> > +
> > +`||`
> > +
> > +`` again is considered relative and, together with ``,
> should
> > +be suitable to formulate an `ADD_DOMAIN_WATCHES` operation (see below).
> > +Note that `` must not be a *special* value (beginning with `@`).
> 
> Why not? This is possible for a guest, so it should be possible to
> migrate such a watch.
> 
> Today it might not make any sense, but we should not block any future
> special values which might make sense for a guest to use.
> 

Ok. I was just limiting things to what is actually needed. If you think there 
is merit in allowing them then I have no objection. I'll remove the exclusions.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/HVM: relinquish resources also from hvm_domain_destroy()

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan
> Beulich
> Sent: 28 January 2020 13:17
> To: xen-devel@lists.xenproject.org
> Cc: Andrew Cooper ; Paul Durrant
> ; Wei Liu ; Roger Pau Monné
> 
> Subject: [Xen-devel] [PATCH] x86/HVM: relinquish resources also from
> hvm_domain_destroy()
> 
> Domain creation failure paths don't call domain_relinquish_resources(),
> yet allocations and alike done from hvm_domain_initialize() need to be
> undone nevertheless. Call the function also from hvm_domain_destroy(),
> after making sure all descendants are idempotent.
> 
> Note that while viridian_{domain,vcpu}_deinit() were already used in
> ways suggesting they're idempotent, viridian_time_vcpu_deinit() actually
> wasn't: One can't kill a timer that was never initialized.
> 
> For hvm_destroy_all_ioreq_servers()'s purposes make
> relocate_portio_handler() return whether the to be relocated port range
> was actually found. This seems cheaper than introducing a flag into
> struct hvm_domain's ioreq_server sub-structure.
> 
> In hvm_domain_initialise() additionally
> - use XFREE() also to replace adjacent xfree(),
> - use hvm_domain_relinquish_resources() as being idempotent now.
> 
> Fixes: e7a9b5e72f26 ("viridian: separately allocate domain and vcpu
> structures")
> Fixes: 26fba3c85571 ("viridian: add implementation of synthetic timers")
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/arch/x86/hvm/hpet.c
> +++ b/xen/arch/x86/hvm/hpet.c
> @@ -751,7 +751,7 @@ void hpet_deinit(struct domain *d)
>  int i;
>  HPETState *h = domain_vhpet(d);
> 
> -if ( !has_vhpet(d) )
> +if ( !has_vhpet(d) || !d->arch.hvm.pl_time || !h->stime_freq )

Why the extra checks here? Just to avoid taking the write_lock() before init? 
If so, then wouldn't a check of stime_freq alone suffice?

>  return;
> 
>  write_lock(&h->lock);
> @@ -763,6 +763,8 @@ void hpet_deinit(struct domain *d)
>  for ( i = 0; i < HPET_TIMER_NUM; i++ )
>  if ( timer_enabled(h, i) )
>  hpet_stop_timer(h, i, guest_time);
> +
> +h->hpet.config = 0;
>  }
> 
>  write_unlock(&h->lock);
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -696,24 +696,24 @@ int hvm_domain_initialise(struct domain
>  return 0;
> 
>   fail2:
> -rtc_deinit(d);

I understand that this removal is done because 
hvm_domain_relinquish_resources() will now deal with it, but I notice it is 
also called from hvm_domain_destroy(), which seems superfluous.

>  stdvga_deinit(d);
>  vioapic_deinit(d);
>   fail1:
>  if ( is_hardware_domain(d) )
>  xfree(d->arch.hvm.io_bitmap);
> -xfree(d->arch.hvm.io_handler);
> -xfree(d->arch.hvm.params);
> -xfree(d->arch.hvm.pl_time);
> -xfree(d->arch.hvm.irq);
> +XFREE(d->arch.hvm.io_handler);
> +XFREE(d->arch.hvm.params);
> +XFREE(d->arch.hvm.pl_time);
> +XFREE(d->arch.hvm.irq);

Should these XFREEs not now be removed from hvm_domain_destroy() too?

>   fail0:
>  hvm_destroy_cacheattr_region_list(d);
>  destroy_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0);
>   fail:
> -viridian_domain_deinit(d);
> +hvm_domain_relinquish_resources(d);
>  return rc;
>  }
> 
> +/* This function and all its descendants need to be to be idempotent. */
>  void hvm_domain_relinquish_resources(struct domain *d)
>  {
>  if ( hvm_funcs.domain_relinquish_resources )
> @@ -742,6 +742,13 @@ void hvm_domain_destroy(struct domain *d
>  struct list_head *ioport_list, *tmp;
>  struct g2m_ioport *ioport;
> 
> +/*
> + * This function would not be called when domain initialization fails
> + * (late enough), so do so here. This requires the function and all
> its
> + * descendants to be idempotent.
> + */
> +hvm_domain_relinquish_resources(d);
> +
>  XFREE(d->arch.hvm.io_handler);
>  XFREE(d->arch.hvm.params);
> 
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -1228,6 +1228,9 @@ void hvm_destroy_all_ioreq_servers(struc
>  struct hvm_ioreq_server *s;
>  unsigned int id;
> 
> +if ( !relocate_portio_handler(d, 0xcf8, 0xcf8, 4) )
> +return;
> +

Seems a little bit hacky but agreed that it works and avoids the need for 
another flag.

>  spin_lock_recursive(&d->arch.hvm.ioreq_server.lock);
> 
>  /* No need to domain_pause() as the domain is being torn down */
> --- a/xen/arch/x86/hvm/intercept.c
> +++ b/xen/arch/x86/hvm/intercept.c
> @@ -300,7 +300,7 @@ void register_portio_handler(struct doma
>  handler->portio.action = action;
>  }
> 
> -void relocate_portio_handler(struct domain *d, unsigned int old_port,
> +bool relocate_portio_handler(struct domain *d, unsigned int old_port,
>   unsigned int new_port, unsigned int size)
>  {
>  unsigned int i;
> @@ -317,9 +317,11 @@ void relocate_portio_handler(struct doma
>   (handler->portio.size = size) )
>  {
>  handler->portio.

Re: [Xen-devel] [PATCH v3 2/2] docs/designs: Add a design document for migration of xenstore data

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Wei
> Liu
> Sent: 28 January 2020 13:46
> To: Durrant, Paul 
> Cc: Stefano Stabellini ; Julien Grall
> ; Wei Liu ; Konrad Rzeszutek Wilk
> ; George Dunlap ;
> Andrew Cooper ; Ian Jackson
> ; xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v3 2/2] docs/designs: Add a design
> document for migration of xenstore data
> 
> On Tue, Jan 28, 2020 at 12:28:23PM +, Paul Durrant wrote:
> > +```
> > +0 1 2 3 4 5 6 7 octet
> > ++++
> > +| type   | record specific data   |
> > +++|
> > +...
> > ++-+
> > +```
> > +
> > +
> > +| Field | Description |
> > +|---|---|
> > +| `type` | 0x: invalid |
> > +|| 0x0001: node data |
> > +|| 0x0002: watch data |
> > +|| 0x0003 - 0x: reserved for future use |
> > +
> > +
> > +where data is always in the form of a NUL separated and terminated
> tuple
> > +as follows
> > +
> 
> Is there any padding requirement for a record? I take it there isn't?
> 

No, the padding is at the generic record level (already part of the spec).

  Paul

> Wei.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/HVM: relinquish resources also from hvm_domain_destroy()

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 28 January 2020 14:31
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; Paul Durrant ; Wei Liu
> ; Roger Pau Monné 
> Subject: Re: [PATCH] x86/HVM: relinquish resources also from
> hvm_domain_destroy()
> 
> On 28.01.2020 15:14, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Xen-devel  On Behalf Of
> Jan
> >> Beulich
> >> Sent: 28 January 2020 13:17
> >>
> >> --- a/xen/arch/x86/hvm/hpet.c
> >> +++ b/xen/arch/x86/hvm/hpet.c
> >> @@ -751,7 +751,7 @@ void hpet_deinit(struct domain *d)
> >>  int i;
> >>  HPETState *h = domain_vhpet(d);
> >>
> >> -if ( !has_vhpet(d) )
> >> +if ( !has_vhpet(d) || !d->arch.hvm.pl_time || !h->stime_freq )
> >
> > Why the extra checks here? Just to avoid taking the write_lock()
> > before init? If so, then wouldn't a check of stime_freq alone suffice?
> 
> This and the other functions may now be called with
> d->arch.hvm.pl_time being NULL. domain_vhpet() would yield a
> pointer slightly offset from NULL in this case. Hence we have
> to do this check before we may deref h.
> 

Ah, I'd missed that domain_vhpet() dereferenced pl_time.

> >> --- a/xen/arch/x86/hvm/hvm.c
> >> +++ b/xen/arch/x86/hvm/hvm.c
> >> @@ -696,24 +696,24 @@ int hvm_domain_initialise(struct domain
> >>  return 0;
> >>
> >>   fail2:
> >> -rtc_deinit(d);
> >
> > I understand that this removal is done because
> > hvm_domain_relinquish_resources() will now deal with it,
> > but I notice it is also called from hvm_domain_destroy(),
> > which seems superfluous.
> 
> Oh, indeed, that one could go away as well now.
> 
> >>  stdvga_deinit(d);
> >>  vioapic_deinit(d);
> >>   fail1:
> >>  if ( is_hardware_domain(d) )
> >>  xfree(d->arch.hvm.io_bitmap);
> >> -xfree(d->arch.hvm.io_handler);
> >> -xfree(d->arch.hvm.params);
> >> -xfree(d->arch.hvm.pl_time);
> >> -xfree(d->arch.hvm.irq);
> >> +XFREE(d->arch.hvm.io_handler);
> >> +XFREE(d->arch.hvm.params);
> >> +XFREE(d->arch.hvm.pl_time);
> >> +XFREE(d->arch.hvm.irq);
> >
> > Should these XFREEs not now be removed from hvm_domain_destroy() too?
> 
> I'm afraid I don't understand: This is in hvm_domain_initialise().
> arch_domain_destroy() (and hence hvm_domain_destroy()) won't get
> called if hvm_domain_initialise() (and hence arch_domain_destroy())
> fails (doing so is, I think, a future plan of Andrew's).
> 

Oh, sorry. For some reason I thought this hunk was in 
hvm_domain_relinquish_resources() so yes the XFREEs in destroy need to stay as 
is.

> >> --- a/xen/arch/x86/hvm/pmtimer.c
> >> +++ b/xen/arch/x86/hvm/pmtimer.c
> >> @@ -373,7 +373,7 @@ void pmtimer_deinit(struct domain *d)
> >>  {
> >>  PMTState *s = &d->arch.hvm.pl_time->vpmt;
> >>
> >> -if ( !has_vpm(d) )
> >> +if ( !has_vpm(d) || !d->arch.hvm.pl_time || !s->vcpu )
> >>  return;
> >
> > Isn't a test of s->vcpu sufficient?
> 
> No, for the same reason a that explained for hpet.c.
> 
> >> --- a/xen/arch/x86/hvm/rtc.c
> >> +++ b/xen/arch/x86/hvm/rtc.c
> >> @@ -844,7 +844,8 @@ void rtc_deinit(struct domain *d)
> >>  {
> >>  RTCState *s = domain_vrtc(d);
> >>
> >> -if ( !has_vrtc(d) )
> >> +if ( !has_vrtc(d) || !d->arch.hvm.pl_time ||
> >> + s->update_timer.status == TIMER_STATUS_invalid )
> >>  return;
> >
> > Testing s->pt.source for a zero value would be sufficient and cheaper, I
> think.
> 
> It's not obvious to me where in rtc_init() s->pt.source would
> get set to a non-zero value. I'd prefer the check here to be
> obviously connected to what rtc_init() does. I'm also unclear
> why you think it would be cheaper.

Ok. I'd assume a non-zero rather than equality test would be cheaper but I 
notice that TIMER_STATUS_invalid is defined to 0 anyway, so it should be 
optimised at compile time.

  Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] docs/designs: Add a design document for non-cooperative live migration

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Wei Liu 
> Sent: 28 January 2020 13:41
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Jan Beulich ;
> Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ; Wei
> Liu 
> Subject: Re: [PATCH v3 1/2] docs/designs: Add a design document for non-
> cooperative live migration
> 
> Thanks for writing this up. I skimmed through it. It looks sensible.
> 
> On Tue, Jan 28, 2020 at 12:28:22PM +, Paul Durrant wrote:
> > It has become apparent to some large cloud providers that the current
> > model of cooperative migration of guests under Xen is not usable as it
> > relies on software running inside the guest, which is likely beyond the
> > provider's control.
> > This patch introduces a proposal for non-cooperative live migration,
> > designed not to rely on any guest-side software.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Ian Jackson 
> > Cc: Jan Beulich 
> > Cc: Julien Grall 
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: Stefano Stabellini 
> > Cc: Wei Liu 
> >
> > v2:
> >  - Use the term 'non-cooperative' instead of 'transparent'
> >  - Replace 'trust in' with 'reliance on' when referring to guest-side
> >software
> > ---
> >  docs/designs/non-cooperative-migration.md | 259 ++
> >  1 file changed, 259 insertions(+)
> >  create mode 100644 docs/designs/non-cooperative-migration.md
> >
> > diff --git a/docs/designs/non-cooperative-migration.md
> b/docs/designs/non-cooperative-migration.md
> > new file mode 100644
> > index 00..f38d664c34
> > --- /dev/null
> > +++ b/docs/designs/non-cooperative-migration.md
> > @@ -0,0 +1,259 @@
> > +# Non-Cooperative Migration of Guests on Xen
> > +
> > +## Background
> > +
> > +The normal model of migration in Xen is driven by the guest because it
> was
> > +originally implemented for PV guests, where the guest must be aware it
> is
> > +running under Xen and is hence expected to co-operate. This model dates
> from
> > +an era when it was assumed that the host administrator had control of
> at least
> > +the privileged software running in the guest (i.e. the guest kernel)
> which may
> > +still be true in an enterprise deployment but is not generally true in
> a cloud
> > +environment. The aim of this design is to provide a model which is
> purely host
> > +driven, requiring no co-operation from the software running in the
> > +guest, and is thus suitable for cloud scenarios.
> > +
> > +PV guests are out of scope for this project because, as is outlined
> above, they
> > +have a symbiotic relationship with the hypervisor and therefore a
> certain level
> > +of co-operation can be assumed.
> 
> Missing newline here?
> 

Yep.

> > +HVM guests can already be migrated on Xen without guest co-operation
> but only
> > +if they don’t have PV drivers installed[1] or are in power state S3.
> The
> > +reason for not expecting co-operation if the guest is in S3 is obvious,
> but the
> > +reason co-operation is expected if PV drivers are installed is due to
> the
> > +nature of PV protocols.
> > +
> > +## Xenstore Nodes and Domain ID
> > +
> > +The PV driver model consists of a *frontend* and a *backend*. The
> frontend runs
> > +inside the guest domain and the backend runs inside a *service domain*
> which
> > +may or may not domain 0. The frontend and backend typically pass data
> via
> 
> "may or may not _be_ domain 0"
> 

Ack.

> > +memory pages which are shared between the two domains, but this channel
> of
> > +communication is generally established using xenstore (the store
> protocol
> > +itself being an exception to this for obvious chicken-and-egg reasons).
> > +
> > +Typical protocol establishment is based on use of two separate xenstore
> > +*areas*. If we consider PV drivers for the *netif* protocol (i.e. class
> vif)
> > +and assume the guest has domid X, the service domain has domid Y, and
> the vif
> > +has index Z then the frontend area will reside under the parent node:
> 
> The term "parent" shows up first time in this document. What does it
> mean in Xen's context?
> 

I'd hope it's well known that xenstore is hierarchical. I can add a short 
explanation if you think it’s needed.

> > +
> > +`/local/domain/Y/device/vif/Z`
> > +
> > +A

Re: [Xen-devel] [PATCH v4 6/7] x86/hyperv: retrieve vp_index from Hyper-V

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan
> Beulich
> Sent: 28 January 2020 16:19
> To: Wei Liu ; Paul Durrant ; Andrew Cooper
> 
> Cc: Xen Development List ; Roger Pau Monné
> ; Wei Liu ; Michael Kelley
> 
> Subject: Re: [Xen-devel] [PATCH v4 6/7] x86/hyperv: retrieve vp_index from
> Hyper-V
> 
> On 28.01.2020 16:55, Wei Liu wrote:
> > On Thu, Jan 23, 2020 at 04:48:38PM +0100, Jan Beulich wrote:
> >> On 22.01.2020 21:23, Wei Liu wrote:
> >>> This will be useful when invoking hypercall that targets specific
> >>> vcpu(s).
> >>>
> >>> Signed-off-by: Wei Liu 
> >>> Reviewed-by: Paul Durrant 
> >>
> >> For formal reasons
> >> Acked-by: Jan Beulich 
> >>
> >> However I wonder whether the Viridian entry in MAINTAINERS shouldn't
> >> be extended by
> >>
> >> F: xen/arch/x86/guest/hyperv/
> >>
> >> (and possibly have its title adjusted). Thoughts?
> >
> > This isn't about emulating Hyper-V inside Xen, so I don't think that's
> > the right approach here.
> 
> Well, it's the code producing the interface in one case, and
> consuming it here. So there is some overlap at least.
> 
> > That said, if Paul wants to take this under his purview, it's fine by me
> > -- that would make me easier to upstream my patch. ;-)  I also wouldn't
> > mind adding myself as maintainer for this path.
> 
> Perhaps best both of you? Paul, Andrew, what do you think?
> 

IMO it's probably best to the put the Hyper-V stuff under 'Viridian' and add 
yourself as a maintainer there. There really is likely to be significant 
overlap and it'd make it easier (for me at least) to keep track of the bigger 
picture (i.e. Xen using enlightenments as well as implementing them).

  Cheers,

Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 28 January 2020 15:23
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu ; Volodymyr Babchuk
> ; Roger Pau Monné 
> Subject: Re: [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign
> 
> On 24.01.2020 16:31, Paul Durrant wrote:
> > Currently it is unsafe to assign a domheap page allocated with
> > MEMF_no_refcount to a domain because the domain't 'tot_pages' will not
> > be incremented, but will be decrement when the page is freed (since
> > free_domheap_pages() has no way of telling that the increment was
> skipped).
> >
> > This patch allocates a new 'count_info' bit for a PGC_no_refcount flag
> > which is then used to mark domheap pages allocated with
> MEMF_no_refcount.
> > This then allows free_domheap_pages() to skip decrementing tot_pages
> when
> > appropriate and hence makes the pages safe to assign.
> >
> > NOTE: The patch sets MEMF_no_refcount directly in alloc_domheap_pages()
> >   rather than in assign_pages() because the latter is called with
> >   MEMF_no_refcount by memory_exchange() as an optimization, to avoid
> >   too many calls to domain_adjust_tot_pages() (which acquires and
> >   releases the global 'heap_lock').
> 
> I don't think there were any optimization thoughts with this. The
> MEMF_no_refcount use is because otherwise for a domain with
> tot_pages == max_pages the assignment would fail.
> 

That would not be the case if the calls to steal_page() further up didn't pass 
MEMF_no_refcount (which would be the correct thing to do if not passing it to 
assign_pages(). I had originally considered doing that because I think it 
allows the somewhat complex error path after assign_pages() to be dropped. But 
avoiding thrashing the global lock seemed a good reason to leave 
memory_exchange() the way it is.

> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -460,6 +460,9 @@ unsigned long domain_adjust_tot_pages(struct domain
> *d, long pages)
> >  {
> >  long dom_before, dom_after, dom_claimed, sys_before, sys_after;
> >
> > +if ( !pages )
> > +goto out;
> 
> Unrelated change? Are there, in fact, any callers passing in 0?
> Oh, further down you add one which may do so, but then perhaps
> better to make the caller not call here (as is done e.g. in
> memory_exchange())?

I think it's preferable for domain_adjust_tot_pages() to handle zero gracefully.

> 
> > @@ -2331,11 +2331,20 @@ struct page_info *alloc_domheap_pages(
> >memflags, d)) == NULL)) )
> >   return NULL;
> >
> > -if ( d && !(memflags & MEMF_no_owner) &&
> > - assign_pages(d, pg, order, memflags) )
> > +if ( d && !(memflags & MEMF_no_owner) )
> >  {
> > -free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > -return NULL;
> > +if ( assign_pages(d, pg, order, memflags) )
> > +{
> > +free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > +return NULL;
> > +}
> > +if ( memflags & MEMF_no_refcount )
> > +{
> > +unsigned long i;
> > +
> > +for ( i = 0; i < (1 << order); i++ )
> > +pg[i].count_info |= PGC_no_refcount;
> > +}
> 
> I would seem to me that this needs doing the other way around:
> First set PGC_no_refcount, then assign_pages(). After all, the
> moment assign_pages() drops its lock, the domain could also
> decide to get rid of (some of) the pages again.

True. Yes, this needs to be swapped.

> For this (and
> also to slightly simplify things in free_domheap_pages())
> perhaps it would be better not to add that ASSERT() to
> free_heap_pages(). The function shouldn't really be concerned
> of any refcounting, and hence could as well be ignorant to
> PGC_no_refcount being set on a page.
> 

Not sure I understand here. What would you like to see free_heap_pages() assert?

> > @@ -2368,24 +2377,32 @@ void free_domheap_pages(struct page_info *pg,
> unsigned int order)
> >
> >  if ( likely(d) && likely(d != dom_cow) )
> >  {
> > +long pages = 0;
> > +
> >  /* NB. May recursively lock from relinquish_memory(). */
> >  spin_lock_recursive(&d->page_alloc_lock);
&

Re: [Xen-devel] [PATCH v4 6/7] x86/hyperv: retrieve vp_index from Hyper-V

2020-01-28 Thread Durrant, Paul
> -Original Message-
> From: Wei Liu 
> Sent: 28 January 2020 16:54
> To: Durrant, Paul 
> Cc: Jan Beulich ; Wei Liu ; Paul Durrant
> ; Andrew Cooper ; Xen Development
> List ; Roger Pau Monné
> ; Wei Liu ; Michael Kelley
> 
> Subject: Re: [Xen-devel] [PATCH v4 6/7] x86/hyperv: retrieve vp_index from
> Hyper-V
> 
> On Tue, Jan 28, 2020 at 04:33:00PM +, Durrant, Paul wrote:
> > > -Original Message-
> > > From: Xen-devel  On Behalf Of
> Jan
> > > Beulich
> > > Sent: 28 January 2020 16:19
> > > To: Wei Liu ; Paul Durrant ; Andrew Cooper
> > > 
> > > Cc: Xen Development List ; Roger Pau
> Monné
> > > ; Wei Liu ; Michael Kelley
> > > 
> > > Subject: Re: [Xen-devel] [PATCH v4 6/7] x86/hyperv: retrieve vp_index
> from
> > > Hyper-V
> > >
> > > On 28.01.2020 16:55, Wei Liu wrote:
> > > > On Thu, Jan 23, 2020 at 04:48:38PM +0100, Jan Beulich wrote:
> > > >> On 22.01.2020 21:23, Wei Liu wrote:
> > > >>> This will be useful when invoking hypercall that targets specific
> > > >>> vcpu(s).
> > > >>>
> > > >>> Signed-off-by: Wei Liu 
> > > >>> Reviewed-by: Paul Durrant 
> > > >>
> > > >> For formal reasons
> > > >> Acked-by: Jan Beulich 
> > > >>
> > > >> However I wonder whether the Viridian entry in MAINTAINERS
> shouldn't
> > > >> be extended by
> > > >>
> > > >> F: xen/arch/x86/guest/hyperv/
> > > >>
> > > >> (and possibly have its title adjusted). Thoughts?
> > > >
> > > > This isn't about emulating Hyper-V inside Xen, so I don't think
> that's
> > > > the right approach here.
> > >
> > > Well, it's the code producing the interface in one case, and
> > > consuming it here. So there is some overlap at least.
> > >
> > > > That said, if Paul wants to take this under his purview, it's fine
> by me
> > > > -- that would make me easier to upstream my patch. ;-)  I also
> wouldn't
> > > > mind adding myself as maintainer for this path.
> > >
> > > Perhaps best both of you? Paul, Andrew, what do you think?
> > >
> >
> > IMO it's probably best to the put the Hyper-V stuff under 'Viridian'
> > and add yourself as a maintainer there. There really is likely to be
> > significant overlap and it'd make it easier (for me at least) to keep
> > track of the bigger picture (i.e. Xen using enlightenments as well as
> > implementing them).
> 
> When you said "yourself", did you mean me or Jan?
>

You. I thought was replying to your question. Apologies for any confusion.

  Paul
 
> Wei.
> 
> >
> >   Cheers,
> >
> > Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign

2020-01-28 Thread Durrant, Paul
> -Original Message-
[snip]
> > > @@ -2331,11 +2331,20 @@ struct page_info *alloc_domheap_pages(
> > >memflags, d)) == NULL)) )
> > >   return NULL;
> > >
> > > -if ( d && !(memflags & MEMF_no_owner) &&
> > > - assign_pages(d, pg, order, memflags) )
> > > +if ( d && !(memflags & MEMF_no_owner) )
> > >  {
> > > -free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > > -return NULL;
> > > +if ( assign_pages(d, pg, order, memflags) )
> > > +{
> > > +free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > > +return NULL;
> > > +}
> > > +if ( memflags & MEMF_no_refcount )
> > > +{
> > > +unsigned long i;
> > > +
> > > +for ( i = 0; i < (1 << order); i++ )
> > > +pg[i].count_info |= PGC_no_refcount;
> > > +}
> >
> > I would seem to me that this needs doing the other way around:
> > First set PGC_no_refcount, then assign_pages(). After all, the
> > moment assign_pages() drops its lock, the domain could also
> > decide to get rid of (some of) the pages again.
> 
> True. Yes, this needs to be swapped.
> 
> > For this (and
> > also to slightly simplify things in free_domheap_pages())
> > perhaps it would be better not to add that ASSERT() to
> > free_heap_pages(). The function shouldn't really be concerned
> > of any refcounting, and hence could as well be ignorant to
> > PGC_no_refcount being set on a page.
> >
> 
> Not sure I understand here. What would you like to see free_heap_pages()
> assert?
> 

Oh, I misread. You want me to remove the ASSERT that I added... that's fine.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign

2020-01-29 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 29 January 2020 08:22
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu ; Volodymyr Babchuk
> ; Roger Pau Monné 
> Subject: Re: [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign
> 
> On 28.01.2020 18:01, Durrant, Paul wrote:
> >> From: Jan Beulich 
> >> Sent: 28 January 2020 15:23
> >>
> >> On 24.01.2020 16:31, Paul Durrant wrote:
> >>> Currently it is unsafe to assign a domheap page allocated with
> >>> MEMF_no_refcount to a domain because the domain't 'tot_pages' will not
> >>> be incremented, but will be decrement when the page is freed (since
> >>> free_domheap_pages() has no way of telling that the increment was
> >> skipped).
> >>>
> >>> This patch allocates a new 'count_info' bit for a PGC_no_refcount flag
> >>> which is then used to mark domheap pages allocated with
> >> MEMF_no_refcount.
> >>> This then allows free_domheap_pages() to skip decrementing tot_pages
> >> when
> >>> appropriate and hence makes the pages safe to assign.
> >>>
> >>> NOTE: The patch sets MEMF_no_refcount directly in
> alloc_domheap_pages()
> >>>   rather than in assign_pages() because the latter is called with
> >>>   MEMF_no_refcount by memory_exchange() as an optimization, to
> avoid
> >>>   too many calls to domain_adjust_tot_pages() (which acquires and
> >>>   releases the global 'heap_lock').
> >>
> >> I don't think there were any optimization thoughts with this. The
> >> MEMF_no_refcount use is because otherwise for a domain with
> >> tot_pages == max_pages the assignment would fail.
> >>
> >
> > That would not be the case if the calls to steal_page() further up
> didn't
> > pass MEMF_no_refcount (which would be the correct thing to do if not
> > passing it to assign_pages().
> 
> No, that's not an option either: steal_page() would otherwise decrement
> ->tot_pages, allowing the domain to allocate new memory on another vCPU.
> This would again result in the exchange failing for no reason. (And no,
> I don't think a guest should be required to serialize e.g. ballooning
> operations with exchanges.)
> 

Ok, yes it does make it non-atomic but my view would be that the guest should 
not be simultaneously ballooning; however, we clearly differ there.

> >>> --- a/xen/common/page_alloc.c
> >>> +++ b/xen/common/page_alloc.c
> >>> @@ -460,6 +460,9 @@ unsigned long domain_adjust_tot_pages(struct
> domain
> >> *d, long pages)
> >>>  {
> >>>  long dom_before, dom_after, dom_claimed, sys_before, sys_after;
> >>>
> >>> +if ( !pages )
> >>> +goto out;
> >>
> >> Unrelated change? Are there, in fact, any callers passing in 0?
> >> Oh, further down you add one which may do so, but then perhaps
> >> better to make the caller not call here (as is done e.g. in
> >> memory_exchange())?
> >
> > I think it's preferable for domain_adjust_tot_pages() to handle zero
> > gracefully.
> 
> That's an option, but imo would then better be a separate change (to
> also drop present guards of calls to the function).

Ok, I'll split it out into a separate patch.

  Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen 4.14 dates

2020-01-29 Thread Durrant, Paul
Hi,

  I'm not aware on any prior discussion w.r.t. dates for Xen 4.14 so as RM I'd 
like to propose the following:

Last posting: May 1st 2020
Hard Freeze: May 22nd 2020
Release: June 26th

  That puts summit between hard freeze and release, but I don't see that as a 
problem and may even be beneficial.

  Thoughts?

Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen 4.14 dates

2020-01-29 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 29 January 2020 11:08
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] Xen 4.14 dates
> 
> On 29.01.2020 11:55, Durrant, Paul wrote:
> >   I'm not aware on any prior discussion w.r.t. dates for Xen 4.14 so as
> RM I'd like to propose the following:
> >
> > Last posting: May 1st 2020
> > Hard Freeze: May 22nd 2020
> > Release: June 26th
> 
> Was 4.13 really more than 1.5 months late? The above would make
> its originally planned release date Oct 26th (the actual one was
> Dec 18th) with our current 8 month cadence.
> 
> Just wondering,

I found the last 4.13 update email from Juergen stating release on Nov 7th. 8 
months would put release on Jul 7th but I rounded down as I'm conscious that 
any slip is going to move things into holiday season.
Another option would be to slip out to ~1 year after the planned 4.13 dates.

  Paul

> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 2/4] mm: modify domain_adjust_tot_pages() to better handle a zero adjustment

2020-01-29 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 29 January 2020 11:13
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu 
> Subject: Re: [PATCH v5 2/4] mm: modify domain_adjust_tot_pages() to better
> handle a zero adjustment
> 
> On 29.01.2020 11:16, Paul Durrant wrote:
> > Currently the function will pointlessly acquire and release the global
> > 'heap_lock' in this case.
> >
> > NOTE: No caller yet calls domain_adjust_tot_pages() with a zero 'pages'
> >   argument, but a subsequent patch will make this possible.
> 
> With this memory_exchange(), as previously indicated, now needlessly
> prevents the call when !dec_count. I do think, as said there, that
> together with the addition here then redundant checks in callers
> should be dropped (and as it looks the named one is the only one).
> 

Ok, yes I missed that.

  Paul

> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [ANNOUNCE] Xen 4.14 Development Update

2020-01-29 Thread Durrant, Paul
> -Original Message-
> From: Marek Marczykowski-Górecki 
> Sent: 29 January 2020 12:44
> To: xen-devel@lists.xenproject.org; Durrant, Paul 
> Cc: Woodhouse, David ; luwei.k...@intel.com;
> andrew.coop...@citrix.com; roger@citrix.com
> Subject: Re: [ANNOUNCE] Xen 4.14 Development Update
> 
> On Wed, Jan 29, 2020 at 12:36:18PM +, Paul Durrant wrote:
> > *  Linux stub domains (RFC v2)
> >   -  Marek Marczykowski-Górecki
> 
> There is v4 series already.
> 

Thanks. Will update for next time.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [ANNOUNCE] Xen 4.14 Development Update

2020-01-29 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 29 January 2020 12:48
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org
> Subject: Re: [ANNOUNCE] Xen 4.14 Development Update
> 
> On 29.01.2020 13:36, Paul Durrant wrote:
> > === x86 ===
> >
> > *  Intel Processor Trace virtualization enabling (v1)
> >   -  Luwei Kang
> >
> > *  Linux stub domains (RFC v2)
> >   -  Marek Marczykowski-Górecki
> >
> > *  Fixes to #DB injection
> >   -  Andrew Cooper
> >
> > *  CPUID/MSR Xen/toolstack improvements
> >   -  Andrew Cooper
> >
> > *  Improvements to domain_crash()
> >   -  Andrew Cooper
> >
> > *  EIBRS
> >   -  Andrew Cooper
> >
> > *  Xen ioreq server (v3)
> >   -  Roger Pau Monne
> 
> Do you want to add "x86/HVM: implement memory read caching"
> (https://lists.xenproject.org/archives/html/xen-devel/2018-
> 09/msg00976.html)
> here? I think I now have something coming a little closer to
> what Andrew wants. It has its own downsides, so there being a
> v4 (after well over a year) doesn't mean this will get it any
> closer to going in.

It sounds like something that is reasonable to track. I can add it in (starting 
at v3, if v4 has not been posted by that point)

> 
> Do you also want to add the ongoing x86 insn emulator work
> here? Some parts were posted, some parts are still in needed
> of finding time to carry out, and some parts are still pretty
> vague.

Ok. It's on-going work so it may as well be tracked.

  Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [ANNOUNCE] Xen 4.14 Development Update

2020-01-29 Thread Durrant, Paul
> -Original Message-
> From: David Woodhouse 
> Sent: 29 January 2020 13:05
> To: xen-devel@lists.xenproject.org; Durrant, Paul 
> Cc: Woodhouse, David ; luwei.k...@intel.com;
> marma...@invisiblethingslab.com; andrew.coop...@citrix.com;
> roger@citrix.com
> Subject: Re: [ANNOUNCE] Xen 4.14 Development Update
> 
> On Wed, 2020-01-29 at 12:36 +, Paul Durrant wrote:
> > This email only tracks big items for xen.git tree. Please reply for
> items
> > you would like to see in 4.14 so that people have an idea what
> > is going on and prioritise accordingly.
> >
> > You're welcome to provide description and use cases of the feature
> you're
> > working on.
> >
> > = Timeline =
> >
> > We now adopt a fixed cut-off date scheme. We will release about every 8
> >  months.
> > The critical dates for Xen 4.14 are as follows:
> >
> > ---> We are here
> > * Last posting date: May 1st, 2020
> > * Hard code freeze: May 22nd, 2020
> > * Release: June 26th, 2020
> >
> > Note that we don't have a freeze exception scheme anymore. All patches
> > that wish to go into 4.14 must be posted initially no later than the
> > last posting date and finally no later than the hard code freeze.
> > All patches posted after that date will be automatically queued into
> next
> > release.
> >
> > RCs will be arranged immediately after freeze.
> >
> > There is also a jira instance to track all the tasks (not only big)
> > for the project. See:
> https://xenproject.atlassian.net/projects/XEN/issues.
> >
> > Some of the tasks tracked by this e-mail also have a corresponding jira
> task
> > referred by XEN-N.
> >
> > There is a version number for patch series associated to each feature.
> > Can each owner send an update giving the latest version number if the
> > series has been re-posted? Also, can the owners of any completed items
> > please respond so that the item can be moved into the 'Completed'
> section.
> >
> > = Projects =
> >
> > == Hypervisor ==
> >
> > *  Live-Updating Xen
> >   -  David Woodhouse
> 
> Latest RFC patchset posted is [RFC v2] at
> https://lists.xenproject.org/archives/html/xen-devel/2020-01/msg01764.html
> 
> The tree at
> https://xenbits.xen.org/gitweb/?p=people/dwmw2/xen.git;a=shortlog;h=refs/h
> eads/lu-master
> as moved on a little since then, and I'll post [RFC v3] shortly.
> 
> So far this is mostly just the basic handover of a stream of migration
> data from one Xen to the next, and allowing the new Xen to vmap and
> process it early enough, and reserve the pages which already contain
> domain data during its "boot".
> 
> In our development tree, we have a PV Dom0 surviving the transition.
> We're working on turning those hacks into something we can show in
> public.
> 
> I have updated the wiki page at https://wiki.xenproject.org/wiki/Live-
> Updating_Xen
> which includes a reference to the handover protocol documentation.
> 
> This also depends on the hypervisor side of non-cooperative live
> migration, mentioned below. But as well as that, parts of the memory
> management are going to intersect with the secret hiding work that you
> *didn't* call out in this email AFAICT.

Yes, I omitted Secret Hiding. Agreed it ought to be tracked.

  Paul

> 
> 
> > *  Non-Cooperative Live Migration
> >   -  Paul Durrant
> >
> > === x86 ===
> >
> > *  Intel Processor Trace virtualization enabling (v1)
> >   -  Luwei Kang
> >
> > *  Linux stub domains (RFC v2)
> >   -  Marek Marczykowski-Górecki
> >
> > *  Fixes to #DB injection
> >   -  Andrew Cooper
> >
> > *  CPUID/MSR Xen/toolstack improvements
> >   -  Andrew Cooper
> >
> > *  Improvements to domain_crash()
> >   -  Andrew Cooper
> >
> > *  EIBRS
> >   -  Andrew Cooper
> >
> > *  Xen ioreq server (v3)
> >   -  Roger Pau Monne
> >
> > === ARM ===
> >
> > == Completed ==
> >
> >
> > Paul Durrant

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 2/4] mm: modify domain_adjust_tot_pages() to better handle a zero adjustment

2020-01-29 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 29 January 2020 15:08
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu 
> Subject: Re: [PATCH v6 2/4] mm: modify domain_adjust_tot_pages() to better
> handle a zero adjustment
> 
> On 29.01.2020 15:38, Paul Durrant wrote:
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -727,8 +727,7 @@ static long
> memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
> >   (j * (1UL << exch.out.extent_order)));
> >
> >  spin_lock(&d->page_alloc_lock);
> > -drop_dom_ref = (dec_count &&
> > -!domain_adjust_tot_pages(d, -
> dec_count));
> > +drop_dom_ref = !domain_adjust_tot_pages(d, -dec_count);
> 
> And it's only now that I see it in this shape that it becomes
> clear to me why the change above shouldn't be done, and why in
> your other patch code should be written similar to the above:
> The abstract model requires that the domain reference be
> dropped only when ->tot_pages _transitions_ to zero. No drop
> should occur if the count was already zero. Granted this may
> be technically impossible in the specific case here, but the
> code would still better reflect this general model, to prevent
> it getting (mis-)cloned into other places.
> 

Ok, I guess I'll drop this and then make sure that free_domheap_pages() avoids 
an erroneous ref drop.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 3/4] mm: make MEMF_no_refcount pages safe to assign

2020-01-29 Thread Durrant, Paul


> -Original Message-
> From: Jan Beulich 
> Sent: 29 January 2020 15:15
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu ; Volodymyr Babchuk
> ; Roger Pau Monné 
> Subject: Re: [PATCH v6 3/4] mm: make MEMF_no_refcount pages safe to assign
> 
> On 29.01.2020 15:38, Paul Durrant wrote:
> > @@ -2371,6 +2383,8 @@ void free_domheap_pages(struct page_info *pg,
> unsigned int order)
> >
> >  if ( likely(d) && likely(d != dom_cow) )
> >  {
> > +long pages = 0;
> > +
> >  /* NB. May recursively lock from relinquish_memory(). */
> >  spin_lock_recursive(&d->page_alloc_lock);
> >
> > @@ -2386,9 +2400,11 @@ void free_domheap_pages(struct page_info *pg,
> unsigned int order)
> >  BUG();
> >  }
> >  arch_free_heap_page(d, &pg[i]);
> > +if ( !(pg[i].count_info & PGC_no_refcount) )
> > +pages--;
> >  }
> >
> > -drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order));
> > +drop_dom_ref = !domain_adjust_tot_pages(d, pages);
> 
> Following from what I've just said on the previous patch, this needs
> further changing then as well. There'll need to be a per-domain
> "non-refcounted-pages" count, which - when transitioning from zero
> to non-zero is accompanied by obtaining a domain ref, and when
> transitioning back to zero causes this domain ref to be dropped.
> Otherwise, once the last ref-counted page was freed, the domain
> may become ready for final destruction, no matter how many non-
> refcounted pages there still are on its page lists. (An alternative
> model might be to include all pages in ->tot_pages, keep using just
> that for the domain ref acquire/release, and subtract the new
> count when e.g. comparing against ->max_pages.)

Yes, I think I'll adjust totpages unconditionally and then subtract the 
secondary count for comparison as it means I can leave the ref counting alone.

  Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] docs/designs: Add a design document for migration of xenstore data

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Marek Marczykowski-Górecki 
> Sent: 29 January 2020 21:23
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Stefano Stabellini
> ; Julien Grall ; Wei Liu
> ; Konrad Rzeszutek Wilk ; George
> Dunlap ; Andrew Cooper
> ; Ian Jackson 
> Subject: Re: [Xen-devel] [PATCH v4 2/2] docs/designs: Add a design
> document for migration of xenstore data
> 
> On Wed, Jan 29, 2020 at 02:47:02PM +, Paul Durrant wrote:
> 
> 
> 
> > +**node data**
> > +
> > +
> > +`|||`
> > +
> > +
> > +`` is considered relative to the domain path
> `/local/domain/$domid`
> > +and hence must not begin with `/`.
> 
> How backend settings are going to be serialized?

They're not going to be. The toolstack will construct new backends; 
co-operation will be required from them in that they must support re-binding 
(which the latest versions of blkback and netback do). We will need to 
white-list backends that are known to do this and refuse non-cooperative 
migration if any other backend is use.

> For example vif backend
> has a bunch of feature-* entries, which should not change under the
> guest feet during non-cooperative migration.
> 

The frontend will normally come back in 'connected' state, in which case a 
change in any feature flags will be irrelevant until the next (if any) 
re-connection (since the protocols only sample them at that point). If the 
frontend is not connected then it will sample the feature flags in the usual 
way. If the frontend/backend are in transition then the locking in the backend 
should prevent the 'unbind' from completing until some level of stability has 
been reached.

> > +`` and `` should be suitable to formulate a `WRITE`
> operation
> > +to the receiving xenstore and `` should be similarly
> suitable
> > +to formulate a subsequent `SET_PERMS` operation.
> > +
> > +**watch data**
> > +
> > +
> > +`||`
> > +
> > +`` again is considered relative and, together with ``,
> should
> > +be suitable to formulate an `ADD_DOMAIN_WATCHES` operation (see below).
> > +
> > +
> > +### Protocol Extension
> > +
> > +The `WATCH` operation does not allow specification of a ``; it
> is
> > +assumed that the watch pertains to the domain that owns the shared ring
> > +over which the operation is passed. Hence, for the tool-stack to be
> able
> > +to register a watch on behalf of a domain a new operation is needed:
> > +
> > +```
> > +ADD_DOMAIN_WATCHES  ||+
> > +
> > +Adds watches on behalf of the specified domain.
> > +
> > + is a NUL separated tuple of |. The semantics of
> this
> > +operation are identical to the domain issuing WATCH || for
> > +each .
> > +```
> 
> Normal WATCH operation triggers an event immediately. Is it intended in
> this case too? On the other hand, guest should cope with spurious watch
> events, so probably not an issue.

I don't think it matters if one is triggered or not. As you say, the watch 
protocol allows them to spuriously fire so the guest should cope either way.

> 
> > +The watch information for a domain also needs to be extracted from the
> > +sending xenstored so the following operation is also needed:
> > +
> > +```
> > +GET_DOMAIN_WATCHES  |   ||*
> > +
> > +Gets the list of watches that are currently registered for the domain.
> > +
> > + is a NUL separated tuple of |. The sub-list
> returned
> > +will start at  into the the overall list of watches and may be
> > +truncated such that the returned data fits within XENSTORE_PAYLOAD_MAX.
> > +If  is beyond the end of the overall list then the returned sub-
> > +list will be empty. If the value of  changes then it indicates
> > +that the overall watch list has changed and thus it may be necessary
> > +to re-issue the operation for previous values of .
> > +```
> 
> In what units  is expressed? bytes? entries?

I thought entries was reasonably well implied since I refer to a list, but I 
can make it explicit.

> Can the response be truncated at arbitrary place, or only between
> records?
> 

Only between records. Again I'll add some words to make that clear.

  Paul

> --
> Best Regards,
> Marek Marczykowski-Górecki
> Invisible Things Lab
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/2] docs/designs: Add a design document for non-cooperative live migration

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Andrew Cooper 
> Sent: 29 January 2020 19:47
> To: Durrant, Paul ; xen-devel@lists.xenproject.org
> Cc: George Dunlap ; Ian Jackson
> ; Jan Beulich ; Julien Grall
> ; Konrad Rzeszutek Wilk ; Stefano
> Stabellini ; Wei Liu 
> Subject: Re: [PATCH v4 1/2] docs/designs: Add a design document for non-
> cooperative live migration
> 
> On 29/01/2020 14:47, Paul Durrant wrote:
> > diff --git a/docs/designs/non-cooperative-migration.md
> b/docs/designs/non-cooperative-migration.md
> > new file mode 100644
> > index 00..5db3939db5
> > --- /dev/null
> > +++ b/docs/designs/non-cooperative-migration.md
> > @@ -0,0 +1,272 @@
> > +# Non-Cooperative Migration of Guests on Xen
> > +
> > +## Background
> > +
> > +The normal model of migration in Xen is driven by the guest because it
> was
> > +originally implemented for PV guests, where the guest must be aware it
> is
> > +running under Xen and is hence expected to co-operate.
> 
> For PV guests, is more than "expected to co-operate".
> 
> Migrating a PV guest involves rewriting every pagetable entry with a
> different MFN, so even before you consider things like the PV protocols,
> there is no way this could be done without the cooperation of the guest.

Yes, the P2M will change and this is visible to the guest, but does a PV guest 
need to take action when this occurs? I'm not sure.

> 
> Sadly, this fact was depended upon for migration of the PV protocols,
> and has migrated (excuse the pun) into the HVM world as well.
> 

Alas yes.

> > This model dates from
> > +an era when it was assumed that the host administrator had control of
> at least
> > +the privileged software running in the guest (i.e. the guest kernel)
> which may
> > +still be true in an enterprise deployment but is not generally true in
> a cloud
> > +environment.
> 
> I haven't seen it discussed elsewhere, but even enterprise environments
> have problems.
> 
> Having host admin == guest admin doesn't mean that guest drivers aren't
> buggy, or that the VM doesn't explode on migrate.

No, but at least the host admin has a chance to test and update guest software 
to be 'reasonably' confident that migration will work before employing it en 
masse.

> 
> The simple fact is that involving the guest kernel adds unnecessary
> moving parts which can (and do with a non-zero probability) go wrong.
> 

Yes, having written the frontend side of migration in the Windows drivers it is 
*very* hard to get right, particularly in Windows where one has to deal with 
the complex and asynchronous PnP subsystem colliding with a migration. The 
network driver also requires a multi-reader/single-writer lock with odd 
semantics (w.r.t. to IRQL) which I had to code myself 
(https://xenbits.xen.org/gitweb/?p=pvdrivers/win/xenvif.git;a=blob;f=src/xenvif/mrsw.h).
 It took years of fixing subtle races (in that and elsewhere) to get to the 
(AFAIK) reliable code we have now. 
Avoiding execution of code like this (in all OS) certainly avoids the 
opportunity for subtle bugs to manifest themselves.

> >  The aim of this design is to provide a model which is purely host
> > +driven, requiring no co-operation from the software running in the
> > +guest, and is thus suitable for cloud scenarios.
> > +
> > +PV guests are out of scope for this project because, as is outlined
> above, they
> > +have a symbiotic relationship with the hypervisor and therefore a
> certain level
> > +of co-operation can be assumed.
> 
> If nothing else, I'd at least suggest s/can be assumed/is necessary/.

Ok. I'll make that modification.

> 
> > +Because the service domain’s domid is used directly by the guest in
> setting
> > +up grant entries and event channels, the backend drivers in the new
> host
> > +environment must be provided by service domain with the same domid.
> Also,
> > +because the guest can sample its own domid from the frontend area and
> use it in
> > +hypercalls (e.g. HVMOP_set_param) rather than DOMID_SELF, the guest
> domid must
> > +also be preserved to maintain the ABI.
> 
> Has this been true since forever?  The grant and event APIs took some
> care to avoid the guest needing to know its own domid.
> 

The guest doesn't need to know its domid; DOMID_SELF will work, but the guest 
*can* use its own domid in this case (whereas I think grant and event ops will 
insist on DOMID_SELF unless referring to another domain). As far as I know this 
has been the case since forever and so I don't think it is something we can 
change now unless we move to a new ABI.

> > +
> > +Furthermore, 

Re: [Xen-devel] [PATCH v5 01/12] MAINTAINERS: put Hyper-V code under Viridian maintainership

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Wei
> Liu
> Sent: 29 January 2020 20:20
> To: Xen Development List 
> Cc: Stefano Stabellini ; Wei Liu
> ; Wei Liu ; Konrad Rzeszutek Wilk
> ; George Dunlap ;
> Andrew Cooper ; Durrant, Paul
> ; Ian Jackson ; Michael
> Kelley ; Julien Grall 
> Subject: [Xen-devel] [PATCH v5 01/12] MAINTAINERS: put Hyper-V code under
> Viridian maintainership
> 
> And add myself as a maintainer.
> 
> Sort the list alphabetically while at it.
> 
> Signed-off-by: Wei Liu 
> Signed-off-by: Wei Liu 

Reviewed-by: Paul Durrant 

> ---
>  MAINTAINERS | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1915e09f8b..04d91482cd 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -514,10 +514,13 @@ F:  xen/arch/x86/mm/shadow/
> 
>  X86 VIRIDIAN ENLIGHTENMENTS
>  M:   Paul Durrant 
> +M:   Wei Liu 
>  S:   Supported
> +F:   xen/arch/x86/guest/hyperv/
>  F:   xen/arch/x86/hvm/viridian/
> -F:   xen/include/asm-x86/hvm/viridian.h
> +F:   xen/include/asm-x86/guest/hyperv.h
>  F:   xen/include/asm-x86/guest/hyperv-tlfs.h
> +F:   xen/include/asm-x86/hvm/viridian.h
> 
>  XENTRACE
>  M:   George Dunlap 
> --
> 2.20.1
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 07/12] x86/hyperv: setup hypercall page

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Wei Liu  On Behalf Of Wei Liu
> Sent: 29 January 2020 20:20
> To: Xen Development List 
> Cc: Durrant, Paul ; Michael Kelley
> ; Wei Liu ; Jan Beulich
> ; Andrew Cooper ; Wei Liu
> ; Roger Pau Monné 
> Subject: [PATCH v5 07/12] x86/hyperv: setup hypercall page
> 
> Hyper-V uses a technique called overlay page for its hypercall page. It
> will insert a backing page to the guest when the hypercall functionality
> is enabled. That means we can use a page that is not backed by real
> memory for hypercall page.
> 
> Use the top-most addressable page for that purpose. Adjust e820 code
> accordingly.
> 
> We also need to register Xen's guest OS ID to Hyper-V. Use 0x3 as the
> vendor ID. Fix the comment in hyperv-tlfs.h while at it.
> 
> Signed-off-by: Wei Liu 
> ---
> v5:
> 1. use hypervisor_reserve_top_pages
> 2. add a macro for hypercall page mfn
> 3. address other misc comments
> 
> v4:
> 1. Use fixmap
> 2. Follow routines listed in TLFS
> ---
>  xen/arch/x86/e820.c |  5 +++
>  xen/arch/x86/guest/hyperv/hyperv.c  | 57 +++--
>  xen/include/asm-x86/guest/hyperv-tlfs.h |  5 ++-
>  xen/include/asm-x86/guest/hyperv.h  |  3 ++
>  4 files changed, 65 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c
> index 3892c9cfb7..99643f3ea0 100644
> --- a/xen/arch/x86/e820.c
> +++ b/xen/arch/x86/e820.c
> @@ -343,6 +343,7 @@ static unsigned long __init find_max_pfn(void)
>  {
>  unsigned int i;
>  unsigned long max_pfn = 0;
> +unsigned long top_pfn = ((1ull << paddr_bits) - 1) >> PAGE_SHIFT;
> 
>  for (i = 0; i < e820.nr_map; i++) {
>  unsigned long start, end;
> @@ -357,6 +358,10 @@ static unsigned long __init find_max_pfn(void)
>  max_pfn = end;
>  }
> 
> +top_pfn -= hypervisor_reserve_top_pages();
> +if ( max_pfn >= top_pfn )
> +max_pfn = top_pfn;
> +
>  return max_pfn;
>  }
> 
> diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> b/xen/arch/x86/guest/hyperv/hyperv.c
> index 8d38313d7a..2bedcc438c 100644
> --- a/xen/arch/x86/guest/hyperv/hyperv.c
> +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> @@ -19,15 +19,26 @@
>   * Copyright (c) 2019 Microsoft.
>   */
>  #include 
> +#include 
> 
> +#include 
>  #include 
>  #include 
> +#include 
> 
>  struct ms_hyperv_info __read_mostly ms_hyperv;
> 
> -static const struct hypervisor_ops ops = {
> -.name = "Hyper-V",
> -};
> +static uint64_t generate_guest_id(void)
> +{
> +uint64_t id;
> +
> +id = (uint64_t)HV_XEN_VENDOR_ID << 48;
> +id |= (xen_major_version() << 16) | xen_minor_version();
> +
> +return id;

I think this should use the hv_guest_os_id union. You can then set the values 
using the bit-fields and return the raw.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 08/12] x86/hyperv: provide Hyper-V hypercall functions

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Wei
> Liu
> Sent: 29 January 2020 20:21
> To: Xen Development List 
> Cc: Stefano Stabellini ; Wei Liu
> ; Wei Liu ; Konrad Rzeszutek Wilk
> ; George Dunlap ;
> Andrew Cooper ; Durrant, Paul
> ; Ian Jackson ; Michael
> Kelley ; Julien Grall ; Roger Pau
> Monné 
> Subject: [Xen-devel] [PATCH v5 08/12] x86/hyperv: provide Hyper-V
> hypercall functions
> 
> These functions will be used later to make hypercalls to Hyper-V.
> 
> Signed-off-by: Wei Liu 

Reviewed-by: Paul Durrant 

> ---
> v5:
> 1. Switch back to direct call
> 2. Fix some issues pointed out by Jan
> 
> I tried using the asm(".equ ..") trick but hit a problem with %c again.
> 
> mm.c:5736:5: error: invalid 'asm': operand is not a condition code,
> invalid operand code 'c'
>asm ( ".equ HV_HCALL_PAGE, %c0; .global HV_HCALL_PAGE"
> ---
>  MAINTAINERS  |  1 +
>  xen/arch/x86/guest/hyperv/hyperv.c   |  6 ++
>  xen/arch/x86/xen.lds.S   |  4 +
>  xen/include/asm-x86/fixmap.h |  3 +-
>  xen/include/asm-x86/guest/hyperv-hcall.h | 96 
>  5 files changed, 108 insertions(+), 2 deletions(-)
>  create mode 100644 xen/include/asm-x86/guest/hyperv-hcall.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 04d91482cd..d0a5ed635b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -519,6 +519,7 @@ S:Supported
>  F:   xen/arch/x86/guest/hyperv/
>  F:   xen/arch/x86/hvm/viridian/
>  F:   xen/include/asm-x86/guest/hyperv.h
> +F:   xen/include/asm-x86/guest/hyperv-hcall.h
>  F:   xen/include/asm-x86/guest/hyperv-tlfs.h
>  F:   xen/include/asm-x86/hvm/viridian.h
> 
> diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> b/xen/arch/x86/guest/hyperv/hyperv.c
> index 2bedcc438c..932a648ff7 100644
> --- a/xen/arch/x86/guest/hyperv/hyperv.c
> +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> @@ -123,6 +123,12 @@ static const struct hypervisor_ops ops = {
>  .setup = setup,
>  };
> 
> +static void __maybe_unused build_assertions(void)
> +{
> +/* We use 1 in linker script */
> +BUILD_BUG_ON(FIX_X_HYPERV_HCALL != 1);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> index 97f9c07891..8e02b4c648 100644
> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -329,6 +329,10 @@ SECTIONS
>efi = .;
>  #endif
> 
> +#ifdef CONFIG_HYPERV_GUEST
> +  hv_hcall_page = ABSOLUTE(__fix_x_to_virt(1));
> +#endif
> +
>/* Sections to be discarded */
>/DISCARD/ : {
> *(.exit.text)
> diff --git a/xen/include/asm-x86/fixmap.h b/xen/include/asm-x86/fixmap.h
> index 8094546b75..a9bcb068cb 100644
> --- a/xen/include/asm-x86/fixmap.h
> +++ b/xen/include/asm-x86/fixmap.h
> @@ -16,6 +16,7 @@
> 
>  #define FIXADDR_TOP (VMAP_VIRT_END - PAGE_SIZE)
>  #define FIXADDR_X_TOP (XEN_VIRT_END - PAGE_SIZE)
> +#define __fix_x_to_virt(x) (FIXADDR_X_TOP - ((x) << PAGE_SHIFT))
> 
>  #ifndef __ASSEMBLY__
> 
> @@ -110,8 +111,6 @@ extern void __set_fixmap_x(
> 
>  #define clear_fixmap_x(idx) __set_fixmap_x(idx, 0, 0)
> 
> -#define __fix_x_to_virt(x) (FIXADDR_X_TOP - ((x) << PAGE_SHIFT))
> -
>  #define fix_x_to_virt(x)   ((void *)__fix_x_to_virt(x))
> 
>  #endif /* __ASSEMBLY__ */
> diff --git a/xen/include/asm-x86/guest/hyperv-hcall.h b/xen/include/asm-
> x86/guest/hyperv-hcall.h
> new file mode 100644
> index 00..5b7509b3b5
> --- /dev/null
> +++ b/xen/include/asm-x86/guest/hyperv-hcall.h
> @@ -0,0 +1,96 @@
> +/
> **
> + * asm-x86/guest/hyperv-hcall.h
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see
> <http://www.gnu.org/licenses/>.
> + *
> + * Copyright (c) 2019 Microsoft.
> + */
> +
> +#ifndef __X86_HYPERV_HCALL_H__
> +#define __X86_HYPERV_HCALL_H__
> +
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +st

Re: [Xen-devel] [PATCH v5 10/12] x86/hyperv: provide percpu hypercall input page

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Wei
> Liu
> Sent: 29 January 2020 20:21
> To: Xen Development List 
> Cc: Wei Liu ; Wei Liu ; Andrew Cooper
> ; Durrant, Paul ;
> Michael Kelley ; Roger Pau Monné
> 
> Subject: [Xen-devel] [PATCH v5 10/12] x86/hyperv: provide percpu hypercall
> input page
> 
> Hyper-V's input / output argument must be 8 bytes aligned an not cross
> page boundary. One way to satisfy those requirements is to use percpu
> page.
> 
> For the foreseeable future we only need to provide input for TLB
> and APIC hypercalls, so skip setting up an output page.
> 
> We will also need to provide an ap_setup hook for secondary cpus to
> setup its own input page.
> 
> Signed-off-by: Wei Liu 

Reviewed-by: Paul Durrant 

> ---
> v5:
> 1. Adjust to new ap_setup
> 2. Change variable name to hv_pcpu_input_page
> 
> v4:
> 1. Change wording in commit message
> 2. Prevent leak
> 3. Introduce a private header
> 
> v3:
> 1. Use xenheap page instead
> 2. Drop page tracking structure
> 3. Drop Paul's review tag
> ---
>  xen/arch/x86/guest/hyperv/hyperv.c  | 31 +
>  xen/arch/x86/guest/hyperv/private.h | 29 +++
>  2 files changed, 60 insertions(+)
>  create mode 100644 xen/arch/x86/guest/hyperv/private.h
> 
> diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> b/xen/arch/x86/guest/hyperv/hyperv.c
> index 4387b6541e..f0facccbad 100644
> --- a/xen/arch/x86/guest/hyperv/hyperv.c
> +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> @@ -27,7 +27,10 @@
>  #include 
>  #include 
> 
> +#include "private.h"
> +
>  struct ms_hyperv_info __read_mostly ms_hyperv;
> +DEFINE_PER_CPU_READ_MOSTLY(void *, hv_pcpu_input_page);
> 
>  static uint64_t generate_guest_id(void)
>  {
> @@ -127,14 +130,42 @@ static void __init setup_hypercall_page(void)
>  }
>  }
> 
> +static int setup_hypercall_pcpu_arg(void)
> +{
> +void *mapping;
> +
> +if ( this_cpu(hv_pcpu_input_page) )
> +return 0;
> +
> +mapping = alloc_xenheap_page();
> +if ( !mapping )
> +{
> +printk("Failed to allocate hypercall input page for CPU%u\n",
> +   smp_processor_id());
> +return -ENOMEM;
> +}
> +
> +this_cpu(hv_pcpu_input_page) = mapping;
> +
> +return 0;
> +}
> +
>  static void __init setup(void)
>  {
>  setup_hypercall_page();
> +if ( setup_hypercall_pcpu_arg() )
> +panic("Hypercall percpu arg setup failed\n");
> +}
> +
> +static int ap_setup(void)
> +{
> +return setup_hypercall_pcpu_arg();
>  }
> 
>  static const struct hypervisor_ops ops = {
>  .name = "Hyper-V",
>  .setup = setup,
> +.ap_setup = ap_setup,
>  };
> 
>  static void __maybe_unused build_assertions(void)
> diff --git a/xen/arch/x86/guest/hyperv/private.h
> b/xen/arch/x86/guest/hyperv/private.h
> new file mode 100644
> index 00..a339274985
> --- /dev/null
> +++ b/xen/arch/x86/guest/hyperv/private.h
> @@ -0,0 +1,29 @@
> +/
> **
> + * arch/x86/guest/hyperv/private.h
> + *
> + * Definitions / declarations only useful to Hyper-V code.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Copyright (c) 2020 Microsoft.
> + */
> +
> +#ifndef __XEN_HYPERV_PRIVIATE_H__
> +#define __XEN_HYPERV_PRIVIATE_H__
> +
> +#include 
> +
> +DECLARE_PER_CPU(void *, hv_pcpu_input_page);
> +
> +#endif /* __XEN_HYPERV_PRIVIATE_H__  */
> --
> 2.20.1
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 12/12] x86/hyperv: setup VP assist page

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Wei Liu  On Behalf Of Wei Liu
> Sent: 29 January 2020 20:21
> To: Xen Development List 
> Cc: Durrant, Paul ; Michael Kelley
> ; Wei Liu ; Wei Liu
> ; Jan Beulich ; Andrew Cooper
> ; Roger Pau Monné 
> Subject: [PATCH v5 12/12] x86/hyperv: setup VP assist page
> 
> VP assist page is rather important as we need to toggle some bits in it
> for efficient nested virtualisation.
> 
> Signed-off-by: Wei Liu 
> ---
> v5:
> 1. Deal with error properly instead of always panicking
> 2. Swap percpu variables declarations' location
> 
> v4:
> 1. Use private.h
> 2. Prevent leak
> 
> v3:
> 1. Use xenheap page
> 2. Drop set_vp_assist
> 
> v2:
> 1. Use HV_HYP_PAGE_SHIFT instead
> ---
>  xen/arch/x86/guest/hyperv/hyperv.c  | 44 -
>  xen/arch/x86/guest/hyperv/private.h |  1 +
>  2 files changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> b/xen/arch/x86/guest/hyperv/hyperv.c
> index af0d6ed692..bc40a3d338 100644
> --- a/xen/arch/x86/guest/hyperv/hyperv.c
> +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> @@ -31,6 +31,7 @@
> 
>  struct ms_hyperv_info __read_mostly ms_hyperv;
>  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_pcpu_input_page);
> +DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
>  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> 
>  static uint64_t generate_guest_id(void)
> @@ -155,16 +156,57 @@ static int setup_hypercall_pcpu_arg(void)
>  return 0;
>  }
> 
> +static int setup_vp_assist(void)
> +{
> +void *mapping;
> +uint64_t val;
> +
> +mapping = this_cpu(hv_vp_assist);
> +
> +if ( !mapping )
> +{
> +mapping = alloc_xenheap_page();
> +if ( !mapping )
> +{
> +printk("Failed to allocate vp_assist page for CPU%u\n",
> +   smp_processor_id());
> +return -ENOMEM;
> +}
> +
> +clear_page(mapping);
> +this_cpu(hv_vp_assist) = mapping;
> +}
> +
> +val = (virt_to_mfn(mapping) << HV_HYP_PAGE_SHIFT)
> +| HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;

Perhaps it would be neater to put the viridian_page_msr union into 
hyperv-tlfs.h and then use that.

  Paul

> +wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
> +
> +return 0;
> +}
> +
>  static void __init setup(void)
>  {
>  setup_hypercall_page();
> +
>  if ( setup_hypercall_pcpu_arg() )
>  panic("Hypercall percpu arg setup failed\n");
> +
> +if ( setup_vp_assist() )
> +panic("VP assist page setup failed\n");
>  }
> 
>  static int ap_setup(void)
>  {
> -return setup_hypercall_pcpu_arg();
> +int rc;
> +
> +rc = setup_hypercall_pcpu_arg();
> +if ( rc )
> +goto out;
> +
> +rc = setup_vp_assist();
> +
> + out:
> +return rc;
>  }
> 
>  static const struct hypervisor_ops ops = {
> diff --git a/xen/arch/x86/guest/hyperv/private.h
> b/xen/arch/x86/guest/hyperv/private.h
> index c1c2431eff..fcddc47544 100644
> --- a/xen/arch/x86/guest/hyperv/private.h
> +++ b/xen/arch/x86/guest/hyperv/private.h
> @@ -25,6 +25,7 @@
>  #include 
> 
>  DECLARE_PER_CPU(void *, hv_pcpu_input_page);
> +DECLARE_PER_CPU(void *, hv_vp_assist);
>  DECLARE_PER_CPU(unsigned int, hv_vp_index);
> 
>  #endif /* __XEN_HYPERV_PRIVIATE_H__  */
> --
> 2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 2/3] mm: make pages allocated with MEMF_no_refcount safe to assign

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 30 January 2020 10:20
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu ; Volodymyr Babchuk
> ; Roger Pau Monné 
> Subject: Re: [PATCH v7 2/3] mm: make pages allocated with MEMF_no_refcount
> safe to assign
> 
> On 29.01.2020 18:10, Paul Durrant wrote:
> > NOTE: steal_page() is also modified to decrement extra_pages in the case
> of
> >   a PGC_extra page being stolen from a domain.
> 
> I don't think stealing of such pages should be allowed. If anything,
> the replacement page then again should be an "extra" one, which I
> guess would be quite ugly to arrange for. But such "extra" pages
> aren't supposed to be properly exposed (and hence played with) to
> the domain in the first place.
> 
> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -2256,6 +2256,7 @@ int assign_pages(
> >  {
> >  int rc = 0;
> >  unsigned long i;
> > +unsigned int extra_pages = 0;
> >
> >  spin_lock(&d->page_alloc_lock);
> >
> > @@ -2267,13 +2268,19 @@ int assign_pages(
> >  goto out;
> >  }
> >
> > +for ( i = 0; i < (1 << order); i++ )
> > +if ( pg[i].count_info & PGC_extra )
> > +extra_pages++;
> 
> Perhaps assume (and maybe ASSERT()) that all pages in the batch
> are the same in this regard? Then you could ...
> 
> >  if ( !(memflags & MEMF_no_refcount) )
> >  {
> > -if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages) )
> > +unsigned int max_pages = d->max_pages - d->extra_pages -
> extra_pages;
> > +
> > +if ( unlikely((d->tot_pages + (1 << order)) > max_pages) )
> >  {
> >  gprintk(XENLOG_INFO, "Over-allocation for domain %u: "
> >  "%u > %u\n", d->domain_id,
> > -d->tot_pages + (1 << order), d->max_pages);
> > +d->tot_pages + (1 << order), max_pages);
> >  rc = -E2BIG;
> >  goto out;
> >  }
> > @@ -2282,13 +2289,17 @@ int assign_pages(
> >  get_knownalive_domain(d);
> >  }
> >
> > +d->extra_pages += extra_pages;
> 
> ... arrange things like this, I think:
> 
> if ( pg[i].count_info & PGC_extra )
> d->extra_pages += 1U << order;
> else if ( !(memflags & MEMF_no_refcount) )
> {
> unsigned int max_pages = d->max_pages - d->extra_pages;
> ...
> 
> This would, afaict, then also eliminate the need to mask off
> MEMF_no_refcount in alloc_domheap_pages(), ...
> 
> 
> >  for ( i = 0; i < (1 << order); i++ )
> >  {
> > +unsigned long count_info = pg[i].count_info;
> > +
> >  ASSERT(page_get_owner(&pg[i]) == NULL);
> > -ASSERT(!pg[i].count_info);
> > +ASSERT(!(count_info & ~PGC_extra));
> 
> ... resulting in my prior comment on this one still applying.
> 
> Besides the changes you've made, what about the code handling
> XENMEM_set_pod_target? What about p2m-pod.c? And
> pv_shim_setup_dom()? I'm also not fully sure whether
> getdomaininfo() shouldn't subtract extra_pages, but I think
> this is the only way to avoid having an externally visible
> effect. There may be more. Perhaps it's best to introduce a
> domain_tot_pages() inline function returning the difference,
> and us it almost everywhere where ->tot_pages is used right
> now.

This is getting very very complicated now, which makes me think that my 
original approach using a 'normal' page and setting an initial max_pages in 
domain_create() was a better approach.

  Paul



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 2/3] mm: make pages allocated with MEMF_no_refcount safe to assign

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 30 January 2020 11:02
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; George Dunlap ;
> Ian Jackson ; Julien Grall ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu ; Volodymyr Babchuk
> ; Roger Pau Monné 
> Subject: Re: [PATCH v7 2/3] mm: make pages allocated with MEMF_no_refcount
> safe to assign
> 
> (replying from seeing your reply on the list archives, i.e.
> threading lost/broken)
> 
> On 30.01.2020 10:40, Paul Durrant wrote:
> > This is getting very very complicated now, which makes me think that my
> > original approach using a 'normal' page and setting an initial max_pages
> in
> > domain_create() was a better approach.
> 
> I don't think so, no. I also don't thing auditing all ->{max,tot}_pages
> uses can be called "very very complicated". All I can say (again, I
> think) is that there was a reason this APIC page thing was done the
> way it was done. (It's another thing that this probably wasn't a
> _good_ reason.)
> 

I really want to get rid of shared xenheap pages though, so I will persist. 
I'll add the domain_tot_pages() helper as you suggest. I also agree that 
steal_page() ought not to encounter a PGC_extra page so I think I'll just make 
that an error case.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 2/4] add a domain_tot_pages() helper function

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 30 January 2020 16:29
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap ; Ian
> Jackson ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> ; Tim Deegan 
> Subject: Re: [PATCH v8 2/4] add a domain_tot_pages() helper function
> 
> On 30.01.2020 15:57, Paul Durrant wrote:
> > --- a/xen/arch/x86/domain.c
> > +++ b/xen/arch/x86/domain.c
> > @@ -218,7 +218,7 @@ void dump_pageframe_info(struct domain *d)
> >
> >  printk("Memory pages belonging to domain %u:\n", d->domain_id);
> >
> > -if ( d->tot_pages >= 10 && d->is_dying < DOMDYING_dead )
> > +if ( domain_tot_pages(d) >= 10 && d->is_dying < DOMDYING_dead )
> 
> Before I go any further - are you simply replacing _all_
> ->tot_pages uses by the new helper?

Basically, apart from domain_adjust_tot_pages(), yes.

> In the case here, for
> example, I don't think this is what we want.
> 

Why not? I would have thought any 'extra' pages would always be of interest.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: allow creation of domains with a specified or random domid

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 30 January 2020 17:26
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu ; Anthony Perard
> ; Andrew Cooper ;
> George Dunlap ; Jan Beulich ;
> Julien Grall ; Konrad Rzeszutek Wilk
> ; Stefano Stabellini ;
> Jason Andryuk 
> Subject: Re: [PATCH v4 5/7] libxl: allow creation of domains with a
> specified or random domid
> 
> Paul Durrant writes ("[PATCH v4 5/7] libxl: allow creation of domains with
> a specified or random domid"):
> > This patch adds a 'domid' field to libxl_domain_create_info and then
> > modifies libxl__domain_make() to have Xen use that value if it is valid.
> > If the domid value is invalid then Xen will choose the domid, as before,
> > unless the value is the new special RANDOM_DOMID value added to the API.
> > This value instructs libxl__domain_make() to choose a random domid value
> > for Xen to use.
> >
> > If Xen determines that a domid specified to or chosen by
> > libxl__domain_make() co-incides with an existing domain then the create
> > operation will fail. In this case, if RANDOM_DOMID was specified to
> > libxl__domain_make() then a new random value will be chosen and the
> create
> > operation will be re-tried, otherwise libxl__domain_make() will fail.
> >
> > After Xen has successfully created a new domain, libxl__domain_make()
> will
> > check whether its domid matches any recently used domid values. If it
> does
> > then the domain will be destroyed. If the domid used in creation was
> > specified to libxl__domain_make() then it will fail at this point,
> > otherwise the create operation will be re-tried with either a new random
> > or Xen-selected domid value.
> ...
> > -ret = xc_domain_create(ctx->xch, domid, &create);
> > +if (libxl_domid_valid_guest(info->domid))
> > +*domid = info->domid;
> > +
> > +again:
> > +for (;;) {
> > +if (info->domid == RANDOM_DOMID) {
> > +uint16_t v;
> > +
> > +ret = libxl__random_bytes(gc, (void *)&v, sizeof(v));
> > +if (ret < 0)
> > +break;
> > +
> > +v &= DOMID_MASK;
> > +if (!libxl_domid_valid_guest(v))
> > +continue;
> > +
> > +*domid = v;
> > +}
> > +
> > +ret = xc_domain_create(ctx->xch, domid, &create);
> > +if (ret == 0 || errno != EEXIST || info->domid !=
> RANDOM_DOMID)
> > +break;
> > +}
> > +
> >  if (ret < 0) {
> >  LOGED(ERROR, *domid, "domain creation fail");
> > +*domid = INVALID_DOMID;
> > +rc = ERROR_FAIL;
> > +goto out;
> > +}
> > +
> > +if (libxl__is_domid_recent(gc, *domid)) {
> > +if (*domid == info->domid) /* domid was specified */
> > +LOGED(ERROR, *domid, "domain id recently used");
> > +
> > +ret = xc_domain_destroy(ctx->xch, *domid);
> > +if (!ret) {
> > +*domid = INVALID_DOMID;
> > +
> > +/* If the domid was not specified then have another go
> */
> > +if (!libxl_domid_valid_guest(info->domid))
> > +goto again;
> > +}
> 
> You have written this as two nested loops, one of which is implemented
> as a goto, but actually logically this is surely only one loop ?
> Please could you reorganise this and then I'll read it again...
> 

Ok, I'll try to squash it down to one. It should be do-able.

  Paul

> Thanks,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of INVALID_DOMID to the API

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Ian
> Jackson
> Sent: 30 January 2020 17:32
> To: Durrant, Paul 
> Cc: Anthony Perard ; xen-
> de...@lists.xenproject.org; Wei Liu 
> Subject: Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of
> INVALID_DOMID to the API
> 
> Paul Durrant writes ("[PATCH v4 1/7] libxl: add definition of
> INVALID_DOMID to the API"):
> > Currently both xl and libxl have internal definitions of INVALID_DOMID
> > which happen to be identical. However, for the purposes of describing
> the
> > behaviour of libxl_domain_create_new/restore() it is useful to have a
> > specified invalid value for a domain id.
> 
> Hi.  (I'm replying to the 1/ here because I don't seem to have any 0/
> in my inbox...)
> 

Oh, I must have forgot to copy the combined cc list onto the cover letter. 
Sorry about that.

> Thanks for all this.  As well as your use case, which is in itself
> valuable, I think this change is important for other reasons.
> So I want to see it go in.
> 
> You'll see I have replied with some comments about the implementation.
> I hope you won't be discouraged.  If you feel I am asking for too much
> work I might be inclined to pick up some of this myself; if so, please
> let me know.  I definitely don't want this to be dropped.

Don't worry, your feedback is fine... certainly not asking too much.

  Cheers,

Paul


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 7/7] xl: allow domid to be preserved on save/restore or migrate

2020-01-30 Thread Durrant, Paul
> -Original Message-
> From: Ian Jackson 
> Sent: 30 January 2020 17:29
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Wei Liu ; Anthony Perard
> 
> Subject: Re: [PATCH v4 7/7] xl: allow domid to be preserved on
> save/restore or migrate
> 
> Paul Durrant writes ("[PATCH v4 7/7] xl: allow domid to be preserved on
> save/restore or migrate"):
> > This patch adds a '-D' command line option to save and migrate to allow
> > the domain id to be incorporated into the saved domain configuration and
> > hence be preserved.
> 
> Thanks.
> 
> In response to v3 6/6 I wrote:
> 
>   I wonder if this should be done more in libxl.  Should this be a
>   domain property ?  Wei, Anthony ?
> 
> This question remains unanswered.  I'm sorry that neither Wei nor
> Anthony had a chance to answer yet...
> 
> Paul, do you have a reason for doing it this way ?  My inclination is
> that think doing it at the libxl layer would make more sense.  Do you
> think that would be possible ?
> 

I'm not sure how it would work at the libxl level as the domid is part of 
create_info and that it transferred by xl on migration. IIUC we'd need a new 
libxl save record to transfer it at that level, and then I'm not sure whether 
we'd hit an ordering problem as we'd have to dig that save record out before we 
could actually create the domain.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of INVALID_DOMID to the API

2020-01-31 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of
> Roger Pau Monné
> Sent: 22 January 2020 14:53
> To: Durrant, Paul 
> Cc: Anthony PERARD ; xen-
> de...@lists.xenproject.org; Ian Jackson ; Wei
> Liu 
> Subject: Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of
> INVALID_DOMID to the API
> 
> On Wed, Jan 22, 2020 at 02:44:40PM +, Paul Durrant wrote:
> > Currently both xl and libxl have internal definitions of INVALID_DOMID
> > which happen to be identical. However, for the purposes of describing
> the
> > behaviour of libxl_domain_create_new/restore() it is useful to have a
> > specified invalid value for a domain id.
> >
> > This patch therefore moves the libxl definition from libxl_internal.h to
> > libxl.h and removes the internal definition from xl_utils.h. The
> hardcoded
> > '-1' passed back via domcreate_complete() is then updated to
> INVALID_DOMID
> > and comment above libxl_domain_create_new/restore() is accordingly
> > modified.
> 
> Urg, it's kind of ugly to add another definition of invalid domid when
> there's already DOMID_INVALID in the public headers. I guess there's a
> reason I'm missing for not using DOMID_INVALID instead of introducing
> a new value?
> 

TBH I don't know. As far as xl/libxl goes, domids are uint32_ts so maybe 
DOMID_INVALID was for some reason not considered appropriate? Bear in mind, 
this patch is not truly introducing a new value; it's just making something 
that was internal but identical in both xl and libxl part of the interface.

> If so could this be mentioned in the commit message?
> 

I'll add a note to the commit comment to point out that this is not changing 
any value used by the toolstack.

  Paul

> Thanks, Roger.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 4/7] libxl: add infrastructure to track and query 'recent' domids

2020-01-31 Thread Durrant, Paul


> -Original Message-
> From: Anthony PERARD 
> Sent: 31 January 2020 10:55
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Ian Jackson
> ; Wei Liu 
> Subject: Re: [PATCH v4 4/7] libxl: add infrastructure to track and query
> 'recent' domids
> 
> On Wed, Jan 22, 2020 at 02:44:43PM +, Paul Durrant wrote:
> > A domid is considered recent if the domain it represents was destroyed
> > less than a specified number of seconds ago. The number can be set using
> > the environment variable LIBXL_DOMID_REUSE_TIMEOUT. If the variable does
> > not exist then a default value of 60s is used.
> 
> Could you rewrite that part of the commit message? By reading it, it
> sounds to me like LIBXL_DOMID_REUSE_TIMEOUT is a configuration variable
> that a toolstack may want to set. Whereas the comment in
> libxl_internal.h indicates that it's for debuging.  Having env var for
> debugging sounds good, but having env var as configuration doesn't.
> 

Sure, I'll make the commit comment clearer.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of INVALID_DOMID to the API

2020-01-31 Thread Durrant, Paul
> -Original Message-
> From: Roger Pau Monné 
> Sent: 31 January 2020 11:06
> To: Durrant, Paul 
> Cc: Anthony PERARD ; xen-
> de...@lists.xenproject.org; Ian Jackson ; Wei
> Liu 
> Subject: Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of
> INVALID_DOMID to the API
> 
> On Fri, Jan 31, 2020 at 10:31:49AM +, Durrant, Paul wrote:
> > > -Original Message-
> > > From: Xen-devel  On Behalf Of
> > > Roger Pau Monné
> > > Sent: 22 January 2020 14:53
> > > To: Durrant, Paul 
> > > Cc: Anthony PERARD ; xen-
> > > de...@lists.xenproject.org; Ian Jackson ;
> Wei
> > > Liu 
> > > Subject: Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of
> > > INVALID_DOMID to the API
> > >
> > > On Wed, Jan 22, 2020 at 02:44:40PM +, Paul Durrant wrote:
> > > > Currently both xl and libxl have internal definitions of
> INVALID_DOMID
> > > > which happen to be identical. However, for the purposes of
> describing
> > > the
> > > > behaviour of libxl_domain_create_new/restore() it is useful to have
> a
> > > > specified invalid value for a domain id.
> > > >
> > > > This patch therefore moves the libxl definition from
> libxl_internal.h to
> > > > libxl.h and removes the internal definition from xl_utils.h. The
> > > hardcoded
> > > > '-1' passed back via domcreate_complete() is then updated to
> > > INVALID_DOMID
> > > > and comment above libxl_domain_create_new/restore() is accordingly
> > > > modified.
> > >
> > > Urg, it's kind of ugly to add another definition of invalid domid when
> > > there's already DOMID_INVALID in the public headers. I guess there's a
> > > reason I'm missing for not using DOMID_INVALID instead of introducing
> > > a new value?
> > >
> >
> > TBH I don't know. As far as xl/libxl goes, domids are uint32_ts so maybe
> DOMID_INVALID was for some reason not considered appropriate? Bear in
> mind, this patch is not truly introducing a new value; it's just making
> something that was internal but identical in both xl and libxl part of the
> interface.
> 
> Hm, OK. It seems quite a mess that libxl uses a 32bit value when the
> hypervisor is using a 16bit field, but I guess there's nothing that
> can be done at this point to fix this.
> 
> Since this will be part of the public interface now, it might make
> sense to define it to the same value as DOMID_INVALID (0x7FF4). And
> make sure domid values passed to libxl are truncated to 16bits.

That sounds like feature creep that I'd rather not go near in this patch. I 
suspect a can of worms awaits.

> 
> Maybe it's not that relevant, but domids passed to libxl would need to
> be sanitized in order to make sure Xen's DOMID_INVALID is not treated
> as a valid domid value from libxl's PoV. There are also other special
> domids that need to be filtered.

There is already a validity check: libxl_domid_valid_guest().

  Paul

> 
> > > If so could this be mentioned in the commit message?
> > >
> >
> > I'll add a note to the commit comment to point out that this is not
> changing any value used by the toolstack.
> 
> Thanks!
> 
> Roger.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of INVALID_DOMID to the API

2020-01-31 Thread Durrant, Paul
> -Original Message-
> From: Andrew Cooper 
> Sent: 31 January 2020 12:08
> To: Durrant, Paul ; Roger Pau Monné
> 
> Cc: Anthony PERARD ; xen-
> de...@lists.xenproject.org; Ian Jackson ; Wei
> Liu 
> Subject: Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of
> INVALID_DOMID to the API
> 
> On 31/01/2020 11:10, Durrant, Paul wrote:
> >> -Original Message-
> >> From: Roger Pau Monné 
> >> Sent: 31 January 2020 11:06
> >> To: Durrant, Paul 
> >> Cc: Anthony PERARD ; xen-
> >> de...@lists.xenproject.org; Ian Jackson ;
> Wei
> >> Liu 
> >> Subject: Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of
> >> INVALID_DOMID to the API
> >>
> >> On Fri, Jan 31, 2020 at 10:31:49AM +, Durrant, Paul wrote:
> >>>> -Original Message-
> >>>> From: Xen-devel  On Behalf Of
> >>>> Roger Pau Monné
> >>>> Sent: 22 January 2020 14:53
> >>>> To: Durrant, Paul 
> >>>> Cc: Anthony PERARD ; xen-
> >>>> de...@lists.xenproject.org; Ian Jackson ;
> >> Wei
> >>>> Liu 
> >>>> Subject: Re: [Xen-devel] [PATCH v4 1/7] libxl: add definition of
> >>>> INVALID_DOMID to the API
> >>>>
> >>>> On Wed, Jan 22, 2020 at 02:44:40PM +, Paul Durrant wrote:
> >>>>> Currently both xl and libxl have internal definitions of
> >> INVALID_DOMID
> >>>>> which happen to be identical. However, for the purposes of
> >> describing
> >>>> the
> >>>>> behaviour of libxl_domain_create_new/restore() it is useful to have
> >> a
> >>>>> specified invalid value for a domain id.
> >>>>>
> >>>>> This patch therefore moves the libxl definition from
> >> libxl_internal.h to
> >>>>> libxl.h and removes the internal definition from xl_utils.h. The
> >>>> hardcoded
> >>>>> '-1' passed back via domcreate_complete() is then updated to
> >>>> INVALID_DOMID
> >>>>> and comment above libxl_domain_create_new/restore() is accordingly
> >>>>> modified.
> >>>> Urg, it's kind of ugly to add another definition of invalid domid
> when
> >>>> there's already DOMID_INVALID in the public headers. I guess there's
> a
> >>>> reason I'm missing for not using DOMID_INVALID instead of introducing
> >>>> a new value?
> >>>>
> >>> TBH I don't know. As far as xl/libxl goes, domids are uint32_ts so
> maybe
> >> DOMID_INVALID was for some reason not considered appropriate?
> 
> Read the commit message where I did the blanket change to use uint32_t.
> 
> c/s 5b42c82f55
> 
> Does this really need to enter the API?
> 

I need a 'magic' value for RANDOM_DOMID so it seemed reasonable to make this 
part of the xl/libxl API too. I don't think libxc is in scope here.

  Paul

> ~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 1/4] x86 / vmx: move teardown from domain_destroy()...

2020-01-31 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 31 January 2020 13:32
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Jun Nakajima ;
> Kevin Tian ; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap 
> Subject: Re: [PATCH v8 1/4] x86 / vmx: move teardown from
> domain_destroy()...
> 
> On 30.01.2020 15:57, Paul Durrant wrote:
> > ... to domain_relinquish_resources().
> >
> > The teardown code frees the APICv page. This does not need to be done
> late
> > so do it in domain_relinquish_resources() rather than domain_destroy().
> >
> > Signed-off-by: Paul Durrant 
> 
> Btw., this can have my
> Reviewed-by: Jan Beulich 
> as soon as "x86/HVM: relinquish resources also from hvm_domain_destroy()"
> has gone in. But that's still pending an ack or otherwise by you.
> 

Ok, thanks, I'll get on and review that.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/HVM: relinquish resources also from hvm_domain_destroy()

2020-01-31 Thread Durrant, Paul
> -Original Message-
> From: Xen-devel  On Behalf Of Jan
> Beulich
> Sent: 29 January 2020 13:00
> To: xen-devel@lists.xenproject.org
> Cc: Andrew Cooper ; Paul Durrant
> ; Wei Liu ; Roger Pau Monné
> 
> Subject: [Xen-devel] [PATCH v2] x86/HVM: relinquish resources also from
> hvm_domain_destroy()
> 
> Domain creation failure paths don't call domain_relinquish_resources(),
> yet allocations and alike done from hvm_domain_initialize() need to be
> undone nevertheless. Call the function also from hvm_domain_destroy(),
> after making sure all descendants are idempotent.
> 
> Note that while viridian_{domain,vcpu}_deinit() were already used in
> ways suggesting they're idempotent, viridian_time_vcpu_deinit() actually
> wasn't: One can't kill a timer that was never initialized.
> 
> For hvm_destroy_all_ioreq_servers()'s purposes make
> relocate_portio_handler() return whether the to be relocated port range
> was actually found. This seems cheaper than introducing a flag into
> struct hvm_domain's ioreq_server sub-structure.
> 
> In hvm_domain_initialise() additionally
> - use XFREE() also to replace adjacent xfree(),
> - use hvm_domain_relinquish_resources() as being idempotent now.
> There as well as in hvm_domain_destroy() the explicit call to
> rtc_deinit() isn't needed anymore.
> 
> In hvm_domain_relinquish_resources() additionally drop a no longer
> relevant if().
> 
> Fixes: e7a9b5e72f26 ("viridian: separately allocate domain and vcpu
> structures")
> Fixes: 26fba3c85571 ("viridian: add implementation of synthetic timers")
> Signed-off-by: Jan Beulich 

LGTM

Reviewed-by: Paul Durrant 

> Reviewed-by: Roger Pau Monné 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 2/4] add a domain_tot_pages() helper function

2020-01-31 Thread Durrant, Paul
> -Original Message-
> From: Jan Beulich 
> Sent: 31 January 2020 12:53
> To: Durrant, Paul 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> ; Wei Liu ; Roger Pau Monné
> ; George Dunlap ; Ian
> Jackson ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> ; Tim Deegan 
> Subject: Re: [PATCH v8 2/4] add a domain_tot_pages() helper function
> 
> On 30.01.2020 15:57, Paul Durrant wrote:
> > v8:
> >  - New in v8
> > ---
> >  xen/arch/x86/domain.c   |  2 +-
> >  xen/arch/x86/mm.c   |  6 +++---
> >  xen/arch/x86/mm/p2m-pod.c   | 10 +-
> >  xen/arch/x86/mm/shadow/common.c |  2 +-
> >  xen/arch/x86/msi.c  |  2 +-
> >  xen/arch/x86/numa.c |  2 +-
> >  xen/arch/x86/pv/dom0_build.c| 25 +
> >  xen/arch/x86/pv/domain.c|  2 +-
> >  xen/common/domctl.c |  2 +-
> >  xen/common/grant_table.c|  4 ++--
> >  xen/common/keyhandler.c |  2 +-
> >  xen/common/memory.c |  4 ++--
> >  xen/common/page_alloc.c | 15 ---
> >  xen/include/public/memory.h |  4 ++--
> >  xen/include/xen/sched.h | 24 ++--
> >  15 files changed, 60 insertions(+), 46 deletions(-)
> 
> From this, with the comment you add next to the struct field, and
> with your reply yesterday, what about the uses in
> - arch/arm/arm64/domctl.c:switch_mode(),

TBH I'm not sure with that one. It looks to me like it needs to check whether 
the domain has *any* memory assigned. Perhaps checking page_list would be more 
appropriate. Perhaps Julien can comment?

> - arch/x86/pv/shim.c:pv_shim_setup_dom(),
> - arch/x86/pv/shim.c:write_start_info()?

It looks like both of those should be changed.

> 
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -4194,8 +4194,8 @@ long do_mmu_update(
> >   * - page caching attributes cleaned up
> >   * - removed from the domain's page_list
> >   *
> > - * If MEMF_no_refcount is not set, the domain's tot_pages will be
> > - * adjusted.  If this results in the page count falling to 0,
> > + * If MEMF_no_refcount is not set, the domain_adjust_tot_pages() will
> > + * be called.  If this results in the page count falling to 0,
> >   * put_domain() will be called.
> 
> If you fiddle with this comment, please also drop the "the" ahead
> of the function name. Unless you as a native speaker would confirm
> it's appropriate there (it doesn't seem so to me). Of course I
> also wouldn't mind leaving this untouched altogether.
> 

Ok, I'll drop that hunk.

> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -717,7 +717,7 @@ static long
> memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
> >
> >  /*
> >   * Pages in in_chunk_list is stolen without
> > - * decreasing the tot_pages. If the domain is dying
> when
> > + * decreasing domain_tot_pages(). If the domain is
> dying when
> 
> I'd leave this comment alone, or at least not use the function
> name. Maybe do as you did in the public header?
> 

OK I'll leave this alone too.

> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -364,12 +364,18 @@ struct domain
> >  spinlock_t   page_alloc_lock; /* protects all the following
> fields  */
> >  struct page_list_head page_list;  /* linked list */
> >  struct page_list_head xenpage_list; /* linked list (size
> xenheap_pages) */
> > -unsigned int tot_pages;   /* number of pages currently
> possesed */
> > -unsigned int xenheap_pages;   /* # pages allocated from Xen
> heap*/
> > -unsigned int outstanding_pages; /* pages claimed but not
> possessed  */
> > -unsigned int max_pages;   /* maximum value for tot_pages
> */
> > -atomic_t shr_pages;   /* number of shared pages
> */
> > -atomic_t paged_pages; /* number of paged-out pages
> */
> > +
> > +/*
> > + * This field should only be directly accessed by
> domain_adjust_tot_pages()
> > + * and the domain_tot_pages() helper function defined below.
> > + */
> > +unsigned int tot_pages;
> 
> If the three missing ones got taken care of, with there being arguments
> both pro and con your change to dump_pageframe_info(), I'd be okay with
> it getting changed as you do, to not render this comment partially
> wrong.

Ok.

  Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

<    1   2   3   4   5   >