date:20190829

Re: [Xen-devel] [PATCH] x86/domain: don't destroy IOREQ servers on soft reset

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin 
> Sent: 28 August 2019 21:40
> To: xen-devel@lists.xenproject.org
> Cc: jbeul...@suse.com; Andrew Cooper ; Paul Durrant
> ; w...@xen.org; Roger Pau Monne 
> ; Igor Druzhinin
> 
> Subject: [PATCH] x86/domain: don't destroy IOREQ servers on soft reset
> 
> Performing soft reset should not opportunistically kill IOREQ servers
> for device emulators that might be currently running for a domain.
> Every emulator is supposed to clean up IOREQ servers for itself on exit.
> This allows a toolstack to elect whether or not a particular device
> model should be restarted.
> 
> Since commit ba7fdd64b ("xen: cleanup IOREQ server on exit") QEMU now
> destroys IOREQ server for itself as every other device emulator
> is supposed to do. It's now safe to remove this code from soft reset
> path - existing systems with old QEMU should be able to work as
> even if there are IOREQ servers left behind, a new QEMU instance will
> override its ranges anyway.
> 
> Signed-off-by: Igor Druzhinin 

Code LGTM, but I think it also be useful to mention the commit that introduced 
hvm_domain_soft_reset():

3235cbfe "arch-specific hooks for domain_soft_reset()"

...since it specifically calls out destroying ioreq servers in its commit 
message. I suspect that was necessary at the time because the 'default' ioreq 
server had no means of cleaning up (because it was unaware of the API) but now 
that support for a default server has gone away, this patch should be fine.

  Paul

> ---
>  xen/arch/x86/domain.c | 2 --
>  xen/arch/x86/hvm/hvm.c| 5 -
>  xen/include/asm-x86/hvm/hvm.h | 1 -
>  3 files changed, 8 deletions(-)
> 
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 2df3123..957f059 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -713,8 +713,6 @@ int arch_domain_soft_reset(struct domain *d)
>  if ( !is_hvm_domain(d) )
>  return -EINVAL;
> 
> -hvm_domain_soft_reset(d);
> -
>  spin_lock(>event_lock);
>  for ( i = 0; i < d->nr_pirqs ; i++ )
>  {
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 029eea3..2b81899 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -5045,11 +5045,6 @@ void hvm_toggle_singlestep(struct vcpu *v)
>  v->arch.hvm.single_step = !v->arch.hvm.single_step;
>  }
> 
> -void hvm_domain_soft_reset(struct domain *d)
> -{
> -hvm_destroy_all_ioreq_servers(d);
> -}
> -
>  /*
>   * Segment caches in VMCB/VMCS are inconsistent about which bits are checked,
>   * important, and preserved across vmentry/exit.  Cook the values to make 
> them
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index b327bd2..4e72d07 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -240,7 +240,6 @@ extern const struct hvm_function_table *start_vmx(void);
>  int hvm_domain_initialise(struct domain *d);
>  void hvm_domain_relinquish_resources(struct domain *d);
>  void hvm_domain_destroy(struct domain *d);
> -void hvm_domain_soft_reset(struct domain *d);
> 
>  int hvm_vcpu_initialise(struct vcpu *v);
>  void hvm_vcpu_destroy(struct vcpu *v);
> --
> 2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 140739: regressions - FAIL

2019-08-29 Thread osstest service owner

flight 140739 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/140739/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvshim   12 guest-start  fail REGR. vs. 140282

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 140282
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 140282
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 140282
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 140282
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 140282
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuu23919ddfd56135cad3cb468a8f54d5a595f024f4
baseline version:
 qemuuafd760539308a5524accf964107cdb1d54a059e3

Last test of basis   140282  2019-08-18 05:36:51 Z   11 days
Failing since140361  2019-08-19 11:36:26 Z9 days   11 attempts
Testing same since   140739  2019-08-28 07:28:02 Z1 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Aleksandar Markovic 
  Alex Bennée 
  Alexey Kardashevskiy 
  Andrey Shinkevich 
  Anthony PERARD 
  Aurelien Jarno 
  BALATON Zoltan 
  Bastian Koppelmann 
  Carlo Marcelo Arenas Belón 
  Catherine Ho

Re: [Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability

2019-08-29 Thread Roger Pau Monné

On Thu, Aug 29, 2019 at 05:02:27PM +0800, Chao Gao wrote:
> Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's
> perspective, assigned devices cannot be reset. But some devices rely on PCI
> reset to recover from hardware hangs. When being assigned to a VM, those
> devices cannot be reset and won't work any longer if a hardware hang occurs.
> We have to reboot VM to trigger PCI reset on host to recover the device.
>
> This patch exposes FLR capability to VMs if the assigned device can be reset 
> on
> host. When VM initiates an FLR to a device, qemu cleans up the device state,
> (including disabling of intx and/or MSI and unmapping BARs from guest, 
> deleting
> emulated registers), then initiate PCI reset through 'reset' knob under the
> device's sysfs, finally initialize the device again.

I think you likely need to deassign the device from the VM, perform
the reset, and then assign the device again, so that there's no Xen
internal state carried over prior to the reset?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V3 4/8] xen/common: Introduce xrealloc_flex_struct() helper macros

2019-08-29 Thread Jan Beulich

On 28.08.2019 20:23, Oleksandr wrote:
> --- a/xen/include/xen/xmalloc.h
> +++ b/xen/include/xen/xmalloc.h
> @@ -35,6 +35,18 @@
>   #define xzalloc_array(_type, _num) \
>   ((_type *)_xzalloc_array(sizeof(_type), __alignof__(_type), _num))
> 
> +/* Allocate space for a structure with a flexible array of typed 
> objects. */
> +#define xmalloc_flex_struct(type, field, nr) \
> +    ((type *)_xmalloc(offsetof(type, field[nr]), __alignof__(type)))
> +
> +/* Re-allocate space for a structure with a flexible array of typed 
> objects. */
> +#define xrealloc_flex_struct(ptr, type, field, nr) 
> ({    \
> +    typeof(*(ptr)) *ptr_ = 
> (ptr);    \
> +    /* Type checking: make sure that incoming pointer is of correct 
> type */  \
> +    (void)((ptr) == (type 
> *)0);  \
> +    (type *)_xrealloc(ptr_, offsetof(type, field[nr]), 
> __alignof__(type));   \
> +})
> +

What about

#define xrealloc_flex_struct(ptr, field, nr) \
(typeof(ptr))_xrealloc(ptr, offsetof(typeof(*(ptr)), field[nr]), \
   __alignof__(typeof(*(ptr

?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"

2019-08-29 Thread Roger Pau Monné

On Thu, Aug 29, 2019 at 11:33:11AM +0200, Jan Beulich wrote:
> On 29.08.2019 10:49, Roger Pau Monne wrote:
> > @@ -640,9 +670,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, 
> > mfn_t mfn,
> >   && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
> >  p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
> >  
> > +if ( need_iommu_pt_sync(p2m->domain) &&
> > + (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
> > +rc = iommu_pte_flags ?
> > +iommu_legacy_map(d, _dfn(gfn), mfn, page_order,
> > + iommu_pte_flags) :
> > +iommu_legacy_unmap(d, _dfn(gfn), page_order);
> 
> Indentation of the if() body is one level too deep. With this
> corrected (easy enough to do while committing)
> Reviewed-by: Jan Beulich 

Indeed, my editor seems to have done the wrong thing. Thanks.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability

2019-08-29 Thread Jan Beulich

On 29.08.2019 11:02, Chao Gao wrote:
> Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's
> perspective, assigned devices cannot be reset. But some devices rely on PCI
> reset to recover from hardware hangs. When being assigned to a VM, those
> devices cannot be reset and won't work any longer if a hardware hang occurs.
> We have to reboot VM to trigger PCI reset on host to recover the device.

Did you consider a hot-unplug, reset (by host), hot-plug cycle instead?

> +static int xen_pt_devctl_reg_write(XenPCIPassthroughState *s,
> +   XenPTReg *cfg_entry, uint16_t *val,
> +   uint16_t dev_value, uint16_t valid_mask)
> +{
> +if (s->real_device.is_resetable && (*val & PCI_EXP_DEVCTL_BCR_FLR)) {
> +xen_pt_reset(s);
> +}
> +return xen_pt_word_reg_write(s, cfg_entry, val, dev_value, valid_mask);

I think you also need to clear the bit before handing on the request,
such that reads will always observe it clear.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 01/28] linkage: new macros for assembler symbols

2019-08-29 Thread Jiri Slaby

On 12. 08. 19, 19:06, Borislav Petkov wrote:
>> +  Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``.
> 
> Btw, this coupling: I haven't gone through the whole patchset but do we
> have automatic checking which makes sure a starting macro is coupled
> with the correct ending macro?

I do, but it's not part of this series. It's a pinch of link-time magic,
but it works reliably (see e.g. 1cbec37b3f9c). I will post it if/after
this gets merged. There were other approaches proposed in the past too
-- using objtool for that (not implemented).

>> +Overriding Macros
>> +~
>> +Architecture can also override any of the macros in their own
> 
> "Other architectures... "

Not only "other", x86 can override the types if need be too.

>> +``asm/linkage.h``, including macros specifying the type of a symbol
>> +(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``).  As every macro
>> +described in this file is surrounded by ``#ifdef`` + ``#endif``, it is 
>> enough
>> +to define the macros differently in the aforementioned 
>> architecture-dependent
>> +header.

thanks,
-- 
js
suse labs

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"

2019-08-29 Thread George Dunlap

On 8/29/19 9:49 AM, Roger Pau Monne wrote:
> This partially reverts commit
> 854a49a7486a02edae5b3e53617bace526e9c1b1 by re-adding the logic that
> propagates changes to the domain physmap done by p2m_pt_set_entry into
> the iommu page tables. Without this logic changes to the guest physmap
> are not propagated to the iommu, leaving stale iommu entries that can
> leak data, or failing to add new entries.
> 
> Note that this commit doesn't re-introduce iommu flags to the cpu page
> table entries, since the logic to add/remove entries to the iommu page
> tables is based on the p2m type and the mfn.
> 
> Fixes: 854a49a7486a02 ('x86/mm: Clean IOMMU flags from p2m-pt code')
> Signed-off-by: Roger Pau Monné 

Sorry, the removal of the iommu update looked a bit suspicious; I should
have looked a bit closer.

Acked-by: George Dunlap 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V3 8/8] iommu/arm: Add Renesas IPMMU-VMSA support

2019-08-29 Thread Oleksandr


Hi, all



+
+static __init int ipmmu_init(struct dt_device_node *node, const void *data)
+{
+int ret;
+
+/*
+ * Even if the device can't be initialized, we don't want to give
+ * the IPMMU device to dom0.
+ */
+dt_device_set_used_by(node, DOMID_XEN);
+
+if ( !iommu_hap_pt_share )
+{
+printk_once(XENLOG_ERR "ipmmu: P2M table must always be shared between the 
CPU and the IPMMU\n");
+return -EINVAL;
+}
+
+if ( !ipmmu_stage2_supported() )
+{
+printk_once(XENLOG_ERR "ipmmu: P2M sharing is not supported in current SoC 
revision\n");
+return -ENODEV;
+}
+else
+{
+/*
+ * As 4-level translation table is not supported in IPMMU, we need
+ * to check IPA size used for P2M table beforehand to be sure it is
+ * 3-level and the IPMMU will be able to use it.
+ *
+ * TODO: First initialize the IOMMU and gather the requirements and
+ * then initialize the P2M. In the P2M code, take into the account
+ * the IOMMU requirements and restrict "pa_range" if necessary.
+ */
+if ( IPMMU_MAX_P2M_IPA_BITS < p2m_ipa_bits )
+{
+printk_once(XENLOG_ERR "ipmmu: P2M IPA size is not supported (P2M=%u 
IPMMU=%u)!\n",
+p2m_ipa_bits, IPMMU_MAX_P2M_IPA_BITS);
+return -ENODEV;
+}



There is a patch in ML which is intended to address this TODO:

https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg02237.html


--
Regards,

Oleksandr Tyshchenko


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/boot: Annotate pagetables with STT_OBJECT

2019-08-29 Thread Jan Beulich

On 28.08.2019 20:24, Andrew Cooper wrote:
> On 27/08/2019 15:36, Jan Beulich wrote:
>> On 14.08.2019 12:44, Andrew Cooper wrote:
>>> Introduce a new ENDDATA() helper which sets type and size together.
>>
>> Except this isn't very natural: Setting the size late is quite
>> common, to avoid the need for an "end" label. But the type would
>> better be set together with the label definition, i.e. by a
>> GLOBAL() counterpart (e.g. GLOBAL_DATA()). However, if despite
>> this remark you think your approach is the way to go:
> 
> Well - this way is fewer moving parts.
> 
> GLOBAL and ENTRY are to do with visibility and alignment.  While ENTRY
> might not typically be used for data, both are commonly used for code.
> 
> We can either have a proliferation of {GLOBAL,ENTRY}{,_DATA,_FUNC,etc}
> and a single END, or we can have ENDDATA/ENDFUNC and need no changes to
> the existing GLOBAL/ENTRY infrastructure.

Yes, it's slightly more of a change the way I'd prefer, but as said,
it would (imo) yield a more natural result. Omission of END() would
then mean only the symbol's size to not be set, but all other
attributes to be correct nevertheless.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()

2019-08-29 Thread Jan Beulich

On 29.08.2019 09:15, Chao Gao wrote:
> On Wed, Aug 28, 2019 at 05:12:34PM +0200, Jan Beulich wrote:
>> On 19.08.2019 03:25, Chao Gao wrote:
>>> --- a/xen/arch/x86/microcode_intel.c
>>> +++ b/xen/arch/x86/microcode_intel.c
>>> @@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, 
>>> struct cpu_signature *csig)
>>>  return 0;
>>>  }
>>>  
>>> -static inline int microcode_update_match(
>>> -unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>>> -int sig, int pf)
>>> +/* Check an update against the CPU signature and current update revision */
>>> +static enum microcode_match_result microcode_update_match(
>>> +const struct microcode_header_intel *mc_header, unsigned int cpu)
>>>  {
>>> -struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu_num);
>>> -
>>> -return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
>>> -(mc_header->rev > uci->cpu_sig.rev));
>>> +const struct extended_sigtable *ext_header;
>>> +const struct extended_signature *ext_sig;
>>> +unsigned int i;
>>> +struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu);
>>> +unsigned int sig = uci->cpu_sig.sig;
>>> +unsigned int pf = uci->cpu_sig.pf;
>>> +unsigned int rev = uci->cpu_sig.rev;
>>> +unsigned long data_size = get_datasize(mc_header);
>>> +const void *end = (const void *)mc_header + get_totalsize(mc_header);
>>> +
>>> +if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>>> +return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
>>
>> Didn't you lose a range check against "end" ahead of this if()?
>> get_totalsize() and get_datasize() aiui also would need to live
>> after a range check, just a sizeof() (i.e. MC_HEADER_SIZE) based
>> one. This would also affect the caller as it seems.
> 
> I think microcode_sanity_check() is for this purpose. We can do
> sanity check before the if(). Perhaps, we can just add an assertion
> that sanity check won't fail. Because whenever sanity check failed
> when pasing an ucode blob, we just drop the ucode; we won't pass an
> broken ucode to this function.

Well - that's the main question. The purpose of this patch, after all,
is (aiui) to allow calling the function in more cases. If all callers
are indeed supposed to check the basic properties, then yes, an ASSERT()
would be fine.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup

2019-08-29 Thread Chao Gao

On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote:
>On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote:
>> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
>> >On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
>> >> Both are loading the cached patch. Since APs call the unified function,
>> >> microcode_update_one(), during wakeup, the 'start_update' parameter
>> >> which originally used to distinguish BSP and APs is redundant. So remove
>> >> this parameter.
>> >> 
>> >> Signed-off-by: Chao Gao 
>> >> ---
>> >> Note that here is a functional change: resuming a CPU would call
>> >> ->end_update() now while previously it wasn't. Not quite sure
>> >> whether it is correct.
>> >
>> >I guess that's required if it called start_update prior to calling
>> >end_update?
>> >
>> >> 
>> >> Changes in v9:
>> >>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
>> >>microcode_update_one()
>> >>  - rebase and fix conflicts.
>> >> 
>> >> Changes in v8:
>> >>  - split out from the previous patch
>> >> ---
>> >>  xen/arch/x86/acpi/power.c   |  2 +-
>> >>  xen/arch/x86/microcode.c| 90 
>> >> ++---
>> >>  xen/arch/x86/smpboot.c  |  5 +--
>> >>  xen/include/asm-x86/processor.h |  4 +-
>> >>  4 files changed, 44 insertions(+), 57 deletions(-)
>> >> 
>> >> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
>> >> index 4f21903..24798d5 100644
>> >> --- a/xen/arch/x86/acpi/power.c
>> >> +++ b/xen/arch/x86/acpi/power.c
>> >> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
>> >>  
>> >>  console_end_sync();
>> >>  
>> >> -microcode_resume_cpu();
>> >> +microcode_update_one();
>> >>  
>> >>  if ( !recheck_cpu_features(0) )
>> >>  panic("Missing previously available feature(s)\n");
>> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> >> index a2febc7..bdd9c9f 100644
>> >> --- a/xen/arch/x86/microcode.c
>> >> +++ b/xen/arch/x86/microcode.c
>> >> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char 
>> >> *buf, uint32_t len)
>> >>  return NULL;
>> >>  }
>> >>  
>> >> -int microcode_resume_cpu(void)
>> >> -{
>> >> -int err;
>> >> -struct cpu_signature *sig = _cpu(cpu_sig);
>> >> -
>> >> -if ( !microcode_ops )
>> >> -return 0;
>> >> -
>> >> -spin_lock(_mutex);
>> >> -
>> >> -err = microcode_ops->collect_cpu_info(sig);
>> >> -if ( likely(!err) )
>> >> -err = microcode_ops->apply_microcode(microcode_cache);
>> >> -spin_unlock(_mutex);
>> >> -
>> >> -return err;
>> >> -}
>> >> -
>> >>  void microcode_free_patch(struct microcode_patch *microcode_patch)
>> >>  {
>> >>  microcode_ops->free_patch(microcode_patch->mc);
>> >> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
>> >>  }
>> >>  __initcall(microcode_init);
>> >>  
>> >> -int __init early_microcode_update_cpu(bool start_update)
>> >> +/* Load a cached update to current cpu */
>> >> +int microcode_update_one(void)
>> >> +{
>> >> +int rc;
>> >> +
>> >> +if ( !microcode_ops )
>> >> +return -EOPNOTSUPP;
>> >> +
>> >> +rc = microcode_update_cpu(NULL);
>> >> +
>> >> +if ( microcode_ops->end_update )
>> >> +microcode_ops->end_update();
>> >
>> >Don't you need to call start_update before calling
>> >microcode_update_cpu?
>> 
>> No. On AMD side, osvw_status records the hardware erratum in the system.
>> As we don't assume all CPUs have the same erratum, each cpu calls
>> end_update to update osvw_status after ucode loading.
>> start_update just resets osvw_status to 0. And it is called once prior
>> to ucode loading on any CPU so that osvw_status can be recomputed.
>
>Oh, I think I understand it. start_update must only be called once
>_before_ the sequence to update the microcode on all CPUs is
>performed, while end_update needs to be called on _each_ CPU after the
>update has been completed in order to account for any erratas.
>
>The name for those hooks should be improved, I guess renaming
>end_update to end_update_each or end_update_percpu would be clearer in
>order to make it clear that start_update is global, while end_update
>is percpu. Anyway, I don't want to delay this series for a naming nit.
>
>I'm still unsure where start_update is called for the resume from
>suspension case, I don't seem to see any call to start_update neither
>in enter_state or microcode_update_one, hence I think this is missing?

No. Actually, no call of start_update for resume case.

>
>I would expect you need to clean osvw_status also on resume from
>suspension, in case microcode loading fails? Or else you will be
>carrying a stale osvw_status.

Then we need to send IPI to all other CPUs to recompute osvw_state. But
I think it is not necessary. If ucode cache isn't changed during the
CPU's suspension period, there is not stale osvw bit (assuming OSVW on
the resuming CPU won't change). If the ucode cache is

Re: [Xen-devel] [PATCH v6 07/10] use is_iommu_enabled() where appropriate...

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Roger Pau Monne 
> Sent: 23 August 2019 11:56
> To: Paul Durrant 
> Cc: xen-devel@lists.xenproject.org; Stefano Stabellini 
> ; Julien Grall
> ; Volodymyr Babchuk ; Jan 
> Beulich
> ; Andrew Cooper ; Wei Liu 
> ; Jun Nakajima
> ; Kevin Tian ; George Dunlap 
> ;
> Suravee Suthikulpanit ; Brian Woods 
> ; Daniel De
> Graaf 
> Subject: Re: [PATCH v6 07/10] use is_iommu_enabled() where appropriate...
> 
> On Fri, Aug 16, 2019 at 06:19:58PM +0100, Paul Durrant wrote:
> > ...rather than testing the global iommu_enabled flag and ops pointer.
> >
> > Now that there is a per-domain flag indicating whether the domain is
> > permitted to use the IOMMU (which determines whether the ops pointer will
> > be set), many tests of the global iommu_enabled flag and ops pointer can
> > be translated into tests of the per-domain flag. Some of the other tests of
> > purely the global iommu_enabled flag can also be translated into tests of
> > the per-domain flag.
> >
> > NOTE: The comment in iommu_share_p2m_table() is also fixed; need_iommu()
> >   disappeared some time ago. Also, whilst the style of the 'if' in
> >   flask_iommu_resource_use_perm() is fixed, I have not translated any
> >   instances of u32 into uint32_t to keep consistency. IMO such a
> >   translation would be better done globally for the source module in
> >   a separate patch.
> 
> I've usually changed those instances as the line gets modified anyway,
> but I'm not going to ask everyone to do this if they don't feel like
> it.
> 
> The problem with doing wholesale changes of u32 -> uint32_t is that
> you introduce a lot of noise when trying to use git blame afterwards
> for example. Doing it when touching the line for legitimate changes
> avoids the noise.
> 
> >   The change in the initialization of the 'hd' variable in iommu_map()
> >   and iommu_unmap() is done to keep the PV shim build happy. Without
> >   this change it will fail to compile with errors of the form:
> >
> > iommu.c:361:32: error: unused variable ‘hd’ [-Werror=unused-variable]
> >  const struct domain_iommu *hd = dom_iommu(d);
> >  ^~
> >
> > Signed-off-by: Paul Durrant 
> 
> Reviewed-by: Roger Pau Monné 
> 

Thanks.

> I haven't looked however if there are further cases where
> iommu_enabled should be changed into is_iommu_enabled.
> 

Jan had clearly checked on a previous review iteration because he spotted a 
couple of places I had missed. I'm reasonably confident I've found them all now.

  Paul

> > @@ -556,8 +560,8 @@ int iommu_do_domctl(
> >  {
> >  int ret = -ENODEV;
> >
> > -if ( !iommu_enabled )
> > -return -ENOSYS;
> > +if ( !is_iommu_enabled(d) )
> > +return -EOPNOTSUPP;
> 
> I would use ENOENT here, but I don't have a strong opinion. The
> hypercall is supported, it's just that there's no iommu for this
> domain.
> 
> Thanks, Roger.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86: properly gate clearing of PKU feature

2019-08-29 Thread Jan Beulich

setup_clear_cpu_cap() is __init and hence may not be called post-boot.
Note that opt_pku nevertheless is not getting __initdata added - see
e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init").

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -466,7 +466,7 @@ void identify_cpu(struct cpuinfo_x86 *c)
this_cpu->c_init(c);
 
 
-   if ( !opt_pku )
+   if (c == _cpu_data && !opt_pku)
setup_clear_cpu_cap(X86_FEATURE_PKU);
 
/*

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()

2019-08-29 Thread Jan Beulich

On 22.08.2019 15:59, Roger Pau Monné  wrote:
> Seeing how this works I'm not sure what's the best option here. As
> updating will be attempted on other CPUs, I'm not sure if it's OK to
> return an error if the update succeed on some CPUs but failed on
> others.

The overall result of a partially successful update should be an
error - mismatched ucode may, after all, be more of a problem
than outdated ucode.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] p2m/ept: pass correct level to atomic_write_ept_entry in ept_invalidate_emt

2019-08-29 Thread Roger Pau Monné

On Tue, Aug 27, 2019 at 05:23:33PM +0200, Jan Beulich wrote:
> On 23.08.2019 07:58,  Tian, Kevin  wrote:
> > > From: Roger Pau Monne [mailto:roger@citrix.com]
> > > Sent: Tuesday, August 20, 2019 11:38 PM
> > > 
> > > The level passed to ept_invalidate_emt corresponds to the EPT entry
> > > passed as the mfn parameter, which is a pointer to an EPT page table,
> > > hence the entries in that page table will have one level less than the
> > > parent.
> > > 
> > > Fix the call to atomic_write_ept_entry to pass the correct level, ie:
> > > one level less than the parent.
> > > 
> > > Fixes: 50fe6e73705 ('pvh dom0: add and remove foreign pages')
> > > Signed-off-by: Roger Pau Monné 
> > > ---
> > > Cc: Jun Nakajima 
> > > Cc: Kevin Tian 
> > > Cc: George Dunlap 
> > > Cc: Jan Beulich 
> > > Cc: Andrew Cooper 
> > > Cc: Wei Liu 
> > > Cc: Sergey Dyasli 
> > > Cc: Paul Durrant 
> > > ---
> > >   xen/arch/x86/mm/p2m-ept.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> > > index 6b8468c793..23ae6e9da3 100644
> > > --- a/xen/arch/x86/mm/p2m-ept.c
> > > +++ b/xen/arch/x86/mm/p2m-ept.c
> > > @@ -367,7 +367,7 @@ static bool_t ept_invalidate_emt(struct p2m_domain
> > > *p2m, mfn_t mfn,
> > >   e.emt = MTRR_NUM_TYPES;
> > >   if ( recalc )
> > >   e.recalc = 1;
> > > -rc = atomic_write_ept_entry(p2m, [i], e, level);
> > > +rc = atomic_write_ept_entry(p2m, [i], e, level - 1);
> > >   ASSERT(rc == 0);
> > >   changed = 1;
> > >   }
> > 
> > Reviewed-by: Kevin Tian .
> > 
> > One small comment about readability. What about renaming 'level'
> > to 'parent_level' in this function?
> 
> And also taking the opportunity and making it unsigned int?

I've been thinking about this, and I'm not sure renaming level to
parent_level is correct, level actually matches the level of the mfn
also passed as a parameter, and hence it's correct. It's the function
itself that actually iterates over the page table entries, and hence
descends one level into the page tables.

ie: I could add something like ASSERT(level) to make sure no level 0
entries are passed to the function, but renaming doesn't seem correct
to me.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Jan Beulich

On 29.08.2019 12:20, Paul Durrant wrote:
>> -Original Message-
>> From: Jan Beulich 
>> Sent: 29 August 2019 10:52
>> To: Paul Durrant 
>> Cc: 'JulienGrall' ; 'Alexandru Isaila' 
>> ; 'Petre
>> Pircalabu' ; 'Razvan Cojocaru' 
>> ; Andrew Cooper
>> ; George Dunlap ; Ian 
>> Jackson
>> ; Roger Pau Monne ; 
>> 'VolodymyrBabchuk'
>> ; 'Stefano Stabellini' ; 
>> 'xen-
>> de...@lists.xenproject.org' ; 'Konrad 
>> Rzeszutek Wilk'
>> ; 'Tamas K Lengyel' ; Tim 
>> (Xen.org) ; 'Wei
>> Liu' 
>> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of 
>> IOMMU page tables
>>
>> On 29.08.2019 11:33, Paul Durrant wrote:
>>> TBH I've seen sufficient numbers of domain create failures when using
>>> auto-ballooning that I stopped using it some time ago (besides the fact
>>> that it's slow). If you're happy with the simplistic
>>> double-the-page-table-reservation calculation then I can add that but
>>> IMO it would be better to add another patch to just remove auto-ballooning.
>>
>> I'd not be overly happy, but I could live with this. But this needs
>> consent by others, not the least the tool stack maintainers.
>>
> 
> Ok. I'll deal with that later. Looking again though, I'm not sure what
> you mean by 'the amount reserved for page tables'? Do you mean
> 'shadow_memkb' or something else?

This one, I think, yes, or the value it gets defaulted to when not set
in the guest config file.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup

2019-08-29 Thread Roger Pau Monné

On Thu, Aug 29, 2019 at 03:37:47PM +0800, Chao Gao wrote:
> On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote:
> >On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote:
> >> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
> >> >On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
> >> >> Both are loading the cached patch. Since APs call the unified function,
> >> >> microcode_update_one(), during wakeup, the 'start_update' parameter
> >> >> which originally used to distinguish BSP and APs is redundant. So remove
> >> >> this parameter.
> >> >> 
> >> >> Signed-off-by: Chao Gao 
> >> >> ---
> >> >> Note that here is a functional change: resuming a CPU would call
> >> >> ->end_update() now while previously it wasn't. Not quite sure
> >> >> whether it is correct.
> >> >
> >> >I guess that's required if it called start_update prior to calling
> >> >end_update?
> >> >
> >> >> 
> >> >> Changes in v9:
> >> >>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
> >> >>microcode_update_one()
> >> >>  - rebase and fix conflicts.
> >> >> 
> >> >> Changes in v8:
> >> >>  - split out from the previous patch
> >> >> ---
> >> >>  xen/arch/x86/acpi/power.c   |  2 +-
> >> >>  xen/arch/x86/microcode.c| 90 
> >> >> ++---
> >> >>  xen/arch/x86/smpboot.c  |  5 +--
> >> >>  xen/include/asm-x86/processor.h |  4 +-
> >> >>  4 files changed, 44 insertions(+), 57 deletions(-)
> >> >> 
> >> >> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> >> >> index 4f21903..24798d5 100644
> >> >> --- a/xen/arch/x86/acpi/power.c
> >> >> +++ b/xen/arch/x86/acpi/power.c
> >> >> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
> >> >>  
> >> >>  console_end_sync();
> >> >>  
> >> >> -microcode_resume_cpu();
> >> >> +microcode_update_one();
> >> >>  
> >> >>  if ( !recheck_cpu_features(0) )
> >> >>  panic("Missing previously available feature(s)\n");
> >> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> >> >> index a2febc7..bdd9c9f 100644
> >> >> --- a/xen/arch/x86/microcode.c
> >> >> +++ b/xen/arch/x86/microcode.c
> >> >> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const 
> >> >> char *buf, uint32_t len)
> >> >>  return NULL;
> >> >>  }
> >> >>  
> >> >> -int microcode_resume_cpu(void)
> >> >> -{
> >> >> -int err;
> >> >> -struct cpu_signature *sig = _cpu(cpu_sig);
> >> >> -
> >> >> -if ( !microcode_ops )
> >> >> -return 0;
> >> >> -
> >> >> -spin_lock(_mutex);
> >> >> -
> >> >> -err = microcode_ops->collect_cpu_info(sig);
> >> >> -if ( likely(!err) )
> >> >> -err = microcode_ops->apply_microcode(microcode_cache);
> >> >> -spin_unlock(_mutex);
> >> >> -
> >> >> -return err;
> >> >> -}
> >> >> -
> >> >>  void microcode_free_patch(struct microcode_patch *microcode_patch)
> >> >>  {
> >> >>  microcode_ops->free_patch(microcode_patch->mc);
> >> >> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
> >> >>  }
> >> >>  __initcall(microcode_init);
> >> >>  
> >> >> -int __init early_microcode_update_cpu(bool start_update)
> >> >> +/* Load a cached update to current cpu */
> >> >> +int microcode_update_one(void)
> >> >> +{
> >> >> +int rc;
> >> >> +
> >> >> +if ( !microcode_ops )
> >> >> +return -EOPNOTSUPP;
> >> >> +
> >> >> +rc = microcode_update_cpu(NULL);
> >> >> +
> >> >> +if ( microcode_ops->end_update )
> >> >> +microcode_ops->end_update();
> >> >
> >> >Don't you need to call start_update before calling
> >> >microcode_update_cpu?
> >> 
> >> No. On AMD side, osvw_status records the hardware erratum in the system.
> >> As we don't assume all CPUs have the same erratum, each cpu calls
> >> end_update to update osvw_status after ucode loading.
> >> start_update just resets osvw_status to 0. And it is called once prior
> >> to ucode loading on any CPU so that osvw_status can be recomputed.
> >
> >Oh, I think I understand it. start_update must only be called once
> >_before_ the sequence to update the microcode on all CPUs is
> >performed, while end_update needs to be called on _each_ CPU after the
> >update has been completed in order to account for any erratas.
> >
> >The name for those hooks should be improved, I guess renaming
> >end_update to end_update_each or end_update_percpu would be clearer in
> >order to make it clear that start_update is global, while end_update
> >is percpu. Anyway, I don't want to delay this series for a naming nit.
> >
> >I'm still unsure where start_update is called for the resume from
> >suspension case, I don't seem to see any call to start_update neither
> >in enter_state or microcode_update_one, hence I think this is missing?
> 
> No. Actually, no call of start_update for resume case.
> 
> >
> >I would expect you need to clean osvw_status also on resume from
> >suspension, in case microcode loading fails? Or else you will be

[Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"

2019-08-29 Thread Roger Pau Monne

This partially reverts commit
854a49a7486a02edae5b3e53617bace526e9c1b1 by re-adding the logic that
propagates changes to the domain physmap done by p2m_pt_set_entry into
the iommu page tables. Without this logic changes to the guest physmap
are not propagated to the iommu, leaving stale iommu entries that can
leak data, or failing to add new entries.

Note that this commit doesn't re-introduce iommu flags to the cpu page
table entries, since the logic to add/remove entries to the iommu page
tables is based on the p2m type and the mfn.

Fixes: 854a49a7486a02 ('x86/mm: Clean IOMMU flags from p2m-pt code')
Signed-off-by: Roger Pau Monné 
---
Cc: Alexandru Stefan ISAILA 
---
Changes since v1:
 - Remove the share-pt branch, there's no sharing on AMD hardware.
---
 xen/arch/x86/mm/p2m-pt.c | 42 ++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 3a0a500d66..8483dc8142 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -508,7 +508,18 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t 
mfn,
 l2_pgentry_t l2e_content;
 l3_pgentry_t l3e_content;
 int rc;
-unsigned int flags;
+unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt, mfn);
+/*
+ * old_mfn and iommu_old_flags control possible flush/update needs on the
+ * IOMMU: We need to flush when MFN or flags (i.e. permissions) change.
+ * iommu_old_flags being initialized to zero covers the case of the entry
+ * getting replaced being a non-present (leaf or intermediate) one. For
+ * present leaf entries the real value will get calculated below, while
+ * for present intermediate entries ~0 (guaranteed != iommu_pte_flags)
+ * will be used (to cover all cases of what the leaf entries underneath
+ * the intermediate one might be).
+ */
+unsigned int flags, iommu_old_flags = 0;
 unsigned long old_mfn = mfn_x(INVALID_MFN);
 
 if ( !sve )
@@ -556,9 +567,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t 
mfn,
 if ( flags & _PAGE_PRESENT )
 {
 if ( flags & _PAGE_PSE )
+{
 old_mfn = l1e_get_pfn(*p2m_entry);
+iommu_old_flags =
+p2m_get_iommu_flags(p2m_flags_to_type(flags),
+_mfn(old_mfn));
+}
 else
+{
+iommu_old_flags = ~0;
 intermediate_entry = *p2m_entry;
+}
 }
 
 check_entry(mfn, p2mt, p2m_flags_to_type(flags), page_order);
@@ -594,6 +613,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t 
mfn,
0, L1_PAGETABLE_ENTRIES);
 ASSERT(p2m_entry);
 old_mfn = l1e_get_pfn(*p2m_entry);
+iommu_old_flags =
+p2m_get_iommu_flags(p2m_flags_to_type(l1e_get_flags(*p2m_entry)),
+_mfn(old_mfn));
 
 if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) )
 entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
@@ -617,9 +639,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t 
mfn,
 if ( flags & _PAGE_PRESENT )
 {
 if ( flags & _PAGE_PSE )
+{
 old_mfn = l1e_get_pfn(*p2m_entry);
+iommu_old_flags =
+p2m_get_iommu_flags(p2m_flags_to_type(flags),
+_mfn(old_mfn));
+}
 else
+{
+iommu_old_flags = ~0;
 intermediate_entry = *p2m_entry;
+}
 }
 
 check_entry(mfn, p2mt, p2m_flags_to_type(flags), page_order);
@@ -640,9 +670,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t 
mfn,
  && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
 p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
 
+if ( need_iommu_pt_sync(p2m->domain) &&
+ (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
+rc = iommu_pte_flags ?
+iommu_legacy_map(d, _dfn(gfn), mfn, page_order,
+ iommu_pte_flags) :
+iommu_legacy_unmap(d, _dfn(gfn), page_order);
+
 /*
  * Free old intermediate tables if necessary.  This has to be the
- * last thing we do so as to avoid a potential use-after-free.
+ * last thing we do, after removal from the IOMMU tables, so as to
+ * avoid a potential use-after-free.
  */
 if ( l1e_get_flags(intermediate_entry) & _PAGE_PRESENT )
 p2m_free_entry(p2m, _entry, page_order);
-- 
2.22.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 2/3] host properties: Firmware: Move default to selecthost

2019-08-29 Thread Ian Jackson

Drop the explicit default from all the call sites.  This centralises
the default.  This is going to be the new scheme for host properties
in general.

(Two of the call sites had a different default, "", which in their
context was semantically equivalent to "bios".)

Signed-off-by: Ian Jackson 
---
 Osstest/Debian.pm  |  4 ++--
 Osstest/TestSupport.pm | 13 +
 ts-xen-install |  2 +-
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm
index 911d8905..79aa2d24 100644
--- a/Osstest/Debian.pm
+++ b/Osstest/Debian.pm
@@ -443,7 +443,7 @@ sub setupboot_grub2 () {
 
 # Grub2 on jessie/stretch ARM* doesn't do multiboot, so we must chainload.
 my $need_uefi_chainload =
-get_host_property($ho, "firmware", "") eq "uefi" &&
+get_host_property($ho, "firmware") eq "uefi" &&
 $ho->{Suite} =~ m/jessie|stretch/ && $ho->{Arch} =~ m/^arm/;
 
 my $parsemenu= sub {
@@ -1498,7 +1498,7 @@ END
 
 preseed_microcode($ho,$sfx);
 
-if (get_host_property($ho, "firmware",'') eq "uefi") {
+if (get_host_property($ho, "firmware") eq "uefi") {
die unless $ho->{Suite} =~ m/jessie|stretch/;
# Prevent grub-install from making a new Debian boot entry, so
# we always reboot from the network. Debian bug #789798 proposes a
diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
index e554af38..b629fb7d 100644
--- a/Osstest/TestSupport.pm
+++ b/Osstest/TestSupport.pm
@@ -1218,7 +1218,10 @@ sub selecthost ($;$) {
 
 #- calculation of the host's properties -
 
-$ho->{Properties} = { };
+# Firstly, hardcoded defaults
+$ho->{Properties} = {
+Firmware => 'bios',
+};
 my $setprop = sub {
my ($pn,$val) = @_;
$ho->{Properties}{$pn} = $val;
@@ -1363,6 +1366,8 @@ sub propname_check ($) {
 
 # It is fine to call this on a guest object too, in which case it will
 # always return $defval.
+# Ideally all uses of $defval would be replaced by defaults in the
+# initial array in selecthost.
 sub get_host_property ($$;$) {
 my ($ho, $prop, $defval) = @_;
 return $defval unless $ho->{Properties};
@@ -2811,7 +2816,7 @@ sub host_netboot_file ($;$) {
 # in array context, returns (dir, pathtail)
 #  where dir does not depend on $templatekeytail
 my %v = %r;
-my $firmware = get_host_property($ho, "firmware", "bios");
+my $firmware = get_host_property($ho, "firmware");
 my $templatekeybase = $firmware eq 'uefi' ? 'NetGrub' : 'Pxe';
 $templatekeytail //= 'Templates';
 my $templatekey = $templatekeybase.$templatekeytail;
@@ -2937,7 +2942,7 @@ END
 
 sub setup_netboot_local ($) {
 my ($ho) = @_;
-my $firmware = get_host_property($ho, "firmware", "bios");
+my $firmware = get_host_property($ho, "firmware");
 $firmware =~ s/-/_/g;
 no strict qw(refs);
 return &{"setup_netboot_local_${firmware}"}($ho);
@@ -2945,7 +2950,7 @@ sub setup_netboot_local ($) {
 
 sub setup_netboot_di ($;%) {
 my ($ho,$kern,$initrd,$dicmd,$hocmd,%xopts) = @_;
-my $firmware = get_host_property($ho, "firmware", "bios");
+my $firmware = get_host_property($ho, "firmware");
 $firmware =~ s/-/_/g;
 no strict qw(refs);
 return &{"setup_netboot_di_${firmware}"}($ho,$kern,$initrd,
diff --git a/ts-xen-install b/ts-xen-install
index 2d3c644d..154f78c7 100755
--- a/ts-xen-install
+++ b/ts-xen-install
@@ -114,7 +114,7 @@ sub extradebs () {
 some_extradebs([ 'DebianExtraPackages', $suite ]);
 
 # $c{ DebianExtraPackages___ }
-my $firmware = get_host_property($ho, "firmware", "bios");
+my $firmware = get_host_property($ho, "firmware");
 some_extradebs([ 'DebianExtraPackages', $firmware, $ho->{Arch}, $suite ]);
 }
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 1/3] mg-hosts mknetbootdir: Improve sub-option parser

2019-08-29 Thread Ian Jackson

No semantic change.

Signed-off-by: Ian Jackson 
---
 mg-hosts | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mg-hosts b/mg-hosts
index 58b4acc3..d68f7b4e 100755
--- a/mg-hosts
+++ b/mg-hosts
@@ -121,7 +121,14 @@ sub l ($) { return split /,/, $_[0]; }
 
 sub cmd_mknetbootdir () {
 my $dryrun = 0;
-if (@ARGV && $ARGV[0] eq '-n') { shift @ARGV; $dryrun= 1; }
+while (@ARGV && $ARGV[0] =~ m/^-/) {
+   $_ = shift @ARGV;
+   last if $_ =~ m/^--?$/;
+   while (m/^-./) {
+   if (s/^-n/-/) { $dryrun= 1; }
+   else { die "unknown mknetbootdir option $_"; }
+   }
+}
 die unless @ARGV>=1;
 my $sudo = $ENV{'OSSTEST_SUDO'} // 'sudo';
 foreach my $hn (@ARGV) {
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 3/3] mg-hosts mknetbootdir: Introduce and require -F

2019-08-29 Thread Ian Jackson

If one runs
  ./mg-hosts mknetbootdir HOST
before having sorted out all the host configuration, it uses the
default configuration value for the host's firmware kind, which is
"bios".  If the configuration is then changed, things don't work.
This is confusing.

So ask the user to specify one or more -F, or -Fany.

CC: Dominic Brekau 
Signed-off-by: Ian Jackson 
---
 mg-hosts | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/mg-hosts b/mg-hosts
index d68f7b4e..14f816ae 100755
--- a/mg-hosts
+++ b/mg-hosts
@@ -19,7 +19,7 @@
 
 # Usages:
 #
-#  ./mg-hosts mknetbootdir HOST...
+#  ./mg-hosts mknetbootdir -Fany | -F... HOST...
 #   Create directories for netboot as expected by the rest
 #   of osstest.  Will use "sudo". The HOST(s) must be
 #   allocated (via mg-allocate HOST).
@@ -121,19 +121,29 @@ sub l ($) { return split /,/, $_[0]; }
 
 sub cmd_mknetbootdir () {
 my $dryrun = 0;
+my %expect_firmware;
 while (@ARGV && $ARGV[0] =~ m/^-/) {
$_ = shift @ARGV;
last if $_ =~ m/^--?$/;
while (m/^-./) {
if (s/^-n/-/) { $dryrun= 1; }
+   elsif (s/^-F(.*)//) { $expect_firmware{$1} = 1 }
else { die "unknown mknetbootdir option $_"; }
}
 }
+die "need at least one -F or -Fany option"
+   unless %expect_firmware;
 die unless @ARGV>=1;
 my $sudo = $ENV{'OSSTEST_SUDO'} // 'sudo';
 foreach my $hn (@ARGV) {
 my $ho= selecthost("host=$hn");
my ($dir, $file) = host_netboot_file($ho);
+   if (!$expect_firmware{any}) {
+   my $got_firmware = get_host_property($ho, "firmware");
+   die
+ "host $hn configuration says firmware \`$got_firmware', not as expected\n"
+   unless $expect_firmware{$got_firmware};
+   }
 die unless defined $dir;
my ($rdir, $realfile) = host_netboot_file($ho, 'TemplatesReal');
$realfile //= $file;
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup

2019-08-29 Thread Jan Beulich

On 29.08.2019 09:37, Chao Gao wrote:
> On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote:
>> On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote:
>>> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
 On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
> Both are loading the cached patch. Since APs call the unified function,
> microcode_update_one(), during wakeup, the 'start_update' parameter
> which originally used to distinguish BSP and APs is redundant. So remove
> this parameter.
>
> Signed-off-by: Chao Gao 
> ---
> Note that here is a functional change: resuming a CPU would call
> ->end_update() now while previously it wasn't. Not quite sure
> whether it is correct.

 I guess that's required if it called start_update prior to calling
 end_update?

>
> Changes in v9:
>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
>microcode_update_one()
>  - rebase and fix conflicts.
>
> Changes in v8:
>  - split out from the previous patch
> ---
>  xen/arch/x86/acpi/power.c   |  2 +-
>  xen/arch/x86/microcode.c| 90 
> ++---
>  xen/arch/x86/smpboot.c  |  5 +--
>  xen/include/asm-x86/processor.h |  4 +-
>  4 files changed, 44 insertions(+), 57 deletions(-)
>
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 4f21903..24798d5 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
>  
>  console_end_sync();
>  
> -microcode_resume_cpu();
> +microcode_update_one();
>  
>  if ( !recheck_cpu_features(0) )
>  panic("Missing previously available feature(s)\n");
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index a2febc7..bdd9c9f 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char 
> *buf, uint32_t len)
>  return NULL;
>  }
>  
> -int microcode_resume_cpu(void)
> -{
> -int err;
> -struct cpu_signature *sig = _cpu(cpu_sig);
> -
> -if ( !microcode_ops )
> -return 0;
> -
> -spin_lock(_mutex);
> -
> -err = microcode_ops->collect_cpu_info(sig);
> -if ( likely(!err) )
> -err = microcode_ops->apply_microcode(microcode_cache);
> -spin_unlock(_mutex);
> -
> -return err;
> -}
> -
>  void microcode_free_patch(struct microcode_patch *microcode_patch)
>  {
>  microcode_ops->free_patch(microcode_patch->mc);
> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
>  }
>  __initcall(microcode_init);
>  
> -int __init early_microcode_update_cpu(bool start_update)
> +/* Load a cached update to current cpu */
> +int microcode_update_one(void)
> +{
> +int rc;
> +
> +if ( !microcode_ops )
> +return -EOPNOTSUPP;
> +
> +rc = microcode_update_cpu(NULL);
> +
> +if ( microcode_ops->end_update )
> +microcode_ops->end_update();

 Don't you need to call start_update before calling
 microcode_update_cpu?
>>>
>>> No. On AMD side, osvw_status records the hardware erratum in the system.
>>> As we don't assume all CPUs have the same erratum, each cpu calls
>>> end_update to update osvw_status after ucode loading.
>>> start_update just resets osvw_status to 0. And it is called once prior
>>> to ucode loading on any CPU so that osvw_status can be recomputed.
>>
>> Oh, I think I understand it. start_update must only be called once
>> _before_ the sequence to update the microcode on all CPUs is
>> performed, while end_update needs to be called on _each_ CPU after the
>> update has been completed in order to account for any erratas.
>>
>> The name for those hooks should be improved, I guess renaming
>> end_update to end_update_each or end_update_percpu would be clearer in
>> order to make it clear that start_update is global, while end_update
>> is percpu. Anyway, I don't want to delay this series for a naming nit.
>>
>> I'm still unsure where start_update is called for the resume from
>> suspension case, I don't seem to see any call to start_update neither
>> in enter_state or microcode_update_one, hence I think this is missing?
> 
> No. Actually, no call of start_update for resume case.
> 
>>
>> I would expect you need to clean osvw_status also on resume from
>> suspension, in case microcode loading fails? Or else you will be
>> carrying a stale osvw_status.
> 
> Then we need to send IPI to all other CPUs to recompute osvw_state. But
> I think it is not necessary. If ucode cache isn't changed during the
> CPU's suspension period, there is not

[Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability

2019-08-29 Thread Chao Gao

Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's
perspective, assigned devices cannot be reset. But some devices rely on PCI
reset to recover from hardware hangs. When being assigned to a VM, those
devices cannot be reset and won't work any longer if a hardware hang occurs.
We have to reboot VM to trigger PCI reset on host to recover the device.

This patch exposes FLR capability to VMs if the assigned device can be reset on
host. When VM initiates an FLR to a device, qemu cleans up the device state,
(including disabling of intx and/or MSI and unmapping BARs from guest, deleting
emulated registers), then initiate PCI reset through 'reset' knob under the
device's sysfs, finally initialize the device again.

Signed-off-by: Chao Gao 
---
Do we need to introduce an attribute, like "permissive" to explicitly
enable FLR capability emulation? During PCI reset, interrupts and BARs are
unmapped from the guest. It seems that guest cannot interact with the device
directly except access to device's configuration space which is emulated by
qemu. If proper method can be used to prevent qemu accessing the physical
device there is no new security hole caused by the FLR emulation.

VM's FLR may be backed by any reset function on host to the physical device,
for example: FLR, D3softreset, secondary bus reset. Not sure it is fine to mix
them. Given Linux kernel just uses an unified API to reset device and caller
cannot choose a specific one, it might be OK.
---
 hw/xen/xen-host-pci-device.c | 30 ++
 hw/xen/xen-host-pci-device.h |  3 +++
 hw/xen/xen_pt.c  |  9 +
 hw/xen/xen_pt.h  |  1 +
 hw/xen/xen_pt_config_init.c  | 30 +++---
 5 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 1b44dcafaf..d549656f42 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -198,6 +198,35 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice *d)
 return !stat(path, );
 }
 
+static bool xen_host_pci_resetable(XenHostPCIDevice *d)
+{
+char path[PATH_MAX];
+
+xen_host_pci_sysfs_path(d, "reset", path, sizeof(path));
+
+return !access(path, W_OK);
+}
+
+void xen_host_pci_reset(XenHostPCIDevice *d)
+{
+char path[PATH_MAX];
+int fd;
+
+xen_host_pci_sysfs_path(d, "reset", path, sizeof(path));
+
+fd = open(path, O_WRONLY);
+if (fd == -1) {
+XEN_HOST_PCI_LOG("Xen host pci reset: open error\n");
+return;
+}
+
+if (write(fd, "1", 1) != 1) {
+XEN_HOST_PCI_LOG("Xen host pci reset: write error\n");
+}
+
+return;
+}
+
 static void xen_host_pci_config_open(XenHostPCIDevice *d, Error **errp)
 {
 char path[PATH_MAX];
@@ -377,6 +406,7 @@ void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t 
domain,
 d->class_code = v;
 
 d->is_virtfn = xen_host_pci_dev_is_virtfn(d);
+d->is_resetable = xen_host_pci_resetable(d);
 
 return;
 
diff --git a/hw/xen/xen-host-pci-device.h b/hw/xen/xen-host-pci-device.h
index 4d8d34ecb0..cacf9b3df8 100644
--- a/hw/xen/xen-host-pci-device.h
+++ b/hw/xen/xen-host-pci-device.h
@@ -32,6 +32,7 @@ typedef struct XenHostPCIDevice {
 XenHostPCIIORegion rom;
 
 bool is_virtfn;
+bool is_resetable;
 
 int config_fd;
 } XenHostPCIDevice;
@@ -55,4 +56,6 @@ int xen_host_pci_set_block(XenHostPCIDevice *d, int pos, 
uint8_t *buf,
 
 int xen_host_pci_find_ext_cap_offset(XenHostPCIDevice *s, uint32_t cap);
 
+void xen_host_pci_reset(XenHostPCIDevice *d);
+
 #endif /* XEN_HOST_PCI_DEVICE_H */
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 8fbaf2eae9..d750367c0a 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -938,6 +938,15 @@ static void xen_pt_unregister_device(PCIDevice *d)
 xen_pt_destroy(d);
 }
 
+void xen_pt_reset(XenPCIPassthroughState *s)
+{
+PCIDevice *d = PCI_DEVICE(s);
+
+xen_pt_unregister_device(d);
+xen_host_pci_reset(>real_device);
+xen_pt_realize(d, NULL);
+}
+
 static Property xen_pci_passthrough_properties[] = {
 DEFINE_PROP_PCI_HOST_DEVADDR("hostaddr", XenPCIPassthroughState, hostaddr),
 DEFINE_PROP_BOOL("permissive", XenPCIPassthroughState, permissive, false),
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 9167bbaf6d..ed05bc0d39 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -332,4 +332,5 @@ int xen_pt_register_vga_regions(XenHostPCIDevice *dev);
 int xen_pt_unregister_vga_regions(XenHostPCIDevice *dev);
 void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev,
  Error **errp);
+void xen_pt_reset(XenPCIPassthroughState *s);
 #endif /* XEN_PT_H */
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 31ec5add1d..435abd7286 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -852,6 +852,30 @@ static inline uint8_t 
get_device_type(XenPCIPassthroughState *s,
 return (flag &

Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG

2019-08-29 Thread Jan Beulich

On 29.08.2019 10:00, Roger Pau Monné  wrote:
> On Wed, Aug 28, 2019 at 04:24:22PM +0100, Igor Druzhinin wrote:
>> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
>> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
>> @@ -26,33 +26,34 @@
>>  
>>  #include "mmconfig.h"
>>  
>> +static bool_t __read_mostly force_mmcfg = true;
>>  unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF;
>>  
>>  static int __init parse_mmcfg(const char *s)
>>  {
>>  const char *ss;
>> -int rc = 0;
>> +int val, rc = 0;
>>  
>>  do {
>>  ss = strchr(s, ',');
>>  if ( !ss )
>>  ss = strchr(s, '\0');
>>  
>> -switch ( parse_bool(s, ss) )
>> -{
>> -case 0:
>> -pci_probe &= ~PCI_PROBE_MMCONF;
>> -break;
>> -case 1:
>> -break;
>> -default:
>> -if ( !cmdline_strcmp(s, "amd_fam10") ||
>> - !cmdline_strcmp(s, "amd-fam10") )
>> +if ( (val = parse_bool(s, ss)) >= 0 ) {
>> +if ( val )
>> +   pci_probe |= PCI_PROBE_MMCONF;
>> +else
>> +   pci_probe &= ~PCI_PROBE_MMCONF;
>> +} else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 ||
>> +(val = parse_boolean("amd-fam10", s, ss)) >= 0 ) {
>> +if ( val )
>>  pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF;
>>  else
>> -rc = -EINVAL;
>> -break;
>> -}
>> +pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF;
>> +} else if ( (val = parse_boolean("force", s, ss)) >= 0)
>> +force_mmcfg = val;
> 
> You could also consider adding a new flag to pci_probe, ie:
> PCI_PROBE_FORCE_MMCFG.

Yes please, albeit to be in sync with the other flag perhaps
better PCI_PROBE_FORCE_MMCONF or PCI_PROBE_MMCONF_FORCE.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h

2019-08-29 Thread Jan Beulich

Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:

There have been reports of RDRAND issues after resuming from suspend on
some AMD family 15h and family 16h systems. This issue stems from a BIOS
not performing the proper steps during resume to ensure RDRAND continues
to function properly.

Update the CPU initialization to clear the RDRAND CPUID bit for any family
15h and 16h processor that supports RDRAND. If it is known that the family
15h or family 16h system does not have an RDRAND resume issue or that the
system will not be placed in suspend, the "cpuid=rdrand" kernel parameter
can be used to stop the clearing of the RDRAND CPUID bit.

Note, that clearing the RDRAND CPUID bit does not prevent a processor
that normally supports the RDRAND instruction from executing it. So any
code that determined the support based on family and model won't #UD.

Signed-off-by: Jan Beulich 
---
Slightly RFC, in particular because of the change to parse_xen_cpuid():
Alternative approach suggestions are welcome.

--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -488,6 +488,10 @@ The Speculation Control hardware feature
 be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
 won't offer them to guests.
 
+`rdrand` can be used to override the default disabling of the feature on 
certain
+AMD systems.  Its negative form can of course also be used to suppress use and
+exposure of the feature.
+
 ### cpuid_mask_cpu
 > `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b`
 
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -641,6 +641,18 @@ static void init_amd(struct cpuinfo_x86
if (acpi_smi_cmd && (acpi_enable_value | acpi_disable_value))
amd_acpi_c1e_quirk = true;
break;
+
+   case 0x15: case 0x16:
+   /*
+* There are too many Fam15/Fam16 systems where upon resume
+* from S3 firmware fails to re-setup properly functioning
+* RDRAND.  Clear the feature unless force-enabled on the
+* command line.
+*/
+   if (c == _cpu_data &&
+   !is_forced_cpu_cap(X86_FEATURE_RDRAND))
+   setup_clear_cpu_cap(X86_FEATURE_RDRAND);
+   break;
}
 
display_cacheinfo(c);
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -94,6 +94,11 @@ void __init setup_force_cpu_cap(unsigned
__set_bit(cap, boot_cpu_data.x86_capability);
 }
 
+bool __init is_forced_cpu_cap(unsigned int cap)
+{
+   return test_bit(cap, forced_caps);
+}
+
 static void default_init(struct cpuinfo_x86 * c)
 {
/* Not much we can do here... */
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -74,6 +74,13 @@ static int __init parse_xen_cpuid(const
 setup_clear_cpu_cap(X86_FEATURE_AMD_SSBD);
 }
 }
+else if ( (val = parse_boolean("rdrand", s, ss)) >= 0 )
+{
+if ( !val )
+setup_clear_cpu_cap(X86_FEATURE_RDRAND);
+else if ( cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_RDRAND) )
+setup_force_cpu_cap(X86_FEATURE_RDRAND);
+}
 else
 rc = -EINVAL;
 
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -166,6 +166,7 @@ extern const struct x86_cpu_id *x86_matc
 extern void identify_cpu(struct cpuinfo_x86 *);
 extern void setup_clear_cpu_cap(unsigned int);
 extern void setup_force_cpu_cap(unsigned int);
+extern bool is_forced_cpu_cap(unsigned int);
 extern void print_cpu_info(unsigned int cpu);
 extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich 
> Sent: 27 August 2019 08:46
> To: Paul Durrant 
> Cc: JulienGrall ; Alexandru Isaila 
> ; Petre Pircalabu
> ; Razvan Cojocaru ; 
> Andrew Cooper
> ; George Dunlap ; Ian 
> Jackson
> ; Roger Pau Monne ; 
> VolodymyrBabchuk
> ; Stefano Stabellini ; 
> xen-
> de...@lists.xenproject.org; Konrad Rzeszutek Wilk ; 
> Tamas K Lengyel
> ; Tim (Xen.org) ; Wei Liu 
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of 
> IOMMU page tables
> 
> On 12.08.2019 17:41, Paul Durrant wrote:
> >> From: Jan Beulich 
> >> Sent: 07 August 2019 11:32
> >>
> >> On 30.07.2019 15:44, Paul Durrant wrote:
> >>> @@ -625,8 +548,7 @@ static void iommu_dump_p2m_table(unsigned char key)
> >>>ops = iommu_get_ops();
> >>>for_each_domain(d)
> >>>{
> >>> -if ( is_hardware_domain(d) ||
> >>> - dom_iommu(d)->status < IOMMU_STATUS_initialized )
> >>> +if ( !is_iommu_enabled(d) )
> >>>continue;
> >>
> >> Why do you drop the hwdom check here?
> >
> > Because is_iommu_enabled() for the h/w domain will always be true if
> > iommu_enabled is true, so no need for a special case.
> 
> But the effect of the extra check was to _skip_ Dom0. If you mean to
> change this, then you should say so (and why) in the description.
> 

Ah, yes, it does still need to remain.

  Paul

> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG

2019-08-29 Thread Jan Beulich

On 28.08.2019 17:24, Igor Druzhinin wrote:
> If MCFG area is not reserved in E820 Xen by default will defer its usage
> until Dom0 registers it explicitly after ACPI parser recognizes it as
> a reserved resource in DSDT. Having it reserved in E820 is not
> mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
> and firmware is free to keep a hole E820 in that place.
> 
> Unfortunately, keeping it disabled at this point makes Xen fail to
> recognize many of PCIe extended capabilities early enough for newly added
> devices. Namely, (1) some of VT-d quirks are not applied during Dom0
> device handoff, (2) currently MCFG reservation report comes
> too late from Dom0 after some of PCI devices being registered in Xen
> by Dom0 kernel that break initialization of a number of PCIe capabilities
> (e.g. SR-IOV VF BAR sizing).
> 
> Since introduction of ACPI parser in Xen is not possible add a "force"
> option that will allow Xen to use MCFG area even it's not properly
> reserved in E820.
> 
> Signed-off-by: Igor Druzhinin 
> ---
> 
> I think we need to have this option to at least have a way to workaround
> problem (1). Problem (2) could be solved in Dom0 kernel by invoking
> xen_mcfg_late() earlier but before the first PCI bus enumertaion which
> currently happens somwhere during ACPI scan I think.

Indeed (2) should be fixed as you say here. (1) should imo be fixed
differently too: pci_vtd_quirk() (and anything else that relies on
extended config space accesses) should be re-invoked once MCFG becomes
available for a given bus (range).

Nevertheless I think the command line option is a good thing to have.

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and - 
> when supported
>  by the platform - DomU with pass-through device assigned).
>  
>  ### mmcfg (x86)
> -> `= [,amd-fam10]`
> +> `= List of [ , force, amd-fam10 ]`
>  
> -> Default: `1`
> +> Default: `true,force`
>  
> -Specify if the MMConfig space should be enabled.
> +Specify if the MMConfig space should be enabled. If force option is specified
> +(default) MMConfig space will be used by Xen early in boot even if it's
> +not reserved by firmware in the E820 table.

The term "force" is too heavy for my taste: You still require the
range to be at least in an E820 hole. "allow-e820-hole" otoh is a
little longish. Perhaps "relaxed"?

> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
> @@ -26,33 +26,34 @@
>  
>  #include "mmconfig.h"
>  
> +static bool_t __read_mostly force_mmcfg = true;

Just bool please.

>  unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF;
>  
>  static int __init parse_mmcfg(const char *s)
>  {
>  const char *ss;
> -int rc = 0;
> +int val, rc = 0;
>  
>  do {
>  ss = strchr(s, ',');
>  if ( !ss )
>  ss = strchr(s, '\0');
>  
> -switch ( parse_bool(s, ss) )
> -{
> -case 0:
> -pci_probe &= ~PCI_PROBE_MMCONF;
> -break;
> -case 1:
> -break;
> -default:
> -if ( !cmdline_strcmp(s, "amd_fam10") ||
> - !cmdline_strcmp(s, "amd-fam10") )
> +if ( (val = parse_bool(s, ss)) >= 0 ) {
> +if ( val )
> +   pci_probe |= PCI_PROBE_MMCONF;
> +else
> +   pci_probe &= ~PCI_PROBE_MMCONF;
> +} else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 ||
> +(val = parse_boolean("amd-fam10", s, ss)) >= 0 ) {
> +if ( val )
>  pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF;
>  else
> -rc = -EINVAL;
> -break;
> -}
> +pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF;
> +} else if ( (val = parse_boolean("force", s, ss)) >= 0)
> +force_mmcfg = val;
> +else
> +rc = -EINVAL;

Brace placement is wrong in your additions. I also don't really see
why you do away with the switch().

> @@ -355,6 +356,11 @@ static int __init is_mmconf_reserved(
> (unsigned int)cfg->start_bus_number,
> (unsigned int)cfg->end_bus_number);
>  }
> +} else if (!e820_any_mapped(addr, addr + old_size - 1, 0)) {
> +if (!force_mmcfg)
> +printk(KERN_WARNING "PCI: MCFG area at %lx is not reserved in 
> E820, "
> +   "use mmcfg=force option\n", addr);

I don't think saying "use" is a good idea. If anything, make it less
strict - either by attaching "if ..." or by saying "consider using".

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature

2019-08-29 Thread Andrew Cooper

On 29/08/2019 10:27, Jan Beulich wrote:
> setup_clear_cpu_cap() is __init and hence may not be called post-boot.
> Note that opt_pku nevertheless is not getting __initdata added - see
> e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init").
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 

However, I'm tempted to suggest that this logic disappears entirely. 
Given that we don't support it for PV guests, and can't without taking a
CR4 write on toggle_kernel_mode(), all this actually controls is whether
the bit finds its way into the HVM max policy.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch

2019-08-29 Thread Jan Beulich

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -542,29 +505,21 @@ static struct microcode_patch 
> *cpu_request_microcode(const void *buf,
>  while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
> )) == 0 )
>  {
> -struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
> -
> -if ( IS_ERR(new_patch) )
> -{
> -error = PTR_ERR(new_patch);
> -break;
> -}
> -
>  /*
> - * If the new patch covers current CPU, compare patches and store the
> + * If the new ucode covers current CPU, compare ucodes and store the
>   * one with higher revision.
>   */
> -if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
> - (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
> +#define REV_ID(mpb) (((struct microcode_header_amd 
> *)(mpb))->processor_rev_id)
> +if ( (microcode_fits(mc_amd) != MIS_UCODE) &&
> + (!saved || (REV_ID(mc_amd->mpb) > REV_ID(saved))) )
> +#undef REV_ID

I'm not happy with this helper #define, the more that "saved" already is
of the correct type. compare_patch() in reality only acts on the header,
so I'd suggest having that function forward to a new compare_header()
(or some other suitable name) and use that new function here as well.

> @@ -379,47 +360,47 @@ static struct microcode_patch 
> *cpu_request_microcode(const void *buf,
>  {
>  long offset = 0;
>  int error = 0;
> -void *mc;
> +struct microcode_intel *mc, *saved = NULL;
>  struct microcode_patch *patch = NULL;
>  
> -while ( (offset = get_next_ucode_from_buffer(, buf, size, offset)) > 
> 0 )
> +while ( (offset = get_next_ucode_from_buffer((void **), buf,

Casts like this make me rather nervous. Please see about getting rid of
it (by using a union or a 2nd local variable).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()

2019-08-29 Thread Chao Gao

On Wed, Aug 28, 2019 at 05:12:34PM +0200, Jan Beulich wrote:
>On 19.08.2019 03:25, Chao Gao wrote:
>> to a more generic function. So that it can be used alone to check
>> an update against the CPU signature and current update revision.
>> 
>> Note that enum microcode_match_result will be used in common code
>> (aka microcode.c), it has been placed in the common header.
>> 
>> Signed-off-by: Chao Gao 
>> Reviewed-by: Roger Pau Monné 
>> Reviewed-by: Jan Beulich 
>
>I don't think these can be legitimately retained with ...
>
>> Changes in v9:
>>  - microcode_update_match() doesn't accept (sig, pf, rev) any longer.
>>  Hence, it won't be used to compare two arbitrary updates.
>
>... this kind of a change.

Will drop RBs.

>
>> --- a/xen/arch/x86/microcode_intel.c
>> +++ b/xen/arch/x86/microcode_intel.c
>> @@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, 
>> struct cpu_signature *csig)
>>  return 0;
>>  }
>>  
>> -static inline int microcode_update_match(
>> -unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>> -int sig, int pf)
>> +/* Check an update against the CPU signature and current update revision */
>> +static enum microcode_match_result microcode_update_match(
>> +const struct microcode_header_intel *mc_header, unsigned int cpu)
>>  {
>> -struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu_num);
>> -
>> -return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
>> -(mc_header->rev > uci->cpu_sig.rev));
>> +const struct extended_sigtable *ext_header;
>> +const struct extended_signature *ext_sig;
>> +unsigned int i;
>> +struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu);
>> +unsigned int sig = uci->cpu_sig.sig;
>> +unsigned int pf = uci->cpu_sig.pf;
>> +unsigned int rev = uci->cpu_sig.rev;
>> +unsigned long data_size = get_datasize(mc_header);
>> +const void *end = (const void *)mc_header + get_totalsize(mc_header);
>> +
>> +if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>> +return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
>
>Didn't you lose a range check against "end" ahead of this if()?
>get_totalsize() and get_datasize() aiui also would need to live
>after a range check, just a sizeof() (i.e. MC_HEADER_SIZE) based
>one. This would also affect the caller as it seems.

I think microcode_sanity_check() is for this purpose. We can do
sanity check before the if(). Perhaps, we can just add an assertion
that sanity check won't fail. Because whenever sanity check failed
when pasing an ucode blob, we just drop the ucode; we won't pass an
broken ucode to this function.

Thanks
Chao

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/3] build: allow picking the env values for compiler variables

2019-08-29 Thread Roger Pau Monné

On Tue, Aug 27, 2019 at 11:59:33AM +0100, Ian Jackson wrote:
> Jan Beulich writes ("Re: [PATCH 2/3] build: allow picking the env values for 
> compiler variables"):
> > On 20.08.2019 09:58, Roger Pau Monné  wrote:
> > > I don't have things 'left' in the environment, neither have most build
> > > systems AFAIK. I do have things set that I want the build to
> > > acknowledge, instead of fight against it.
> > 
> > My view is this: XEN_* things coming from the environment are fine.
> > Generic (i.e. project agnostic) variables (like CC) otoh would
> > better be ignored, as it's not clear for what purpose they've been
> > set. (Istr a case where some FORTIFY option was set by RPM build
> > environments, breaking our build.) They may well have been meant
> > for some other project.
> 
> CC is set *specifically to cause build systems's like Xen's to use
> that compiler*.  That is its purpose.  It should be honoured, not
> ignored.
> 
> Likewise FORTIFY, even though it broke something.  FORTIFY_SOURCE was
> set specifically to cause the changes it did.  The people who set it
> didn't see all the implications, but that change *was* their design
> intent and the bugs were real bugs in what they were trying to do.
> 
> > Question is whether to take the above just for the hypervisor part
> > of the build, or also the tool stack etc ones.
> 
> *Definitely* for the toolstack build, we should honour CC et al.
> 
> The hypervisor is a more subtle question because people who set these
> variables often forget to think about kernel code, embedded code,
> etc.  That's what happened with FORTIFY_SOURCE.  However, I would
> argue that it is better, in such a situation, to honour the variable
> and break the build, than it is to simply ignore it.

I fully agree. It's true that some build systems will likely break
bulding Xen, but that's a build system issue, and we shouldn't try to
work around this in Xen. As said before, what build systems (like
travis or gitlab for instance) set in the environment should be
acknowledged by Xen, or else we are forcing everyone to have custom
workarounds for building Xen.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V3 8/8] iommu/arm: Add Renesas IPMMU-VMSA support

2019-08-29 Thread Yoshihiro Shimoda

Hi Oleksandr-san,

> From: Oleksandr Tyshchenko, Sent: Wednesday, August 21, 2019 3:10 AM
> 
> From: Oleksandr Tyshchenko 
> 
> The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
> which provides address translation and access protection functionalities
> to processing units and interconnect networks.
> 
> Please note, current driver is supposed to work only with newest
> Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation

This is still "Gen3", so that please replace it with "R-Car Gen3"...

> table format and is able to use CPU's P2M table as is if one is
> 3-level page table (up to 40 bit IPA).
> 
> The major differences compare to the Linux driver are:
> 
> 1. Stage 1/Stage 2 translation. Linux driver supports Stage 1
> translation only (with Stage 1 translation table format). It manages
> page table by itself. But Xen driver supports Stage 2 translation
> (with Stage 2 translation table format) to be able to share the P2M
> with the CPU. Stage 1 translation is always bypassed in Xen driver.
> 
> So, Xen driver is supposed to be used with newest R-Car Gen3 SoC revisions
> only (H3 ES3.0, M3-W+, etc.) which IPMMU H/W supports stage 2 translation
> table format.
> 
> 2. AArch64 support. Linux driver uses VMSAv8-32 mode, while Xen driver
> enables Armv8 VMSAv8-64 mode to cover up to 40 bit input address.
> 
> 3. Context bank (sets of page table) usage. In Xen, each context bank is
> mapped to one Xen domain. So, all devices being pass throughed to the
> same Xen domain share the same context bank.
> 
> 4. IPMMU device tracking. In Xen, all IOMMU devices are managed
> by single driver instance. So, driver uses global list to keep track
> of registered IPMMU devices.
> 
> Signed-off-by: Oleksandr Tyshchenko 
> CC: Julien Grall 
> CC: Yoshihiro Shimoda 

About this hardware handling, this patch seems good to me. But, since
I'm not familiar about Xen passthrough framework, I think I cannot
add my Reviewed-by tag into this patch. What do you think?

Best regards,
Yoshihiro Shimoda


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich 
> Sent: 27 August 2019 08:49
> To: Paul Durrant 
> Cc: 'JulienGrall' ; 'Alexandru Isaila' 
> ; 'Petre
> Pircalabu' ; 'Razvan Cojocaru' 
> ; Andrew Cooper
> ; George Dunlap ; Ian 
> Jackson
> ; Roger Pau Monne ; 
> 'VolodymyrBabchuk'
> ; 'Stefano Stabellini' ; 
> 'xen-
> de...@lists.xenproject.org' ; 'Konrad 
> Rzeszutek Wilk'
> ; 'Tamas K Lengyel' ; Tim 
> (Xen.org) ; 'Wei
> Liu' 
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of 
> IOMMU page tables
> 
> On 14.08.2019 11:39, Paul Durrant wrote:
> >> -Original Message-
> >> From: Xen-devel  On Behalf Of Paul 
> >> Durrant
> >> Sent: 12 August 2019 17:26
> >> To: 'Jan Beulich' 
> >> Cc: 'Petre Pircalabu' ; 'Stefano Stabellini' 
> >> ;
> >> 'Wei Liu' ; 'Razvan Cojocaru' ; 
> >> 'Konrad Rzeszutek Wilk'
> >> ; Andrew Cooper ; Tim 
> >> (Xen.org) ;
> >> George Dunlap ; 'Julien Grall' 
> >> ; 'Tamas K Lengyel'
> >> ; Ian Jackson ; 'Alexandru 
> >> Isaila'
> >> ; 'xen-devel@lists.xenproject.org' 
> >> ;
> >> 'VolodymyrBabchuk' ; Roger Pau Monne 
> >> 
> >> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction 
> >> of IOMMU page tables
> >>
> >>> -Original Message-
> >> [snip]
> 
>  On 30.07.2019 15:44, Paul Durrant wrote:
> > NOTE: This patch will cause a small amount of extra resource to be used
> > to accommodate IOMMU page tables that may never be used, since 
> > the
> > per-domain IOMMU flag enable flag is currently set to the value
> > of the global iommu_enable flag. A subsequent patch will add an
> > option to the toolstack to allow it to be turned off if there is
> > no intention to assign passthrough hardware to the domain.
> 
>  In particular if the default of this is going to be "true" (I
>  didn't look at that patch yet, but the wording above makes me
>  assume so), in auto-ballooning mode without shared page tables
>  more memory should imo be ballooned out of Dom0 now. It has
>  always been a bug that IOMMU page tables weren't accounted for,
>  but it would become even more prominent then.
> >>>
> >>> Ultimately, once the whole series is applied, then nothing much should 
> >>> change for those specifying
> >>> passthrough h/w in an xl.cfg. The main difference will be that h/w cannot 
> >>> be passed through to a
> >>> domain that was not originally created with IOMMU pagetables.
> >>> With patches applied up to this point then, yes, every domain will get 
> >>> IOMMU page tables. I guess
> >> I'll
> >>> take a look at the auto-ballooning code and see what needs to be done.
> >>>
> >>
> >> Ok, I've had a look...
> >>
> >> I could make a rough calculation in libxl_domain_need_memory() based on 
> >> the domain's max_memkb and
> an
> >> assumption of a 4 level translation with 512 PTEs per level, or would 
> >> prefer such guestimation to
> be
> >> overridable using an xl.cfg parameter in a broadly similar way to 
> >> shadow_memkb?
> >>
> >
> > I think I'm going to say no to this anyway since, as you say, the overhead 
> > as never been accounted
> for and having to make assumptions about the IOMMU table structure is not 
> very attractive. Given that
> any issue is going to be transient (i.e. until patch #6 is committed) I don't 
> think fixing auto-
> ballooning ought to be in scope for this series.
> 
> I'm afraid I disagree: The series extends a pre-existing problem
> affecting some guests to all ones (at least by default).

TBH I've seen sufficient numbers of domain create failures when using 
auto-ballooning that I stopped using it some time ago (besides the fact that 
it's slow). If you're happy with the simplistic 
double-the-page-table-reservation calculation then I can add that but IMO it 
would be better to add another patch to just remove auto-ballooning.

 Paul

> 
> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"

2019-08-29 Thread Jan Beulich

On 29.08.2019 10:49, Roger Pau Monne wrote:
> @@ -640,9 +670,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, 
> mfn_t mfn,
>   && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
>  p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
>  
> +if ( need_iommu_pt_sync(p2m->domain) &&
> + (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
> +rc = iommu_pte_flags ?
> +iommu_legacy_map(d, _dfn(gfn), mfn, page_order,
> + iommu_pte_flags) :
> +iommu_legacy_unmap(d, _dfn(gfn), page_order);

Indentation of the if() body is one level too deep. With this
corrected (easy enough to do while committing)
Reviewed-by: Jan Beulich 

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Jan Beulich

On 29.08.2019 11:33, Paul Durrant wrote:
> TBH I've seen sufficient numbers of domain create failures when using
> auto-ballooning that I stopped using it some time ago (besides the fact
> that it's slow). If you're happy with the simplistic
> double-the-page-table-reservation calculation then I can add that but
> IMO it would be better to add another patch to just remove auto-ballooning.

I'd not be overly happy, but I could live with this. But this needs
consent by others, not the least the tool stack maintainers.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich 
> Sent: 29 August 2019 10:52
> To: Paul Durrant 
> Cc: 'JulienGrall' ; 'Alexandru Isaila' 
> ; 'Petre
> Pircalabu' ; 'Razvan Cojocaru' 
> ; Andrew Cooper
> ; George Dunlap ; Ian 
> Jackson
> ; Roger Pau Monne ; 
> 'VolodymyrBabchuk'
> ; 'Stefano Stabellini' ; 
> 'xen-
> de...@lists.xenproject.org' ; 'Konrad 
> Rzeszutek Wilk'
> ; 'Tamas K Lengyel' ; Tim 
> (Xen.org) ; 'Wei
> Liu' 
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of 
> IOMMU page tables
> 
> On 29.08.2019 11:33, Paul Durrant wrote:
> > TBH I've seen sufficient numbers of domain create failures when using
> > auto-ballooning that I stopped using it some time ago (besides the fact
> > that it's slow). If you're happy with the simplistic
> > double-the-page-table-reservation calculation then I can add that but
> > IMO it would be better to add another patch to just remove auto-ballooning.
> 
> I'd not be overly happy, but I could live with this. But this needs
> consent by others, not the least the tool stack maintainers.
> 

Ok. I'll deal with that later. Looking again though, I'm not sure what you mean 
by 'the amount reserved for page tables'? Do you mean 'shadow_memkb' or 
something else?

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/5] xen/spinlocks: in debug builds store cpu holding the lock

2019-08-29 Thread Juergen Gross

Add the cpu currently holding the lock to struct lock_debug. This makes
analysis of locking errors easier and it can be tested whether the
correct cpu is releasing a lock again.

Signed-off-by: Juergen Gross 
---
V2: adjust types (Jan Beulich)
---
 xen/common/spinlock.c  | 33 ++---
 xen/include/xen/spinlock.h | 17 -
 2 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 6bc52d70c0..1be1b5ebe6 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -13,9 +13,9 @@
 
 static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0);
 
-static void check_lock(struct lock_debug *debug)
+static void check_lock(union lock_debug *debug)
 {
-int irq_safe = !local_irq_is_enabled();
+bool irq_safe = !local_irq_is_enabled();
 
 if ( unlikely(atomic_read(_debug) <= 0) )
 return;
@@ -43,18 +43,21 @@ static void check_lock(struct lock_debug *debug)
  */
 if ( unlikely(debug->irq_safe != irq_safe) )
 {
-int seen = cmpxchg(>irq_safe, -1, irq_safe);
+union lock_debug seen, new = { 0 };
 
-if ( seen == !irq_safe )
+new.irq_safe = irq_safe;
+seen.val = cmpxchg(>val, LOCK_DEBUG_INITVAL, new.val);
+
+if ( !seen.unseen && seen.irq_safe == !irq_safe )
 {
 printk("CHECKLOCK FAILURE: prev irqsafe: %d, curr irqsafe %d\n",
-   seen, irq_safe);
+   seen.irq_safe, irq_safe);
 BUG();
 }
 }
 }
 
-static void check_barrier(struct lock_debug *debug)
+static void check_barrier(union lock_debug *debug)
 {
 if ( unlikely(atomic_read(_debug) <= 0) )
 return;
@@ -70,7 +73,18 @@ static void check_barrier(struct lock_debug *debug)
  * However, if we spin on an IRQ-unsafe lock with IRQs disabled then that
  * is clearly wrong, for the same reason outlined in check_lock() above.
  */
-BUG_ON(!local_irq_is_enabled() && (debug->irq_safe == 0));
+BUG_ON(!local_irq_is_enabled() && !debug->irq_safe);
+}
+
+static void got_lock(union lock_debug *debug)
+{
+debug->cpu = smp_processor_id();
+}
+
+static void rel_lock(union lock_debug *debug)
+{
+ASSERT(debug->cpu == smp_processor_id());
+debug->cpu = SPINLOCK_NO_CPU;
 }
 
 void spin_debug_enable(void)
@@ -87,6 +101,8 @@ void spin_debug_disable(void)
 
 #define check_lock(l) ((void)0)
 #define check_barrier(l) ((void)0)
+#define got_lock(l) ((void)0)
+#define rel_lock(l) ((void)0)
 
 #endif
 
@@ -150,6 +166,7 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void 
*), void *data)
 cb(data);
 arch_lock_relax();
 }
+got_lock(>debug);
 LOCK_PROFILE_GOT;
 preempt_disable();
 arch_lock_acquire_barrier();
@@ -181,6 +198,7 @@ void _spin_unlock(spinlock_t *lock)
 arch_lock_release_barrier();
 preempt_enable();
 LOCK_PROFILE_REL;
+rel_lock(>debug);
 add_sized(>tickets.head, 1);
 arch_lock_signal();
 }
@@ -224,6 +242,7 @@ int _spin_trylock(spinlock_t *lock)
 if ( cmpxchg(>tickets.head_tail,
  old.head_tail, new.head_tail) != old.head_tail )
 return 0;
+got_lock(>debug);
 #ifdef CONFIG_LOCK_PROFILE
 if (lock->profile)
 lock->profile->time_locked = NOW();
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index 2c7415e23a..88017e0a89 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -6,14 +6,21 @@
 #include 
 
 #ifndef NDEBUG
-struct lock_debug {
-s16 irq_safe; /* +1: IRQ-safe; 0: not IRQ-safe; -1: don't know yet */
+union lock_debug {
+uint16_t val;
+#define LOCK_DEBUG_INITVAL 0x
+struct {
+uint16_t cpu:12;
+uint16_t pad:2;
+bool irq_safe:1;
+bool unseen:1;
+};
 };
-#define _LOCK_DEBUG { -1 }
+#define _LOCK_DEBUG { LOCK_DEBUG_INITVAL }
 void spin_debug_enable(void);
 void spin_debug_disable(void);
 #else
-struct lock_debug { };
+union lock_debug { };
 #define _LOCK_DEBUG { }
 #define spin_debug_enable() ((void)0)
 #define spin_debug_disable() ((void)0)
@@ -142,7 +149,7 @@ typedef struct spinlock {
 #define SPINLOCK_NO_CPU 0xfffu
 u16 recurse_cnt:4;
 #define SPINLOCK_MAX_RECURSE 0xfu
-struct lock_debug debug;
+union lock_debug debug;
 #ifdef CONFIG_LOCK_PROFILE
 struct lock_profile *profile;
 #endif
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 3/5] xen: print lock profile info in panic()

2019-08-29 Thread Juergen Gross

Print the lock profile data when the system crashes and add some more
information for each lock data (lock address, cpu holding the lock).
While at it use the PRI_stime format specifier for printing time data.

This is especially beneficial for watchdog triggered crashes in case
of deadlocks.

In order to have the cpu holding the lock available let the
lock profile config option select DEBUG_LOCKS.

As printing the lock profile data will make use of locking, too, we
need to disable spinlock debugging before calling
spinlock_profile_printall() from panic().

While at it remove a superfluous #ifdef CONFIG_LOCK_PROFILE and rename
CONFIG_LOCK_PROFILE to CONFIG_DEBUG_LOCK_PROFILE.

Also move the .lockprofile.data section to init area in linker scripts
as the data is no longer needed after boot.

Signed-off-by: Juergen Gross 
---
V2:
- rename CONFIG_LOCK_PROFILE to CONFIG_DEBUG_LOCK_PROFILE (Jan Beulich)
- move .lockprofile.data section to init area in linker scripts
- use PRI_stime (Andrew Cooper)
- don't print "cpu=4095", but "not locked" (Andrew Cooper)
---
 xen/Kconfig.debug  |  3 ++-
 xen/arch/arm/xen.lds.S | 13 +++--
 xen/arch/x86/domain.c  |  2 +-
 xen/arch/x86/xen.lds.S | 13 +++--
 xen/common/keyhandler.c|  2 +-
 xen/common/spinlock.c  | 33 ++---
 xen/common/sysctl.c|  2 +-
 xen/drivers/char/console.c |  4 +++-
 xen/include/xen/spinlock.h | 12 +++-
 9 files changed, 47 insertions(+), 37 deletions(-)

diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug
index 1faaa3ba6a..22573e74db 100644
--- a/xen/Kconfig.debug
+++ b/xen/Kconfig.debug
@@ -44,8 +44,9 @@ config COVERAGE
 
  If unsure, say N here.
 
-config LOCK_PROFILE
+config DEBUG_LOCK_PROFILE
bool "Lock Profiling"
+   select DEBUG_LOCKS
---help---
  Lock profiling allows you to see how often locks are taken and 
blocked.
  You can use serial console to print (and reset) using 'l' and 'L'
diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index 16ce1dd01e..a497f6a48d 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -54,12 +54,6 @@ SECTIONS
*(.data.rel.ro)
*(.data.rel.ro.*)
 
-#ifdef CONFIG_LOCK_PROFILE
-   . = ALIGN(POINTER_ALIGN);
-   __lock_profile_start = .;
-   *(.lockprofile.data)
-   __lock_profile_end = .;
-#endif
. = ALIGN(POINTER_ALIGN);
__param_start = .;
*(.data.param)
@@ -173,6 +167,13 @@ SECTIONS
. = ALIGN(4);
*(.altinstr_replacement)
 
+#ifdef CONFIG_DEBUG_LOCK_PROFILE
+   . = ALIGN(POINTER_ALIGN);
+   __lock_profile_start = .;
+   *(.lockprofile.data)
+   __lock_profile_end = .;
+#endif
+
*(.init.data)
*(.init.data.rel)
*(.init.data.rel.*)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index bc082ed827..b1c67c2d28 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -305,7 +305,7 @@ struct domain *alloc_domain_struct(void)
 #endif
 
 
-#ifndef CONFIG_LOCK_PROFILE
+#ifndef CONFIG_DEBUG_LOCK_PROFILE
 BUILD_BUG_ON(sizeof(*d) > PAGE_SIZE);
 #endif
 d = alloc_xenheap_pages(order, MEMF_bits(bits));
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index 87fa02b9b5..111edb5360 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -128,12 +128,6 @@ SECTIONS
*(.ex_table.pre)
__stop___pre_ex_table = .;
 
-#ifdef CONFIG_LOCK_PROFILE
-   . = ALIGN(POINTER_ALIGN);
-   __lock_profile_start = .;
-   *(.lockprofile.data)
-   __lock_profile_end = .;
-#endif
. = ALIGN(POINTER_ALIGN);
__param_start = .;
*(.data.param)
@@ -251,6 +245,13 @@ SECTIONS
 *(.altinstructions)
 __alt_instructions_end = .;
 
+#ifdef CONFIG_DEBUG_LOCK_PROFILE
+   . = ALIGN(POINTER_ALIGN);
+   __lock_profile_start = .;
+   *(.lockprofile.data)
+   __lock_profile_end = .;
+#endif
+
. = ALIGN(8);
__ctors_start = .;
*(.ctors)
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 57b360ee4b..c36baa4dff 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -62,7 +62,7 @@ static struct keyhandler {
 KEYHANDLER('P', perfc_reset, "reset performance counters", 0),
 #endif
 
-#ifdef CONFIG_LOCK_PROFILE
+#ifdef CONFIG_DEBUG_LOCK_PROFILE
 KEYHANDLER('l', spinlock_profile_printall, "print lock profile info", 1),
 KEYHANDLER('L', spinlock_profile_reset, "reset lock profile info", 0),
 #endif
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 79e70a9947..c4f706c627 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -106,7 +106,7 @@ void spin_debug_disable(void)
 
 #endif
 
-#ifdef CONFIG_LOCK_PROFILE
+#ifdef CONFIG_DEBUG_LOCK_PROFILE
 
 #define LOCK_PROFILE_REL \
 if (lock->profile)   \
@@

Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()

2019-08-29 Thread Jan Beulich

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -300,32 +322,44 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) 
> buf, unsigned long len)
>  if ( microcode_ops == NULL )
>  return -EINVAL;
>  
> -info = xmalloc_bytes(sizeof(*info) + len);
> -if ( info == NULL )
> +buffer = xmalloc_bytes(len);
> +if ( !buffer )
>  return -ENOMEM;
>  
> -ret = copy_from_guest(info->buffer, buf, len);
> -if ( ret != 0 )
> +if ( copy_from_guest(buffer, buf, len) )
>  {
> -xfree(info);
> -return ret;
> +ret = -EFAULT;
> +goto free;
>  }
>  
> -info->buffer_size = len;
> -info->error = 0;
> -info->cpu = cpumask_first(_online_map);
> -
>  if ( microcode_ops->start_update )
>  {
>  ret = microcode_ops->start_update();
>  if ( ret != 0 )
> -{
> -xfree(info);
> -return ret;
> -}
> +goto free;
>  }
>  
> -return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
> +patch = parse_blob(buffer, len);
> +if ( IS_ERR(patch) )
> +{
> +ret = PTR_ERR(patch);
> +printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);

I think this wants to be at least XENLOG_WARNING.

> @@ -372,23 +406,46 @@ int __init early_microcode_update_cpu(bool start_update)
>  
>  microcode_ops->collect_cpu_info(_cpu(cpu_sig));
>  
> -if ( data )
> +if ( !data )
> +return -ENOMEM;
> +
> +if ( start_update )
>  {
> -if ( start_update && microcode_ops->start_update )
> +struct microcode_patch *patch;
> +
> +if ( microcode_ops->start_update )
>  rc = microcode_ops->start_update();
>  
>  if ( rc )
>  return rc;
>  
> -rc = microcode_update_cpu(data, len);
> +patch = parse_blob(data, len);
> +if ( IS_ERR(patch) )
> +{
> +printk(XENLOG_INFO "Parsing microcode blob error %ld\n",

Same here.

> +   PTR_ERR(patch));
> +return PTR_ERR(patch);
> +}
> +
> +if ( !patch )
> +{
> +printk(XENLOG_INFO "No ucode found. Update aborted!\n");

Here I'm not sure the message is worthwhile to have.

> @@ -41,8 +42,6 @@ struct cpu_signature {
>  DECLARE_PER_CPU(struct cpu_signature, cpu_sig);
>  extern const struct microcode_ops *microcode_ops;
>  
> -const struct microcode_patch *microcode_get_cache(void);
> -bool microcode_update_cache(struct microcode_patch *patch);

If you remove the declaration but not the definition, then the
latter should become static.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch

2019-08-29 Thread Jan Beulich

On 19.08.2019 03:25, Chao Gao wrote:
> +static enum microcode_match_result compare_patch(
> +const struct microcode_patch *new, const struct microcode_patch *old)
> +{
> +return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ?  NEW_UCODE :
> +OLD_UCODE;

There's one too many blank after the ? here. Also we commonly align
the : under the ? in cases like this one.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/5] xen: add new CONFIG_DEBUG_LOCKS option

2019-08-29 Thread Juergen Gross

Instead of enabling debugging for debug builds only add a dedicated
Kconfig option for that purpose which defaults to DEBUG.

Signed-off-by: Juergen Gross 
---
V2:
- rename to CONFIG_DEBUG_LOCKS (Jan Beulich)
---
 xen/Kconfig.debug  | 7 +++
 xen/common/spinlock.c  | 4 ++--
 xen/include/xen/spinlock.h | 2 +-
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug
index e10e314e25..1faaa3ba6a 100644
--- a/xen/Kconfig.debug
+++ b/xen/Kconfig.debug
@@ -51,6 +51,13 @@ config LOCK_PROFILE
  You can use serial console to print (and reset) using 'l' and 'L'
  respectively, or the 'xenlockprof' tool.
 
+config DEBUG_LOCKS
+   bool "Lock debugging"
+   default DEBUG
+   ---help---
+ Enable debugging features of lock handling.  Some additional
+ checks will be performed when acquiring and releasing locks.
+
 config PERF_COUNTERS
bool "Performance Counters"
---help---
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 1be1b5ebe6..79e70a9947 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -9,7 +9,7 @@
 #include 
 #include 
 
-#ifndef NDEBUG
+#ifdef CONFIG_DEBUG_LOCKS
 
 static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0);
 
@@ -97,7 +97,7 @@ void spin_debug_disable(void)
 atomic_dec(_debug);
 }
 
-#else /* defined(NDEBUG) */
+#else /* CONFIG_DEBUG_LOCKS */
 
 #define check_lock(l) ((void)0)
 #define check_barrier(l) ((void)0)
diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index 88017e0a89..736490f52b 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -5,7 +5,7 @@
 #include 
 #include 
 
-#ifndef NDEBUG
+#ifdef CONFIG_DEBUG_LOCKS
 union lock_debug {
 uint16_t val;
 #define LOCK_DEBUG_INITVAL 0x
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] p2m/ept: pass correct level to atomic_write_ept_entry in ept_invalidate_emt

2019-08-29 Thread Jan Beulich

On 29.08.2019 12:26, Roger Pau Monné  wrote:
> On Tue, Aug 27, 2019 at 05:23:33PM +0200, Jan Beulich wrote:
>> On 23.08.2019 07:58,  Tian, Kevin  wrote:
 From: Roger Pau Monne [mailto:roger@citrix.com]
 Sent: Tuesday, August 20, 2019 11:38 PM

 The level passed to ept_invalidate_emt corresponds to the EPT entry
 passed as the mfn parameter, which is a pointer to an EPT page table,
 hence the entries in that page table will have one level less than the
 parent.

 Fix the call to atomic_write_ept_entry to pass the correct level, ie:
 one level less than the parent.

 Fixes: 50fe6e73705 ('pvh dom0: add and remove foreign pages')
 Signed-off-by: Roger Pau Monné 
 ---
 Cc: Jun Nakajima 
 Cc: Kevin Tian 
 Cc: George Dunlap 
 Cc: Jan Beulich 
 Cc: Andrew Cooper 
 Cc: Wei Liu 
 Cc: Sergey Dyasli 
 Cc: Paul Durrant 
 ---
   xen/arch/x86/mm/p2m-ept.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
 index 6b8468c793..23ae6e9da3 100644
 --- a/xen/arch/x86/mm/p2m-ept.c
 +++ b/xen/arch/x86/mm/p2m-ept.c
 @@ -367,7 +367,7 @@ static bool_t ept_invalidate_emt(struct p2m_domain
 *p2m, mfn_t mfn,
   e.emt = MTRR_NUM_TYPES;
   if ( recalc )
   e.recalc = 1;
 -rc = atomic_write_ept_entry(p2m, [i], e, level);
 +rc = atomic_write_ept_entry(p2m, [i], e, level - 1);
   ASSERT(rc == 0);
   changed = 1;
   }
>>>
>>> Reviewed-by: Kevin Tian .
>>>
>>> One small comment about readability. What about renaming 'level'
>>> to 'parent_level' in this function?
>>
>> And also taking the opportunity and making it unsigned int?
> 
> I've been thinking about this, and I'm not sure renaming level to
> parent_level is correct, level actually matches the level of the mfn
> also passed as a parameter, and hence it's correct. It's the function
> itself that actually iterates over the page table entries, and hence
> descends one level into the page tables.
> 
> ie: I could add something like ASSERT(level) to make sure no level 0
> entries are passed to the function, but renaming doesn't seem correct
> to me.

Hmm, I'm afraid I've made the change as requested by Kevin while
committing. Personally I think either name is fine, but if Kevin
agrees with your response, then maybe we should undo that adjustment.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature

2019-08-29 Thread Jan Beulich

On 29.08.2019 12:46, Andrew Cooper wrote:
> On 29/08/2019 10:27, Jan Beulich wrote:
>> setup_clear_cpu_cap() is __init and hence may not be called post-boot.
>> Note that opt_pku nevertheless is not getting __initdata added - see
>> e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init").
>>
>> Signed-off-by: Jan Beulich 
> 
> Acked-by: Andrew Cooper 
> 
> However, I'm tempted to suggest that this logic disappears entirely. 
> Given that we don't support it for PV guests, and can't without taking a
> CR4 write on toggle_kernel_mode(), all this actually controls is whether
> the bit finds its way into the HVM max policy.

Well, if you mean replacing opt_pku by an addition to
parse_xen_cpuid(), then I'll be happy to do that as a follow-on.
If there is another direction you've been considering, then
you'd have to at least outline it.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode

2019-08-29 Thread Jan Beulich

On 27.08.2019 06:52, Chao Gao wrote:
> On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote:
>> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote:
>>> On 19/08/2019 02:25, Chao Gao wrote:
 register an nmi callback. And this callback does busy-loop on threads
 which are waiting for loading completion. Control threads send NMI to
 slave threads to prevent NMI acceptance during ucode loading.

 Signed-off-by: Chao Gao 
 ---
 Changes in v9:
  - control threads send NMI to all other threads. Slave threads will
  stay in the NMI handling to prevent NMI acceptance during ucode
  loading. Note that self-nmi is invalid according to SDM.
>>>
>>> To me this looks like a half-measure: why keep only slave threads in
>>> the NMI handler, when master threads can update the microcode from
>>> inside the NMI handler as well?
>>
>> No special reason. Because the issue we want to address is that slave
>> threads might go to handle NMI and access MSRs when master thread is
>> loading ucode. So we only keep slave threads in the NMI handler.
>>
>>>
>>> You mention that self-nmi is invalid, but Xen has self_nmi() which is
>>> used for apply_alternatives() during boot, so can be trusted to work.
>>
>> Sorry, I meant using self shorthand to send self-nmi. I tried to use
>> self shorthand but got APIC error. And I agree that it is better to
>> make slave thread call self_nmi() itself.
>>
>>>
>>> I experimented a bit with the following approach: after loading_state
>>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
>>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
>>> the NMI handler. And it seems to work.
>>>
>>> Separate question is about the safety of this approach: can we be sure
>>> that a ucode update would not reset the status of the NMI latch? I.e.
>>> can it cause another NMI to be delivered while Xen already handles one?
>>
>> Ashok, what's your opinion on Sergey's approach and his concern?
> 
> I talked with Ashok. We think your approach is better. I will follow
> your approach in v10. It would be much helpful if you post your patch
> so that I can just rebase it onto other patches.

Doing the actual ucode update inside an NMI handler seems rather risky
to me. Even if Ashok confirmed it would not be an issue on past and
current Intel CPUs - what about future ones, or ones from other vendors?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode

2019-08-29 Thread Jan Beulich

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch)
>  return ret;
>  }
>  
> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
> +{
> +/* The first thread of a core is to load an update. Don't block it. */
> +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ||
> + loading_state != LOADING_CALLIN )
> +return 0;
> +
> +cpumask_set_cpu(cpu, _callin_map);
> +
> +while ( loading_state != LOADING_EXIT )
> +cpu_relax();
> +
> +return 0;
> +}

By returning 0 you tell do_nmi() to continue processing the NMI.
Since you can't tell whether a non-IPI NMI has surfaced at about
the same time this is generally the right thing imo, but how do
you prevent unknown_nmi_error() from getting entered when do_nmi()
ends up setting handle_unknown to true? (The question is mostly
rhetorical, but there's a disconnect between do_nmi() checking
"cpu == 0" and the control thread running on
cpumask_first(_online_map), i.e. you introduce a well hidden
dependency on CPU 0 never going offline. IOW my request is to at
least make this less well hidden, such that it can be noticed if
and when someone endeavors to remove said limitation.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h

2019-08-29 Thread Juergen Gross


On 29.08.19 14:27, Jan Beulich wrote:

On 29.08.2019 14:01, Juergen Gross wrote:

On 29.08.19 13:42, Andrew Cooper wrote:

On 29/08/2019 10:28, Jan Beulich wrote:

Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:

  There have been reports of RDRAND issues after resuming from suspend on
  some AMD family 15h and family 16h systems. This issue stems from a BIOS
  not performing the proper steps during resume to ensure RDRAND continues
  to function properly.

  Update the CPU initialization to clear the RDRAND CPUID bit for any family
  15h and 16h processor that supports RDRAND. If it is known that the family
  15h or family 16h system does not have an RDRAND resume issue or that the
  system will not be placed in suspend, the "cpuid=rdrand" kernel parameter
  can be used to stop the clearing of the RDRAND CPUID bit.

  Note, that clearing the RDRAND CPUID bit does not prevent a processor
  that normally supports the RDRAND instruction from executing it. So any
  code that determined the support based on family and model won't #UD.


Yeah, but it will no longer see random numbers after resume.


That's the implication; note that I've taken the text from the
Linux commit.


Signed-off-by: Jan Beulich 
---
Slightly RFC, in particular because of the change to parse_xen_cpuid():
Alternative approach suggestions are welcome.


This issue was on my radar, but I hadn't got around to starting a patch yet.

I was wondering if we could perhaps default it to whether S3 was
available, but looking at the code which prints "ACPI sleep modes: S3",
it doesn't appear to be related to any real ACPI information.


Wouldn't it make more sense to inhibit suspend/resume instead?


That's an alternative option. But if anything I'd see us providing
both, not the one you suggest instead of what the patch here does.


I'm quite sure a machine running Xen is very rarely put to S3.


I'm not at all sure about this.


Suspend/resume in Xen was broken for more than 3 months in the late 4.12
development phase and nobody noticed it. Only when I started my core
scheduling series I came across the issue.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature

2019-08-29 Thread Andrew Cooper

On 29/08/2019 14:13, Jan Beulich wrote:
> On 29.08.2019 12:46, Andrew Cooper wrote:
>> On 29/08/2019 10:27, Jan Beulich wrote:
>>> setup_clear_cpu_cap() is __init and hence may not be called post-boot.
>>> Note that opt_pku nevertheless is not getting __initdata added - see
>>> e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init").
>>>
>>> Signed-off-by: Jan Beulich 
>> Acked-by: Andrew Cooper 
>>
>> However, I'm tempted to suggest that this logic disappears entirely. 
>> Given that we don't support it for PV guests, and can't without taking a
>> CR4 write on toggle_kernel_mode(), all this actually controls is whether
>> the bit finds its way into the HVM max policy.
> Well, if you mean replacing opt_pku by an addition to
> parse_xen_cpuid(), then I'll be happy to do that as a follow-on.
> If there is another direction you've been considering, then
> you'd have to at least outline it.

Well - I did say "disappear entirely".  I don't see it as having, or
liable to gain, any useful purpose.

In reality, it was subsumed into cpuid= by my patch to allow all bits to
be tweak-able.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/ACPI: re-park previously parked CPUs upon resume from S3

2019-08-29 Thread Andrew Cooper

On 14/06/2019 12:37, Jan Beulich wrote:
> Aiui when resuming from S3, CPUs come back out of RESET/INIT.

From a programmers point of view, all APs are in Wait-for-SIPI after
resume, and this is confirmed by several of the BWG flowcharts.  This is
a logical consequence of the CPU losing all power, meaning that its
startup sequence will be the same whether it is booting from S5 or S3.

>  Therefore
> they need to undergo the same procedure as was added elsewhere by
> commits d8f974f1a6 ("x86: command line option to avoid use of secondary
> hyper-threads") and 8797d20a6e ("x86: possibly bring up all CPUs even
> if not all are supposed to be used").
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/mm: Add mem access rights to NPT

2019-08-29 Thread Jan Beulich

On 22.08.2019 16:02, Alexandru Stefan ISAILA wrote:
> This patch adds access control for NPT mode.
> 
> The access rights are stored in the NPT p2m table 56:53.

Why starting from bit 53? I can't seem to find any use of bit 52.

> The bits are free after clearing the IOMMU flags [1].
> 
> Modify p2m_type_to_flags() to accept and interpret an access value,
> parallel to the ept code.
> 
> Add a set_default_access() method to the p2m-pt and p2m-ept versions
> of the p2m rather than setting it directly, to deal with different
> default permitted access values.

I think this would better be a separate change.

> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -63,10 +63,13 @@
>  #define needs_recalc(level, ent) _needs_recalc(level##e_get_flags(ent))
>  #define valid_recalc(level, ent) (!(level##e_get_flags(ent) & 
> _PAGE_ACCESSED))
>  
> +#define _PAGE_ACCESS_BITS  (0x1e00) /* mem_access rights 16:13 */

I guess this is too disconnected from the two page.h headers where
the correlation between bit positions gets explained, so I guess
you want to extend the comment and either refer there, or replicate
some of it to make understandable why 16:13 matches 56:53.

I'm also concerned how easy it'll be for someone to find this
definition when wanting to use other of the available bits.

> @@ -104,8 +112,32 @@ static unsigned long p2m_type_to_flags(const struct 
> p2m_domain *p2m,
>  flags |= _PAGE_PWT;
>  ASSERT(!level);
>  }
> -return flags | P2M_BASE_FLAGS | _PAGE_PCD;
> +flags |= P2M_BASE_FLAGS | _PAGE_PCD;
> +break;
>  }
> +
> +switch ( access )
> +{
> +case p2m_access_r:
> +flags |= _PAGE_NX_BIT;
> +flags &= ~_PAGE_RW;
> +break;
> +case p2m_access_rw:
> +flags |= _PAGE_NX_BIT;
> +break;
> +case p2m_access_rx:
> +case p2m_access_rx2rw:
> +flags &= ~(_PAGE_NX_BIT | _PAGE_RW);
> +break;
> +case p2m_access_x:
> +flags &= ~_PAGE_RW;
> +break;

I can't seem to be able to follow you here. In fact I don't see
how you would be able to express execute-only with NPT. If this
is really needed for some reason, then a justifying comment
should be added.

> @@ -152,6 +184,17 @@ p2m_free_entry(struct p2m_domain *p2m, l1_pgentry_t 
> *p2m_entry, int page_order)
>  p2m_free_ptp(p2m, l1e_get_page(*p2m_entry));
>  }
>  
> +static void p2m_set_access(intpte_t *entry, p2m_access_t access)
> +{
> +*entry |= put_pte_flags((get_pte_flags(*entry) & ~_PAGE_ACCESS_BITS) |
> +(MASK_INSR(access, _PAGE_ACCESS_BITS)));
> +}
> +
> +static p2m_access_t p2m_get_access(intpte_t entry)
> +{
> +return (p2m_access_t)(MASK_EXTR(get_pte_flags(entry), 
> _PAGE_ACCESS_BITS));

Is the cast really needed here?

> @@ -226,6 +269,7 @@ p2m_next_level(struct p2m_domain *p2m, void **table,
>  {
>  new_entry = l1e_from_pfn(pfn | (i << ((level - 1) * 
> PAGETABLE_ORDER)),
>   flags);
> +p2m_set_access(_entry.l1, p2m->default_access);
>  rc = p2m->write_p2m_entry(p2m, gfn, l1_entry + i, new_entry, 
> level);
>  if ( rc )
>  {
> @@ -237,6 +281,7 @@ p2m_next_level(struct p2m_domain *p2m, void **table,
>  unmap_domain_page(l1_entry);
>  
>  new_entry = l1e_from_mfn(mfn, P2M_BASE_FLAGS | _PAGE_RW);
> +p2m_set_access(_entry.l1, p2m->default_access);
>  rc = p2m->write_p2m_entry(p2m, gfn, p2m_entry, new_entry,
>level + 1);
>  if ( rc )

Is it really intended to insert the access bits also into non-leaf
entries (which may be what is being processed here)? (May also
apply further down.)

> @@ -474,6 +520,32 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa)
>  return rc;
>  }
>  
> +static int p2m_pt_check_access(p2m_access_t p2ma)
> +{
> +switch ( p2ma )
> +{
> +case p2m_access_n:
> +case p2m_access_w:
> +case p2m_access_wx:
> +case p2m_access_n2rwx:
> +return -EINVAL;

I'm not convinced EINVAL is appropriate here - the argument isn't
invalid, it's just that there's no way to represent it.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h

2019-08-29 Thread Jan Beulich

On 29.08.2019 14:01, Juergen Gross wrote:
> On 29.08.19 13:42, Andrew Cooper wrote:
>> On 29/08/2019 10:28, Jan Beulich wrote:
>>> Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:
>>>
>>>  There have been reports of RDRAND issues after resuming from suspend on
>>>  some AMD family 15h and family 16h systems. This issue stems from a 
>>> BIOS
>>>  not performing the proper steps during resume to ensure RDRAND 
>>> continues
>>>  to function properly.
>>>
>>>  Update the CPU initialization to clear the RDRAND CPUID bit for any 
>>> family
>>>  15h and 16h processor that supports RDRAND. If it is known that the 
>>> family
>>>  15h or family 16h system does not have an RDRAND resume issue or that 
>>> the
>>>  system will not be placed in suspend, the "cpuid=rdrand" kernel 
>>> parameter
>>>  can be used to stop the clearing of the RDRAND CPUID bit.
>>>
>>>  Note, that clearing the RDRAND CPUID bit does not prevent a processor
>>>  that normally supports the RDRAND instruction from executing it. So any
>>>  code that determined the support based on family and model won't #UD.
> 
> Yeah, but it will no longer see random numbers after resume.

That's the implication; note that I've taken the text from the
Linux commit.

>>> Signed-off-by: Jan Beulich 
>>> ---
>>> Slightly RFC, in particular because of the change to parse_xen_cpuid():
>>> Alternative approach suggestions are welcome.
>>
>> This issue was on my radar, but I hadn't got around to starting a patch yet.
>>
>> I was wondering if we could perhaps default it to whether S3 was
>> available, but looking at the code which prints "ACPI sleep modes: S3",
>> it doesn't appear to be related to any real ACPI information.
> 
> Wouldn't it make more sense to inhibit suspend/resume instead?

That's an alternative option. But if anything I'd see us providing
both, not the one you suggest instead of what the patch here does.

> I'm quite sure a machine running Xen is very rarely put to S3.

I'm not at all sure about this.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86/AMD: Fix handling of x87 exception pointers on Fam17h hardware

2019-08-29 Thread Jan Beulich

On 19.08.2019 20:26, Andrew Cooper wrote:
> AMD Pre-Fam17h CPUs "optimise" {F,}X{SAVE,RSTOR} by not saving/restoring
> FOP/FIP/FDP if an x87 exception isn't pending.  This causes an information
> leak, CVE-2006-1056, and worked around by several OSes, including Xen.  AMD
> Fam17h CPUs no longer have this leak, and advertise so in a CPUID bit.
> 
> Introduce the RSTR_FP_ERR_PTRS feature, as specified by AMD, and expose to all
> guests by default.  While adjusting libxl's cpuid table, add CLZERO which
> looks to have been omitted previously.
> 
> Also introduce an X86_BUG bit to trigger the (F)XRSTOR workaround, and set it
> on AMD hardware where RSTR_FP_ERR_PTRS is not advertised.  Optimise the
> workaround path by dropping the data-dependent unpredictable conditions which
> will evalute to true for all 64bit OSes and most 32bit ones.

I definitely don't buy the "all 64bit OSes" part here: Anyone doing
full 80-bit FP operations will have to use the FPU, and hence may
want to have some unmasked exceptions. I'm also not sure why you
call them "unpredictable": If all (or most) cases match, the branch
there could be pretty well predicted (subject of course to capacity).

All in all I'd prefer if the conditions remained in place; my minimal
request would be for there to be a comment why there's no evaluation
of FSW/FCW.

> --- a/xen/arch/x86/i387.c
> +++ b/xen/arch/x86/i387.c
> @@ -43,20 +43,17 @@ static inline void fpu_fxrstor(struct vcpu *v)
>  const typeof(v->arch.xsave_area->fpu_sse) *fpu_ctxt = v->arch.fpu_ctxt;
>  
>  /*
> - * AMD CPUs don't save/restore FDP/FIP/FOP unless an exception
> + * Some CPUs don't save/restore FDP/FIP/FOP unless an exception

Are there any non-AMD CPUs known to have this issue? If not, is
there a particular reason you don't say "Some AMD CPUs ..."?

>   * is pending. Clear the x87 state here by setting it to fixed
>   * values. The hypervisor data segment can be sometimes 0 and
>   * sometimes new user value. Both should be ok. Use the FPU saved
>   * data block as a safe address because it should be in L1.
>   */
> -if ( !(fpu_ctxt->fsw & ~fpu_ctxt->fcw & 0x003f) &&
> - boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
> -{
> +if ( cpu_bug_fpu_ptr_leak )
>  asm volatile ( "fnclex\n\t"
> "ffree %%st(7)\n\t" /* clear stack tag */
> "fildl %0"  /* load to clear state */
> : : "m" (*fpu_ctxt) );

If here and in the respective xsave instance you'd use alternatives
patching, I wouldn't mind the use of a X86_BUG_* for this (as made
possible by patch 1). But as said before, just like for synthetic
features I strongly think we should use simple boolean variables
when using them only in if()-s. Use of the feature(/bug) machinery
is needed only to not further complicate alternatives patching.

> @@ -169,11 +166,10 @@ static inline void fpu_fxsave(struct vcpu *v)
> : "=m" (*fpu_ctxt) : "R" (fpu_ctxt) );
>  
>  /*
> - * AMD CPUs don't save/restore FDP/FIP/FOP unless an exception
> - * is pending.
> + * Some CPUs don't save/restore FDP/FIP/FOP unless an exception is
> + * pending.  The restore code fills in suitable defaults.
>   */
> -if ( !(fpu_ctxt->fsw & 0x0080) &&
> - boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
> +if ( cpu_bug_fpu_ptr_leak && !(fpu_ctxt->fsw & 0x0080) )
>  return;

The comment addition seems a little unmotivated: The code here isn't
about leaking data, but about having valid data to consume (down
from here). With this, keying the return to cpu_bug_* also doesn't
look very nice, but I admit I can't suggest a better alternative
(other than leaving the vendor check in place and checking the
X86_FEATURE_RSTR_FP_ERR_PTRS bit).

An option might be to give the construct a different name, without
"leak" in it (NO_FP_ERR_PTRS?).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup

2019-08-29 Thread Andrew Cooper

On 14/06/2019 12:38, Jan Beulich wrote:
> To "compensate" for the code size growth by an earlier change:
> - drop "trampoline" labels (in almost all cases the target label is
>   reachable with an 8-bit-displacement branch anyway, and a single 16-
>   bit-displacement branch is still better than a pair of two branches)

Do you happen to know why we any to start with?  It can't plausibly be
for manual code alignment reasons.

(probably) whatever the reason, good riddance.

> - drop an entirely dead insn
> - reduce code size in a few other (obvious I hope) cases, by more
>   suitable insn/operands selection

I'm afraid these are rather hard to identify, given no hints.  Comments
in line.

> Also drop redundant #define-s (move suitable #include a little earlier
> instead) and add two alignment directives.
>
> Signed-off-by: Jan Beulich 
>
> --- a/xen/arch/x86/boot/trampoline.S
> +++ b/xen/arch/x86/boot/trampoline.S
> @@ -176,6 +176,7 @@ start64:
>  
>  jmpq*%rdi
>  
> +#include "video.h"
>  #include "wakeup.S"
>  
>  .balign 8
> @@ -283,8 +284,6 @@ trampoline_boot_cpu_entry:
>  /* Jump to the common bootstrap entry point. */
>  jmp trampoline_protmode_entry
>  
> -#include "video.h"
> -
>  .align  2
>  /* Keep in sync with cmdline.c:early_boot_opts_t type! */
>  early_boot_opts:
> --- a/xen/arch/x86/boot/video.S
> +++ b/xen/arch/x86/boot/video.S
> @@ -384,9 +384,6 @@ lmbad:  leawbootsym(unknt), %si
>  jmp mode_menu
>  lmdef:  ret
>  
> -_setrec:jmp setrec  # Ugly...
> -_set_80x25: jmp set_80x25
> -
>  # Setting of user mode (AX=mode ID) => CF=success
>  mode_set:
>  movw%ax, bootsym(boot_vid_mode)
> @@ -396,7 +393,7 @@ mode_set:
>  je  setvesabysize
>  
>  testb   $VIDEO_RECALC>>8, %ah
> -jnz _setrec
> +jnz setrec
>  
>  cmpb$VIDEO_FIRST_SPECIAL>>8, %ah
>  jz  setspc
> @@ -421,7 +418,7 @@ setspc: xorb%bh, %bh
>  
>  setmenu:
>  orb %al, %al# 80x25 is an exception
> -jz  _set_80x25
> +jz  set_80x25
>  
>  pushw   %bx # Set mode chosen from menu
>  callmode_table  # Build the mode table
> @@ -441,36 +438,32 @@ check_vesa:
>  cmpw$0x004f, %ax
>  jnz setbad
>  
> -leawvesa_mode_info, %di
> -subb$VIDEO_FIRST_VESA>>8, %bh
> -movw%bx, %cx# Get mode information structure
> +leawvesa_mode_info, %di # Get mode information structure
> +leaw-VIDEO_FIRST_VESA(%bx), %cx
>  movw$0x4f01, %ax
>  int $0x10
> -addb$VIDEO_FIRST_VESA>>8, %bh

Is this the redundant instruction you are talking about, or ... (near
the end)?

I think I follow this as "no logical change", by leaving %bx intact
throughout, and only clearing %ch as part of the %bx=>%cx copy.

>  cmpw$0x004f, %ax
>  jnz setbad
>  
>  movb(%di), %al  # Check mode attributes.
>  andb$0x99, %al
>  cmpb$0x99, %al
> -jnz _setbad # Doh! No linear frame buffer.
> +jnz setbad  # Doh! No linear frame buffer.
>  
>  pushw   %bx
>  subb$VIDEO_FIRST_VESA>>8, %bh
> -orw $0x4000, %bx# Use linear frame buffer
> +orb $0x40, %bh  # Use linear frame buffer
>  movw$0x4f02, %ax# VESA BIOS mode set call
>  int $0x10
>  popw%bx
>  cmpw$0x004f, %ax# AL=4f if implemented
> -jnz _setbad # AH=0 if OK
> +jnz setbad  # AH=0 if OK
>  
>  movb$1, bootsym(graphic_mode)  # flag graphic mode
>  movw%bx, bootsym(video_mode)
>  stc
>  ret
>  
> -_setbad: jmpsetbad  # Ugly...
> -
>  # Recalculate vertical display end registers -- this fixes various
>  # inconsistencies of extended modes on many adapters. Called when
>  # the VIDEO_RECALC flag is set in the mode ID.
> @@ -515,7 +508,7 @@ setvesabysize:
>  leawmodelist,%si
>  1:  add $8,%si
>  cmpw$ASK_VGA,-8(%si)# End?
> -je  _setbad
> +je  setbad
>  movw-6(%si),%ax
>  cmpw%ax,bootsym(vesa_size)+0
>  jne 1b
> @@ -948,6 +941,7 @@ store_edid:
>  #endif
>  ret
>  
> +.p2align 1
>  mt_end: .word   0   # End of video mode table if built
>  edit_buf:   .space  6   # Line editor buffer
>  card_name:  .word   0   # Pointer to adapter name
> @@ -991,6 +985,7 @@ vesa_name:  .asciz  "VESA"
>  
>  name_bann:  .asciz  "Video adapter: "
>  
> +.p2align 1
>  force_size: .word   0   # Use this size

Re: [Xen-devel] [PATCH v6 10/10] introduce a 'passthrough' configuration option to xl.cfg...

2019-08-29 Thread Jan Beulich

On 16.08.2019 19:20, Paul Durrant wrote:
> ...and hence the ability to disable IOMMU mappings, and control EPT
> sharing.
> 
> This patch introduces a new 'libxl_passthrough' enumeration into
> libxl_domain_create_info. The value will be set by xl either when it parses
> a new 'passthrough' option in xl.cfg, or implicitly if there is passthrough
> hardware specified for the domain.
> 
> If the value of the passthrough configuration option is 'disabled' then
> the XEN_DOMCTL_CDF_iommu flag will be clear in the xen_domctl_createdomain
> flags, thus allowing the toolstack to control whether the domain gets
> IOMMU mappings or not (where previously they were globally set).
> 
> If the value of the passthrough configuration option is 'sync_pt' then
> a new 'iommu_opts' field in xen_domctl_createdomain will be set with the
> value XEN_DOMCTL_IOMMU_no_sharept. This will override the global default
> set in iommu_hap_pt_share, thus allowing the toolstack to control whether
> EPT sharing is used for the domain.
> 
> Signed-off-by: Paul Durrant 

Reviewed-by: Jan Beulich 
with a question/suggestion and a nit:

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -308,6 +308,13 @@ static int sanitise_domain_config(struct 
> xen_domctl_createdomain *config)
>  return -EINVAL;
>  }
>  
> +if ( !(config->flags & XEN_DOMCTL_CDF_iommu) && config->iommu_opts )
> +{
> +dprintk(XENLOG_INFO,
> +"IOMMU options specified but IOMMU not enabled\n");
> +return -EINVAL;
> +}

Seeing this I wonder whether XEN_DOMCTL_CDF_iommu wouldn't better be
bit 0 of iommu_opts.

> @@ -70,6 +70,10 @@ struct xen_domctl_createdomain {
>  
>  uint32_t flags;
>  
> +#define _XEN_DOMCTL_IOMMU_no_sharept  0
> +#define XEN_DOMCTL_IOMMU_no_sharept   (1U<<_XEN_DOMCTL_IOMMU_no_sharept)

Please can you add the missing blanks around << ?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations

2019-08-29 Thread Konrad Rzeszutek Wilk

> diff --git a/xen/test/livepatch/Makefile b/xen/test/livepatch/Makefile
> index 23113d3418..067861903f 100644
> --- a/xen/test/livepatch/Makefile
> +++ b/xen/test/livepatch/Makefile
> @@ -27,6 +27,8 @@ LIVEPATCH_ACTION_HOOKS_NOFUNC := 
> xen_action_hooks_nofunc.livepatch
>  LIVEPATCH_ACTION_HOOKS_MARKER:= xen_action_hooks_marker.livepatch
>  LIVEPATCH_ACTION_HOOKS_NOAPPLY:= xen_action_hooks_noapply.livepatch
>  LIVEPATCH_ACTION_HOOKS_NOREVERT:= xen_action_hooks_norevert.livepatch
> +LIVEPATCH_EXPECTATIONS:= xen_expectations.livepatch
> +LIVEPATCH_EXPECTATIONS_FAIL:= xen_expectations_fail.livepatch
>  
>  LIVEPATCHES += $(LIVEPATCH)
>  LIVEPATCHES += $(LIVEPATCH_BYE)
> @@ -40,6 +42,8 @@ LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_NOFUNC)
>  LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_MARKER)
>  LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_NOAPPLY)
>  LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_NOREVERT)
> +LIVEPATCHES += $(LIVEPATCH_EXPECTATIONS)
> +LIVEPATCHES += $(LIVEPATCH_EXPECTATIONS_FAIL)
>  
>  LIVEPATCH_DEBUG_DIR ?= $(DEBUG_DIR)/xen-livepatch
>  
> @@ -54,7 +58,7 @@ uninstall:
>  
>  .PHONY: clean
>  clean::
> - rm -f *.o .*.o.d *.livepatch config.h
> + rm -f *.o .*.o.d *.livepatch config.h expect_config.h
>  
>  #
>  # To compute these values we need the binary files: xen-syms
> @@ -182,8 +186,27 @@ xen_actions_hooks_norevert.o: config.h
>  $(LIVEPATCH_ACTION_HOOKS_NOREVERT): xen_action_hooks_marker.o 
> xen_hello_world_func.o note.o xen_note.o
>   $(LD) $(LDFLAGS) $(build_id_linker) -r -o 
> $(LIVEPATCH_ACTION_HOOKS_NOREVERT) $^
>  
> +CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E 
> '<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | 
> sed 's/,$$/}/g')
> +.PHONY: expect_config.h
> +expect_config.h: EXPECT_BYTES=$(call 
> CODE_GET_EXPECT,$(BASEDIR)/xen-syms,xen_extra_version)
> +expect_config.h: EXPECT_BYTES_COUNT=6
> +expect_config.h: xen_expectations.o
> + (set -e; \
> +  echo "#define EXPECT_BYTES $(EXPECT_BYTES)"; \
> + echo "#define EXPECT_BYTES_COUNT $(EXPECT_BYTES_COUNT)") > $@
> +
> +xen_expectations.o: expect_config.h
> +
> +.PHONY: $(LIVEPATCH_EXPECTATIONS)
> +$(LIVEPATCH_EXPECTATIONS): xen_expectations.o xen_hello_world_func.o note.o 
> xen_note.o
> + $(LD) $(LDFLAGS) $(build_id_linker) -r -o $(LIVEPATCH_EXPECTATIONS) $^
> +
> +.PHONY: $(LIVEPATCH_EXPECTATIONS_FAIL)
> +$(LIVEPATCH_EXPECTATIONS_FAIL): xen_expectations_fail.o 
> xen_hello_world_func.o note.o xen_note.o
> + $(LD) $(LDFLAGS) $(build_id_linker) -r -o 
> $(LIVEPATCH_EXPECTATIONS_FAIL) $^
> +
>  .PHONY: livepatch
>  livepatch: $(LIVEPATCH) $(LIVEPATCH_BYE) $(LIVEPATCH_REPLACE) 
> $(LIVEPATCH_NOP) $(LIVEPATCH_NO_XEN_BUILDID) \
> $(LIVEPATCH_PREPOST_HOOKS) $(LIVEPATCH_PREPOST_HOOKS_FAIL) 
> $(LIVEPATCH_ACTION_HOOKS) \
> $(LIVEPATCH_ACTION_HOOKS_NOFUNC) $(LIVEPATCH_ACTION_HOOKS_MARKER) 
> $(LIVEPATCH_ACTION_HOOKS_NOAPPLY) \
> -   $(LIVEPATCH_ACTION_HOOKS_NOREVERT)
> +   $(LIVEPATCH_ACTION_HOOKS_NOREVERT) $(LIVEPATCH_EXPECTATIONS) 
> $(LIVEPATCH_EXPECTATIONS_FAIL)
> diff --git a/xen/test/livepatch/xen_expectations.c 
> b/xen/test/livepatch/xen_expectations.c
> new file mode 100644
> index 00..c8175a458b
> --- /dev/null
> +++ b/xen/test/livepatch/xen_expectations.c
> @@ -0,0 +1,41 @@
> +/*
> + * Copyright (c) 2019 Amazon.com, Inc. or its affiliates. All rights 
> reserved.
> + *
> + */
> +
> +#include "expect_config.h"
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +static const char livepatch_exceptions_str[] = "xen_extra_version";
> +extern const char *xen_hello_world(void);
> +
> +struct livepatch_func __section(".livepatch.funcs") livepatch_exceptions = {
> +.version = LIVEPATCH_PAYLOAD_VERSION,
> +.name = livepatch_exceptions_str,
> +.new_addr = xen_hello_world,
> +.old_addr = xen_extra_version,
> +.new_size = EXPECT_BYTES_COUNT,
> +.old_size = EXPECT_BYTES_COUNT,
> +.expect = {
> +.enabled = 1,
> +.len = EXPECT_BYTES_COUNT,
> +.data = EXPECT_BYTES
> +},
> +
> +};

When I compile with 32-bit ARM 'make tests' I get:

arm-eabi-ld-EL -EL --build-id=sha1 -r -o 
xen_action_hooks_norevert.livepatch xen_action_hooks_marker.o 
xen_hello_world_func.o note.o xen_note.o
make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped.
objdump: can't disassemble for architecture UNKNOWN!

(set -e; \
 echo "#define EXPECT_BYTES {"; \
 echo "#define EXPECT_BYTES_COUNT 6") > expect_config.h
arm-eabi-gcc -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall 
-Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin 
-fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ 
-include /home/konrad/A20/xen.git/xen/include/xen/config.h 
'-D__OBJECT_FILE__="xen_expectations.o"'

Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h

2019-08-29 Thread Juergen Gross


On 29.08.19 13:42, Andrew Cooper wrote:

On 29/08/2019 10:28, Jan Beulich wrote:

Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:

 There have been reports of RDRAND issues after resuming from suspend on
 some AMD family 15h and family 16h systems. This issue stems from a BIOS
 not performing the proper steps during resume to ensure RDRAND continues
 to function properly.

 Update the CPU initialization to clear the RDRAND CPUID bit for any family
 15h and 16h processor that supports RDRAND. If it is known that the family
 15h or family 16h system does not have an RDRAND resume issue or that the
 system will not be placed in suspend, the "cpuid=rdrand" kernel parameter
 can be used to stop the clearing of the RDRAND CPUID bit.

 Note, that clearing the RDRAND CPUID bit does not prevent a processor
 that normally supports the RDRAND instruction from executing it. So any
 code that determined the support based on family and model won't #UD.


Yeah, but it will no longer see random numbers after resume.



Signed-off-by: Jan Beulich 
---
Slightly RFC, in particular because of the change to parse_xen_cpuid():
Alternative approach suggestions are welcome.


This issue was on my radar, but I hadn't got around to starting a patch yet.

I was wondering if we could perhaps default it to whether S3 was
available, but looking at the code which prints "ACPI sleep modes: S3",
it doesn't appear to be related to any real ACPI information.


Wouldn't it make more sense to inhibit suspend/resume instead?

I'm quite sure a machine running Xen is very rarely put to S3.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH] make-hosts-flight: Prefer stretch to jessie

2019-08-29 Thread Ian Jackson

The list of suites in ALL_SUITES is in decreasing order of preference:
it gets passed to cs-hosts-list, where the order is significant.

Without this patch, we try to commission hosts by running jessie if
the host flags seem to say jessie would be supported.  That's not
really sensible.

CC: Dominic Brekau 
Signed-off-by: Ian Jackson 
---
 make-hosts-flight | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/make-hosts-flight b/make-hosts-flight
index f7422932..92da1c7c 100755
--- a/make-hosts-flight
+++ b/make-hosts-flight
@@ -26,7 +26,8 @@ blessing=$4
 buildflight=$5
 
 : ${ALL_ARCHES:=amd64 i386 arm64 armhf}
-: ${ALL_SUITES:=jessie stretch}
+: ${ALL_SUITES:=stretch jessie}
+# ^ most preferred suite first
 
 : ${PERHOST_MAXWAIT:=2} # seconds
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 07/10] use is_iommu_enabled() where appropriate...

2019-08-29 Thread Jan Beulich

On 16.08.2019 19:19, Paul Durrant wrote:
> ...rather than testing the global iommu_enabled flag and ops pointer.
> 
> Now that there is a per-domain flag indicating whether the domain is
> permitted to use the IOMMU (which determines whether the ops pointer will
> be set), many tests of the global iommu_enabled flag and ops pointer can
> be translated into tests of the per-domain flag. Some of the other tests of
> purely the global iommu_enabled flag can also be translated into tests of
> the per-domain flag.
> 
> NOTE: The comment in iommu_share_p2m_table() is also fixed; need_iommu()
>   disappeared some time ago. Also, whilst the style of the 'if' in
>   flask_iommu_resource_use_perm() is fixed, I have not translated any
>   instances of u32 into uint32_t to keep consistency. IMO such a
>   translation would be better done globally for the source module in
>   a separate patch.
>   The change in the initialization of the 'hd' variable in iommu_map()
>   and iommu_unmap() is done to keep the PV shim build happy. Without
>   this change it will fail to compile with errors of the form:
> 
> iommu.c:361:32: error: unused variable ‘hd’ [-Werror=unused-variable]
>  const struct domain_iommu *hd = dom_iommu(d);
>  ^~

Why would that be? Afaict there's no short-circuiting of
is_iommu_enabled() (albeit there could be, as we don't mean the shim
to have/use an IOMMU). Oh - I guess I see it: Instead of this change
what I think we want is for x86's iommu_call() to evaluate its 1st
argument. Otherwise we're liable to run into the same issue elsewhere
again.

> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -886,7 +886,7 @@ static int flask_map_domain_msi (struct domain *d, int 
> irq, const void *data,
>  #endif
>  }
>  
> -static u32 flask_iommu_resource_use_perm(void)
> +static u32 flask_iommu_resource_use_perm(struct domain *d)

const?

With these adjustments
Reviewed-by: Jan Beulich 

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 08/10] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Jan Beulich

On 16.08.2019 19:19, Paul Durrant wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -146,6 +146,17 @@ static int __init parse_dom0_iommu_param(const char *s)
>  }
>  custom_param("dom0-iommu", parse_dom0_iommu_param);
>  
> +static void __hwdom_init check_hwdom_reqs(struct domain *d)

This really should have const, but I realize ...

> +{
> +if ( iommu_hwdom_none || !paging_mode_translate(d) )
> +return;
> +
> +arch_iommu_check_autotranslated_hwdom(d);

... this one wants non-const (for - afaict - no reason).

> @@ -159,129 +170,44 @@ int iommu_domain_init(struct domain *d)
>  return ret;
>  
>  hd->platform_ops = iommu_get_ops();
> -return hd->platform_ops->init(d);
> -}
> +ret = hd->platform_ops->init(d);
> +if ( ret )
> +return ret;
>  
> -static void __hwdom_init check_hwdom_reqs(struct domain *d)
> -{
> -if ( iommu_hwdom_none || !paging_mode_translate(d) )
> -return;
> +/*
> + * NB: 'relaxed' h/w domains don't need the IOMMU mappings to be kept
> + * in-sync with their assigned pages because all host RAM will be
> + * mapped during hwdom_init().
> + */

Doesn't this comment belong to ...

> +if ( is_hardware_domain(d) )
> +check_hwdom_reqs(d); /* may modify iommu_hwdom_strict */
>  
> -arch_iommu_check_autotranslated_hwdom(d);
> +if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> +hd->need_sync = !iommu_use_hap_pt(d);

... this if()?

> @@ -629,8 +552,7 @@ static void iommu_dump_p2m_table(unsigned char key)
>  ops = iommu_get_ops();
>  for_each_domain(d)
>  {
> -if ( is_hardware_domain(d) ||
> - dom_iommu(d)->status < IOMMU_STATUS_initialized )
> +if ( !is_iommu_enabled(d) )
>  continue;

Didn't you agree to retain the hwdom part of the condition here?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading

2019-08-29 Thread Jan Beulich

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -232,6 +276,34 @@ bool microcode_update_cache(struct microcode_patch 
> *patch)
>  return true;
>  }
>  
> +/* Wait for a condition to be met with a timeout (us). */
> +static int wait_for_condition(int (*func)(void *data), void *data,
> + unsigned int timeout)
> +{
> +while ( !func(data) )
> +{
> +if ( !timeout-- )
> +{
> +printk("CPU%u: Timeout in %pS\n",
> +   smp_processor_id(), __builtin_return_address(0));
> +return -EBUSY;
> +}
> +udelay(1);
> +}
> +
> +return 0;
> +}
> +
> +static int wait_cpu_callin(void *nr)
> +{
> +return cpumask_weight(_callin_map) >= (unsigned long)nr;
> +}
> +
> +static int wait_cpu_callout(void *nr)
> +{
> +return atomic_read(_out) >= (unsigned long)nr;
> +}

Since wait_for_condition() is used with only these two functions
as callbacks, they should imo return bool and take const void *.

> @@ -265,37 +337,155 @@ static int microcode_update_cpu(const struct 
> microcode_patch *patch)
>  return err;
>  }
>  
> -static long do_microcode_update(void *patch)
> +static int slave_thread_fn(void)
> +{
> +unsigned int cpu = smp_processor_id();
> +unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
> +
> +while ( loading_state != LOADING_CALLIN )
> +cpu_relax();
> +
> +cpumask_set_cpu(cpu, _callin_map);
> +
> +while ( loading_state != LOADING_EXIT )
> +cpu_relax();
> +
> +/* Copy update revision from the "master" thread. */
> +this_cpu(cpu_sig).rev = per_cpu(cpu_sig, master).rev;
> +
> +return 0;
> +}
> +
> +static int master_thread_fn(const struct microcode_patch *patch)
> +{
> +unsigned int cpu = smp_processor_id();
> +int ret = 0;
> +
> +while ( loading_state != LOADING_CALLIN )
> +cpu_relax();
> +
> +cpumask_set_cpu(cpu, _callin_map);
> +
> +while ( loading_state != LOADING_ENTER )
> +cpu_relax();
> +
> +/*
> + * If an error happened, control thread would set 'loading_state'
> + * to LOADING_EXIT. Don't perform ucode loading for this case
> + */
> +if ( loading_state == LOADING_EXIT )
> +return ret;

Even if the producer transitions this through ENTER to EXIT, the
observer here may never get to see the ENTER state, and hence
never exit the loop above. You want either < ENTER or == CALLIN.

> +ret = microcode_ops->apply_microcode(patch);
> +if ( !ret )
> +atomic_inc(_updated);
> +atomic_inc(_out);
> +
> +while ( loading_state != LOADING_EXIT )
> +cpu_relax();
> +
> +return ret;
> +}

As a cosmetic remark, I don't think "master" and "slave" are
suitable terms here. "primary" and "secondary" would imo come
closer to what the threads' relationship is.

> +static int control_thread_fn(const struct microcode_patch *patch)
>  {
> -unsigned int cpu;
> +unsigned int cpu = smp_processor_id(), done;
> +unsigned long tick;
> +int ret;
>  
> -/* Store the patch after a successful loading */
> -if ( !microcode_update_cpu(patch) && patch )
> +/* Allow threads to call in */
> +loading_state = LOADING_CALLIN;
> +smp_mb();

Why not just smp_wmb()? (Same further down then.)

> +cpumask_set_cpu(cpu, _callin_map);
> +
> +/* Waiting for all threads calling in */
> +ret = wait_for_condition(wait_cpu_callin,
> + (void *)(unsigned long)num_online_cpus(),
> + MICROCODE_CALLIN_TIMEOUT_US);
> +if ( ret ) {

Misplaced brace.

> +static int do_microcode_update(void *patch)

const?

> @@ -326,19 +523,67 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) 
> buf, unsigned long len)
>  {
>  ret = PTR_ERR(patch);
>  printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);
> -goto free;
> +goto put;
>  }
>  
>  if ( !patch )
>  {
>  printk(XENLOG_INFO "No ucode found. Update aborted!\n");
>  ret = -EINVAL;
> -goto free;
> +goto put;
> +}
> +
> +cpumask_clear(_callin_map);
> +atomic_set(_out, 0);
> +atomic_set(_updated, 0);
> +loading_state = LOADING_PREPARE;
> +
> +/* Calculate the number of online CPU core */
> +nr_cores = 0;
> +for_each_online_cpu(cpu)
> +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +nr_cores++;
> +
> +printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores);
> +
> +/*
> + * We intend to disable interrupt for long time, which may lead to
> + * watchdog timeout.
> + */
> +watchdog_disable();
> +/*
> + * Late loading dance. Why the heavy-handed stop_machine effort?
> + *
> + * - HT siblings must be idle and not execute other code while the other
> + *   sibling is loading microcode in order to avoid any negative
> + *   interactions cause by the

Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature

2019-08-29 Thread Jan Beulich

On 29.08.2019 15:21, Andrew Cooper wrote:
> On 29/08/2019 14:13, Jan Beulich wrote:
>> On 29.08.2019 12:46, Andrew Cooper wrote:
>>> On 29/08/2019 10:27, Jan Beulich wrote:
 setup_clear_cpu_cap() is __init and hence may not be called post-boot.
 Note that opt_pku nevertheless is not getting __initdata added - see
 e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init").

 Signed-off-by: Jan Beulich 
>>> Acked-by: Andrew Cooper 
>>>
>>> However, I'm tempted to suggest that this logic disappears entirely. 
>>> Given that we don't support it for PV guests, and can't without taking a
>>> CR4 write on toggle_kernel_mode(), all this actually controls is whether
>>> the bit finds its way into the HVM max policy.
>> Well, if you mean replacing opt_pku by an addition to
>> parse_xen_cpuid(), then I'll be happy to do that as a follow-on.
>> If there is another direction you've been considering, then
>> you'd have to at least outline it.
> 
> Well - I did say "disappear entirely".  I don't see it as having, or
> liable to gain, any useful purpose.
> 
> In reality, it was subsumed into cpuid= by my patch to allow all bits to
> be tweak-able.

Did you post this patch already (I don't seem to recall; I only recall
there being the plan to do this)?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 09/10] iommu: tidy up iommu_use_hap_pt() and need_iommu_pt_sync() macros

2019-08-29 Thread Jan Beulich

On 16.08.2019 19:20, Paul Durrant wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -102,8 +102,10 @@ static int __init parse_iommu_param(const char *s)
>  iommu_hwdom_passthrough = val;
>  else if ( (val = parse_boolean("dom0-strict", s, ss)) >= 0 )
>  iommu_hwdom_strict = val;
> +#ifndef CONFIG_ARM
>  else if ( (val = parse_boolean("sharept", s, ss)) >= 0 )
>  iommu_hap_pt_share = val;
> +#endif
>  else
>  rc = -EINVAL;

I think you/we should go further here: Arm should #define this to
true, and here we should have "#ifndef iommu_hap_pt_share".

> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -268,6 +268,17 @@ struct domain_iommu {
>  #define iommu_set_feature(d, f)   set_bit(f, dom_iommu(d)->features)
>  #define iommu_clear_feature(d, f) clear_bit(f, dom_iommu(d)->features)
>  
> +/* Are we using the domain P2M table as its IOMMU pagetable? */
> +#define iommu_use_hap_pt(d) \
> +(hap_enabled(d) && is_iommu_enabled(d) && iommu_hap_pt_share)
> +
> +/* Does the IOMMU pagetable need to be kept synchronized with the P2M */
> +#ifdef CONFIG_HAS_PASSTHROUGH
> +#define need_iommu_pt_sync(d) (dom_iommu(d)->need_sync)
> +#else
> +#define need_iommu_pt_sync(d) false

I think you'd better evaluate d here; one (somewhat in risk of
opposition) variant would be

#define need_iommu_pt_sync(d) (!(d))

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup

2019-08-29 Thread Jan Beulich

On 29.08.2019 16:08, Andrew Cooper wrote:
> On 14/06/2019 12:38, Jan Beulich wrote:
>> To "compensate" for the code size growth by an earlier change:
>> - drop "trampoline" labels (in almost all cases the target label is
>>   reachable with an 8-bit-displacement branch anyway, and a single 16-
>>   bit-displacement branch is still better than a pair of two branches)
> 
> Do you happen to know why we any to start with?  It can't plausibly be
> for manual code alignment reasons.

I have no idea - my guess is that some pre-386 code was cloned, or
it was written by someone used to the pre-386 limitations.

>> @@ -421,7 +418,7 @@ setspc: xorb%bh, %bh
>>  
>>  setmenu:
>>  orb %al, %al# 80x25 is an exception
>> -jz  _set_80x25
>> +jz  set_80x25
>>  
>>  pushw   %bx # Set mode chosen from menu
>>  callmode_table  # Build the mode table
>> @@ -441,36 +438,32 @@ check_vesa:
>>  cmpw$0x004f, %ax
>>  jnz setbad
>>  
>> -leawvesa_mode_info, %di
>> -subb$VIDEO_FIRST_VESA>>8, %bh
>> -movw%bx, %cx# Get mode information structure
>> +leawvesa_mode_info, %di # Get mode information structure
>> +leaw-VIDEO_FIRST_VESA(%bx), %cx
>>  movw$0x4f01, %ax
>>  int $0x10
>> -addb$VIDEO_FIRST_VESA>>8, %bh
> 
> Is this the redundant instruction you are talking about, or ... (near
> the end)?

No, here it's simply ...

> I think I follow this as "no logical change", by leaving %bx intact
> throughout, and only clearing %ch as part of the %bx=>%cx copy.

... as you say.

>> --- a/xen/arch/x86/boot/wakeup.S
>> +++ b/xen/arch/x86/boot/wakeup.S
>> @@ -30,7 +30,7 @@ ENTRY(wakeup_start)
>>  jne bogus_real_magic
>>  
>>  # for acpi_sleep=s3_bios
>> -testl   $1, wakesym(video_flags)
>> +testb   $1, wakesym(video_flags)
> 
> video_flags is currently .long, and, AFAICT, uses 2 bits even after this
> series.  We could get better code reduction by shrinking it to .byte.

Well, the goal of this patch is to play with the assembly code. To
do what you suggest I'd have to touch a C (or really .h) file as
well. I'm fine doing this change though, but preferably in a separate
patch.

>> @@ -38,9 +38,9 @@ ENTRY(wakeup_start)
>>  movw%ax, %ss# Need this? How to ret if clobbered?
>>  
>>  1:  # for acpi_sleep=s3_mode
>> -testl   $2, wakesym(video_flags)
>> +testb   $2, wakesym(video_flags)
>>  jz  1f
>> -movlwakesym(video_mode), %eax
>> +movwwakesym(video_mode), %ax
> 
> Similarly, video_mode can become .word, I think.

But a word is less efficient to access (because of the operand size
override), so I'd prefer to not shrink this one. Just let me know
whether you agree, and I'll cook up a patch accordingly.

>> @@ -55,48 +55,26 @@ ENTRY(wakeup_start)
>>  lmsw%ax # Turn on CR0.PE 
>>  ljmpl   $BOOT_CS32, $bootsym_rel(wakeup_32, 6)
>>  
>> -/* This code uses an extended set of video mode numbers. These include:
>> - * Aliases for standard modes
>> - *  NORMAL_VGA (-1)
>> - *  EXTENDED_VGA (-2)
>> - *  ASK_VGA (-3)
>> - * Video modes numbered by menu position -- NOT RECOMMENDED because of lack
>> - * of compatibility when extending the table. These are between 0x00 and 
>> 0xff.
>> - */
>> -#define VIDEO_FIRST_MENU 0x
>> -
>> -/* Standard BIOS video modes (BIOS number + 0x0100) */
>> -#define VIDEO_FIRST_BIOS 0x0100
>> -
>> -/* VESA BIOS video modes (VESA number + 0x0200) */
>> -#define VIDEO_FIRST_VESA 0x0200
>> -
>> -/* Video7 special modes (BIOS number + 0x0900) */
>> -#define VIDEO_FIRST_V7 0x0900
>> -
>>  # Setting of user mode (AX=mode ID) => CF=success
>>  mode_setw:
>>  movw%ax, %bx
>>  cmpb$VIDEO_FIRST_VESA>>8, %ah
>>  jnc check_vesaw
>> -decb%ah
> 
> ... or is this the no functional change?  If so, I'm not sure I agree,
> given the clc below.

How does the CLC matter? CF, as the comment says, indicates success.
Whether or not there's a DEC ahead of it (which doesn't even alter
CF) doesn't matter, does it?

In any event - yes, that's the dead insn. I'll mention the function
name in the description.

Jan

>>  setbadw: clc
>>  ret

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 2/3] x86/ACPI: restore VESA mode upon resume from S3

2019-08-29 Thread Andrew Cooper

On 14/06/2019 12:37, Jan Beulich wrote:
> In order for "acpi_sleep=s3_mode" to have any effect, we should record
> the video mode we switched to during boot. Since right now there's mode
> setting code for VESA modes only in the resume case, record the mode
> just in that one case.
>
> Signed-off-by: Jan Beulich 
> ---
> RFC: On the box that I've been trying to test this on this didn't really
>  make a difference (in the random cases where resume works in the
>  first place there): The graphics card looks to remain powered off
>  even after the Dom0 kernel has resumed. Additionally using
>  "acpi_sleep=s3_bios" didn't make a difference either. Furthermore
>  it looks like the serial console (connected via PCI card) doesn't
>  work (yet) immediately after resume (I suppose it too is powered
>  down), and resume hangs altogether with it in use. Hence it's sort
>  of difficult to actually debug anything here.

While you were away, I had an awful time debugging c/s 7aae9c1c91, even
with COM1.

I think we should take some proactive steps to try and make the serial
settings more robust, so printk() does continue to function before the
console irq gets restored.

In the case of legacy COM1/COM2, this should be falling back to polled
mode which at least lets the characters get out, and for PCI serial
cards, we should probably make an attempt to bring it out of D3 ahead of
relying on dom0 to do this.

> ---
> I'm wondering actually whether the user having to explicitly request the
> mode restoration is a good model: Why would we _not_ want to restore the
> mode we've set during boot? In the worst case Dom0 kernel or X will
> change the mode another time.

I think I agree.  I can't see anything good which can come from offering
a choice here, and I am all for reducing the quantity of 16bit VGA code
we have.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-next test] 140745: regressions - FAIL

2019-08-29 Thread osstest service owner

flight 140745 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/140745/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-examine11 examine-serial/bootloader fail REGR. vs. 140676

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt   14 saverestore-support-check fail blocked in 140676
 test-armhf-armhf-libvirt-raw 13 saverestore-support-check fail blocked in 
140676
 test-amd64-i386-examine   8 reboot   fail  like 140676
 test-amd64-i386-libvirt   7 xen-boot fail  like 140676
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot  fail like 140676
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail  like 140676
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail  like 140676
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot   fail like 140676
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-bootfail like 140676
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail like 
140676
 test-amd64-i386-xl-shadow 7 xen-boot fail  like 140676
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail like 
140676
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot  fail like 140676
 test-amd64-i386-xl-xsm7 xen-boot fail  like 140676
 test-amd64-i386-xl7 xen-boot fail  like 140676
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-bootfail like 140676
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot  fail like 140676
 test-amd64-i386-freebsd10-i386  7 xen-bootfail like 140676
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  7 xen-boot   fail like 140676
 test-amd64-i386-xl-raw7 xen-boot fail  like 140676
 test-amd64-i386-pair 10 xen-boot/src_hostfail  like 140676
 test-amd64-i386-pair 11 xen-boot/dst_hostfail  like 140676
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm  7 xen-boot fail like 140676
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot   fail like 140676
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot   fail like 140676
 test-amd64-i386-libvirt-xsm   7 xen-boot fail  like 140676
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  7 xen-boot   fail like 140676
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot  fail like 140676
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot   fail like 140676
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm  7 xen-boot   fail like 140676
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot   fail like 140676
 test-amd64-i386-xl-pvshim 7 xen-boot fail  like 140676
 test-amd64-i386-freebsd10-amd64  7 xen-boot   fail like 140676
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot   fail like 140676
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot   fail like 140676
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 140676
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 140676
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 140676
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 140676
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass

Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h

2019-08-29 Thread Andrew Cooper

On 29/08/2019 13:39, Juergen Gross wrote:
> On 29.08.19 14:27, Jan Beulich wrote:
>> On 29.08.2019 14:01, Juergen Gross wrote:
>>> On 29.08.19 13:42, Andrew Cooper wrote:
 On 29/08/2019 10:28, Jan Beulich wrote:
> Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:
>
>   There have been reports of RDRAND issues after resuming from
> suspend on
>   some AMD family 15h and family 16h systems. This issue stems
> from a BIOS
>   not performing the proper steps during resume to ensure
> RDRAND continues
>   to function properly.
>
>   Update the CPU initialization to clear the RDRAND CPUID bit
> for any family
>   15h and 16h processor that supports RDRAND. If it is known
> that the family
>   15h or family 16h system does not have an RDRAND resume
> issue or that the
>   system will not be placed in suspend, the "cpuid=rdrand"
> kernel parameter
>   can be used to stop the clearing of the RDRAND CPUID bit.
>
>   Note, that clearing the RDRAND CPUID bit does not prevent a
> processor
>   that normally supports the RDRAND instruction from executing
> it. So any
>   code that determined the support based on family and model
> won't #UD.
>>>
>>> Yeah, but it will no longer see random numbers after resume.
>>
>> That's the implication; note that I've taken the text from the
>> Linux commit.
>>
> Signed-off-by: Jan Beulich 
> ---
> Slightly RFC, in particular because of the change to
> parse_xen_cpuid():
> Alternative approach suggestions are welcome.

 This issue was on my radar, but I hadn't got around to starting a
 patch yet.

 I was wondering if we could perhaps default it to whether S3 was
 available, but looking at the code which prints "ACPI sleep modes:
 S3",
 it doesn't appear to be related to any real ACPI information.
>>>
>>> Wouldn't it make more sense to inhibit suspend/resume instead?
>>
>> That's an alternative option. But if anything I'd see us providing
>> both, not the one you suggest instead of what the patch here does.
>>
>>> I'm quite sure a machine running Xen is very rarely put to S3.
>>
>> I'm not at all sure about this.
>
> Suspend/resume in Xen was broken for more than 3 months in the late 4.12
> development phase and nobody noticed it. Only when I started my core
> scheduling series I came across the issue.

OpenXT and Qubes will come after you with sticks...

For traditional server scenarios, S3 is rare to non-existent, but Xen
also runs in a number of well known cases where S3 is very important. 
The reason S3 breaks frequently (see also Jan's IRQ adjustments broke it
for a while in staging), is because none of the developers who work in
these areas use it.

I now have a test scenario using a CoffeeLake system sat on my desk, but
it only gets testing when I'm poking the low level code, or when I
suspect a change might have an impact.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 08/10] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich 
> Sent: 29 August 2019 14:39
> To: Paul Durrant 
> Cc: xen-devel@lists.xenproject.org; Julien Grall ; 
> Alexandru Isaila
> ; Petre Pircalabu ; 
> Razvan Cojocaru
> ; Andrew Cooper ; Roger 
> Pau Monne
> ; Volodymyr Babchuk ; 
> George Dunlap
> ; Ian Jackson ; Stefano 
> Stabellini
> ; KonradRzeszutek Wilk ; 
> Tamas K Lengyel
> ; Tim (Xen.org) ; Wei Liu 
> Subject: Re: [PATCH v6 08/10] remove late (on-demand) construction of IOMMU 
> page tables
> 
> On 16.08.2019 19:19, Paul Durrant wrote:
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -146,6 +146,17 @@ static int __init parse_dom0_iommu_param(const char *s)
> >  }
> >  custom_param("dom0-iommu", parse_dom0_iommu_param);
> >
> > +static void __hwdom_init check_hwdom_reqs(struct domain *d)
> 
> This really should have const, but I realize ...
> 
> > +{
> > +if ( iommu_hwdom_none || !paging_mode_translate(d) )
> > +return;
> > +
> > +arch_iommu_check_autotranslated_hwdom(d);
> 
> ... this one wants non-const (for - afaict - no reason).
> 
> > @@ -159,129 +170,44 @@ int iommu_domain_init(struct domain *d)
> >  return ret;
> >
> >  hd->platform_ops = iommu_get_ops();
> > -return hd->platform_ops->init(d);
> > -}
> > +ret = hd->platform_ops->init(d);
> > +if ( ret )
> > +return ret;
> >
> > -static void __hwdom_init check_hwdom_reqs(struct domain *d)
> > -{
> > -if ( iommu_hwdom_none || !paging_mode_translate(d) )
> > -return;
> > +/*
> > + * NB: 'relaxed' h/w domains don't need the IOMMU mappings to be kept
> > + * in-sync with their assigned pages because all host RAM will be
> > + * mapped during hwdom_init().
> > + */
> 
> Doesn't this comment belong to ...
> 
> > +if ( is_hardware_domain(d) )
> > +check_hwdom_reqs(d); /* may modify iommu_hwdom_strict */
> >
> > -arch_iommu_check_autotranslated_hwdom(d);
> > +if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> > +hd->need_sync = !iommu_use_hap_pt(d);
> 
> ... this if()?

Yes, it's probably adrift from where it should be now.

> 
> > @@ -629,8 +552,7 @@ static void iommu_dump_p2m_table(unsigned char key)
> >  ops = iommu_get_ops();
> >  for_each_domain(d)
> >  {
> > -if ( is_hardware_domain(d) ||
> > - dom_iommu(d)->status < IOMMU_STATUS_initialized )
> > +if ( !is_iommu_enabled(d) )
> >  continue;
> 
> Didn't you agree to retain the hwdom part of the condition here?
> 

Indeed I have as of today, but this was posted 2 weeks ago :-)

  Paul

> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup

2019-08-29 Thread Andrew Cooper

On 29/08/2019 15:23, Jan Beulich wrote:
> On 29.08.2019 16:08, Andrew Cooper wrote:
>> On 14/06/2019 12:38, Jan Beulich wrote:
>>> To "compensate" for the code size growth by an earlier change:
>>> - drop "trampoline" labels (in almost all cases the target label is
>>>   reachable with an 8-bit-displacement branch anyway, and a single 16-
>>>   bit-displacement branch is still better than a pair of two branches)
>> Do you happen to know why we any to start with?  It can't plausibly be
>> for manual code alignment reasons.
> I have no idea - my guess is that some pre-386 code was cloned, or
> it was written by someone used to the pre-386 limitations.
>
>>> @@ -421,7 +418,7 @@ setspc: xorb%bh, %bh
>>>  
>>>  setmenu:
>>>  orb %al, %al# 80x25 is an exception
>>> -jz  _set_80x25
>>> +jz  set_80x25
>>>  
>>>  pushw   %bx # Set mode chosen from menu
>>>  callmode_table  # Build the mode table
>>> @@ -441,36 +438,32 @@ check_vesa:
>>>  cmpw$0x004f, %ax
>>>  jnz setbad
>>>  
>>> -leawvesa_mode_info, %di
>>> -subb$VIDEO_FIRST_VESA>>8, %bh
>>> -movw%bx, %cx# Get mode information structure
>>> +leawvesa_mode_info, %di # Get mode information structure
>>> +leaw-VIDEO_FIRST_VESA(%bx), %cx
>>>  movw$0x4f01, %ax
>>>  int $0x10
>>> -addb$VIDEO_FIRST_VESA>>8, %bh
>> Is this the redundant instruction you are talking about, or ... (near
>> the end)?
> No, here it's simply ...
>
>> I think I follow this as "no logical change", by leaving %bx intact
>> throughout, and only clearing %ch as part of the %bx=>%cx copy.
> ... as you say.
>
>>> --- a/xen/arch/x86/boot/wakeup.S
>>> +++ b/xen/arch/x86/boot/wakeup.S
>>> @@ -30,7 +30,7 @@ ENTRY(wakeup_start)
>>>  jne bogus_real_magic
>>>  
>>>  # for acpi_sleep=s3_bios
>>> -testl   $1, wakesym(video_flags)
>>> +testb   $1, wakesym(video_flags)
>> video_flags is currently .long, and, AFAICT, uses 2 bits even after this
>> series.  We could get better code reduction by shrinking it to .byte.
> Well, the goal of this patch is to play with the assembly code. To
> do what you suggest I'd have to touch a C (or really .h) file as
> well. I'm fine doing this change though, but preferably in a separate
> patch.

This is fine, and may indeed want to wait until David has finished
trampoline work.

>
>>> @@ -38,9 +38,9 @@ ENTRY(wakeup_start)
>>>  movw%ax, %ss# Need this? How to ret if clobbered?
>>>  
>>>  1:  # for acpi_sleep=s3_mode
>>> -testl   $2, wakesym(video_flags)
>>> +testb   $2, wakesym(video_flags)
>>>  jz  1f
>>> -movlwakesym(video_mode), %eax
>>> +movwwakesym(video_mode), %ax
>> Similarly, video_mode can become .word, I think.
> But a word is less efficient to access (because of the operand size
> override), so I'd prefer to not shrink this one. Just let me know
> whether you agree, and I'll cook up a patch accordingly.

This is 16 bit code so it is movl which has the operand size prefix, not
movw.

It is extern'd in C, but not wrapped in bootsym(), and I can't see it
being read anywhere.

>
>>> @@ -55,48 +55,26 @@ ENTRY(wakeup_start)
>>>  lmsw%ax # Turn on CR0.PE 
>>>  ljmpl   $BOOT_CS32, $bootsym_rel(wakeup_32, 6)
>>>  
>>> -/* This code uses an extended set of video mode numbers. These include:
>>> - * Aliases for standard modes
>>> - *  NORMAL_VGA (-1)
>>> - *  EXTENDED_VGA (-2)
>>> - *  ASK_VGA (-3)
>>> - * Video modes numbered by menu position -- NOT RECOMMENDED because of lack
>>> - * of compatibility when extending the table. These are between 0x00 and 
>>> 0xff.
>>> - */
>>> -#define VIDEO_FIRST_MENU 0x
>>> -
>>> -/* Standard BIOS video modes (BIOS number + 0x0100) */
>>> -#define VIDEO_FIRST_BIOS 0x0100
>>> -
>>> -/* VESA BIOS video modes (VESA number + 0x0200) */
>>> -#define VIDEO_FIRST_VESA 0x0200
>>> -
>>> -/* Video7 special modes (BIOS number + 0x0900) */
>>> -#define VIDEO_FIRST_V7 0x0900
>>> -
>>>  # Setting of user mode (AX=mode ID) => CF=success
>>>  mode_setw:
>>>  movw%ax, %bx
>>>  cmpb$VIDEO_FIRST_VESA>>8, %ah
>>>  jnc check_vesaw
>>> -decb%ah
>> ... or is this the no functional change?  If so, I'm not sure I agree,
>> given the clc below.
> How does the CLC matter? CF, as the comment says, indicates success.
> Whether or not there's a DEC ahead of it (which doesn't even alter
> CF) doesn't matter, does it?

I'd forgotten that dec left CF intact, and was thinking more like sub
$1, where it would matter.

> In any event - yes, that's the dead insn. I'll mention the function
> name in the description.

Please do.

~Andrew

___
Xen-devel mailing list

Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup

2019-08-29 Thread Jan Beulich

On 29.08.2019 16:38, Andrew Cooper wrote:
> On 29/08/2019 15:23, Jan Beulich wrote:
>> On 29.08.2019 16:08, Andrew Cooper wrote:
>>> On 14/06/2019 12:38, Jan Beulich wrote:
 @@ -38,9 +38,9 @@ ENTRY(wakeup_start)
  movw%ax, %ss# Need this? How to ret if clobbered?
  
  1:  # for acpi_sleep=s3_mode
 -testl   $2, wakesym(video_flags)
 +testb   $2, wakesym(video_flags)
  jz  1f
 -movlwakesym(video_mode), %eax
 +movwwakesym(video_mode), %ax
>>> Similarly, video_mode can become .word, I think.
>> But a word is less efficient to access (because of the operand size
>> override), so I'd prefer to not shrink this one. Just let me know
>> whether you agree, and I'll cook up a patch accordingly.
> 
> This is 16 bit code so it is movl which has the operand size prefix, not
> movw.
> 
> It is extern'd in C, but not wrapped in bootsym(), and I can't see it
> being read anywhere.

Oh, indeed. I'll ditch the extern then.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations

2019-08-29 Thread Wieczorkiewicz, Pawel



On 29. Aug 2019, at 16:34, Konrad Rzeszutek Wilk 
mailto:kon...@darnok.org>> wrote:



…snip...

+
+struct livepatch_func __section(".livepatch.funcs") livepatch_exceptions = {
+.version = LIVEPATCH_PAYLOAD_VERSION,
+.name = livepatch_exceptions_str,
+.new_addr = xen_hello_world,
+.old_addr = xen_extra_version,
+.new_size = EXPECT_BYTES_COUNT,
+.old_size = EXPECT_BYTES_COUNT,
+.expect = {
+.enabled = 1,
+.len = EXPECT_BYTES_COUNT,
+.data = EXPECT_BYTES
+},
+
+};

When I compile with 32-bit ARM 'make tests' I get:

arm-eabi-ld-EL -EL --build-id=sha1 -r -o 
xen_action_hooks_norevert.livepatch xen_action_hooks_marker.o 
xen_hello_world_func.o note.o xen_note.o
make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped.
objdump: can't disassemble for architecture UNKNOWN!

(set -e; \
echo "#define EXPECT_BYTES {"; \
echo "#define EXPECT_BYTES_COUNT 6") > expect_config.h
arm-eabi-gcc -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall 
-Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin 
-fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ 
-include /home/konrad/A20/xen.git/xen/include/xen/config.h 
'-D__OBJECT_FILE__="xen_expectations.o"' -Wa,--strip-local-absolute -g -MMD -MF 
./.xen_expectations.o.d -msoft-float -mcpu=cortex-a15 -DCONFIG_EARLY_PRINTK 
-DEARLY_PRINTK_INC=\"debug-8250.inc\" -DEARLY_PRINTK_BAUD= 
-DEARLY_UART_BASE_ADDRESS=0x01c28000 -DEARLY_UART_REG_SHIFT=2  
-I/home/konrad/A20/xen.git/xen/include -fno-stack-protector -fno-exceptions 
-Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE -marm -DBUILD_ID 
-fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes 
-Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -c xen_expectations.c -o xen_expectations.o
xen_expectations.c:31:2: error: expected '}' before ';' token
};
 ^
xen_expectations.c:18:76: note: to match this '{'
struct livepatch_func __section(".livepatch.funcs") livepatch_exceptions = {
   ^
make[3]: *** [/home/konrad/A20/xen.git/xen/Rules.mk:202: xen_expectations.o] 
Error 1
make[3]: Leaving directory '/home/konrad/A20/xen.git/xen/test/livepatch'


And this is what expect_config.h looks like:

#define EXPECT_BYTES {
#define EXPECT_BYTES_COUNT 6

Hi Konrad,

I think I fixed it with the following change on top:

[DIFF START]
diff --git a/xen/test/livepatch/Makefile b/xen/test/livepatch/Makefile
index 067861903f..20bdeb6c6d 100644
--- a/xen/test/livepatch/Makefile
+++ b/xen/test/livepatch/Makefile
@@ -186,10 +186,17 @@ xen_actions_hooks_norevert.o: config.h
 $(LIVEPATCH_ACTION_HOOKS_NOREVERT): xen_action_hooks_marker.o 
xen_hello_world_func.o note.o xen_note.o
$(LD) $(LDFLAGS) $(build_id_linker) -r -o 
$(LIVEPATCH_ACTION_HOOKS_NOREVERT) $^

-CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E 
'<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | sed 
's/,$$/}/g')
+ifeq ($(findstring arm, $(TARGET_ARCH) $(ARCH)),arm)
+INSN_WIDTH := 4
+INSN_PER_LINE := 2
+else
+INSN_WIDTH := 1
+INSN_PER_LINE := 8
+endif
+CODE_GET_EXPECT=$(shell $(OBJDUMP) -d --insn-width=$(INSN_WIDTH) $(1) | sed -n 
-e '/<'$(2)'>:$$/,/^$$/ p' | tail -n +2 | head -n $(INSN_PER_LINE) | awk 
'{printf "%s", $$2}' | sed 's/.\{2\}/0x&,/g' | sed 's/^/{/;s/,$$/}/g')
 .PHONY: expect_config.h
 expect_config.h: EXPECT_BYTES=$(call 
CODE_GET_EXPECT,$(BASEDIR)/xen-syms,xen_extra_version)
-expect_config.h: EXPECT_BYTES_COUNT=6
+expect_config.h: EXPECT_BYTES_COUNT=8
 expect_config.h: xen_expectations.o
(set -e; \
 echo "#define EXPECT_BYTES $(EXPECT_BYTES)"; \
[DIFF END]

I will test that some more and add it to the v3 of the patchset.

Best Regards,
Pawel Wieczorkiewicz






Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/2] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2019-08-29 Thread Jan Beulich

On 19.08.2019 16:26, Julien Grall wrote:
> No functional change intended.
> 
> Only reasonable clean-ups are done in this patch. The rest will use _gfn
> for the time being.
> 
> The code in get_page_from_gfn is slightly reworked to also handle a bad
> behavior because it is not safe to use mfn_to_page(...) because
> mfn_valid(...) succeeds.

I guess the 2nd "because" is meant to be "before", but even then I
don't think I can easily agree: mfn_to_page() is just calculations,
no dereference.

> --- a/xen/arch/x86/hvm/domain.c
> +++ b/xen/arch/x86/hvm/domain.c
> @@ -296,8 +296,10 @@ int arch_set_info_hvm_guest(struct vcpu *v, const 
> vcpu_hvm_context_t *ctx)
>  if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) )
>  {
>  /* Shadow-mode CR3 change. Check PDBR and update refcounts. */
> -struct page_info *page = get_page_from_gfn(v->domain,
> - v->arch.hvm.guest_cr[3] >> PAGE_SHIFT,
> +struct page_info *page;
> +
> +page = get_page_from_gfn(v->domain,
> + gaddr_to_gfn(v->arch.hvm.guest_cr[3]),
>   NULL, P2M_ALLOC);

Iirc I've said so before: I consider use of gaddr_to_gfn() here more
misleading than a plain shift by PAGE_SHIFT. Neither is really correct,
but in no event does CR3 as a whole hold an address. (Same elsewhere
then, and sadly in quite a few places.)

> --- a/xen/common/event_fifo.c
> +++ b/xen/common/event_fifo.c
> @@ -361,7 +361,7 @@ static const struct evtchn_port_ops evtchn_port_ops_fifo =
>  .print_state   = evtchn_fifo_print_state,
>  };
>  
> -static int map_guest_page(struct domain *d, uint64_t gfn, void **virt)
> +static int map_guest_page(struct domain *d, gfn_t gfn, void **virt)
>  {
>  struct page_info *p;
>  
> @@ -422,7 +422,7 @@ static int setup_control_block(struct vcpu *v)
>  return 0;
>  }
>  
> -static int map_control_block(struct vcpu *v, uint64_t gfn, uint32_t offset)
> +static int map_control_block(struct vcpu *v, gfn_t gfn, uint32_t offset)
>  {
>  void *virt;
>  unsigned int i;

Just as a remark (no action expected) - this makes a truncation issue
pretty apparent: On 32-bit platforms the upper 32 bits of a passed in
GFN get ignored. Correct (imo) behavior would be to reject the upper
bits being non-zero.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 140800: tolerable all pass - PUSHED

2019-08-29 Thread osstest service owner

flight 140800 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/140800/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  41c7700a00011ad08be3c9d71126b67e08e58ac3
baseline version:
 xen  f1a94202128e677be5c14f84492f80a12dba9571

Last test of basis   140776  2019-08-28 21:01:11 Z0 days
Testing same since   140800  2019-08-29 14:00:34 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  George Dunlap 
  Jan Beulich 
  Roger Pau Monné 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   f1a9420212..41c7700a00  41c7700a00011ad08be3c9d71126b67e08e58ac3 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 2/3] x86/ACPI: restore VESA mode upon resume from S3

2019-08-29 Thread Jan Beulich

On 29.08.2019 16:45, Andrew Cooper wrote:
> On 14/06/2019 12:37, Jan Beulich wrote:
>> ---
>> I'm wondering actually whether the user having to explicitly request the
>> mode restoration is a good model: Why would we _not_ want to restore the
>> mode we've set during boot? In the worst case Dom0 kernel or X will
>> change the mode another time.
> 
> I think I agree.  I can't see anything good which can come from offering
> a choice here, and I am all for reducing the quantity of 16bit VGA code
> we have.

Well, mode restoration may of course hang due to BIOS issues. But in such
a case the question is how sensible it is to use S3 on such a system.

Also note that adjusting this isn't going to reduce the quantity of
16-bit code; it would only be a change to what video_flags defaults to.
Right now, without the command line option provided, mode restoration
would happen only if reset_videomode_after_s3() gets invoked, i.e.
just on a single Toshiba laptop model which I guess didn't even have
64-bit capable CPUs.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability

2019-08-29 Thread Chao Gao

On Thu, Aug 29, 2019 at 12:03:44PM +0200, Jan Beulich wrote:
>On 29.08.2019 11:02, Chao Gao wrote:
>> Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's
>> perspective, assigned devices cannot be reset. But some devices rely on PCI
>> reset to recover from hardware hangs. When being assigned to a VM, those
>> devices cannot be reset and won't work any longer if a hardware hang occurs.
>> We have to reboot VM to trigger PCI reset on host to recover the device.
>
>Did you consider a hot-unplug, reset (by host), hot-plug cycle instead?

Yes. I considered this means. But it needs host to initiate this action.
However, when a device needs reset is determined by the device driver
in VM. So in practice, VM still needs a way to notify host to do
unplug/reset/plug. As the standard FLR capability can meet the
requirement, I don't try to invent one.

>
>> +static int xen_pt_devctl_reg_write(XenPCIPassthroughState *s,
>> +   XenPTReg *cfg_entry, uint16_t *val,
>> +   uint16_t dev_value, uint16_t valid_mask)
>> +{
>> +if (s->real_device.is_resetable && (*val & PCI_EXP_DEVCTL_BCR_FLR)) {
>> +xen_pt_reset(s);
>> +}
>> +return xen_pt_word_reg_write(s, cfg_entry, val, dev_value, valid_mask);
>
>I think you also need to clear the bit before handing on the request,
>such that reads will always observe it clear.

Will do.

Thanks
Chao

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 01/11] checkpatch: check for nested (un)?likely() calls

2019-08-29 Thread Denis Efremov

IS_ERR(), IS_ERR_OR_NULL(), IS_ERR_VALUE() and WARN*() already contain
unlikely() optimization internally. Thus, there is no point in calling
these functions and defines under likely()/unlikely().

This check is based on the coccinelle rule developed by Enrico Weigelt
https://lore.kernel.org/lkml/1559767582-11081-1-git-send-email-i...@metux.net/

Signed-off-by: Denis Efremov 
Cc: Joe Perches 
Cc: Andrew Morton 
Cc: Andy Whitcroft 
---
 scripts/checkpatch.pl | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 93a7edfe0f05..56969ce06df4 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6480,6 +6480,12 @@ sub process {
 "Using $1 should generally have parentheses around 
the comparison\n" . $herecurr);
}
 
+# nested likely/unlikely calls
+   if ($line =~ 
/\b(?:(?:un)?likely)\s*\(\s*!?\s*(IS_ERR(?:_OR_NULL|_VALUE)?|WARN)/) {
+   WARN("LIKELY_MISUSE",
+"nested (un)?likely() calls, $1 already uses 
unlikely() internally\n" . $herecurr);
+   }
+
 # whine mightly about in_atomic
if ($line =~ /\bin_atomic\s*\(/) {
if ($realfile =~ m@^drivers/@) {
-- 
2.21.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 10/10] introduce a 'passthrough' configuration option to xl.cfg...

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Roger Pau Monne 
> Sent: 23 August 2019 15:17
> To: Paul Durrant 
> Cc: xen-devel@lists.xenproject.org; Ian Jackson ; Wei 
> Liu ; Andrew
> Cooper ; George Dunlap ; 
> Jan Beulich
> ; Julien Grall ; Konrad Rzeszutek 
> Wilk
> ; Stefano Stabellini ; Tim 
> (Xen.org) ;
> Anthony Perard ; Volodymyr Babchuk 
> 
> Subject: Re: [PATCH v6 10/10] introduce a 'passthrough' configuration option 
> to xl.cfg...
> 
> On Fri, Aug 16, 2019 at 06:20:01PM +0100, Paul Durrant wrote:
> > ...and hence the ability to disable IOMMU mappings, and control EPT
> > sharing.
> >
> > This patch introduces a new 'libxl_passthrough' enumeration into
> > libxl_domain_create_info. The value will be set by xl either when it parses
> > a new 'passthrough' option in xl.cfg, or implicitly if there is passthrough
> > hardware specified for the domain.
> >
> > If the value of the passthrough configuration option is 'disabled' then
> > the XEN_DOMCTL_CDF_iommu flag will be clear in the xen_domctl_createdomain
> > flags, thus allowing the toolstack to control whether the domain gets
> > IOMMU mappings or not (where previously they were globally set).
> >
> > If the value of the passthrough configuration option is 'sync_pt' then
> > a new 'iommu_opts' field in xen_domctl_createdomain will be set with the
> > value XEN_DOMCTL_IOMMU_no_sharept. This will override the global default
> > set in iommu_hap_pt_share, thus allowing the toolstack to control whether
> > EPT sharing is used for the domain.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Ian Jackson 
> > Cc: Wei Liu 
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Jan Beulich 
> > Cc: Julien Grall 
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: Stefano Stabellini 
> > Cc: Tim Deegan 
> > Cc: Anthony PERARD 
> > Cc: Volodymyr Babchuk 
> > Cc: "Roger Pau Monné" 
> >
> > Previously part of series 
> > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02267.html
> >
> > v6:
> >  - Remove the libxl_physinfo() call since it's usefulness is limited to x86
> >
> > v5:
> >  - Expand xen_domctl_createdomain flags field and hence bump interface
> >version
> >  - Fix spelling mistakes in context line
> > ---
> >  docs/man/xl.cfg.5.pod.in| 52 +
> >  tools/libxl/libxl.h |  5 
> >  tools/libxl/libxl_create.c  |  9 ++
> >  tools/libxl/libxl_types.idl |  7 +
> >  tools/xl/xl_parse.c | 38 
> >  xen/arch/arm/domain.c   | 10 ++-
> >  xen/arch/x86/domain.c   |  2 +-
> >  xen/common/domain.c |  7 +
> >  xen/drivers/passthrough/iommu.c | 13 -
> >  xen/include/public/domctl.h |  6 +++-
> >  xen/include/xen/iommu.h | 19 
> >  11 files changed, 158 insertions(+), 10 deletions(-)
> >
> > diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> > index c99d40307e..fd35685e9e 100644
> > --- a/docs/man/xl.cfg.5.pod.in
> > +++ b/docs/man/xl.cfg.5.pod.in
> > @@ -605,6 +605,58 @@ option should only be used with a trusted device tree.
> >  Note that the partial device tree should avoid using the phandle 65000
> >  which is reserved by the toolstack.
> >
> > +=item B
> > +
> > +Specify whether IOMMU mappings are enabled for the domain and hence whether
> > +it will be enabled for passthrough hardware. Valid values for this option
> > +are:
> > +
> > +=over 4
> > +
> > +=item B
> > +
> > +IOMMU mappings are disabled for the domain and so hardware may not be
> > +passed through.
> > +
> > +This option is the default if no passthrough hardware is specified
> > +in the domain's configuration.
> > +
> > +=item B
> 
> I would maybe name this exclusive_pt, but historically it's been named
> sync_pt.
> 

Yes, I prefer to stick with the incumbent name.

[snip]
> > diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> > index e105bda2bb..c904604008 100644
> > --- a/tools/xl/xl_parse.c
> > +++ b/tools/xl/xl_parse.c
> > @@ -2326,6 +2326,44 @@ skip_vfb:
> >  }
> >  }
> >
> > +if (!xlu_cfg_get_string(config, "passthrough", , 0)) {
> > +libxl_passthrough o;
> > +
> > +e = libxl_passthrough_from_string(buf, );
> > +if (e) {
> > +fprintf(stderr,
> > +"ERROR: unknown passthrough option '%s'\n",
> > +buf);
> > +exit(-ERROR_FAIL);
> > +}
> > +
> > +switch (o) {
> > +case LIBXL_PASSTHROUGH_DISABLED:
> > +if (d_config->num_pcidevs || d_config->num_dtdevs) {
> > +fprintf(stderr,
> > +"ERROR: passthrough disabled but devices are 
> > specified\n");
> > +exit(-ERROR_FAIL);
> > +}
> 
> Don't you need a break here?
> 
> > +case LIBXL_PASSTHROUGH_SHARE_PT:
> > +if (c_info->type == LIBXL_DOMAIN_TYPE_PV) {
> > +fprintf(stderr,
> > +"ERROR:

Re: [Xen-devel] [PATCH v4 2/2] xen/domain: Use typesafe gfn in map_vcpu_info

2019-08-29 Thread Jan Beulich

On 23.08.2019 16:16, Andrew Cooper wrote:
> On 19/08/2019 15:26, Julien Grall wrote:
>> diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h
>> index 3623af932f..dc4c6a72a0 100644
>> --- a/xen/include/public/vcpu.h
>> +++ b/xen/include/public/vcpu.h
>> @@ -182,7 +182,7 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_set_singleshot_timer_t);
>>   */
>>  #define VCPUOP_register_vcpu_info   10  /* arg == vcpu_register_vcpu_info_t 
>> */
>>  struct vcpu_register_vcpu_info {
>> -uint64_t mfn;/* mfn of page to place vcpu_info */
>> +uint64_t mfn;/* gfn of page to place vcpu_info */
> 
> I wonder if we can do some __XEN_INTERFACE_VERSION__ magic here to
> properly change the name to gfn.
> 
> Leaving it as mfn is going to be an ongoing source of confusion.

This would be pretty nice indeed.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations

2019-08-29 Thread Konrad Rzeszutek Wilk

> +CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E 
> '<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | 
> sed 's/,$$/}/g')

Ony my Hikey 960 when I compile using an native compiler I get:

gcc  -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes 
-Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin 
-fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ 
-include /home/xen.git/xen/include/xen/config.h 
'-D__OBJECT_FILE__="xen_expectations.o"' -Wa,--strip-local-absolute -g -MMD -MF 
./.xen_expectations.o.d -mcpu=generic -mgeneral-regs-only   
-I/home/xen.git/xen/include -fno-stack-protector -fno-exceptions 
-Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE  -DBUILD_ID 
-fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes 
-Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -c xen_expectations.c -o xen_expectations.o
/home/xen.git/xen/Rules.mk:202: recipe for target 'xen_expectations.o' failed
make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped.
In file included from xen_expectations.c:6:0:
expect_config.h:1:23: error: large integer implicitly truncated to unsigned type
 [-Werror=overflow]
 #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x
c09112e0}
   ^
xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’
 .data = EXPECT_BYTES
 ^~~~
expect_config.h:1:34: error: large integer implicitly truncated to unsigned type
 [-Werror=overflow]
 #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x
c09112e0}
  ^
xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’
 .data = EXPECT_BYTES
 ^~~~
expect_config.h:1:45: error: large integer implicitly truncated to unsigned type
 [-Werror=overflow]
 #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x
c09112e0}
…

make[3]: Leaving directory '/home/xen.git/xen/test/livepatch'
Makefile:11: recipe for target 'build' failed
make[2]: Leaving directory '/home/xen.git/xen/test'
Makefile:85: recipe for target '_tests' failed
make[1]: Leaving directory '/home/xen.git/xen'
Makefile:45: recipe for target 'tests' failed
root@hikey960:/home/xen.git/xen# cat test/livepatch/expect_config.h 
#define EXPECT_BYTES 
{0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0xc09112e0}
#define EXPECT_BYTES_COUNT 6



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations

2019-08-29 Thread Wieczorkiewicz, Pawel



On 29. Aug 2019, at 17:58, Konrad Rzeszutek Wilk 
mailto:konrad.w...@oracle.com>> wrote:

+CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E 
'<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | sed 
's/,$$/}/g')

Ony my Hikey 960 when I compile using an native compiler I get:

gcc  -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes 
-Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin 
-fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ 
-include /home/xen.git/xen/include/xen/config.h 
'-D__OBJECT_FILE__="xen_expectations.o"' -Wa,--strip-local-absolute -g -MMD -MF 
./.xen_expectations.o.d -mcpu=generic -mgeneral-regs-only   
-I/home/xen.git/xen/include -fno-stack-protector -fno-exceptions 
-Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE  -DBUILD_ID 
-fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes 
-Wdeclaration-after-statement -Wno-unused-but-set-variable 
-Wno-unused-local-typedefs   -c xen_expectations.c -o xen_expectations.o
/home/xen.git/xen/Rules.mk:202: recipe for target 'xen_expectations.o' failed
make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped.
In file included from xen_expectations.c:6:0:
expect_config.h:1:23: error: large integer implicitly truncated to unsigned type
[-Werror=overflow]
#define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x
c09112e0}
  ^
xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’
.data = EXPECT_BYTES
^~~~
expect_config.h:1:34: error: large integer implicitly truncated to unsigned type
[-Werror=overflow]
#define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x
c09112e0}
 ^
xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’
.data = EXPECT_BYTES
^~~~
expect_config.h:1:45: error: large integer implicitly truncated to unsigned type
[-Werror=overflow]
#define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x
c09112e0}
…

make[3]: Leaving directory '/home/xen.git/xen/test/livepatch'
Makefile:11: recipe for target 'build' failed
make[2]: Leaving directory '/home/xen.git/xen/test'
Makefile:85: recipe for target '_tests' failed
make[1]: Leaving directory '/home/xen.git/xen'
Makefile:45: recipe for target 'tests' failed
root@hikey960:/home/xen.git/xen# cat test/livepatch/expect_config.h
#define EXPECT_BYTES 
{0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0xc09112e0}
#define EXPECT_BYTES_COUNT 6



Could you please try with the patch attached?




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




0001-fixup-livepatch-Add-support-for-inline-asm-hotpatchi.patch
Description: 0001-fixup-livepatch-Add-support-for-inline-asm-hotpatchi.patch
If it still fails, could you please send me the output for the following command with this build?objdump -d xen-syms | sed -n -e '/:$/,/^$/ p' | tail -n +2
Best Regards,Pawel Wieczorkiewicz

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/2] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2019-08-29 Thread Julien Grall

Hi,

On 29/08/2019 17:41, Jan Beulich wrote:
> On 19.08.2019 16:26, Julien Grall wrote:
>> No functional change intended.
>>
>> Only reasonable clean-ups are done in this patch. The rest will use _gfn
>> for the time being.
>>
>> The code in get_page_from_gfn is slightly reworked to also handle a bad
>> behavior because it is not safe to use mfn_to_page(...) because
>> mfn_valid(...) succeeds.
> 
> I guess the 2nd "because" is meant to be "before", but even then I
> don't think I can easily agree: mfn_to_page() is just calculations,
> no dereference.

Regardless the s/because/before/. Do you disagree on the wording or the 
change? If the former, I just paraphrased what Andrew wrote in the 
previous version:

"This unfortunately propagates some bad behaviour, because it is not 
safe to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds.  (In 
practice it works because mfn_to_page() is just pointer arithmetic.)"

This is x86 code, so please agree with Andrew on the approach/wording.

> 
>> --- a/xen/arch/x86/hvm/domain.c
>> +++ b/xen/arch/x86/hvm/domain.c
>> @@ -296,8 +296,10 @@ int arch_set_info_hvm_guest(struct vcpu *v, const 
>> vcpu_hvm_context_t *ctx)
>>   if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) )
>>   {
>>   /* Shadow-mode CR3 change. Check PDBR and update refcounts. */
>> -struct page_info *page = get_page_from_gfn(v->domain,
>> - v->arch.hvm.guest_cr[3] >> PAGE_SHIFT,
>> +struct page_info *page;
>> +
>> +page = get_page_from_gfn(v->domain,
>> + gaddr_to_gfn(v->arch.hvm.guest_cr[3]),
>>NULL, P2M_ALLOC);
> 
> Iirc I've said so before: I consider use of gaddr_to_gfn() here more
> misleading than a plain shift by PAGE_SHIFT. Neither is really correct,
> but in no event does CR3 as a whole hold an address. (Same elsewhere
> then, and sadly in quite a few places.)

Well, this code has not changed since v3 and you acked it... I only 
dropped the ack here because the previous version was sent 9 months ago. 
I also can't see such comment made on any version of this series.

Anyway, I am more than happy to switch to _gfn((v->arch.hvm.guest_cr[3] 
 >> PAGE_SHIFT)) if you prefer it.

> 
>> --- a/xen/common/event_fifo.c
>> +++ b/xen/common/event_fifo.c
>> @@ -361,7 +361,7 @@ static const struct evtchn_port_ops evtchn_port_ops_fifo 
>> =
>>   .print_state   = evtchn_fifo_print_state,
>>   };
>>   
>> -static int map_guest_page(struct domain *d, uint64_t gfn, void **virt)
>> +static int map_guest_page(struct domain *d, gfn_t gfn, void **virt)
>>   {
>>   struct page_info *p;
>>   
>> @@ -422,7 +422,7 @@ static int setup_control_block(struct vcpu *v)
>>   return 0;
>>   }
>>   
>> -static int map_control_block(struct vcpu *v, uint64_t gfn, uint32_t offset)
>> +static int map_control_block(struct vcpu *v, gfn_t gfn, uint32_t offset)
>>   {
>>   void *virt;
>>   unsigned int i;
> 
> Just as a remark (no action expected) - this makes a truncation issue
> pretty apparent: On 32-bit platforms the upper 32 bits of a passed in
> GFN get ignored. Correct (imo) behavior would be to reject the upper
> bits being non-zero.

This is not only here but on pretty much all the hypercalls taking a gfn 
(on Arm it is 64-bit regardless the bitness).

I agree this is not nice, but I am afraid this is likely another can of 
worm that I am not to open it yet.

Cheers,

-- 
Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 10/10] introduce a 'passthrough' configuration option to xl.cfg...

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich 
> Sent: 29 August 2019 15:07
> To: Paul Durrant 
> Cc: xen-devel@lists.xenproject.org; Julien Grall ; 
> Andrew Cooper
> ; Anthony Perard ; 
> Roger Pau Monne
> ; Volodymyr Babchuk ; 
> George Dunlap
> ; Ian Jackson ; Stefano 
> Stabellini
> ; Konrad Rzeszutek Wilk ; Tim 
> (Xen.org) ;
> Wei Liu 
> Subject: Re: [PATCH v6 10/10] introduce a 'passthrough' configuration option 
> to xl.cfg...
> 
> On 16.08.2019 19:20, Paul Durrant wrote:
> > ...and hence the ability to disable IOMMU mappings, and control EPT
> > sharing.
> >
> > This patch introduces a new 'libxl_passthrough' enumeration into
> > libxl_domain_create_info. The value will be set by xl either when it parses
> > a new 'passthrough' option in xl.cfg, or implicitly if there is passthrough
> > hardware specified for the domain.
> >
> > If the value of the passthrough configuration option is 'disabled' then
> > the XEN_DOMCTL_CDF_iommu flag will be clear in the xen_domctl_createdomain
> > flags, thus allowing the toolstack to control whether the domain gets
> > IOMMU mappings or not (where previously they were globally set).
> >
> > If the value of the passthrough configuration option is 'sync_pt' then
> > a new 'iommu_opts' field in xen_domctl_createdomain will be set with the
> > value XEN_DOMCTL_IOMMU_no_sharept. This will override the global default
> > set in iommu_hap_pt_share, thus allowing the toolstack to control whether
> > EPT sharing is used for the domain.
> >
> > Signed-off-by: Paul Durrant 
> 
> Reviewed-by: Jan Beulich 

Thanks.

> with a question/suggestion and a nit:
> 
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -308,6 +308,13 @@ static int sanitise_domain_config(struct 
> > xen_domctl_createdomain *config)
> >  return -EINVAL;
> >  }
> >
> > +if ( !(config->flags & XEN_DOMCTL_CDF_iommu) && config->iommu_opts )
> > +{
> > +dprintk(XENLOG_INFO,
> > +"IOMMU options specified but IOMMU not enabled\n");
> > +return -EINVAL;
> > +}
> 
> Seeing this I wonder whether XEN_DOMCTL_CDF_iommu wouldn't better be
> bit 0 of iommu_opts.
> 

I debated this with myself and I prefer to stick with separate flag and options.

> > @@ -70,6 +70,10 @@ struct xen_domctl_createdomain {
> >
> >  uint32_t flags;
> >
> > +#define _XEN_DOMCTL_IOMMU_no_sharept  0
> > +#define XEN_DOMCTL_IOMMU_no_sharept   (1U<<_XEN_DOMCTL_IOMMU_no_sharept)
> 
> Please can you add the missing blanks around << ?
> 

Sure.

  Paul

> Jan
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.19 test] 140757: regressions - FAIL

2019-08-29 Thread osstest service owner

flight 140757 linux-4.19 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/140757/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 129313
 build-armhf-pvops 6 kernel-build fail REGR. vs. 129313

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-pvshim 20 guest-start/debian.repeat fail in 140708 pass in 
140638
 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail in 140708 
pass in 140757
 test-amd64-amd64-xl-pvshim   12 guest-startfail pass in 140708
 test-arm64-arm64-examine 11 examine-serial/bootloader  fail pass in 140708

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-armhf-armhf-examine  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail in 
140638 blocked in 129313
 test-amd64-i386-libvirt-xsm 13 migrate-support-check fail in 140708 never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-check fail in 140708 never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail in 140708 never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail in 140708 never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass

[Xen-devel] [PATCH v3 04/11] xen/events: Remove unlikely() from WARN() condition

2019-08-29 Thread Denis Efremov

"unlikely(WARN(x))" is excessive. WARN() already uses unlikely()
internally.

Signed-off-by: Denis Efremov 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Joe Perches 
Cc: Andrew Morton 
Cc: xen-devel@lists.xenproject.org
---
 drivers/xen/events/events_base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 2e8570c09789..6c8843968a52 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -247,7 +247,7 @@ static void xen_irq_info_cleanup(struct irq_info *info)
  */
 unsigned int evtchn_from_irq(unsigned irq)
 {
-   if (unlikely(WARN(irq >= nr_irqs, "Invalid irq %d!\n", irq)))
+   if (WARN(irq >= nr_irqs, "Invalid irq %d!\n", irq))
return 0;
 
return info_for_irq(irq)->evtchn;
-- 
2.21.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG

2019-08-29 Thread Roger Pau Monné

On Wed, Aug 28, 2019 at 04:24:22PM +0100, Igor Druzhinin wrote:
> If MCFG area is not reserved in E820 Xen by default will defer its usage
> until Dom0 registers it explicitly after ACPI parser recognizes it as
> a reserved resource in DSDT. Having it reserved in E820 is not
> mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
> and firmware is free to keep a hole E820 in that place.
> 
> Unfortunately, keeping it disabled at this point makes Xen fail to
> recognize many of PCIe extended capabilities early enough for newly added
> devices. Namely, (1) some of VT-d quirks are not applied during Dom0
> device handoff, (2) currently MCFG reservation report comes
> too late from Dom0 after some of PCI devices being registered in Xen
> by Dom0 kernel that break initialization of a number of PCIe capabilities
> (e.g. SR-IOV VF BAR sizing).
> 
> Since introduction of ACPI parser in Xen is not possible add a "force"
> option that will allow Xen to use MCFG area even it's not properly
> reserved in E820.
> 
> Signed-off-by: Igor Druzhinin 

Thanks! I think this is the best way to solve the issue.

> ---
> 
> I think we need to have this option to at least have a way to workaround
> problem (1). Problem (2) could be solved in Dom0 kernel by invoking
> xen_mcfg_late() earlier but before the first PCI bus enumertaion which
> currently happens somwhere during ACPI scan I think.
> 
> The question is what the defult value for this option should be?

Have you tested this against a variety of hardware?

Have you found any box that has a wrong MMCFG area in the MCFG static
table? (ie: one that doesn't match the motherboard resource
reservation in the dynamic ACPI tables?)

> 
> ---
>  docs/misc/xen-command-line.pandoc |  8 +---
>  xen/arch/x86/e820.c   | 20 
>  xen/arch/x86/x86_64/mmconfig-shared.c | 34 --
>  xen/include/asm-x86/e820.h|  1 +
>  4 files changed, 46 insertions(+), 17 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc 
> b/docs/misc/xen-command-line.pandoc
> index 7c72e31..f13b10c 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and - 
> when supported
>  by the platform - DomU with pass-through device assigned).
>  
>  ### mmcfg (x86)
> -> `= [,amd-fam10]`
> +> `= List of [ , force, amd-fam10 ]`
>  
> -> Default: `1`
> +> Default: `true,force`
>  
> -Specify if the MMConfig space should be enabled.
> +Specify if the MMConfig space should be enabled. If force option is specified
> +(default) MMConfig space will be used by Xen early in boot even if it's
> +not reserved by firmware in the E820 table.
>  
>  ### mmio-relax (x86)
>  > `=  | all`
> diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c
> index 8e8a2c4..edef72a 100644
> --- a/xen/arch/x86/e820.c
> +++ b/xen/arch/x86/e820.c
> @@ -37,6 +37,26 @@ struct e820map e820;
>  struct e820map __initdata e820_raw;
>  
>  /*
> + * This function checks if any part of the range  is mapped
> + * with type.
> + */
> +int __init e820_any_mapped(u64 start, u64 end, unsigned type)

Please use uint64_t and unsigned int. Also it looks like this
function wants to return a bool instead of an int.

> +{
> +   int i;

unsigned int.

> +
> +   for (i = 0; i < e820.nr_map; i++) {

Coding style. Some parts of this file are using the Linux coding
style I think, but new additions should be using the Xen coding
style, see e820_change_range_type for example.

> +   struct e820entry *ei = [i];

const.

> +
> +   if (type && ei->type != type)
> +   continue;
> +   if (ei->addr >= end || ei->addr + ei->size <= start)
> +   continue;
> +   return 1;
> +   }
> +   return 0;
> +}
> +
> +/*
>   * This function checks if the entire range  is mapped with type.
>   *
>   * Note: this function only works correct if the e820 table is sorted and
> diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c 
> b/xen/arch/x86/x86_64/mmconfig-shared.c
> index cc08b52..9fc0865 100644
> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
> @@ -26,33 +26,34 @@
>  
>  #include "mmconfig.h"
>  
> +static bool_t __read_mostly force_mmcfg = true;
>  unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF;
>  
>  static int __init parse_mmcfg(const char *s)
>  {
>  const char *ss;
> -int rc = 0;
> +int val, rc = 0;
>  
>  do {
>  ss = strchr(s, ',');
>  if ( !ss )
>  ss = strchr(s, '\0');
>  
> -switch ( parse_bool(s, ss) )
> -{
> -case 0:
> -pci_probe &= ~PCI_PROBE_MMCONF;
> -break;
> -case 1:
> -break;
> -default:
> -if ( !cmdline_strcmp(s, "amd_fam10") ||
> - !cmdline_strcmp(s,

Re: [Xen-devel] [PATCH v6 06/10] domain: introduce XEN_DOMCTL_CDF_iommu flag

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Roger Pau Monne 
> Sent: 23 August 2019 11:33
> To: Paul Durrant 
> Cc: xen-devel@lists.xenproject.org; Ian Jackson ; Wei 
> Liu ;
> Anthony Perard ; Andrew Cooper 
> ; George Dunlap
> ; Jan Beulich ; Julien Grall 
> ;
> Konrad Rzeszutek Wilk ; Stefano Stabellini 
> ; Tim
> (Xen.org) ; Volodymyr Babchuk 
> Subject: Re: [PATCH v6 06/10] domain: introduce XEN_DOMCTL_CDF_iommu flag
> 
> On Fri, Aug 16, 2019 at 06:19:57PM +0100, Paul Durrant wrote:
> > This patch introduces a common domain creation flag to determine whether
> > the domain is permitted to make use of the IOMMU. Currently the flag is
> > always set (for both dom0 and domU) if the IOMMU is globally enabled
> > (i.e. iommu_enabled == 1). sanitise_domain_config() is modified to reject
> > the flag if !iommu_enabled.
> >
> > A new helper function, is_iommu_enabled(), is added to test the flag and
> > iommu_domain_init() will return immediately if !is_iommu_enabled(). This is
> > slightly different to the previous behaviour based on !iommu_enabled where
> > the call to arch_iommu_domain_init() was made regardless, however it appears
> > that this call was only necessary to initialize the dt_devices list for ARM
> > such that iommu_release_dt_devices() can be called unconditionally by
> > domain_relinquish_resources(). Adding a simple check of is_iommu_enabled()
> > into iommu_release_dt_devices() keeps this unconditional call working.
> >
> > No functional change should be observed with this patch applied.
> >
> > Subsequent patches will allow the toolstack to control whether use of the
> > IOMMU is enabled for a domain.
> >
> > NOTE: The introduction of the is_iommu_enabled() helper function might
> >   seem excessive but its use is expected to increase with subsequent
> >   patches. Also, having iommu_domain_init() bail before calling
> >   arch_iommu_domain_init() is not strictly necessary, but I think the
> >   consequent addition of the call to is_iommu_enabled() in
> >   iommu_release_dt_devices() makes the code clearer.
> >
> > Signed-off-by: Paul Durrant 
> 
> I have one ARM-related question and one 'nice to have', but the code
> LGTM:
> 
> Reviewed-by: Roger Pau Monné 
> 
> > ---
> > Cc: Ian Jackson 
> > Cc: Wei Liu 
> > Cc: Anthony PERARD 
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Jan Beulich 
> > Cc: Julien Grall 
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: Stefano Stabellini 
> > Cc: Tim Deegan 
> > Cc: Volodymyr Babchuk 
> > Cc: "Roger Pau Monné" 
> >
> > Previously part of series 
> > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02267.html
> >
> > v6:
> >  - Remove the toolstack parts as there's no nice method of testing whether
> >the IOMMU is enabled in an architecture-neutral way
> >
> > v5:
> >  - Move is_iommu_enabled() check into iommu_domain_init()
> >  - Reject XEN_DOMCTL_CDF_iommu in sanitise_domain_config() if !iommu_enabled
> >  - Use evaluate_nospec() in defintion of is_iommu_enabled()
> > ---
> >  xen/arch/arm/setup.c  | 3 +++
> >  xen/arch/x86/setup.c  | 3 +++
> >  xen/common/domain.c   | 9 -
> >  xen/common/domctl.c   | 8 
> >  xen/drivers/passthrough/device_tree.c | 3 +++
> >  xen/drivers/passthrough/iommu.c   | 6 +++---
> >  xen/include/public/domctl.h   | 4 
> >  xen/include/xen/sched.h   | 5 +
> >  8 files changed, 37 insertions(+), 4 deletions(-)
> >
> > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> > index 2c5d1372c0..20021ee0ca 100644
> > --- a/xen/arch/arm/setup.c
> > +++ b/xen/arch/arm/setup.c
> > @@ -914,6 +914,9 @@ void __init start_xen(unsigned long boot_phys_offset,
> >  dom0_cfg.arch.tee_type = tee_get_type();
> >  dom0_cfg.max_vcpus = dom0_max_vcpus();
> >
> > +if ( iommu_enabled )
> > +dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> > +
> >  dom0 = domain_create(0, _cfg, true);
> >  if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
> >  panic("Error creating domain 0\n");
> > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> > index d0b35b0ce2..fa226a2bab 100644
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -1733,6 +1733,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> >  }
> >  dom0_cfg.max_vcpus = dom0_max_vcpus();
> >
> > +if ( iommu_enabled )
> > +dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> > +
> >  /* Create initial domain 0. */
> >  dom0 = domain_create(get_initial_domain_id(), _cfg, !pv_shim);
> >  if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 76e6976617..e832a5c4aa 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -301,7 +301,8 @@ static int sanitise_domain_config(struct 
> > xen_domctl_createdomain *config)
> > XEN_DOMCTL_CDF_hap |
> >

Re: [Xen-devel] [PATCH v6 08/10] remove late (on-demand) construction of IOMMU page tables

2019-08-29 Thread Paul Durrant

> -Original Message-
> From: Roger Pau Monne 
> Sent: 23 August 2019 12:34
> To: Paul Durrant 
> Cc: xen-devel@lists.xenproject.org; Alexandru Isaila 
> ; Stefano Stabellini
> ; Julien Grall ; Volodymyr 
> Babchuk
> ; Andrew Cooper ; 
> George Dunlap
> ; Ian Jackson ; Jan Beulich 
> ;
> Konrad Rzeszutek Wilk ; Tim (Xen.org) ; 
> Wei Liu ;
> Tamas K Lengyel ; Razvan Cojocaru 
> ; Petre Pircalabu
> 
> Subject: Re: [PATCH v6 08/10] remove late (on-demand) construction of IOMMU 
> page tables
> 
> On Fri, Aug 16, 2019 at 06:19:59PM +0100, Paul Durrant wrote:
> > Now that there is a per-domain IOMMU enable flag, which should be enabled if
> > any device is going to be passed through, stop deferring page table
> > construction until the assignment is done. Also don't tear down the tables
> > again when the last device is de-assigned; defer that task until domain
> > destruction.
> >
> > This allows the has_iommu_pt() helper and iommu_status enumeration to be
> > removed. Calls to has_iommu_pt() are simply replaced by calls to
> > is_iommu_enabled(). Remaining open-code tests of iommu_hap_pt_share can also
> > be replaced by calls to iommu_use_hap_pt().
> > The arch_iommu_populate_page_table() and iommu_construct() functions become
> > redundant, as does the 'strict mode' dom0 page_list mapping code in
> > iommu_hwdom_init(), and iommu_teardown() can be made static is its only
> > remaining caller, iommu_domain_destroy(), is within the same source
> > module.
> >
> > All in all, about 220 lines of code are removed.
> >
> > NOTE: This patch will cause a small amount of extra resource to be used
> >   to accommodate IOMMU page tables that may never be used, since the
> >   per-domain IOMMU flag enable flag is currently set to the value
> >   of the global iommu_enable flag. A subsequent patch will add an
> >   option to the toolstack to allow it to be turned off if there is
> >   no intention to assign passthrough hardware to the domain.
> >
> > Signed-off-by: Paul Durrant 
> > Reviewed-by: Alexandru Isaila 
> > ---
> > Cc: Stefano Stabellini 
> > Cc: Julien Grall 
> > Cc: Volodymyr Babchuk 
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Ian Jackson 
> > Cc: Jan Beulich 
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: Tim Deegan 
> > Cc: Wei Liu 
> > Cc: "Roger Pau Monné" 
> > Cc: Tamas K Lengyel 
> > Cc: George Dunlap 
> > Cc: Razvan Cojocaru 
> > Cc: Petre Pircalabu 
> >
> > Previously part of series 
> > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02267.html
> >
> > v5:
> >  - Minor style fixes
> > ---
> >  xen/arch/arm/p2m.c|   2 +-
> >  xen/arch/x86/dom0_build.c |   2 +-
> >  xen/arch/x86/hvm/mtrr.c   |   5 +-
> >  xen/arch/x86/mm/mem_sharing.c |   2 +-
> >  xen/arch/x86/mm/paging.c  |   2 +-
> >  xen/arch/x86/x86_64/mm.c  |   2 +-
> >  xen/common/memory.c   |   4 +-
> >  xen/common/vm_event.c |   2 +-
> >  xen/drivers/passthrough/device_tree.c |  11 ---
> >  xen/drivers/passthrough/iommu.c   | 134 ++
> >  xen/drivers/passthrough/pci.c |  12 ---
> >  xen/drivers/passthrough/vtd/iommu.c   |  10 +-
> >  xen/drivers/passthrough/x86/iommu.c   |  95 --
> >  xen/include/asm-arm/iommu.h   |   2 +-
> >  xen/include/asm-x86/iommu.h   |   2 +-
> >  xen/include/xen/iommu.h   |  16 ---
> >  xen/include/xen/sched.h   |   2 -
> >  17 files changed, 42 insertions(+), 263 deletions(-)
> >
> > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> > index 7f1442932a..692565757e 100644
> > --- a/xen/arch/arm/p2m.c
> > +++ b/xen/arch/arm/p2m.c
> > @@ -1056,7 +1056,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
> >   !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
> >  p2m_free_entry(p2m, orig_pte, level);
> >
> > -if ( has_iommu_pt(p2m->domain) &&
> > +if ( is_iommu_enabled(p2m->domain) &&
> >   (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
> >  {
> >  unsigned int flush_flags = 0;
> > diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> > index d381784edd..7cfab2dc25 100644
> > --- a/xen/arch/x86/dom0_build.c
> > +++ b/xen/arch/x86/dom0_build.c
> > @@ -365,7 +365,7 @@ unsigned long __init dom0_compute_nr_pages(
> >  }
> >
> >  need_paging = is_hvm_domain(d) &&
> > -(!iommu_hap_pt_share || !paging_mode_hap(d));
> > +(!iommu_use_hap_pt(d) || !paging_mode_hap(d));
> 
> I'm not sure this is correct, at the point where dom0_compute_nr_pages
> gets called the iommu has not been initialized yet (the call to
> iommu_hwdom_init is done afterwards), so the iommu status field which
> is used by iommu_use_hap_pt is not yet initialized.

Note that this patch removes the iommu status field.

> 
> >  for ( ; ; need_paging = false )
> >  {
> >  nr_pages = get_memsize(_size,

Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"

2019-08-29 Thread Alexandru Stefan ISAILA



On 29.08.2019 11:49, Roger Pau Monne wrote:
> This partially reverts commit
> 854a49a7486a02edae5b3e53617bace526e9c1b1 by re-adding the logic that
> propagates changes to the domain physmap done by p2m_pt_set_entry into
> the iommu page tables. Without this logic changes to the guest physmap
> are not propagated to the iommu, leaving stale iommu entries that can
> leak data, or failing to add new entries.
> 
> Note that this commit doesn't re-introduce iommu flags to the cpu page
> table entries, since the logic to add/remove entries to the iommu page
> tables is based on the p2m type and the mfn.
> 
> Fixes: 854a49a7486a02 ('x86/mm: Clean IOMMU flags from p2m-pt code')
> Signed-off-by: Roger Pau Monné 
> ---
> Cc: Alexandru Stefan ISAILA 
> ---
> Changes since v1:
>   - Remove the share-pt branch, there's no sharing on AMD hardware.
> ---

Reviewed-by: Alexandru Isaila 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup

2019-08-29 Thread Jan Beulich

On 19.08.2019 03:25, Chao Gao wrote:
> Both are loading the cached patch. Since APs call the unified function,
> microcode_update_one(), during wakeup, the 'start_update' parameter
> which originally used to distinguish BSP and APs is redundant. So remove
> this parameter.
> 
> Signed-off-by: Chao Gao 
> ---
> Note that here is a functional change: resuming a CPU would call
> ->end_update() now while previously it wasn't. Not quite sure
> whether it is correct.

I think it is, as implied by the other response I've sent. But it
should then (as also said) include calling ->start_update() as well.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V3 8/8] iommu/arm: Add Renesas IPMMU-VMSA support

2019-08-29 Thread Oleksandr



On 29.08.19 11:37, Yoshihiro Shimoda wrote:

Hi Oleksandr-san,


Hi Shimoda-san.





From: Oleksandr Tyshchenko, Sent: Wednesday, August 21, 2019 3:10 AM

From: Oleksandr Tyshchenko 

The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
which provides address translation and access protection functionalities
to processing units and interconnect networks.

Please note, current driver is supposed to work only with newest
Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation

This is still "Gen3", so that please replace it with "R-Car Gen3"...


Will do.





table format and is able to use CPU's P2M table as is if one is
3-level page table (up to 40 bit IPA).

The major differences compare to the Linux driver are:

1. Stage 1/Stage 2 translation. Linux driver supports Stage 1
translation only (with Stage 1 translation table format). It manages
page table by itself. But Xen driver supports Stage 2 translation
(with Stage 2 translation table format) to be able to share the P2M
with the CPU. Stage 1 translation is always bypassed in Xen driver.

So, Xen driver is supposed to be used with newest R-Car Gen3 SoC revisions
only (H3 ES3.0, M3-W+, etc.) which IPMMU H/W supports stage 2 translation
table format.

2. AArch64 support. Linux driver uses VMSAv8-32 mode, while Xen driver
enables Armv8 VMSAv8-64 mode to cover up to 40 bit input address.

3. Context bank (sets of page table) usage. In Xen, each context bank is
mapped to one Xen domain. So, all devices being pass throughed to the
same Xen domain share the same context bank.

4. IPMMU device tracking. In Xen, all IOMMU devices are managed
by single driver instance. So, driver uses global list to keep track
of registered IPMMU devices.

Signed-off-by: Oleksandr Tyshchenko 
CC: Julien Grall 
CC: Yoshihiro Shimoda 

About this hardware handling, this patch seems good to me. But, since
I'm not familiar about Xen passthrough framework, I think I cannot
add my Reviewed-by tag into this patch. What do you think?


I am not completely sure regarding that, but I have seen cases when the 
people give their R-b not for the entire patch, but for a part of it.


Especially, when patch touches many sub-systems/archs. Would you mind if 
I specify what your R-b covers? -> [for the IPMMU H/W bits]



Thank you for the review.



Best regards,
Yoshihiro Shimoda


--
Regards,

Oleksandr Tyshchenko


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h

2019-08-29 Thread Andrew Cooper

On 29/08/2019 10:28, Jan Beulich wrote:
> Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:
>
> There have been reports of RDRAND issues after resuming from suspend on
> some AMD family 15h and family 16h systems. This issue stems from a BIOS
> not performing the proper steps during resume to ensure RDRAND continues
> to function properly.
>
> Update the CPU initialization to clear the RDRAND CPUID bit for any family
> 15h and 16h processor that supports RDRAND. If it is known that the family
> 15h or family 16h system does not have an RDRAND resume issue or that the
> system will not be placed in suspend, the "cpuid=rdrand" kernel parameter
> can be used to stop the clearing of the RDRAND CPUID bit.
>
> Note, that clearing the RDRAND CPUID bit does not prevent a processor
> that normally supports the RDRAND instruction from executing it. So any
> code that determined the support based on family and model won't #UD.
>
> Signed-off-by: Jan Beulich 
> ---
> Slightly RFC, in particular because of the change to parse_xen_cpuid():
> Alternative approach suggestions are welcome.

This issue was on my radar, but I hadn't got around to starting a patch yet.

I was wondering if we could perhaps default it to whether S3 was
available, but looking at the code which prints "ACPI sleep modes: S3",
it doesn't appear to be related to any real ACPI information.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 06/11] swiotlb-xen: always use dma-direct helpers to alloc coherent pages

2019-08-29 Thread Stefano Stabellini

+ Boris, Juergen

On Mon, 26 Aug 2019, Christoph Hellwig wrote:
> x86 currently calls alloc_pages, but using dma-direct works as well
> there, with the added benefit of using the CMA pool if available.
> The biggest advantage is of course to remove a pointless bit of
> architecture specific code.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/x86/include/asm/xen/page-coherent.h | 16 
>  drivers/xen/swiotlb-xen.c|  7 +++
>  include/xen/arm/page-coherent.h  | 12 
>  3 files changed, 3 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/include/asm/xen/page-coherent.h 
> b/arch/x86/include/asm/xen/page-coherent.h
> index 116777e7f387..8ee33c5edded 100644
> --- a/arch/x86/include/asm/xen/page-coherent.h
> +++ b/arch/x86/include/asm/xen/page-coherent.h
> @@ -5,22 +5,6 @@
>  #include 
>  #include 
>  
> -static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t 
> size,
> - dma_addr_t *dma_handle, gfp_t flags,
> - unsigned long attrs)
> -{
> - void *vstart = (void*)__get_free_pages(flags, get_order(size));
> - *dma_handle = virt_to_phys(vstart);

This is where we need Boris and Juergen's opinion. From an ARM POV it
looks OK.


> - return vstart;
> -}
> -
> -static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
> - void *cpu_addr, dma_addr_t dma_handle,
> - unsigned long attrs)
> -{
> - free_pages((unsigned long) cpu_addr, get_order(size));
> -}
> -
>  static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
>dma_addr_t dev_addr, unsigned long offset, size_t size,
>enum dma_data_direction dir, unsigned long attrs) { }
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index b8808677ae1d..f9dd4cb6e4b3 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
> size,
>* address. In fact on ARM virt_to_phys only works for kernel direct
>* mapped RAM memory. Also see comment below.
>*/
> - ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
> -
> + ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
>   if (!ret)
>   return ret;
>  
> @@ -319,7 +318,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
> size,
>   else {
>   if (xen_create_contiguous_region(phys, order,
>fls64(dma_mask), dma_handle) 
> != 0) {
> - xen_free_coherent_pages(hwdev, size, ret, 
> (dma_addr_t)phys, attrs);
> + dma_direct_free(hwdev, size, ret, (dma_addr_t)phys, 
> attrs);
>   return NULL;
>   }
>   SetPageXenRemapped(virt_to_page(ret));
> @@ -351,7 +350,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   TestClearPageXenRemapped(virt_to_page(vaddr)))
>   xen_destroy_contiguous_region(phys, order);
>  
> - xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
> + dma_direct_free(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
>  }
>  
>  /*
> diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
> index a840d6949a87..0e244f4fec1a 100644
> --- a/include/xen/arm/page-coherent.h
> +++ b/include/xen/arm/page-coherent.h
> @@ -16,18 +16,6 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev,
>  void __xen_dma_sync_single_for_device(struct device *hwdev,
>   dma_addr_t handle, size_t size, enum dma_data_direction dir);
>  
> -static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t 
> size,
> - dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
> -{
> - return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
> -}
> -
> -static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
> - void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
> -{
> - dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
> -}
> -
>  static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
>   dma_addr_t handle, size_t size, enum dma_data_direction dir)
>  {
> -- 
> 2.20.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 03/11] xen/arm: simplify dma_cache_maint

2019-08-29 Thread Stefano Stabellini

On Tue, 27 Aug 2019, Christoph Hellwig wrote:
> And this was still buggy I think, it really needs some real Xen/Arm
> testing which I can't do.  Hopefully better version below:
> 
> --
> >From 5ad4b6e291dbb49f65480c9b769414931cbd485a Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig 
> Date: Wed, 24 Jul 2019 15:26:08 +0200
> Subject: xen/arm: simplify dma_cache_maint
> 
> Calculate the required operation in the caller, and pass it directly
> instead of recalculating it for each page, and use simple arithmetics
> to get from the physical address to Xen page size aligned chunks.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm/xen/mm.c | 61 ---
>  1 file changed, 21 insertions(+), 40 deletions(-)
> 
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index 90574d89d0d4..2fde161733b0 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -35,64 +35,45 @@ unsigned long xen_get_swiotlb_free_pages(unsigned int 
> order)
>   return __get_free_pages(flags, order);
>  }
>  
> -enum dma_cache_op {
> -   DMA_UNMAP,
> -   DMA_MAP,
> -};
>  static bool hypercall_cflush = false;
>  
> -/* functions called by SWIOTLB */
> -
> -static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
> - size_t size, enum dma_data_direction dir, enum dma_cache_op op)
> +/* buffers in highmem or foreign pages cannot cross page boundaries */
> +static void dma_cache_maint(dma_addr_t handle, size_t size, u32 op)
>  {
>   struct gnttab_cache_flush cflush;
> - unsigned long xen_pfn;
> - size_t left = size;
>  
> - xen_pfn = (handle >> XEN_PAGE_SHIFT) + offset / XEN_PAGE_SIZE;
> - offset %= XEN_PAGE_SIZE;
> + cflush.a.dev_bus_addr = handle & XEN_PAGE_MASK;
> + cflush.offset = xen_offset_in_page(handle);
> + cflush.op = op;
>  
>   do {
> - size_t len = left;
> - 
> - /* buffers in highmem or foreign pages cannot cross page
> -  * boundaries */
> - if (len + offset > XEN_PAGE_SIZE)
> - len = XEN_PAGE_SIZE - offset;
> -
> - cflush.op = 0;
> - cflush.a.dev_bus_addr = xen_pfn << XEN_PAGE_SHIFT;
> - cflush.offset = offset;
> - cflush.length = len;
> -
> - if (op == DMA_UNMAP && dir != DMA_TO_DEVICE)
> - cflush.op = GNTTAB_CACHE_INVAL;
> - if (op == DMA_MAP) {
> - if (dir == DMA_FROM_DEVICE)
> - cflush.op = GNTTAB_CACHE_INVAL;
> - else
> - cflush.op = GNTTAB_CACHE_CLEAN;
> - }
> - if (cflush.op)
> - HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, 
> , 1);
> + if (size + cflush.offset > XEN_PAGE_SIZE)
> + cflush.length = XEN_PAGE_SIZE - cflush.offset;
> + else
> + cflush.length = size;
> +
> + HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, , 1);
>  
> - offset = 0;
> - xen_pfn++;
> - left -= len;
> - } while (left);
> + cflush.offset = 0;
> + cflush.a.dev_bus_addr += cflush.length;
> + size -= cflush.length;

Yes that's better

Reviewed-by: Stefano Stabellini 


> + } while (size);
>  }
>  
>  static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t 
> handle,
>   size_t size, enum dma_data_direction dir)
>  {
> - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, 
> DMA_UNMAP);
> + if (dir != DMA_TO_DEVICE)
> + dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
>  }
>  
>  static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t 
> handle,
>   size_t size, enum dma_data_direction dir)
>  {
> - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, 
> DMA_MAP);
> + if (dir == DMA_FROM_DEVICE)
> + dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
> + else
> + dma_cache_maint(handle, size, GNTTAB_CACHE_CLEAN);
>  }
>  
>  void __xen_dma_map_page(struct device *hwdev, struct page *page,
> -- 
> 2.20.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations

2019-08-29 Thread Konrad Rzeszutek Wilk

> Ah, I forgot about endianness of course...
> I am sending an improved patch. I hope it works this time as expected.

Works nicely! (Tested only on ARM64, hadn't tried ARM32 yet).

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 07/11] swiotlb-xen: use the same foreign page check everywhere

2019-08-29 Thread Stefano Stabellini

On Mon, 26 Aug 2019, Christoph Hellwig wrote:
> xen_dma_map_page uses a different and more complicated check for foreign
> pages than the other three cache maintainance helpers.  Switch it to the
> simpler pfn_valid method a well, and document the scheme with a single
> improved comment in xen_dma_map_page.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 


> ---
>  include/xen/arm/page-coherent.h | 31 +--
>  1 file changed, 9 insertions(+), 22 deletions(-)
> 
> diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
> index 0e244f4fec1a..07c104dbc21f 100644
> --- a/include/xen/arm/page-coherent.h
> +++ b/include/xen/arm/page-coherent.h
> @@ -41,23 +41,17 @@ static inline void xen_dma_map_page(struct device *hwdev, 
> struct page *page,
>dma_addr_t dev_addr, unsigned long offset, size_t size,
>enum dma_data_direction dir, unsigned long attrs)
>  {
> - unsigned long page_pfn = page_to_xen_pfn(page);
> - unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
> - unsigned long compound_pages =
> - (1< - bool local = (page_pfn <= dev_pfn) &&
> - (dev_pfn - page_pfn < compound_pages);
> + unsigned long pfn = PFN_DOWN(dev_addr);
>  
>   /*
> -  * Dom0 is mapped 1:1, while the Linux page can span across
> -  * multiple Xen pages, it's not possible for it to contain a
> -  * mix of local and foreign Xen pages. So if the first xen_pfn
> -  * == mfn the page is local otherwise it's a foreign page
> -  * grant-mapped in dom0. If the page is local we can safely
> -  * call the native dma_ops function, otherwise we call the xen
> -  * specific function.
> +  * Dom0 is mapped 1:1, and while the Linux page can span across multiple
> +  * Xen pages, it is not possible for it to contain a mix of local and
> +  * foreign Xen pages.  Calling pfn_valid on a foreign mfn will always
> +  * return false, so if pfn_valid returns true the pages is local and we
> +  * can use the native dma-direct functions, otherwise we call the Xen
> +  * specific version.
>*/
> - if (local)
> + if (pfn_valid(pfn))
>   dma_direct_map_page(hwdev, page, offset, size, dir, attrs);
>   else
>   __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, 
> attrs);
> @@ -67,14 +61,7 @@ static inline void xen_dma_unmap_page(struct device 
> *hwdev, dma_addr_t handle,
>   size_t size, enum dma_data_direction dir, unsigned long attrs)
>  {
>   unsigned long pfn = PFN_DOWN(handle);
> - /*
> -  * Dom0 is mapped 1:1, while the Linux page can be spanned accross
> -  * multiple Xen page, it's not possible to have a mix of local and
> -  * foreign Xen page. Dom0 is mapped 1:1, so calling pfn_valid on a
> -  * foreign mfn will always return false. If the page is local we can
> -  * safely call the native dma_ops function, otherwise we call the xen
> -  * specific function.
> -  */
> +
>   if (pfn_valid(pfn))
>   dma_direct_unmap_page(hwdev, handle, size, dir, attrs);
>   else
> -- 
> 2.20.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 09/11] swiotlb-xen: remove page-coherent.h

2019-08-29 Thread Stefano Stabellini

On Mon, 26 Aug 2019, Christoph Hellwig wrote:
> The only thing left of page-coherent.h is two functions implemented by
> the architecture for non-coherent DMA support that are never called for
> fully coherent architectures.  Just move the prototypes for those to
> swiotlb-xen.h instead.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 


> ---
>  arch/arm/include/asm/xen/page-coherent.h   |  2 --
>  arch/arm64/include/asm/xen/page-coherent.h |  2 --
>  arch/x86/include/asm/xen/page-coherent.h   | 11 ---
>  drivers/xen/swiotlb-xen.c  |  3 ---
>  include/Kbuild |  1 -
>  include/xen/arm/page-coherent.h| 10 --
>  include/xen/swiotlb-xen.h  |  6 ++
>  7 files changed, 6 insertions(+), 29 deletions(-)
>  delete mode 100644 arch/arm/include/asm/xen/page-coherent.h
>  delete mode 100644 arch/arm64/include/asm/xen/page-coherent.h
>  delete mode 100644 arch/x86/include/asm/xen/page-coherent.h
>  delete mode 100644 include/xen/arm/page-coherent.h
> 
> diff --git a/arch/arm/include/asm/xen/page-coherent.h 
> b/arch/arm/include/asm/xen/page-coherent.h
> deleted file mode 100644
> index 27e984977402..
> --- a/arch/arm/include/asm/xen/page-coherent.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include 
> diff --git a/arch/arm64/include/asm/xen/page-coherent.h 
> b/arch/arm64/include/asm/xen/page-coherent.h
> deleted file mode 100644
> index 27e984977402..
> --- a/arch/arm64/include/asm/xen/page-coherent.h
> +++ /dev/null
> @@ -1,2 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#include 
> diff --git a/arch/x86/include/asm/xen/page-coherent.h 
> b/arch/x86/include/asm/xen/page-coherent.h
> deleted file mode 100644
> index c9c8398a31ff..
> --- a/arch/x86/include/asm/xen/page-coherent.h
> +++ /dev/null
> @@ -1,11 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#ifndef _ASM_X86_XEN_PAGE_COHERENT_H
> -#define _ASM_X86_XEN_PAGE_COHERENT_H
> -
> -static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
> - dma_addr_t handle, size_t size, enum dma_data_direction dir) { }
> -
> -static inline void xen_dma_sync_single_for_device(struct device *hwdev,
> - dma_addr_t handle, size_t size, enum dma_data_direction dir) { }
> -
> -#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index a642e284f1e2..95911ff9c11c 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -35,9 +35,6 @@
>  #include 
>  #include 
>  
> -#include 
> -#include 
> -
>  #include 
>  /*
>   * Used to do a quick range check in swiotlb_tbl_unmap_single and
> diff --git a/include/Kbuild b/include/Kbuild
> index c38f0d46b267..cce5cf6abf89 100644
> --- a/include/Kbuild
> +++ b/include/Kbuild
> @@ -1189,7 +1189,6 @@ header-test-+= video/vga.h
>  header-test- += video/w100fb.h
>  header-test- += xen/acpi.h
>  header-test- += xen/arm/hypercall.h
> -header-test- += xen/arm/page-coherent.h
>  header-test- += xen/arm/page.h
>  header-test- += xen/balloon.h
>  header-test- += xen/events.h
> diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
> deleted file mode 100644
> index 635492d41ebe..
> --- a/include/xen/arm/page-coherent.h
> +++ /dev/null
> @@ -1,10 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#ifndef _XEN_ARM_PAGE_COHERENT_H
> -#define _XEN_ARM_PAGE_COHERENT_H
> -
> -void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle,
> - phys_addr_t paddr, size_t size, enum dma_data_direction dir);
> -void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle,
> - phys_addr_t paddr, size_t size, enum dma_data_direction dir);
> -
> -#endif /* _XEN_ARM_PAGE_COHERENT_H */
> diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h
> index 5e4b83f83dbc..a7c642872568 100644
> --- a/include/xen/swiotlb-xen.h
> +++ b/include/xen/swiotlb-xen.h
> @@ -2,8 +2,14 @@
>  #ifndef __LINUX_SWIOTLB_XEN_H
>  #define __LINUX_SWIOTLB_XEN_H
>  
> +#include 
>  #include 
>  
> +void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle,
> + phys_addr_t paddr, size_t size, enum dma_data_direction dir);
> +void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle,
> + phys_addr_t paddr, size_t size, enum dma_data_direction dir);
> +
>  extern int xen_swiotlb_init(int verbose, bool early);
>  extern const struct dma_map_ops xen_swiotlb_dma_ops;
>  
> -- 
> 2.20.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 08/11] swiotlb-xen: simplify cache maintainance

2019-08-29 Thread Stefano Stabellini

On Mon, 26 Aug 2019, Christoph Hellwig wrote:
> Now that we know we always have the dma-noncoherent.h helpers available
> if we are on an architecture with support for non-coherent devices,
> we can just call them directly, and remove the calls to the dma-direct
> routines, including the fact that we call the dma_direct_map_page
> routines but ignore the value returned from it.  Instead we now have
> Xen wrappers for the arch_sync_dma_for_{device,cpu} helpers that call
> the special Xen versions of those routines for foreign pages.
> 
> Note that the new helpers get the physical address passed in addition
> to the dma address to avoid another translation for the local cache
> maintainance.  The pfn_valid checks remain on the dma address as in
> the old code, even if that looks a little funny.
> 
> Signed-off-by: Christoph Hellwig 
>
> ---
>  arch/arm/xen/mm.c| 64 ++
>  arch/x86/include/asm/xen/page-coherent.h | 11 
>  drivers/xen/swiotlb-xen.c| 20 +++
>  include/xen/arm/page-coherent.h  | 69 ++--
>  4 files changed, 31 insertions(+), 133 deletions(-)

WOW nice! Now I really can see why this series was worth doing :-)

Reviewed-by: Stefano Stabellini 




> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index b7d53415532b..7096652f5a1e 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -61,63 +61,33 @@ static void dma_cache_maint(dma_addr_t handle, size_t 
> size, u32 op)
>   } while (size);
>  }
>  
> -static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t 
> handle,
> - size_t size, enum dma_data_direction dir)
> +/*
> + * Dom0 is mapped 1:1, and while the Linux page can span across multiple Xen
> + * pages, it is not possible for it to contain a mix of local and foreign Xen
> + * pages.  Calling pfn_valid on a foreign mfn will always return false, so if
> + * pfn_valid returns true the pages is local and we can use the native
> + * dma-direct functions, otherwise we call the Xen specific version.
> + */
> +void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle,
> + phys_addr_t paddr, size_t size, enum dma_data_direction dir)
>  {
> - if (dir != DMA_TO_DEVICE)
> + if (pfn_valid(PFN_DOWN(handle)))
> + arch_sync_dma_for_cpu(dev, paddr, size, dir);
> + else if (dir != DMA_TO_DEVICE)
>   dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
>  }
>  
> -static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t 
> handle,
> - size_t size, enum dma_data_direction dir)
> +void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle,
> + phys_addr_t paddr, size_t size, enum dma_data_direction dir)
>  {
> - if (dir == DMA_FROM_DEVICE)
> + if (pfn_valid(PFN_DOWN(handle)))
> + arch_sync_dma_for_device(dev, paddr, size, dir);
> + else if (dir == DMA_FROM_DEVICE)
>   dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
>   else
>   dma_cache_maint(handle, size, GNTTAB_CACHE_CLEAN);
>  }
>  
> -void __xen_dma_map_page(struct device *hwdev, struct page *page,
> -  dma_addr_t dev_addr, unsigned long offset, size_t size,
> -  enum dma_data_direction dir, unsigned long attrs)
> -{
> - if (dev_is_dma_coherent(hwdev))
> - return;
> - if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
> - return;
> -
> - __xen_dma_page_cpu_to_dev(hwdev, dev_addr, size, dir);
> -}
> -
> -void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
> - size_t size, enum dma_data_direction dir,
> - unsigned long attrs)
> -
> -{
> - if (dev_is_dma_coherent(hwdev))
> - return;
> - if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
> - return;
> -
> - __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
> -}
> -
> -void __xen_dma_sync_single_for_cpu(struct device *hwdev,
> - dma_addr_t handle, size_t size, enum dma_data_direction dir)
> -{
> - if (dev_is_dma_coherent(hwdev))
> - return;
> - __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
> -}
> -
> -void __xen_dma_sync_single_for_device(struct device *hwdev,
> - dma_addr_t handle, size_t size, enum dma_data_direction dir)
> -{
> - if (dev_is_dma_coherent(hwdev))
> - return;
> - __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
> -}
> -
>  bool xen_arch_need_swiotlb(struct device *dev,
>  phys_addr_t phys,
>  dma_addr_t dev_addr)
> diff --git a/arch/x86/include/asm/xen/page-coherent.h 
> b/arch/x86/include/asm/xen/page-coherent.h
> index 8ee33c5edded..c9c8398a31ff 100644
> --- a/arch/x86/include/asm/xen/page-coherent.h
> +++ b/arch/x86/include/asm/xen/page-coherent.h
> @@ -2,17 +2,6 @@
>  #ifndef _ASM_X86_XEN_PAGE_COHERENT_H
>  #define _ASM_X86_XEN_PAGE_COHERENT_H
>  
> -#include 
> -#include 
> -
> -static

Re: [Xen-devel] [PATCH 10/11] swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page

2019-08-29 Thread Stefano Stabellini

On Mon, 26 Aug 2019, Christoph Hellwig wrote:
> No need for a no-op wrapper.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 

> ---
>  drivers/xen/swiotlb-xen.c | 15 ---
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 95911ff9c11c..384304a77020 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -414,9 +414,8 @@ static dma_addr_t xen_swiotlb_map_page(struct device 
> *dev, struct page *page,
>   * After this call, reads by the cpu to the buffer are guaranteed to see
>   * whatever the device wrote there.
>   */
> -static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
> -  size_t size, enum dma_data_direction dir,
> -  unsigned long attrs)
> +static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> + size_t size, enum dma_data_direction dir, unsigned long attrs)
>  {
>   phys_addr_t paddr = xen_bus_to_phys(dev_addr);
>  
> @@ -430,13 +429,6 @@ static void xen_unmap_single(struct device *hwdev, 
> dma_addr_t dev_addr,
>   swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
>  }
>  
> -static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> - size_t size, enum dma_data_direction dir,
> - unsigned long attrs)
> -{
> - xen_unmap_single(hwdev, dev_addr, size, dir, attrs);
> -}
> -
>  static void
>  xen_swiotlb_sync_single_for_cpu(struct device *dev, dma_addr_t dma_addr,
>   size_t size, enum dma_data_direction dir)
> @@ -477,7 +469,8 @@ xen_swiotlb_unmap_sg(struct device *hwdev, struct 
> scatterlist *sgl, int nelems,
>   BUG_ON(dir == DMA_NONE);
>  
>   for_each_sg(sgl, sg, nelems, i)
> - xen_unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir, 
> attrs);
> + xen_swiotlb_unmap_page(hwdev, sg->dma_address, sg_dma_len(sg),
> + dir, attrs);
>  
>  }
>  
> -- 
> 2.20.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 11/11] arm64: use asm-generic/dma-mapping.h

2019-08-29 Thread Stefano Stabellini

On Mon, 26 Aug 2019, Christoph Hellwig wrote:
> Now that the Xen special cases are gone nothing worth mentioning is
> left in the arm64  file, so switch to use the
> asm-generic version instead.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Will Deacon 

Reviewed-by: Stefano Stabellini 


> ---
>  arch/arm64/include/asm/Kbuild|  1 +
>  arch/arm64/include/asm/dma-mapping.h | 22 --
>  arch/arm64/mm/dma-mapping.c  |  1 +
>  3 files changed, 2 insertions(+), 22 deletions(-)
>  delete mode 100644 arch/arm64/include/asm/dma-mapping.h
> 
> diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
> index c52e151afab0..98a5405c8558 100644
> --- a/arch/arm64/include/asm/Kbuild
> +++ b/arch/arm64/include/asm/Kbuild
> @@ -4,6 +4,7 @@ generic-y += delay.h
>  generic-y += div64.h
>  generic-y += dma.h
>  generic-y += dma-contiguous.h
> +generic-y += dma-mapping.h
>  generic-y += early_ioremap.h
>  generic-y += emergency-restart.h
>  generic-y += hw_irq.h
> diff --git a/arch/arm64/include/asm/dma-mapping.h 
> b/arch/arm64/include/asm/dma-mapping.h
> deleted file mode 100644
> index 67243255a858..
> --- a/arch/arm64/include/asm/dma-mapping.h
> +++ /dev/null
> @@ -1,22 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0-only */
> -/*
> - * Copyright (C) 2012 ARM Ltd.
> - */
> -#ifndef __ASM_DMA_MAPPING_H
> -#define __ASM_DMA_MAPPING_H
> -
> -#ifdef __KERNEL__
> -
> -#include 
> -#include 
> -
> -#include 
> -#include 
> -
> -static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type 
> *bus)
> -{
> - return NULL;
> -}
> -
> -#endif   /* __KERNEL__ */
> -#endif   /* __ASM_DMA_MAPPING_H */
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 4b244a037349..6578abcfbbc7 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -8,6 +8,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> -- 
> 2.20.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 140769: regressions - FAIL

2019-08-29 Thread osstest service owner

flight 140769 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/140769/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail 
REGR. vs. 140559

version targeted for testing:
 ovmf 30b4abc6e9141356639207d1cc280058a0229a36
baseline version:
 ovmf abc0155b034230128ad4aaa51ac05a315acfa7c1

Last test of basis   140559  2019-08-23 04:52:13 Z6 days
Testing same since   140684  2019-08-26 22:09:16 Z2 days3 attempts


People who touched revisions under test:
  Michael D Kinney 
  Tim Lewis 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 30b4abc6e9141356639207d1cc280058a0229a36
Author: Michael D Kinney 
Date:   Thu Aug 22 10:36:10 2019 +0800

EmulatorPkg/Win/Host: Fix SecPrint() log line endings

Update use of SecPrint() to consistently use \n\r for
line endings to fix formatting issues in the debug log.

Cc: Jordan Justen 
Reviewed-by: Ray Ni 
Cc: Andrew Fish 
Tested-by: Tim Lewis 
Signed-off-by: Michael D Kinney 

commit 10ccc27c9548ab2550689e68f6f45dc48120bc79
Author: Michael D Kinney 
Date:   Thu Aug 22 10:36:09 2019 +0800

EmulatorPkg/Win/Host: Fix image unload regression

https://bugzilla.tianocore.org/show_bug.cgi?id=2104

When UEFI Applications or UEFI Drivers are unloaded,
the PeCoffLoaderUnloadImageExtraAction() needs to unload
the image using FreeLibrary() if the image was successfully
loaded using LoadLibrrayEx().

This is a regression from the Nt32Pkg that supported
unloading applications and drivers as well as loading
the same application or driver multiple times.

Cc: Jordan Justen 
Reviewed-by: Ray Ni 
Cc: Andrew Fish 
Tested-by: Tim Lewis 
Signed-off-by: Michael D Kinney 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 11/12] livepatch: Add metadata runtime retrieval mechanism

2019-08-29 Thread Konrad Rzeszutek Wilk

On Tue, Aug 27, 2019 at 08:46:23AM +, Pawel Wieczorkiewicz wrote:
> Extend the livepatch list operation to fetch also payloads' metadata.
> This is achieved by extending the sysctl list interface with 2 extra
> guest handles:
> * metadata - an array of arbitrary size strings
> * metadata_len - an array of metadata strings' lengths (uin32_t each)
> 
> Payloads' metadata is a string of arbitrary size and does not have an
> upper bound limit. It may also vary in size between payloads.
> 
> In order to let the userland allocate enough space for the incoming
> data add a metadata total size field to the list sysctl operation and
> fill it with total size of all payloads' metadata.
> 
> Extend the libxc to handle the metadata back-to-back data transfers
> as well as metadata length array data transfers.
> 
> The xen-livepatch userland tool is extended to always display the
> metadata for each received module. The metadata is received with the
> following format: key=value\0key=value\0...key=value\0. The format is
> modified to the following one: key=value;key=value;...key=value.
> The new format allows to easily parse the metadata for a given module
> by a machine.
> 
> Signed-off-by: Pawel Wieczorkiewicz 
> Reviewed-by: Andra-Irina Paraschiv 
> Reviewed-by: Martin Pohlack 
> Reviewed-by: Norbert Manthey 
> ---
> Changed since v1:
>   * added corresponding documentation
>   * make metadata optional (do not display it when given payload
> does not have it)
> 
>  docs/misc/livepatch.pandoc| 31 -
>  tools/libxc/include/xenctrl.h | 22 +++
>  tools/libxc/xc_misc.c | 64 
> +++
>  tools/misc/xen-livepatch.c| 43 +++--
>  xen/common/livepatch.c| 22 +++
>  xen/include/public/sysctl.h   | 19 +
>  6 files changed, 157 insertions(+), 44 deletions(-)

And it may be worth folding this in:

diff --git a/xen/test/livepatch/Makefile b/xen/test/livepatch/Makefile
index a7857d3a2e..3f088e74b2 100644
--- a/xen/test/livepatch/Makefile
+++ b/xen/test/livepatch/Makefile
@@ -79,9 +79,17 @@ config.h: xen_hello_world_func.o
 xen_hello_world.o: config.h
 
 .PHONY: $(LIVEPATCH)
-$(LIVEPATCH): xen_hello_world_func.o xen_hello_world.o note.o xen_note.o
+$(LIVEPATCH): xen_hello_world_func.o xen_hello_world.o note.o xen_note.o 
modinfo.o
$(LD) $(LDFLAGS) $(build_id_linker) -r -o $(LIVEPATCH) $^
 
+.PHONY: modinfo.o
+modinfo.o:
+   (set -e; \
+printf "LIVEPATCH_RULEZ\0") > $@.bin
+   $(OBJCOPY) $(OBJCOPY_MAGIC) \
+  
--rename-section=.data=.modinfo,alloc,load,readonly,data,contents -S $@.bin $@
+   #rm -f $@.bin
+
 #
 # This target is only accessible if CONFIG_LIVEPATCH is defined, which
 # depends on $(build_id_linker) being available. Hence we do not

.. which the test-case can test for.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

1 2 >

1 - 100 of 125 matches

Mail list logo