[Xen-devel] [linux-4.9 test] 122824: FAIL

2018-05-17 Thread osstest service owner
flight 122824 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122824/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-libvirt-xsm broken  in 122722
 test-arm64-arm64-xl-credit2  broken  in 122722
 test-arm64-arm64-xl  broken  in 122722

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-libvirt-xsm 4 host-install(4) broken in 122722 pass in 122824
 test-arm64-arm64-xl  4 host-install(4) broken in 122722 pass in 122824
 test-arm64-arm64-xl-credit2  4 host-install(4) broken in 122722 pass in 122824
 test-arm64-arm64-examine  5 host-install   broken in 122722 pass in 122824
 test-amd64-i386-xl-qemuu-ws16-amd64 15 guest-saverestore.2 fail in 122722 pass 
in 122824
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 122722 pass in 
122824
 test-amd64-amd64-xl-qemut-debianhvm-amd64 16 guest-localmigrate/x10 fail pass 
in 122722

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 122564
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 122564
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 122564
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 122564
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 122564
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-arm

Re: [Xen-devel] [PATCHv3] xen: Add EFI_LOAD_OPTION support

2018-05-17 Thread Jan Beulich
>>> On 07.02.18 at 17:00,  wrote:
> This patch as-is correctly tells the two possible formats apart. I
> tested and Xen boots correctly both from the Shell and from the
> firmware boot menu. I would not like to start addressing hypothetical
> scenarios that I have no reasonable way to test against. If you are
> inclined to do that, that's your call but I'll just leave this patch
> here for now and I hope you would consider merging it.

Would you mind giving the tentative v4 (below) a try?

Jan

EFI: add EFI_LOAD_OPTION support

When booting Xen via UEFI the Xen config file can contain multiple
sections each describing different boot options. It is currently only
possible to choose which section to boot with if the buffer contains a
string. UEFI provides a different standard to pass optional arguments
to an application, and in this patch we make Xen properly parse this
buffer, thus making it possible to have separate EFI boot options
present for the different config sections.

Signed-off-by: Tamas K Lengyel 
Signed-off-by: Jan Beulich 
---
v4: Address my own review comments.

--- unstable.orig/xen/common/efi/boot.c
+++ unstable/xen/common/efi/boot.c
@@ -88,6 +88,14 @@ typedef struct _EFI_APPLE_PROPERTIES {
 EFI_APPLE_PROPERTIES_GETALL GetAll;
 } EFI_APPLE_PROPERTIES;
 
+typedef struct _EFI_LOAD_OPTION {
+UINT32 Attributes;
+UINT16 FilePathListLength;
+CHAR16 Description[];
+} EFI_LOAD_OPTION;
+
+#define LOAD_OPTION_ACTIVE  0x0001
+
 union string {
 CHAR16 *w;
 char *s;
@@ -275,6 +283,16 @@ static int __init wstrncmp(const CHAR16
 return n ? *s1 - *s2 : 0;
 }
 
+static const CHAR16 *__init wmemchr(const CHAR16 *s, CHAR16 c, UINTN n)
+{
+while ( n && *s != c )
+{
+--n;
+++s;
+}
+return n ? s : NULL;
+}
+
 static CHAR16 *__init s2w(union string *str)
 {
 const char *s = str->s;
@@ -374,14 +392,49 @@ static void __init PrintErrMesg(const CH
 }
 
 static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
-CHAR16 *cmdline, UINTN cmdsize,
+VOID *data, UINTN size, UINTN *offset,
 CHAR16 **options)
 {
-CHAR16 *ptr = (CHAR16 *)(argv + argc + 1), *prev = NULL;
+CHAR16 *ptr = (CHAR16 *)(argv + argc + 1), *prev = NULL, *cmdline = NULL;
 bool prev_sep = true;
 
-for ( ; cmdsize > sizeof(*cmdline) && *cmdline;
-cmdsize -= sizeof(*cmdline), ++cmdline )
+if ( *offset < size )
+cmdline = data + *offset;
+else if ( size > sizeof(*cmdline) && !(size % sizeof(*cmdline)) &&
+  (wmemchr(data, 0, size / sizeof(*cmdline)) ==
+   data + size - sizeof(*cmdline)) )
+{
+*offset = 0;
+cmdline = data;
+}
+else if ( size > sizeof(EFI_LOAD_OPTION) )
+{
+const EFI_LOAD_OPTION *elo = data;
+/* The minimum size the buffer needs to be. */
+size_t elo_min = offsetof(EFI_LOAD_OPTION, Description[1]) +
+ elo->FilePathListLength;
+
+if ( (elo->Attributes & LOAD_OPTION_ACTIVE) && size > elo_min &&
+ !((size - elo_min) % sizeof(*cmdline)) )
+{
+const CHAR16 *desc = elo->Description;
+const CHAR16 *end = wmemchr(desc, 0,
+(size - elo_min) / sizeof(*desc) + 1);
+
+if ( end )
+{
+*offset = elo_min + (end - desc) * sizeof(*desc);
+if ( (size -= *offset) > sizeof(*cmdline) )
+cmdline = data + *offset;
+}
+}
+}
+
+if ( !cmdline )
+return 0;
+
+for ( ; size > sizeof(*cmdline) && *cmdline;
+size -= sizeof(*cmdline), ++cmdline )
 {
 bool cur_sep = *cmdline == L' ' || *cmdline == L'\t';
 
@@ -1095,15 +1148,17 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY
 
 if ( use_cfg_file )
 {
+UINTN offset = ~(UINTN)0;
+
 argc = get_argv(0, NULL, loaded_image->LoadOptions,
-loaded_image->LoadOptionsSize, NULL);
+loaded_image->LoadOptionsSize, &offset, NULL);
 if ( argc > 0 &&
  efi_bs->AllocatePool(EfiLoaderData,
   (argc + 1) * sizeof(*argv) +
   loaded_image->LoadOptionsSize,
   (void **)&argv) == EFI_SUCCESS )
 get_argv(argc, argv, loaded_image->LoadOptions,
- loaded_image->LoadOptionsSize, &options);
+ loaded_image->LoadOptionsSize, &offset, &options);
 else
 argc = 0;
 for ( i = 1; i < argc; ++i )



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC 1/3] xen/balloon: Allow allocating DMA buffers

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko 

Signed-off-by: Oleksandr Andrushchenko 
---
 drivers/xen/balloon.c | 214 +++---
 drivers/xen/xen-balloon.c |   2 +
 include/xen/balloon.h |  11 +-
 3 files changed, 188 insertions(+), 39 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index e4db19e88ab1..e3a145aa9f29 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -415,8 +415,10 @@ static bool balloon_is_inflated(void)
return balloon_stats.balloon_low || balloon_stats.balloon_high;
 }
 
-static enum bp_state increase_reservation(unsigned long nr_pages)
+static enum bp_state increase_reservation(unsigned long nr_pages,
+ struct page **ext_pages)
 {
+   enum bp_state ret = BP_DONE;
int rc;
unsigned long i;
struct page   *page;
@@ -425,32 +427,49 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
.extent_order = EXTENT_ORDER,
.domid= DOMID_SELF
};
+   xen_pfn_t *frames;
 
-   if (nr_pages > ARRAY_SIZE(frame_list))
-   nr_pages = ARRAY_SIZE(frame_list);
+   if (nr_pages > ARRAY_SIZE(frame_list)) {
+   frames = kcalloc(nr_pages, sizeof(xen_pfn_t), GFP_KERNEL);
+   if (!frames)
+   return BP_ECANCELED;
+   } else {
+   frames = frame_list;
+   }
 
-   page = list_first_entry_or_null(&ballooned_pages, struct page, lru);
-   for (i = 0; i < nr_pages; i++) {
-   if (!page) {
-   nr_pages = i;
-   break;
-   }
+   /* XENMEM_populate_physmap requires a PFN based on Xen
+* granularity.
+*/
+   if (ext_pages) {
+   for (i = 0; i < nr_pages; i++)
+   frames[i] = page_to_xen_pfn(ext_pages[i]);
+   } else {
+   page = list_first_entry_or_null(&ballooned_pages,
+   struct page, lru);
+   for (i = 0; i < nr_pages; i++) {
+   if (!page) {
+   nr_pages = i;
+   break;
+   }
 
-   /* XENMEM_populate_physmap requires a PFN based on Xen
-* granularity.
-*/
-   frame_list[i] = page_to_xen_pfn(page);
-   page = balloon_next_page(page);
+   frames[i] = page_to_xen_pfn(page);
+   page = balloon_next_page(page);
+   }
}
 
-   set_xen_guest_handle(reservation.extent_start, frame_list);
+   set_xen_guest_handle(reservation.extent_start, frames);
reservation.nr_extents = nr_pages;
rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
-   if (rc <= 0)
-   return BP_EAGAIN;
+   if (rc <= 0) {
+   ret = BP_EAGAIN;
+   goto out;
+   }
 
for (i = 0; i < rc; i++) {
-   page = balloon_retrieve(false);
+   if (ext_pages)
+   page = ext_pages[i];
+   else
+   page = balloon_retrieve(false);
BUG_ON(page == NULL);
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
@@ -463,14 +482,14 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
if (!xen_feature(XENFEAT_auto_translated_physmap)) {
unsigned long pfn = page_to_pfn(page);
 
-   set_phys_to_machine(pfn, frame_list[i]);
+   set_phys_to_machine(pfn, frames[i]);
 
/* Link back into the page tables if not highmem. */
if (!PageHighMem(page)) {
int ret;
ret = HYPERVISOR_update_va_mapping(
(unsigned long)__va(pfn << 
PAGE_SHIFT),
-   mfn_pte(frame_list[i], 
PAGE_KERNEL),
+   mfn_pte(frames[i], PAGE_KERNEL),
0);
BUG_ON(ret);
}
@@ -478,15 +497,22 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
 #endif
 
/* Relinquish the page back to the allocator. */
-   __free_reserved_page(page);
+   if (!ext_pages)
+   __free_reserved_page(page);
}
 
-   balloon_stats.current_pages += rc;
+   if (!ext_pages)
+   balloon_stats.current_pages += rc;
 
-   return BP_DONE;
+out:
+   if (frames != frame_list)
+   kfree(frames);
+
+   return ret;
 }
 
-static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
+static enum bp_state decrease_reservation(unsigned long n

[Xen-devel] [RFC 2/3] xen/grant-table: Extend API to work with DMA buffers

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko 

Signed-off-by: Oleksandr Andrushchenko 
---
 drivers/xen/grant-table.c | 49 +++
 include/xen/grant_table.h |  7 ++
 2 files changed, 56 insertions(+)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index bb36b1e1dbcc..c27bcc420575 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -729,6 +729,55 @@ void gnttab_free_pages(int nr_pages, struct page **pages)
 }
 EXPORT_SYMBOL(gnttab_free_pages);
 
+int gnttab_dma_alloc_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void **vaddr, dma_addr_t *dev_bus_addr)
+{
+   int i;
+   int ret;
+
+   ret = alloc_dma_xenballooned_pages(dev, coherent, nr_pages, pages,
+  vaddr, dev_bus_addr);
+   if (ret < 0)
+   return ret;
+
+   for (i = 0; i < nr_pages; i++) {
+#if BITS_PER_LONG < 64
+   struct xen_page_foreign *foreign;
+
+   foreign = kzalloc(sizeof(*foreign), GFP_KERNEL);
+   if (!foreign) {
+   gnttab_dma_free_pages(dev, flags, nr_pages, pages,
+ *vaddr, *dev_bus_addr);
+   return -ENOMEM;
+   }
+   set_page_private(pages[i], (unsigned long)foreign);
+#endif
+   SetPagePrivate(pages[i]);
+   }
+   return 0;
+}
+EXPORT_SYMBOL(gnttab_dma_alloc_pages);
+
+void gnttab_dma_free_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void *vaddr, dma_addr_t dev_bus_addr)
+{
+   int i;
+
+   for (i = 0; i < nr_pages; i++) {
+   if (PagePrivate(pages[i])) {
+#if BITS_PER_LONG < 64
+   kfree((void *)page_private(pages[i]));
+#endif
+   ClearPagePrivate(pages[i]);
+   }
+   }
+   free_dma_xenballooned_pages(dev, coherent, nr_pages, pages,
+   vaddr, dev_bus_addr);
+}
+EXPORT_SYMBOL(gnttab_dma_free_pages);
+
 /* Handling of paged out grant targets (GNTST_eagain) */
 #define MAX_DELAY 256
 static inline void
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 34b1379f9777..20ee2b5ba965 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -195,6 +195,13 @@ void gnttab_free_auto_xlat_frames(void);
 int gnttab_alloc_pages(int nr_pages, struct page **pages);
 void gnttab_free_pages(int nr_pages, struct page **pages);
 
+int gnttab_dma_alloc_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void **vaddr, dma_addr_t *dev_bus_addr);
+void gnttab_dma_free_pages(struct device *dev, bool coherent,
+  int nr_pages, struct page **pages,
+  void *vaddr, dma_addr_t dev_bus_addr);
+
 int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
struct gnttab_map_grant_ref *kmap_ops,
struct page **pages, unsigned int count);
-- 
2.17.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC 3/3] xen/gntdev: Add support for Linux dma buffers

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko 

Signed-off-by: Oleksandr Andrushchenko 
---
 drivers/xen/gntdev.c  | 954 +-
 include/uapi/xen/gntdev.h | 101 
 include/xen/gntdev_exp.h  |  23 +
 3 files changed, 1066 insertions(+), 12 deletions(-)
 create mode 100644 include/xen/gntdev_exp.h

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 9510f228efe9..0ee88e193362 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -4,6 +4,8 @@
  * Device for accessing (in user-space) pages that have been granted by other
  * domains.
  *
+ * DMA buffer implementation is based on drivers/gpu/drm/drm_prime.c.
+ *
  * Copyright (c) 2006-2007, D G Murray.
  *   (c) 2009 Gerd Hoffmann 
  *
@@ -37,6 +39,9 @@
 #include 
 #include 
 
+#include 
+#include 
+
 #include 
 #include 
 #include 
@@ -61,16 +66,39 @@ static atomic_t pages_mapped = ATOMIC_INIT(0);
 static int use_ptemod;
 #define populate_freeable_maps use_ptemod
 
+#ifndef GRANT_INVALID_REF
+/*
+ * Note on usage of grant reference 0 as invalid grant reference:
+ * grant reference 0 is valid, but never exposed to a driver,
+ * because of the fact it is already in use/reserved by the PV console.
+ */
+#define GRANT_INVALID_REF  0
+#endif
+
 struct gntdev_priv {
/* maps with visible offsets in the file descriptor */
struct list_head maps;
/* maps that are not visible; will be freed on munmap.
 * Only populated if populate_freeable_maps == 1 */
struct list_head freeable_maps;
+   /* List of dma-bufs. */
+   struct list_head dma_bufs;
/* lock protects maps and freeable_maps */
struct mutex lock;
struct mm_struct *mm;
struct mmu_notifier mn;
+
+   /* Private data of the hyper DMA buffers. */
+
+   struct device *dev;
+   /* List of exported DMA buffers. */
+   struct list_head dmabuf_exp_list;
+   /* List of wait objects. */
+   struct list_head dmabuf_exp_wait_list;
+   /* List of imported DMA buffers. */
+   struct list_head dmabuf_imp_list;
+   /* This is the lock which protects dma_buf_xxx lists. */
+   struct mutex dmabuf_lock;
 };
 
 struct unmap_notify {
@@ -95,10 +123,65 @@ struct grant_map {
struct gnttab_unmap_grant_ref *kunmap_ops;
struct page **pages;
unsigned long pages_vm_start;
+
+   /*
+* All the fields starting with dmabuf_ are only valid if this
+* mapping is used for exporting a DMA buffer.
+* If dmabuf_vaddr is not NULL then this mapping is backed by DMA
+* capable memory.
+*/
+
+   /* Flags used to create this DMA buffer: GNTDEV_DMABUF_FLAG_XXX. */
+   bool dmabuf_flags;
+   /* Virtual/CPU address of the DMA buffer. */
+   void *dmabuf_vaddr;
+   /* Bus address of the DMA buffer. */
+   dma_addr_t dmabuf_bus_addr;
+};
+
+struct hyper_dmabuf {
+   struct gntdev_priv *priv;
+   struct dma_buf *dmabuf;
+   struct list_head next;
+   int fd;
+
+   union {
+   struct {
+   /* Exported buffers are reference counted. */
+   struct kref refcount;
+   struct grant_map *map;
+   } exp;
+   struct {
+   /* Granted references of the imported buffer. */
+   grant_ref_t *refs;
+   /* Scatter-gather table of the imported buffer. */
+   struct sg_table *sgt;
+   /* dma-buf attachment of the imported buffer. */
+   struct dma_buf_attachment *attach;
+   } imp;
+   } u;
+
+   /* Number of pages this buffer has. */
+   int nr_pages;
+   /* Pages of this buffer. */
+   struct page **pages;
+};
+
+struct hyper_dmabuf_wait_obj {
+   struct list_head next;
+   struct hyper_dmabuf *hyper_dmabuf;
+   struct completion completion;
+};
+
+struct hyper_dambuf_attachment {
+   struct sg_table *sgt;
+   enum dma_data_direction dir;
 };
 
 static int unmap_grant_pages(struct grant_map *map, int offset, int pages);
 
+static struct miscdevice gntdev_miscdev;
+
 /* -- */
 
 static void gntdev_print_maps(struct gntdev_priv *priv,
@@ -120,8 +203,17 @@ static void gntdev_free_map(struct grant_map *map)
if (map == NULL)
return;
 
-   if (map->pages)
+   if (map->dmabuf_vaddr) {
+   bool coherent = map->dmabuf_flags &
+   GNTDEV_DMABUF_FLAG_DMA_COHERENT;
+
+   gnttab_dma_free_pages(gntdev_miscdev.this_device,
+ coherent, map->count, map->pages,
+ map->dmabuf_vaddr, map->dmabuf_bus_addr);
+   } else if (map->pages) {
gnttab_free_pages(map->count, map->pages);
+   }
+
kfree(map->pages);
kfre

[Xen-devel] [RFC 0/3] dma-buf support for gntdev

2018-05-17 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko 

This work is in response to my previous attempt to introduce Xen/DRM
zero-copy driver [1] to enable Linux dma-buf API [2] for Xen based
frontends/backends. There is also an existing hyper_dmabuf approach
available [3] which, if reworked to utilize the proposed solution,
can greatly benefit as well.

The original rationale behind this work was to enable zero-copying
use-cases while working with Xen para-virtual display driver [4]:
when using Xen PV DRM frontend driver then on backend side one will
need to do copying of display buffers' contents (filled by the
frontend's user-space) into buffers allocated at the backend side.
Taking into account the size of display buffers and frames per
second it may result in unneeded huge data bus occupation and
performance loss.

The helper driver [4] allows implementing zero-copying use-cases
when using Xen para-virtualized frontend display driver by implementing
a DRM/KMS helper driver running on backend's side.
It utilizes PRIME buffers API (implemented on top of Linux dma-buf)
to share frontend's buffers with physical device drivers on
backend's side:

 - a dumb buffer created on backend's side can be shared
   with the Xen PV frontend driver, so it directly writes
   into backend's domain memory (into the buffer exported from
   DRM/KMS driver of a physical display device)
 - a dumb buffer allocated by the frontend can be imported
   into physical device DRM/KMS driver, thus allowing to
   achieve no copying as well

Finally, it was discussed and decided ([1], [5]) that it is worth
implementing such use-cases via extension of the existing Xen gntdev
driver instead of introducing new DRM specific driver.
Please note, that the support of dma-buf is Linux only,
as dma-buf is a Linux only thing.

Now to the proposed solution. The changes  to the existing Xen drivers
in the Linux kernel fall into 2 categories:
1. DMA-able memory buffer allocation and ballooning in/out the pages
   of such a buffer.
   This is required if we are about to share dma-buf with the hardware
   that does require those to be allocated with dma_alloc_xxx API.
   (It is still possible to allocate a dma-buf from any other memory,
   e.g. system pages).
2. Extension of the gntdev driver to enable it to import/export dma-buf’s.

The first two patches in this series solve #1 and the last one is for #2.
The corresponding libxengnttab changes are available at [6].

All the above was tested with display backend [7] and its accompanying
helper library [8] on Renesas ARM64 based board.


*To all the communities*: I would like to ask you to review the proposed
solution and give feedback on it, so I can improve and send final
patches for review (this is still work in progress, but enough to start
discussing the implementation).


Thank you in advance,
Oleksandr Andrushchenko

[1] https://lists.freedesktop.org/archives/dri-devel/2018-April/173163.html
[2] 
https://elixir.bootlin.com/linux/v4.17-rc5/source/Documentation/driver-api/dma-buf.rst
[3] https://lists.xenproject.org/archives/html/xen-devel/2018-02/msg01202.html
[4] https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/xen
[5] https://patchwork.kernel.org/patch/10279681/
[6] https://github.com/andr2000/xen/tree/xen_dma_buf_v1
[7] https://github.com/andr2000/displ_be/tree/xen_dma_buf_v1
[8] https://github.com/andr2000/libxenbe/tree/xen_dma_buf_v1

Oleksandr Andrushchenko (3):
  xen/balloon: Allow allocating DMA buffers
  xen/grant-table: Extend API to work with DMA buffers
  xen/gntdev: Add support for Linux dma buffers

 drivers/xen/balloon.c | 214 +++--
 drivers/xen/gntdev.c  | 954 +-
 drivers/xen/grant-table.c |  49 ++
 drivers/xen/xen-balloon.c |   2 +
 include/uapi/xen/gntdev.h | 101 
 include/xen/balloon.h |  11 +-
 include/xen/gntdev_exp.h  |  23 +
 include/xen/grant_table.h |   7 +
 8 files changed, 1310 insertions(+), 51 deletions(-)
 create mode 100644 include/xen/gntdev_exp.h

-- 
2.17.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] migration regression in xen-4.11 and qemu-2.11 and qcow2

2018-05-17 Thread Olaf Hering
Am Mon, 7 May 2018 17:19:46 +0200
schrieb Olaf Hering :

> With qemu-2.11 the sender thinks everything is alright and the domU is moved.

Another case of breakage in qemu-2.11:
if the targethost does not even have access to the diskimage the sender still 
thinks everything is alright. qemu does not propagate the error to libxl to 
allow it to abort the migration.

Olaf


pgpPu0eGY7vdI.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 for-4.7/4.8] x86: Fix "x86: further CPUID handling adjustments"

2018-05-17 Thread Jan Beulich
>>> On 16.05.18 at 19:27,  wrote:
> c/s f9616884e (a backport of c/s 0d703a701 "x86/feature: Definitions for
> Indirect Branch Controls") missed a CPUID adjustment when calculating the raw
> featureset.  This impacts host administrator diagnostics.
> 
> Signed-off-by: Sergey Dyasli 

I continue to think this should remain to be a separate patch.

> c/s 62b187969 "x86: further CPUID handling adjustments" make some adjustments.
> However, it breaks levelling of guests, making it impossible for the toolstack
> to hide STIBP or IBPB from guests on hardware with up-to-date microcode.
> 
> The dom0 issue referenced in the commit message was fixed by the hunk
> adjusting the zeroing alone.  STIBP and IBPB don't need (and indeed, must not
> be for levelling purposes) OR'd into the leaf.
> 
> One final item which was missed in backport was the need to ignore the
> toolstack choice of STIBP, and set it equal to IBRSB.  This needs doing after
> the mask has been applied.

This last paragraph at least partly contradicts the first talking about tool 
stack
chosen hiding of STIBP. The intended net effect, aiui, is
- expose STIBP independent of tool stack choice when the tool stack has
  elected to expose IBRSB,
- hide STIBP according to tool stack choice when IBRSB is also hidden.

A similar implication (as mentioned before, and see below) is supposed to
exist from IBRSB to IBPB, I think.

Also this aspect wasn't missed in the original backport, but done wrongly: The
ORing in ahead of the masking was meant to take care of this, utilizing what
calculate_{hvm,pv}_featureset() do (overlooking the fact that the feature sets
may have the bits set while the tool stack may have cleared them for the given
domain).

> @@ -1188,7 +1191,6 @@ void pv_cpuid(struct cpu_user_regs *regs)
>  
>  case 0x8008:
>  a = paddr_bits | (vaddr_bits << 8);
> -b |= cpufeat_mask(X86_FEATURE_IBPB);
>  b &= pv_featureset[FEATURESET_e8b];
>  break;

You didn't address my respective v1 review comment, neither by adding code
here, nor verbally: How is Dom0 supposed to know of IBPB being available if
IBSRB is (in hardware), but IBPB isn't? The whole purpose of the respective
chunk of code in calculate_pv_featureset() is just that. And if we  I think 
this similarly
should be done for DomU-s, i.e. also in the HVM variant of this code.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] migration regression in xen-4.11 and qemu-2.11 and qcow2

2018-05-17 Thread Olaf Hering
Am Thu, 17 May 2018 08:30:58 +0200
schrieb Olaf Hering :

> I think the issue fixed by 5d6c599fe1d69a1bf8c5c4d3c58be2b31cd625ad is not 
> specific to HVM. It seems domain_suspend_common_guest_suspended would call 
> that changed function only for HVM. It seems the logic is wrong. It is not 
> about the device model, but about that fact that 'disk==qcow2' requires 
> qemu-upstream.

This fixes it for me. Now the question is how to do that properly for all sorts 
of diskimage types.

Olaf
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -385,6 +385,21 @@ static void domain_suspend_common_guest_
 domain_suspend_common_done(egc, dsps, rc);
 return;
 }
+} else if (dsps->type == LIBXL_DOMAIN_TYPE_PV) {
+uint32_t const domid = dsps->domid;
+const char *const filename = dsps->dm_savefile;
+rc = libxl__qmp_stop(gc, domid);
+if (rc) {
+LOGD(ERROR, domid, "failure from libxl__qmp_stop domid:%d", rc);
+return;
+}
+/* Save DM state into filename */
+rc = libxl__qmp_save(gc, domid, filename, dsps->live);
+if (rc) {
+unlink(filename);
+LOGD(ERROR, domid, "failure from libxl__qmp_save domid:%d", rc);
+return;
+}
 }
 domain_suspend_common_done(egc, dsps, 0);
 }
@@ -466,6 +481,12 @@ int libxl__domain_resume(libxl__gc *gc,
 LOGD(ERROR, domid, "failed to resume device model:%d", rc);
 goto out;
 }
+} else if (type == LIBXL_DOMAIN_TYPE_PV) {
+if (libxl__qmp_resume(gc, domid)) {
+rc = ERROR_FAIL;
+LOGD(ERROR, domid, "failed to resume device model:%d", rc);
+goto out;
+}
 }
 
 if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {


pgpIVE99hQuKB.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Roger Pau Monne
The function is supposed to return whether the MTRR and PAT state of
two CPUs match. Currently this is not properly done because the test
for the deftype and the enable bits required both the deftype and the
enable bits to be different, while just one of those fields being
different can already cause the MTRR states on the vCPU to not match.

Fix this by changing the AND into an OR instead, so that either the
deftype or the enabled bits being different will cause the function to
return mismatching state.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Juergen Gross 
---
This is a bugfix and should go into 4.11
---
 xen/arch/x86/hvm/mtrr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index b721c6330f..3e0435c503 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -492,9 +492,8 @@ bool_t mtrr_pat_not_equal(struct vcpu *vd, struct vcpu *vs)
 if ( res )
 return 1;
 
-/* Test default type MSR. */
-if ( (md->def_type != ms->def_type)
-&& (md->enabled != ms->enabled) )
+/* Test default type MSR and the enable bits. */
+if ( (md->def_type != ms->def_type) || (md->enabled != ms->enabled) )
 return 1;
 
 /* Test PAT. */
-- 
2.17.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Juergen Gross
On 17/05/18 11:48, Roger Pau Monne wrote:
> The function is supposed to return whether the MTRR and PAT state of
> two CPUs match. Currently this is not properly done because the test
> for the deftype and the enable bits required both the deftype and the
> enable bits to be different, while just one of those fields being
> different can already cause the MTRR states on the vCPU to not match.
> 
> Fix this by changing the AND into an OR instead, so that either the
> deftype or the enabled bits being different will cause the function to
> return mismatching state.
> 
> Signed-off-by: Roger Pau Monné 

Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 6/8] xen_backend: make the xen_feature_grant_copy flag private

2018-05-17 Thread Anthony PERARD
On Fri, May 04, 2018 at 08:26:05PM +0100, Paul Durrant wrote:
> There is no longer any use of this flag outside of the xen_backend code.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Anthony PERARD 

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 5/8] xen_disk: remove use of grant map/unmap

2018-05-17 Thread Anthony PERARD
On Fri, May 04, 2018 at 08:26:04PM +0100, Paul Durrant wrote:
> Now that the (native or emulated) xen_be_copy_grant_refs() helper is
> always available, the xen_disk code can be significantly simplified by
> removing direct use of grant map and unmap operations.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Anthony PERARD 

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Jan Beulich
>>> On 17.05.18 at 11:48,  wrote:
> The function is supposed to return whether the MTRR and PAT state of
> two CPUs match. Currently this is not properly done because the test
> for the deftype and the enable bits required both the deftype and the
> enable bits to be different, while just one of those fields being
> different can already cause the MTRR states on the vCPU to not match.
> 
> Fix this by changing the AND into an OR instead, so that either the
> deftype or the enabled bits being different will cause the function to
> return mismatching state.

This is by far not enough, but I didn't view the function as critical
enough to warrant sending out the patch I have right away.

Jan
x86/HVM: correct mtrr_pat_not_equal()

The two vCPU-s differring in MTRR-enabled state means MTRR settings are
not equal. Both vCPU-s having MTRRs disabled means only PAT needs to be
compared. Along those lines for fixed range MTRRs. Differring variable
range counts likewise mean settings are different overall.

Constify types and convert bool_t to bool.

Signed-off-by: Jan Beulich 

--- unstable.orig/xen/arch/x86/hvm/mtrr.c
+++ unstable/xen/arch/x86/hvm/mtrr.c
@@ -476,35 +476,40 @@ bool_t mtrr_var_range_msr_set(
 return 1;
 }
 
-bool_t mtrr_pat_not_equal(struct vcpu *vd, struct vcpu *vs)
+bool mtrr_pat_not_equal(const struct vcpu *vd, const struct vcpu *vs)
 {
-struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
-struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
-int32_t res;
-uint8_t num_var_ranges = (uint8_t)md->mtrr_cap;
-
-/* Test fixed ranges. */
-res = memcmp(md->fixed_ranges, ms->fixed_ranges,
-NUM_FIXED_RANGES*sizeof(mtrr_type));
-if ( res )
-return 1;
-
-/* Test var ranges. */
-res = memcmp(md->var_ranges, ms->var_ranges,
-num_var_ranges*sizeof(struct mtrr_var_range));
-if ( res )
-return 1;
-
-/* Test default type MSR. */
-if ( (md->def_type != ms->def_type)
-&& (md->enabled != ms->enabled) )
-return 1;
+const struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
+const struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
 
-/* Test PAT. */
-if ( vd->arch.hvm_vcpu.pat_cr != vs->arch.hvm_vcpu.pat_cr )
-return 1;
+if ( (md->enabled ^ ms->enabled) & 2 )
+return true;
+
+if ( md->enabled & 2 )
+{
+unsigned int num_var_ranges = (uint8_t)md->mtrr_cap;
+
+/* Test default type MSR. */
+if ( md->def_type != ms->def_type )
+return true;
+
+/* Test fixed ranges. */
+if ( (md->enabled ^ ms->enabled) & 1 )
+return true;
+
+if ( (md->enabled & 1) &&
+ memcmp(md->fixed_ranges, ms->fixed_ranges,
+sizeof(md->fixed_ranges)) )
+return true;
+
+/* Test variable ranges. */
+if ( num_var_ranges != (uint8_t)ms->mtrr_cap ||
+ memcmp(md->var_ranges, ms->var_ranges,
+num_var_ranges * sizeof(*md->var_ranges)) )
+return true;
+}
 
-return 0;
+/* Test PAT. */
+return vd->arch.hvm_vcpu.pat_cr != vs->arch.hvm_vcpu.pat_cr;
 }
 
 struct hvm_mem_pinned_cacheattr_range {
--- unstable.orig/xen/include/asm-x86/mtrr.h
+++ unstable/xen/include/asm-x86/mtrr.h
@@ -92,6 +92,6 @@ extern void memory_type_changed(struct d
 extern bool_t pat_msr_set(uint64_t *pat, uint64_t msr);
 
 bool_t is_var_mtrr_overlapped(const struct mtrr_state *m);
-bool_t mtrr_pat_not_equal(struct vcpu *vd, struct vcpu *vs);
+bool mtrr_pat_not_equal(const struct vcpu *vd, const struct vcpu *vs);
 
 #endif /* __ASM_X86_MTRR_H__ */





___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 7/8] xen_disk: use a single entry iovec

2018-05-17 Thread Anthony PERARD
On Fri, May 04, 2018 at 08:26:06PM +0100, Paul Durrant wrote:
> Since xen_disk now always copies data to and from a guest there is no need
> to maintain a vector entry corresponding to every page of a request.
> This means there is less per-request state to maintain so the ioreq
> structure can shrink significantly.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Anthony PERARD 

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 8/8] xen_disk: be consistent with use of xendev and blkdev->xendev

2018-05-17 Thread Anthony PERARD
On Fri, May 04, 2018 at 08:26:07PM +0100, Paul Durrant wrote:
> Certain functions in xen_disk are called with a pointer to xendev
> (struct XenDevice *). They then use continer_of() to acces the surrounding

  ^ container_of

> blkdev (struct XenBlkDev) but then in various places use &blkdev->xendev
> when use of the original xendev pointer is shorter to express and clearly
> equivalent.
> 
> This patch is a purely cosmetic patch which makes sure there is a xendev
> pointer on stack for any function where the pointer is need on multiple
> occasions modified those functions to use it consistently.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Anthony PERARD 

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Roger Pau Monné
On Thu, May 17, 2018 at 04:44:04AM -0600, Jan Beulich wrote:
> >>> On 17.05.18 at 11:48,  wrote:
> > The function is supposed to return whether the MTRR and PAT state of
> > two CPUs match. Currently this is not properly done because the test
> > for the deftype and the enable bits required both the deftype and the
> > enable bits to be different, while just one of those fields being
> > different can already cause the MTRR states on the vCPU to not match.
> > 
> > Fix this by changing the AND into an OR instead, so that either the
> > deftype or the enabled bits being different will cause the function to
> > return mismatching state.
> 
> This is by far not enough, but I didn't view the function as critical
> enough to warrant sending out the patch I have right away.

I've also realized that the logic there is wonky and would return true
in cases where the states are equal (ie: for example if fixed MTRRs
contents are different but FE is disabled).

Just wanted to do a minimal change that prevents wrongly reporting
that the state is equal when it's not (I think the other way around is
not that critical).

You change LGTM, and fixes some obvious cases where the current code
would return true even if the cache state is the same.

> Jan
> x86/HVM: correct mtrr_pat_not_equal()
> 
> The two vCPU-s differring in MTRR-enabled state means MTRR settings are
> not equal. Both vCPU-s having MTRRs disabled means only PAT needs to be
> compared. Along those lines for fixed range MTRRs. Differring variable
> range counts likewise mean settings are different overall.
> 
> Constify types and convert bool_t to bool.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Roger Pau Monné 

> --- unstable.orig/xen/arch/x86/hvm/mtrr.c
> +++ unstable/xen/arch/x86/hvm/mtrr.c
> @@ -476,35 +476,40 @@ bool_t mtrr_var_range_msr_set(
>  return 1;
>  }
>  
> -bool_t mtrr_pat_not_equal(struct vcpu *vd, struct vcpu *vs)
> +bool mtrr_pat_not_equal(const struct vcpu *vd, const struct vcpu *vs)
>  {
> -struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
> -struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
> -int32_t res;
> -uint8_t num_var_ranges = (uint8_t)md->mtrr_cap;
> -
> -/* Test fixed ranges. */
> -res = memcmp(md->fixed_ranges, ms->fixed_ranges,
> -NUM_FIXED_RANGES*sizeof(mtrr_type));
> -if ( res )
> -return 1;
> -
> -/* Test var ranges. */
> -res = memcmp(md->var_ranges, ms->var_ranges,
> -num_var_ranges*sizeof(struct mtrr_var_range));
> -if ( res )
> -return 1;
> -
> -/* Test default type MSR. */
> -if ( (md->def_type != ms->def_type)
> -&& (md->enabled != ms->enabled) )
> -return 1;
> +const struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
> +const struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
>  
> -/* Test PAT. */
> -if ( vd->arch.hvm_vcpu.pat_cr != vs->arch.hvm_vcpu.pat_cr )
> -return 1;
> +if ( (md->enabled ^ ms->enabled) & 2 )
> +return true;
> +
> +if ( md->enabled & 2 )
> +{
> +unsigned int num_var_ranges = (uint8_t)md->mtrr_cap;
> +
> +/* Test default type MSR. */
> +if ( md->def_type != ms->def_type )
> +return true;
> +
> +/* Test fixed ranges. */
> +if ( (md->enabled ^ ms->enabled) & 1 )
> +return true;
> +
> +if ( (md->enabled & 1) &&
> + memcmp(md->fixed_ranges, ms->fixed_ranges,
> +sizeof(md->fixed_ranges)) )
> +return true;
> +
> +/* Test variable ranges. */
> +if ( num_var_ranges != (uint8_t)ms->mtrr_cap ||

Is it really possible to have two vCPUs on the same domain with a
different number of variable ranges?

> + memcmp(md->var_ranges, ms->var_ranges,
> +num_var_ranges * sizeof(*md->var_ranges)) )
> +return true;
> +}
>  
> -return 0;
> +/* Test PAT. */
> +return vd->arch.hvm_vcpu.pat_cr != vs->arch.hvm_vcpu.pat_cr;
>  }
>  
>  struct hvm_mem_pinned_cacheattr_range {
> --- unstable.orig/xen/include/asm-x86/mtrr.h
> +++ unstable/xen/include/asm-x86/mtrr.h
> @@ -92,6 +92,6 @@ extern void memory_type_changed(struct d
>  extern bool_t pat_msr_set(uint64_t *pat, uint64_t msr);
>  
>  bool_t is_var_mtrr_overlapped(const struct mtrr_state *m);
> -bool_t mtrr_pat_not_equal(struct vcpu *vd, struct vcpu *vs);
> +bool mtrr_pat_not_equal(const struct vcpu *vd, const struct vcpu *vs);
>  
>  #endif /* __ASM_X86_MTRR_H__ */
> 
> 
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 02/10] Osstest/Debian: preseed: Force UEFI install regardless

2018-05-17 Thread Ian Jackson
This suppresses:

   Partition disks
   ---

   This machine's firmware has started the installer in UEFI mode but it looks
   like there may be existing operating systems already installed using "BIOS
   compatibility mode". If you continue to install Debian in UEFI mode, it might
   be difficult to reboot the machine into any BIOS-mode operating systems 
later.

   If you wish to install in UEFI mode and don't care about keeping the ability 
to
   boot one of the existing systems, you have the option to force that here. If
   you wish to keep the option to boot an existing operating system, you should
   choose NOT to force UEFI installation here.
   Force UEFI installation?
 1: Yes  2: No
 Prompt: '?' for help>

Signed-off-by: Ian Jackson 
---
 Osstest/Debian.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm
index b46d222..cbd10d3 100644
--- a/Osstest/Debian.pm
+++ b/Osstest/Debian.pm
@@ -1333,6 +1333,7 @@ d-i partman-auto/disk string $disk
 
 d-i partman-ext3/no_mount_point boolean false
 d-i partman-basicmethods/method_only boolean false
+d-i partman-efi/non_efi_system true
 
 d-i partman-auto/expert_recipe string  \\
boot-root ::\\
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 03/10] README.dev: Fix a typo

2018-05-17 Thread Ian Jackson
Signed-off-by: Ian Jackson 
---
 README.dev | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.dev b/README.dev
index 95fc66c..0f7fccd 100644
--- a/README.dev
+++ b/README.dev
@@ -30,8 +30,8 @@ Keeps running for the duration, so run it in a screen on the 
osstest VM.
 
 Or you can use mg-allocate.
 
-Commisioning a new machine
-==
+Commissioning a new machine
+===
 
 Firstly, arrange that it is hooked up to network, serial, and pdu.
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 04/10] README.dev: Discuss setting Firmware for UEFI machines

2018-05-17 Thread Ian Jackson
This must occur before mknetbootdir, or mknetbootdir does not set
things up for UEFI booting.

Signed-off-by: Ian Jackson 
---
 README.dev | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/README.dev b/README.dev
index 0f7fccd..a2b93e0 100644
--- a/README.dev
+++ b/README.dev
@@ -97,7 +97,11 @@ For example, one might need something like this:
 (Many of these things are not needed in Massachusetts as the ansible
 playbook will provide it via an autogenerated config file - see above.)
 
-Create the tftp directory:
+Set the firmware, if it's UEFI:
+
+ $ ./mg-hosts setprops albana{0,1} -- Firmware uefi
+
+Create the tftp directory (must be done after firmware is set):
 
  $ ./mg-hosts mknetbootdir mudcake{0,1}
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 10/10] mfi-common: Fall back to anointed builds in Executive mode

2018-05-17 Thread Ian Jackson
Previously, `make-*-flight' would not work unless FREEBSD_*_BUILDJOB
was set.  Now we use the anointed values if we can find them.

If we can't, mg-anoint retrieve will print a warning.

Signed-off-by: Ian Jackson 
CC: Roger Pau Monné 
---
 mfi-common | 8 
 1 file changed, 8 insertions(+)

diff --git a/mfi-common b/mfi-common
index 17b1b50..fddd1ce 100644
--- a/mfi-common
+++ b/mfi-common
@@ -130,6 +130,8 @@ set_freebsd_runvars () {
 # 3. Config file FreeBSDDist, FreeBSDVersion: same as 2. except that
 # they are set on the config file.
 #
+# 4. Look for an anointed build of FreeBSD `master' (Executive only)
+#
 local envvar="FREEBSD_${arch^^}_BUILDJOB"
 if [ -n "${!envvar}" ]; then
 freebsd_runvars="freebsdbuildjob=${!envvar}"
@@ -147,6 +149,12 @@ set_freebsd_runvars () {
  freebsd_version=$version"
 return
 fi
+local anointment="freebsd build master $arch"
+local flightjob=`./mg-anoint retrieve --tolerate-unprepared "$anointment"`
+if [ -n "$flightjob" ]; then
+freebsd_runvars="freebsdbuildjob=${flightjob/ /.}"
+return
+fi
 }
 
 create_build_jobs () {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 05/10] README.dev: Make example commisioning runes use $hn variable

2018-05-17 Thread Ian Jackson
This makes them better for cutting-and-pasting.

Signed-off-by: Ian Jackson 
---
 README.dev | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/README.dev b/README.dev
index a2b93e0..441eee8 100644
--- a/README.dev
+++ b/README.dev
@@ -112,9 +112,10 @@ Firstly, a basic "host examination" that checks that we 
can install
 and boot Xen:
 
  $ basis=113124   # pick last good xen-unstable or osstest flight
- $ flight=`./make-hosts-flight play xen-unstable blessed-commission-mudcake 
commission-mudcake $basis`; echo $flight
+ $ hn=mudcake
+ $ flight=`./make-hosts-flight play xen-unstable blessed-commission-$hn 
commission-$hn $basis`; echo $flight
  113155
- $ ./mg-execute-flight -Bcommission-mudcake -eian.jack...@citrix.com $flight
+ $ ./mg-execute-flight -Bcommission-$hn -eian.jack...@citrix.com $flight
 
 This will email the specified address.  The examination should pass,
 completely.  If it does not then you may need to change the BIOS
@@ -130,8 +131,8 @@ right permissions set up.
 If that works, a more thorough test:
 
  $ basis=113124   # pick last good xen-unstable or osstest flight
- $ flight=`./cs-adjust-flight new:commission-mudcake copy $basis`; echo $flight
- $ ./mg-execute-flight -Bcommission-mudcake -eian.jack...@citrix.com -f$basis 
$flight
+ $ flight=`./cs-adjust-flight new:commission-$hn copy $basis`; echo $flight
+ $ ./mg-execute-flight -Bcommission-$hn -eian.jack...@citrix.com -f$basis 
$flight
 
 This should show no regressions.  (Or, at least, none that are a cause
 for concern.)
@@ -139,7 +140,7 @@ for concern.)
 For a new architecture, there may not be an existing flight with
 suitable jobs.  In that case, something like this can be useful:
 
- $ OSSTEST_BLESSING=commission-mudcake DAILY_BRANCH_PREEXEC_HOOK=false 
OSSTEST_BASELINES_ONLY=y ./cr-daily-branch xen-unstable
+ $ OSSTEST_BLESSING=commission-$hn DAILY_BRANCH_PREEXEC_HOOK=false 
OSSTEST_BASELINES_ONLY=y ./cr-daily-branch xen-unstable
 
 You'll need to fish the flight number out of the debug spew.
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 00/10] UEFI and commissioning fixes

2018-05-17 Thread Ian Jackson
From: Ian Jackson 

I am trying to commission some new hosts.  These patches fix problems
I encountered.  There are fixes to:

 * Better support UEFI x86 hosts.
 * Improve the commissioning instructions in README.dev.
 * Handle the new FreeBSD anointments system in make-*-flight in
   production (Executive) mode.

Ian Jackson (10):
  Osstest/TestSupport: Use right arch for UEFI grub setup
  Osstest/Debian: preseed: Force UEFI install regardless
  README.dev: Fix a typo
  README.dev: Discuss setting Firmware for UEFI machines
  README.dev: Make example commisioning runes use $hn variable
  Perl @INC path: fix a few more scripts to use BEGIN
  mg-anoint: Make readonly operations "work" in standalone mode
  mg-anoint: Support mg-anoint retrieve --tolerate-unprepared
  mfi-common: set_freebsd_runvars: Never set freebsd_distpath to
`/amd64' etc.
  mfi-common: Fall back to anointed builds in Executive mode

 Osstest/Debian.pm   |  1 +
 Osstest/JobDB/Executive.pm  |  2 ++
 Osstest/JobDB/Standalone.pm |  2 ++
 Osstest/TestSupport.pm  |  4 ++--
 README.dev  | 21 +
 mfi-common  | 19 ---
 mg-anoint   | 34 ++
 ts-examine-hostprops-save   |  2 +-
 ts-freebsd-host-install |  2 +-
 9 files changed, 68 insertions(+), 19 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 07/10] mg-anoint: Make readonly operations "work" in standalone mode

2018-05-17 Thread Ian Jackson
This makes `mg-anoint' in standalone mode a view onto an empty set of
anointments.  So now it becomes ok to call mg-anoint in make-*-flight.

Signed-off-by: Ian Jackson 
CC: Roger Pau Monné 
---
 Osstest/JobDB/Executive.pm  |  2 ++
 Osstest/JobDB/Standalone.pm |  2 ++
 mg-anoint   | 20 +++-
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm
index a7a6696..fe8e7f6 100644
--- a/Osstest/JobDB/Executive.pm
+++ b/Osstest/JobDB/Executive.pm
@@ -409,4 +409,6 @@ sub jobdb_db_glob ($$) { #method
 return "LIKE E'$str'";
 }
 
+sub can_anoint ($) { return 1; }
+
 1;
diff --git a/Osstest/JobDB/Standalone.pm b/Osstest/JobDB/Standalone.pm
index d9a90fc..4f320cc 100644
--- a/Osstest/JobDB/Standalone.pm
+++ b/Osstest/JobDB/Standalone.pm
@@ -133,4 +133,6 @@ sub jobdb_db_glob ($) { #method
 return "GLOB '$str'";
 }
 
+sub can_anoint ($) { return 0; }
+
 1;
diff --git a/mg-anoint b/mg-anoint
index b007ab4..522cbdd 100755
--- a/mg-anoint
+++ b/mg-anoint
@@ -66,7 +66,6 @@ use DBI;
 BEGIN { unshift @INC, qw(.); }
 use Osstest;
 use Osstest::TestSupport;
-use Osstest::Executive;
 use IO::Handle;
 use Text::Glob qw(glob_to_regex);
 
@@ -93,6 +92,15 @@ END
 our $task_q;
 our $mostrecent_q;
 
+sub empty_unless_can_anoint () {
+return if $mjobdb->can_anoint();
+exit 0;
+}
+sub fail_unless_can_anoint () {
+return if $mjobdb->can_anoint();
+die "anointments not supported in this mode ($c{JobDB})\n"
+}
+
 sub prep_queries {
 $task_q = $dbh_tests->prepare(

[Xen-devel] [OSSTEST PATCH 08/10] mg-anoint: Support mg-anoint retrieve --tolerate-unprepared

2018-05-17 Thread Ian Jackson
make-*-flight is going to want this.

Signed-off-by: Ian Jackson 
CC: Roger Pau Monné 
---
 mg-anoint | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mg-anoint b/mg-anoint
index 522cbdd..d09124b 100755
--- a/mg-anoint
+++ b/mg-anoint
@@ -15,10 +15,12 @@
 #--allow-blessed=BLESSING,...   default is from `prepare'
 #--allow-job-status=STATUS,...  default is only `pass'
 #
-#  ./mg-anoint retrieve REFKEY
+#  ./mg-anoint retrieve [--tolerate-unprepared] REFKEY
 #  => FLIGHT JOB
 # if nothing anointed yet, prints nothing and exits 0
 # if anointment not prepared, fails
+#  With --tolerate-unprepared, it is not an error if nothing is
+#  reported because the anointment has not been prepared.
 #
 #  ./mg-anoint list-prepared REFKEY-GLOB
 #  => possibly empty list of REFKEYs
@@ -294,6 +296,11 @@ END
 }
 
 sub cmd_retrieve {
+my $tolerate_unprepared;
+if (@ARGV && $ARGV[0] eq '--tolerate-unprepared') {
+   shift @ARGV;
+   $tolerate_unprepared = 1;
+}
 die unless @ARGV==1;
 die if $ARGV[0] =~ m/^-/;
 my ($refkey) = @ARGV;
@@ -305,7 +312,8 @@ sub cmd_retrieve {
 @o = ();
 $task_q->execute($refkey);
my ($task) = $task_q->fetchrow_array();
-   die "no such anointment kind \`$refkey'" unless defined $task;
+   die "no such anointment kind \`$refkey'"
+   unless defined $task or $tolerate_unprepared;
 
$mostrecent_q->execute($task);
my $row = $mostrecent_q->fetchrow_hashref();
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 09/10] mfi-common: set_freebsd_runvars: Never set freebsd_distpath to `/amd64' etc.

2018-05-17 Thread Ian Jackson
Logically, the final branch of the if should be qualified with a check
for the emptiness of FreeBSDDist.  This is awkward in the current
structure, since we really want to do the distpath lookup only if
needed.  (This is not very important right now, but we are about to
add another case which will do a more-likely-to-bomb-out and
more-likely-to-block-on-the-db lookup.)  So refactor into `return'
style.  This lets us introduce local variables in each branch.

Now gate the final branch appropriately.  The overall result is that
if no useful FreeBSD build is found, we simply do not set the
freebsd_* runvars, rather than setting them to wrong values (eg,
`freebsd_distpath=/i386'.

Signed-off-by: Ian Jackson 
CC: Roger Pau Monné 
---
 mfi-common | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mfi-common b/mfi-common
index cef28ad..17b1b50 100644
--- a/mfi-common
+++ b/mfi-common
@@ -133,14 +133,19 @@ set_freebsd_runvars () {
 local envvar="FREEBSD_${arch^^}_BUILDJOB"
 if [ -n "${!envvar}" ]; then
 freebsd_runvars="freebsdbuildjob=${!envvar}"
-elif [ -n "$FREEBSD_DIST" ] && [ -n "$FREEBSD_VERSION" ]; then
+return
+fi
+if [ -n "$FREEBSD_DIST" ] && [ -n "$FREEBSD_VERSION" ]; then
 freebsd_runvars="freebsd_distpath=$FREEBSD_DIST/$arch \
  freebsd_version=$FREEBSD_VERSION"
-else
-local distpath=`getconfig "FreeBSDDist"`
+return
+fi
+local distpath=`getconfig "FreeBSDDist"`
+if [ -n "$distpath" ]; then
 local version=`getconfig "FreeBSDVersion"`
 freebsd_runvars="freebsd_distpath=$distpath/$arch \
  freebsd_version=$version"
+return
 fi
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 06/10] Perl @INC path: fix a few more scripts to use BEGIN

2018-05-17 Thread Ian Jackson
Three more files which missed out on
  dea987c5ab11 "PERLLIB, @INC: Use BEGIN { }"

Signed-off-by: Ian Jackson 
CC: Roger Pau Monné 
---
 mg-anoint | 2 +-
 ts-examine-hostprops-save | 2 +-
 ts-freebsd-host-install   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mg-anoint b/mg-anoint
index 837e608..b007ab4 100755
--- a/mg-anoint
+++ b/mg-anoint
@@ -63,7 +63,7 @@
 
 use strict qw(vars refs);
 use DBI;
-unshift @INC, qw(.);
+BEGIN { unshift @INC, qw(.); }
 use Osstest;
 use Osstest::TestSupport;
 use Osstest::Executive;
diff --git a/ts-examine-hostprops-save b/ts-examine-hostprops-save
index 188773f..55d2339 100755
--- a/ts-examine-hostprops-save
+++ b/ts-examine-hostprops-save
@@ -19,7 +19,7 @@ use strict qw(vars);
 use DBI;
 use POSIX;
 
-unshift @INC, qw(.);
+BEGIN { unshift @INC, qw(.); }
 use Osstest;
 use Osstest::TestSupport;
 
diff --git a/ts-freebsd-host-install b/ts-freebsd-host-install
index cac8be9..984bdf0 100755
--- a/ts-freebsd-host-install
+++ b/ts-freebsd-host-install
@@ -35,7 +35,7 @@ use strict qw(vars);
 use DBI;
 use POSIX;
 
-unshift @INC, qw(.);
+BEGIN { unshift @INC, qw(.); }
 use Osstest;
 use Osstest::TestSupport;
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 01/10] Osstest/TestSupport: Use right arch for UEFI grub setup

2018-05-17 Thread Ian Jackson
Signed-off-by: Ian Jackson 
---
 Osstest/TestSupport.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
index 6d5e667..983bb17 100644
--- a/Osstest/TestSupport.pm
+++ b/Osstest/TestSupport.pm
@@ -2646,8 +2646,8 @@ menuentry 'local' {
   insmod part_gpt
   insmod part_msdos
   set root=(hd0,gpt1)
-  echo "Chainloading (\${root})/EFI/BOOT/BOOTAA64.EFI"
-  chainloader (\${root})/EFI/BOOT/BOOTAA64.EFI
+  echo "Chainloading (\${root})/EFI/BOOT/BOOT$efi.EFI"
+  chainloader (\${root})/EFI/BOOT/BOOT$efi.EFI
   boot
 }
 END
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Jan Beulich
>>> On 17.05.18 at 13:10,  wrote:
> On Thu, May 17, 2018 at 04:44:04AM -0600, Jan Beulich wrote:
>> >>> On 17.05.18 at 11:48,  wrote:
>> > The function is supposed to return whether the MTRR and PAT state of
>> > two CPUs match. Currently this is not properly done because the test
>> > for the deftype and the enable bits required both the deftype and the
>> > enable bits to be different, while just one of those fields being
>> > different can already cause the MTRR states on the vCPU to not match.
>> > 
>> > Fix this by changing the AND into an OR instead, so that either the
>> > deftype or the enabled bits being different will cause the function to
>> > return mismatching state.
>> 
>> This is by far not enough, but I didn't view the function as critical
>> enough to warrant sending out the patch I have right away.
> 
> I've also realized that the logic there is wonky and would return true
> in cases where the states are equal (ie: for example if fixed MTRRs
> contents are different but FE is disabled).
> 
> Just wanted to do a minimal change that prevents wrongly reporting
> that the state is equal when it's not (I think the other way around is
> not that critical).
> 
> You change LGTM, and fixes some obvious cases where the current code
> would return true even if the cache state is the same.
> 
>> Jan
>> x86/HVM: correct mtrr_pat_not_equal()
>> 
>> The two vCPU-s differring in MTRR-enabled state means MTRR settings are
>> not equal. Both vCPU-s having MTRRs disabled means only PAT needs to be
>> compared. Along those lines for fixed range MTRRs. Differring variable
>> range counts likewise mean settings are different overall.
>> 
>> Constify types and convert bool_t to bool.
>> 
>> Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Roger Pau Monné 

Thanks.

>> --- unstable.orig/xen/arch/x86/hvm/mtrr.c
>> +++ unstable/xen/arch/x86/hvm/mtrr.c
>> @@ -476,35 +476,40 @@ bool_t mtrr_var_range_msr_set(
>>  return 1;
>>  }
>>  
>> -bool_t mtrr_pat_not_equal(struct vcpu *vd, struct vcpu *vs)
>> +bool mtrr_pat_not_equal(const struct vcpu *vd, const struct vcpu *vs)
>>  {
>> -struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
>> -struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
>> -int32_t res;
>> -uint8_t num_var_ranges = (uint8_t)md->mtrr_cap;
>> -
>> -/* Test fixed ranges. */
>> -res = memcmp(md->fixed_ranges, ms->fixed_ranges,
>> -NUM_FIXED_RANGES*sizeof(mtrr_type));
>> -if ( res )
>> -return 1;
>> -
>> -/* Test var ranges. */
>> -res = memcmp(md->var_ranges, ms->var_ranges,
>> -num_var_ranges*sizeof(struct mtrr_var_range));
>> -if ( res )
>> -return 1;
>> -
>> -/* Test default type MSR. */
>> -if ( (md->def_type != ms->def_type)
>> -&& (md->enabled != ms->enabled) )
>> -return 1;
>> +const struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
>> +const struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
>>  
>> -/* Test PAT. */
>> -if ( vd->arch.hvm_vcpu.pat_cr != vs->arch.hvm_vcpu.pat_cr )
>> -return 1;
>> +if ( (md->enabled ^ ms->enabled) & 2 )
>> +return true;
>> +
>> +if ( md->enabled & 2 )
>> +{
>> +unsigned int num_var_ranges = (uint8_t)md->mtrr_cap;
>> +
>> +/* Test default type MSR. */
>> +if ( md->def_type != ms->def_type )
>> +return true;
>> +
>> +/* Test fixed ranges. */
>> +if ( (md->enabled ^ ms->enabled) & 1 )
>> +return true;
>> +
>> +if ( (md->enabled & 1) &&
>> + memcmp(md->fixed_ranges, ms->fixed_ranges,
>> +sizeof(md->fixed_ranges)) )
>> +return true;
>> +
>> +/* Test variable ranges. */
>> +if ( num_var_ranges != (uint8_t)ms->mtrr_cap ||
> 
> Is it really possible to have two vCPUs on the same domain with a
> different number of variable ranges?

Right now this is more for cosmetic reasons than for functionality. In
theory it is possible, and it'll become possible in practice once we allow
the number to be controlled through the load operation (see one of
the other patches we've discussed yesterday).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Andrew Cooper
On 17/05/18 12:33, Jan Beulich wrote:
 On 17.05.18 at 13:10,  wrote:
>> On Thu, May 17, 2018 at 04:44:04AM -0600, Jan Beulich wrote:
>> On 17.05.18 at 11:48,  wrote:
 The function is supposed to return whether the MTRR and PAT state of
 two CPUs match. Currently this is not properly done because the test
 for the deftype and the enable bits required both the deftype and the
 enable bits to be different, while just one of those fields being
 different can already cause the MTRR states on the vCPU to not match.

 Fix this by changing the AND into an OR instead, so that either the
 deftype or the enabled bits being different will cause the function to
 return mismatching state.
>>> This is by far not enough, but I didn't view the function as critical
>>> enough to warrant sending out the patch I have right away.
>> I've also realized that the logic there is wonky and would return true
>> in cases where the states are equal (ie: for example if fixed MTRRs
>> contents are different but FE is disabled).
>>
>> Just wanted to do a minimal change that prevents wrongly reporting
>> that the state is equal when it's not (I think the other way around is
>> not that critical).
>>
>> You change LGTM, and fixes some obvious cases where the current code
>> would return true even if the cache state is the same.
>>
>>> Jan
>>> x86/HVM: correct mtrr_pat_not_equal()
>>>
>>> The two vCPU-s differring in MTRR-enabled state means MTRR settings are
>>> not equal. Both vCPU-s having MTRRs disabled means only PAT needs to be
>>> compared. Along those lines for fixed range MTRRs. Differring variable
>>> range counts likewise mean settings are different overall.
>>>
>>> Constify types and convert bool_t to bool.
>>>
>>> Signed-off-by: Jan Beulich 
>> Reviewed-by: Roger Pau Monné 
> Thanks.
>
>>> --- unstable.orig/xen/arch/x86/hvm/mtrr.c
>>> +++ unstable/xen/arch/x86/hvm/mtrr.c
>>> @@ -476,35 +476,40 @@ bool_t mtrr_var_range_msr_set(
>>>  return 1;
>>>  }
>>>  
>>> -bool_t mtrr_pat_not_equal(struct vcpu *vd, struct vcpu *vs)
>>> +bool mtrr_pat_not_equal(const struct vcpu *vd, const struct vcpu *vs)
>>>  {
>>> -struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
>>> -struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
>>> -int32_t res;
>>> -uint8_t num_var_ranges = (uint8_t)md->mtrr_cap;
>>> -
>>> -/* Test fixed ranges. */
>>> -res = memcmp(md->fixed_ranges, ms->fixed_ranges,
>>> -NUM_FIXED_RANGES*sizeof(mtrr_type));
>>> -if ( res )
>>> -return 1;
>>> -
>>> -/* Test var ranges. */
>>> -res = memcmp(md->var_ranges, ms->var_ranges,
>>> -num_var_ranges*sizeof(struct mtrr_var_range));
>>> -if ( res )
>>> -return 1;
>>> -
>>> -/* Test default type MSR. */
>>> -if ( (md->def_type != ms->def_type)
>>> -&& (md->enabled != ms->enabled) )
>>> -return 1;
>>> +const struct mtrr_state *md = &vd->arch.hvm_vcpu.mtrr;
>>> +const struct mtrr_state *ms = &vs->arch.hvm_vcpu.mtrr;
>>>  
>>> -/* Test PAT. */
>>> -if ( vd->arch.hvm_vcpu.pat_cr != vs->arch.hvm_vcpu.pat_cr )
>>> -return 1;
>>> +if ( (md->enabled ^ ms->enabled) & 2 )
>>> +return true;
>>> +
>>> +if ( md->enabled & 2 )
>>> +{
>>> +unsigned int num_var_ranges = (uint8_t)md->mtrr_cap;
>>> +
>>> +/* Test default type MSR. */
>>> +if ( md->def_type != ms->def_type )
>>> +return true;
>>> +
>>> +/* Test fixed ranges. */
>>> +if ( (md->enabled ^ ms->enabled) & 1 )
>>> +return true;
>>> +
>>> +if ( (md->enabled & 1) &&
>>> + memcmp(md->fixed_ranges, ms->fixed_ranges,
>>> +sizeof(md->fixed_ranges)) )
>>> +return true;
>>> +
>>> +/* Test variable ranges. */
>>> +if ( num_var_ranges != (uint8_t)ms->mtrr_cap ||
>> Is it really possible to have two vCPUs on the same domain with a
>> different number of variable ranges?
> Right now this is more for cosmetic reasons than for functionality. In
> theory it is possible, and it'll become possible in practice once we allow
> the number to be controlled through the load operation (see one of
> the other patches we've discussed yesterday).

MTRR MSRs are yet another todo item on the grand "move to a usable MSR
infrastructure" list.

MTRRcap is a read-only MSR, so will live in the domain policy and be a
single value across the entire domain.

Nothing good will come of trying to formally support different MSR
capabilities on different vcpus, because you won't find any hardware
where you can do this in practice.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Juergen Gross
On 17/05/18 12:44, Jan Beulich wrote:
 On 17.05.18 at 11:48,  wrote:
>> The function is supposed to return whether the MTRR and PAT state of
>> two CPUs match. Currently this is not properly done because the test
>> for the deftype and the enable bits required both the deftype and the
>> enable bits to be different, while just one of those fields being
>> different can already cause the MTRR states on the vCPU to not match.
>>
>> Fix this by changing the AND into an OR instead, so that either the
>> deftype or the enabled bits being different will cause the function to
>> return mismatching state.
> 
> This is by far not enough, but I didn't view the function as critical
> enough to warrant sending out the patch I have right away.
> 
> Jan
> x86/HVM: correct mtrr_pat_not_equal()
> 
> The two vCPU-s differring in MTRR-enabled state means MTRR settings are
> not equal. Both vCPU-s having MTRRs disabled means only PAT needs to be
> compared. Along those lines for fixed range MTRRs. Differring variable
> range counts likewise mean settings are different overall.
> 
> Constify types and convert bool_t to bool.
> 
> Signed-off-by: Jan Beulich 

My Rab tag for Roger's patch is applicable here, too:

Release-acked-by: Juergen Gross 


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Jan Beulich
>>> On 17.05.18 at 13:46,  wrote:
> Nothing good will come of trying to formally support different MSR
> capabilities on different vcpus, because you won't find any hardware
> where you can do this in practice.

I agree, but it's not really clear to me how you want to enforce identical
values yet at the same time allow save (or really load) records to define
the number of ranges (which currently is broken). Perhaps we could
refuse to launch/unpause a guest when there are inconsistencies.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Jan Beulich
>>> On 17.05.18 at 13:49,  wrote:
> On 17/05/18 12:44, Jan Beulich wrote:
> On 17.05.18 at 11:48,  wrote:
>>> The function is supposed to return whether the MTRR and PAT state of
>>> two CPUs match. Currently this is not properly done because the test
>>> for the deftype and the enable bits required both the deftype and the
>>> enable bits to be different, while just one of those fields being
>>> different can already cause the MTRR states on the vCPU to not match.
>>>
>>> Fix this by changing the AND into an OR instead, so that either the
>>> deftype or the enabled bits being different will cause the function to
>>> return mismatching state.
>> 
>> This is by far not enough, but I didn't view the function as critical
>> enough to warrant sending out the patch I have right away.
>> 
>> Jan
>> x86/HVM: correct mtrr_pat_not_equal()
>> 
>> The two vCPU-s differring in MTRR-enabled state means MTRR settings are
>> not equal. Both vCPU-s having MTRRs disabled means only PAT needs to be
>> compared. Along those lines for fixed range MTRRs. Differring variable
>> range counts likewise mean settings are different overall.
>> 
>> Constify types and convert bool_t to bool.
>> 
>> Signed-off-by: Jan Beulich 
> 
> My Rab tag for Roger's patch is applicable here, too:
> 
> Release-acked-by: Juergen Gross 

Thanks, but that depends on whether Andrew is willing to ack the
patch (with or without minor modifications).

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/cacheattr: fix mtrr_pat_not_equal

2018-05-17 Thread Jan Beulich
>>> On 17.05.18 at 13:46,  wrote:
> Nothing good will come of trying to formally support different MSR
> capabilities on different vcpus, because you won't find any hardware
> where you can do this in practice.

Thinking of it - allowing this would be a nice way to allow testing OS
code when meaning it to cope with asymmetries.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 for-4.7/4.8] x86: Fix "x86: further CPUID handling adjustments"

2018-05-17 Thread Jan Beulich
>>> On 16.05.18 at 19:27,  wrote:
> c/s 62b187969 "x86: further CPUID handling adjustments" make some adjustments.
> However, it breaks levelling of guests, making it impossible for the toolstack
> to hide STIBP or IBPB from guests on hardware with up-to-date microcode.
> 
> The dom0 issue referenced in the commit message was fixed by the hunk
> adjusting the zeroing alone.  STIBP and IBPB don't need (and indeed, must not
> be for levelling purposes) OR'd into the leaf.
> 
> One final item which was missed in backport was the need to ignore the
> toolstack choice of STIBP, and set it equal to IBRSB.  This needs doing after
> the mask has been applied.
> 
> Signed-off-by: Andrew Cooper 

What about the patch below instead? This then allows the tool stack to override
STIBP independent of IBRSB.

Jan

x86: correct "further CPUID handling adjustments"

Commit 62b187969 "x86: further CPUID handling adjustments" went too far,
breaking feature levelling of DomU-s. Restrict the PV overrides to just
Dom0 and undo the HVM overrides.

Reported-by: Andrew Cooper 
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3587,7 +3587,6 @@ void hvm_cpuid(unsigned int input, unsig
 
 *ecx &= hvm_featureset[FEATURESET_7c0];
 
-*edx |= cpufeat_mask(X86_FEATURE_STIBP);
 *edx &= hvm_featureset[FEATURESET_7d0];
 
 /* Don't expose HAP-only features to non-hap guests. */
@@ -3761,7 +3760,6 @@ void hvm_cpuid(unsigned int input, unsig
 hvm_cpuid(0x8001, NULL, NULL, NULL, &_edx);
 *eax |= (_edx & cpufeat_mask(X86_FEATURE_LM) ? vaddr_bits : 32) << 8;
 
-*ebx |= cpufeat_mask(X86_FEATURE_IBPB);
 *ebx &= hvm_featureset[FEATURESET_e8b];
 break;
 }
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1156,7 +1156,8 @@ void pv_cpuid(struct cpu_user_regs *regs
 
 c &= pv_featureset[FEATURESET_7c0];
 
-d |= cpufeat_mask(X86_FEATURE_STIBP);
+if ( is_hardware_domain(currd) || is_control_domain(currd) )
+d |= cpufeat_mask(X86_FEATURE_STIBP);
 d &= pv_featureset[FEATURESET_7d0];
 
 if ( !is_pvh_domain(currd) )
@@ -1271,7 +1272,8 @@ void pv_cpuid(struct cpu_user_regs *regs
 
 case 0x8008:
 a = paddr_bits | (vaddr_bits << 8);
-b |= cpufeat_mask(X86_FEATURE_IBPB);
+if ( is_hardware_domain(currd) || is_control_domain(currd) )
+b |= cpufeat_mask(X86_FEATURE_IBPB);
 b &= pv_featureset[FEATURESET_e8b];
 break;
 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Test for osstest, features used in Qubes OS

2018-05-17 Thread Ian Jackson
Marek Marczykowski-Górecki writes ("Test for osstest, features used in Qubes 
OS"):
> As discussed some time ago, I'd like to help with adding tests for some
> features we use in Qubes OS.
> 
> IMO the easiest thing to test is host suspend. You just need to execute
> "rtcwake -s 30 -m mem", and see if the host is back to live after ~30s.
> Right now I know it works on Xen 4.8, but supposedly is broken on
> staging (haven't tested the most recent version).
> Next step would be the same while having some domains running.
> 
> How the test should look like (where to add this? etc)?

I guess this should be a new
  ts-host-suspend-test
script.

Is it likely that this will depend on non-buggy host firmware ?  If so
then we need to make arrangements to test it and only do it on hosts
which are not buggy.  In practice this probably means wiring it up to
the automatic host examiner.

> Next things would be mostly related to PCI passthrough:
>  - PCI passthrough with qemu in stubdomain
>  - the same as above, but with Linux-based stubdomain (we need cleanup
>and send patches for that first, probably 4.12 material)
>  - guest suspend (recently added libxl_domain_suspend_only), for
>different guest types (PV, PVH, HVM), also with/without PCI device
> 
> For this, the machine obviously need to have IOMMU (I assume at least
> some of the hardware used in test lab have it), and some spare PCI
> device. I use sound card for some of such tests. But testing on USB
> controllers would be more useful (from out experience, one of the most
> problematic devices for suspend, sadly also lacking FLR or such...).

I doubt any of our x86 machines have sound cards. ...  Just looked at
one and it says
  00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core
  Processor HD Audio Controller (rev 06)
which is obviously mad.

I'm pretty sure they all have usb controllers.  Almost all of them
have multiple NICs, often on different pci devices, although it is
difficult to tell if a NIC not connected to anything is working.

Eg,

  02:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network
  Connection (rev 03)

  03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network
  Connection (rev 03)

Is there some kind of cheap USB HID, that is interactable-with, which
we could plug into each machine's USB port ?  I'm slightly concerned
that plugging in a storage device, or connecting the other NIC, might
interfere with booting.

If you want to get pci passthrough tests working I would suggest
testing it with non-stubdom first.  I assume the config etc. is the
same, so having got that working, osstest would be able to test it for
the stubdom tests too.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [xen-unstable test] 122804: FAIL

2018-05-17 Thread Ian Jackson
Juergen Gross writes ("Re: [Xen-devel] [xen-unstable test] 122804: FAIL"):
> On 17/05/18 01:57, osstest service owner wrote:
> > Tests which are failing intermittently (not blocking):
> >  test-amd64-amd64-xl-qemut-debianhvm-amd64 16 guest-localmigrate/x10 fail 
> > pass in 122715
> >  test-amd64-amd64-xl-qemuu-debianhvm-amd64 16 guest-localmigrate/x10 fail 
> > pass in 122715
> 
> Not sure about those. Olaf has seen a similar problem in our SLE15 tests
> which seems to be related to qemu file locking issues. At least the logs
> look exactly the same. See the thread
> 
> https://lists.xen.org/archives/html/xen-devel/2018-05/msg00369.html

That thread seems to have a lot of different bugs in it.

The one you see in 122804 is a heisenbug, which means it is probably a
race somewhere.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] how to recognize in libxl that a domU has qemu-xen attached?

2018-05-17 Thread Olaf Hering
In the other thread about the unsolved migration bugs in qemu-xen it became 
clear that "xen-save-devices-state" must not only be called for HVM, but for 
every domU that has qemu-xen attached to it. I wonder how to check for that 
fact from the migration code. While it can continue to rely on 
LIBXL_DOMAIN_TYPE_HVM make that call, for  LIBXL_DOMAIN_TYPE_PV it is 
apparently not that easy. Is libxl__need_xenpv_qemu the API to use for the 
decision if libxl__qmp_stop/libxl__qmp_save/libxl__qmp_resume have to be called 
during migration?

Olaf


pgpSuA6Ql5mQd.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 06/10] Perl @INC path: fix a few more scripts to use BEGIN

2018-05-17 Thread Roger Pau Monné
On Thu, May 17, 2018 at 12:16:55PM +0100, Ian Jackson wrote:
> Three more files which missed out on
>   dea987c5ab11 "PERLLIB, @INC: Use BEGIN { }"
> 
> Signed-off-by: Ian Jackson 

Reviewed-by: Roger Pau Monné 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] how to recognize in libxl that a domU has qemu-xen attached?

2018-05-17 Thread Juergen Gross
On 17/05/18 14:33, Olaf Hering wrote:
> In the other thread about the unsolved migration bugs in qemu-xen it became 
> clear that "xen-save-devices-state" must not only be called for HVM, but for 
> every domU that has qemu-xen attached to it. I wonder how to check for that 
> fact from the migration code. While it can continue to rely on 
> LIBXL_DOMAIN_TYPE_HVM make that call, for  LIBXL_DOMAIN_TYPE_PV it is 
> apparently not that easy. Is libxl__need_xenpv_qemu the API to use for the 
> decision if libxl__qmp_stop/libxl__qmp_save/libxl__qmp_resume have to be 
> called during migration?

libxl__need_xenpv_qemu() is used to determine whether a pv domain needs
a qemu process for at least one backend. To check for the correct qemu
type you need to call libxl__device_model_version_running() and test the
return value being LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN. So you need:

if (libxl__need_xenpv_qemu(gc, d_config) &&
libxl__device_model_version_running(gc, domid) ==
LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
libxl__qmp_stop(...);


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/2] Input: xen-kbdfront - allow better run-time configuration

2018-05-17 Thread Boris Ostrovsky
On 05/17/2018 01:31 AM, Oleksandr Andrushchenko wrote:
> I will go with the change you suggested and
> I'll send v4 tomorrow then.



Please make sure your changes to kbdif.h are in Xen first. I believe you
submitted a patch there but I don't see it in the staging tree yet.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/2] Input: xen-kbdfront - allow better run-time configuration

2018-05-17 Thread Oleksandr Andrushchenko

On 05/17/2018 04:08 PM, Boris Ostrovsky wrote:

On 05/17/2018 01:31 AM, Oleksandr Andrushchenko wrote:

I will go with the change you suggested and
I'll send v4 tomorrow then.



Please make sure your changes to kbdif.h are in Xen first. I believe you
submitted a patch there but I don't see it in the staging tree yet.

Sure, I already have Release ack for the one which is not
in Xen tree yet, hope Konrad can apply it today,
so tomorrow Xen and Linux are both ok (or by the time
I send v4).

-boris

Thank you,
Oleksandr

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] libxc/x86: don't hand through CPUID leaf 0x80000008 as is

2018-05-17 Thread Jan Beulich
Just like for HVM the feature set should be used for EBX output, while
EAX should be restricted to the low 16 bits and ECX/EDX should be zero.

Short of there being white listing in place just like on the HVM side,
also zap leaves 6, 9, and 0x8007 as well as unknown / reserved ones.

Signed-off-by: Jan Beulich 

--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -575,12 +575,26 @@ static void xc_cpuid_pv_policy(xc_interf
 break;
 }
 
+case 0x8008:
+regs[0] &= 0xu;
+regs[1] = info->featureset[featureword_of(X86_FEATURE_CLZERO)];
+regs[2] = regs[3] = 0;
+break;
+
 case 0x0005: /* MONITOR/MWAIT */
+case 0x0006: /* Thermal and Power Management */
+case 0x0008: /* reserved */
+case 0x0009: /* Direct Cache Access */
 case 0x000b: /* Extended Topology Enumeration */
+case 0x000c: /* reserved */
+case DEF_MAX_BASE + 1 ... 0x: /* unknown / reserved */
+case 0x8007: /* Power Management / RAS */
+case 0x8009: /* reserved */
 case 0x800a: /* SVM revision and features */
+case 0x800b ... 0x8018: /* reserved */
 case 0x801b: /* Instruction Based Sampling */
 case 0x801c: /* Light Weight Profiling */
-case 0x801e: /* Extended topology reporting */
+case max_c(DEF_MAX_INTELEXT, DEF_MAX_AMDEXT) + 1 ... 0x8000:
 regs[0] = regs[1] = regs[2] = regs[3] = 0;
 break;
 }
--- a/tools/libxc/xc_private.h
+++ b/tools/libxc/xc_private.h
@@ -408,6 +408,10 @@ int xc_ffs64(uint64_t x);
 #define max_t(type,x,y) \
 ({ type __x = (x); type __y = (y); __x > __y ? __x: __y; })
 
+/* Use these only in places where constant expressions are required. */
+#define min_c(x, y) ((x) < (y) ? (x) : (y))
+#define max_c(x, y) ((x) > (y) ? (x) : (y))
+
 #define DOMPRINTF(fmt, args...) xc_dom_printf(dom->xch, fmt, ## args)
 #define DOMPRINTF_CALLED(xch) xc_dom_printf((xch), "%s: called", __FUNCTION__)
 





___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxc/x86: don't hand through CPUID leaf 0x80000008 as is

2018-05-17 Thread Andrew Cooper
On 17/05/18 14:20, Jan Beulich wrote:
> Just like for HVM the feature set should be used for EBX output, while
> EAX should be restricted to the low 16 bits and ECX/EDX should be zero.
>
> Short of there being white listing in place just like on the HVM side,
> also zap leaves 6, 9, and 0x8007 as well as unknown / reserved ones.
>
> Signed-off-by: Jan Beulich 

Do you want this for backporting?

The changes below are enforced by recalculate_cpuid_policy() (and in
particular, recaluclate_misc()) in the hypervisor for the past few
releases, and Sergey is currently in the process of making all of this
libxc logic disappear.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxc/x86: don't hand through CPUID leaf 0x80000008 as is

2018-05-17 Thread Jan Beulich
>>> On 17.05.18 at 15:26,  wrote:
> On 17/05/18 14:20, Jan Beulich wrote:
>> Just like for HVM the feature set should be used for EBX output, while
>> EAX should be restricted to the low 16 bits and ECX/EDX should be zero.
>>
>> Short of there being white listing in place just like on the HVM side,
>> also zap leaves 6, 9, and 0x8007 as well as unknown / reserved ones.
>>
>> Signed-off-by: Jan Beulich 
> 
> Do you want this for backporting?

Not really, at least that wasn't a primary goal.

> The changes below are enforced by recalculate_cpuid_policy() (and in
> particular, recaluclate_misc()) in the hypervisor for the past few
> releases, and Sergey is currently in the process of making all of this
> libxc logic disappear.

Well, maybe I'm simply confused: Commit d297b56682 ("x86/cpuid: Handling
of IBRS/IBPB, STIBP and IBRS for guests") introduced similar code into
xc_cpuid_hvm_policy() without doing the same for xc_cpuid_pv_policy().
That's pretty recent a commit, and one that has been backported all the
way through to 4.6. Are you saying that was a pointless change then?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxc/x86: don't hand through CPUID leaf 0x80000008 as is

2018-05-17 Thread Andrew Cooper
On 17/05/18 14:38, Jan Beulich wrote:
 On 17.05.18 at 15:26,  wrote:
>> On 17/05/18 14:20, Jan Beulich wrote:
>>> Just like for HVM the feature set should be used for EBX output, while
>>> EAX should be restricted to the low 16 bits and ECX/EDX should be zero.
>>>
>>> Short of there being white listing in place just like on the HVM side,
>>> also zap leaves 6, 9, and 0x8007 as well as unknown / reserved ones.
>>>
>>> Signed-off-by: Jan Beulich 
>> Do you want this for backporting?
> Not really, at least that wasn't a primary goal.
>
>> The changes below are enforced by recalculate_cpuid_policy() (and in
>> particular, recaluclate_misc()) in the hypervisor for the past few
>> releases, and Sergey is currently in the process of making all of this
>> libxc logic disappear.
> Well, maybe I'm simply confused: Commit d297b56682 ("x86/cpuid: Handling
> of IBRS/IBPB, STIBP and IBRS for guests") introduced similar code into
> xc_cpuid_hvm_policy() without doing the same for xc_cpuid_pv_policy().
> That's pretty recent a commit, and one that has been backported all the
> way through to 4.6. Are you saying that was a pointless change then?

No sorry - you're completely correct.

Without the PV side, a guest will by default get the same settings as
dom0.  The reason why my XTF tests doesn't notice this is because libxl
uses a separate path, and XenServer uses a yet-different path.

The PV side wants to gain a matching clzero hunk.

My comment about recalculate_cpuid_policy() applies to the clamping part
of the change.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen/kbdif: Add features to disable keyboard and pointer

2018-05-17 Thread Konrad Rzeszutek Wilk
On Thu, May 17, 2018 at 08:51:38AM +0300, Oleksandr Andrushchenko wrote:
> On 05/17/2018 08:50 AM, Juergen Gross wrote:
> > On 17/05/18 07:45, Oleksandr Andrushchenko wrote:
> > > Hi, Juergen!
> > > 
> > > This patch should go into 4.11 as it is needed for a related Linux
> > > kernel patch and the risk is next to zero for Xen due to only adding
> > > some macros not in use on Xen side.
> > > 
> > > Could you please release ack this
> > Release-acked-by: Juergen Gross 
> Thank you
> > > and apply?
> > This has to be done by a committer, which I'm not.
> Konrad, could you please apply?

Yes of course.
> > 
> > Juergen
> Thank you,
> Oleksandr

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 07/10] mg-anoint: Make readonly operations "work" in standalone mode

2018-05-17 Thread Roger Pau Monné
On Thu, May 17, 2018 at 12:16:56PM +0100, Ian Jackson wrote:
> This makes `mg-anoint' in standalone mode a view onto an empty set of
> anointments.  So now it becomes ok to call mg-anoint in make-*-flight.
> 
> Signed-off-by: Ian Jackson 

Reviewed-by: Roger Pau Monné 

Thanks.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.7-testing test] 122831: regressions - FAIL

2018-05-17 Thread osstest service owner
flight 122831 xen-4.7-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122831/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-libvirt-xsm broken  in 122678
 test-arm64-arm64-xl-xsm  broken  in 122678
 test-arm64-arm64-xl-credit2  broken  in 122678
 test-xtf-amd64-amd64-2 50 xtf/test-hvm64-lbr-tsx-vmentry fail REGR. vs. 122131

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-libvirt-xsm 4 host-install(4) broken in 122678 pass in 122831
 test-arm64-arm64-xl-xsm  4 host-install(4) broken in 122678 pass in 122831
 test-arm64-arm64-xl-credit2  4 host-install(4) broken in 122678 pass in 122831
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 16 guest-localmigrate/x10 fail in 
122678 pass in 122831
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail 
in 122678 pass in 122831
 test-xtf-amd64-amd64-3   50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 122678

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-4  50 xtf/test-hvm64-lbr-tsx-vmentry fail like 122131
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 122131
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 122131
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 122131
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 122131
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 122131
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 122131
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 122131
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 122131
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 122131
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 122131
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 122131
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 122131
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-a

Re: [Xen-devel] [OSSTEST PATCH 08/10] mg-anoint: Support mg-anoint retrieve --tolerate-unprepared

2018-05-17 Thread Roger Pau Monné
On Thu, May 17, 2018 at 12:16:57PM +0100, Ian Jackson wrote:
> make-*-flight is going to want this.
> 
> Signed-off-by: Ian Jackson 

Reviewed-by: Roger Pau Monné 

Just a style comment below.

> ---
>  mg-anoint | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/mg-anoint b/mg-anoint
> index 522cbdd..d09124b 100755
> --- a/mg-anoint
> +++ b/mg-anoint
> @@ -15,10 +15,12 @@
>  #--allow-blessed=BLESSING,...   default is from `prepare'
>  #--allow-job-status=STATUS,...  default is only `pass'
>  #
> -#  ./mg-anoint retrieve REFKEY
> +#  ./mg-anoint retrieve [--tolerate-unprepared] REFKEY
>  #  => FLIGHT JOB
>  # if nothing anointed yet, prints nothing and exits 0
>  # if anointment not prepared, fails
> +#  With --tolerate-unprepared, it is not an error if nothing is
> +#  reported because the anointment has not been prepared.
>  #
>  #  ./mg-anoint list-prepared REFKEY-GLOB
>  #  => possibly empty list of REFKEYs
> @@ -294,6 +296,11 @@ END
>  }
>  
>  sub cmd_retrieve {
> +my $tolerate_unprepared;
> +if (@ARGV && $ARGV[0] eq '--tolerate-unprepared') {
> + shift @ARGV;
> + $tolerate_unprepared = 1;
> +}

There seems to be a mix between hard and soft tabs in the chunk above
(and below).

Thanks.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1] xl: show full value of cpu_khz in xl info output

2018-05-17 Thread Olaf Hering
Am Tue,  3 Apr 2018 13:14:11 +0200
schrieb Olaf Hering :

> The exact value of cpu_khz can only be obtained via 'xl dmesg', and
> therefore can be lost after some time. 'xl info' truncates the value to
> full MHz. Adjust the output to show the full khz value.
> This helps the host admin to track how a host has calibrated itself. The
> value of cpu_khz is used during live migration for the decision if
> access to TSC should be emualted.

I just found this in my backlog of unapplied patches.
Any word on this change?

Olaf


pgpuDRbrpyezT.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 09/10] mfi-common: set_freebsd_runvars: Never set freebsd_distpath to `/amd64' etc.

2018-05-17 Thread Roger Pau Monné
On Thu, May 17, 2018 at 12:16:58PM +0100, Ian Jackson wrote:
> Logically, the final branch of the if should be qualified with a check
> for the emptiness of FreeBSDDist.  This is awkward in the current
> structure, since we really want to do the distpath lookup only if
> needed.  (This is not very important right now, but we are about to
> add another case which will do a more-likely-to-bomb-out and
> more-likely-to-block-on-the-db lookup.)  So refactor into `return'
> style.  This lets us introduce local variables in each branch.
> 
> Now gate the final branch appropriately.  The overall result is that
> if no useful FreeBSD build is found, we simply do not set the
> freebsd_* runvars, rather than setting them to wrong values (eg,
> `freebsd_distpath=/i386'.
> 
> Signed-off-by: Ian Jackson 

Reviewed-by: Roger Pau Monné 

Thanks.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/2] Input: xen-kbdfront - allow better run-time configuration

2018-05-17 Thread Konrad Rzeszutek Wilk
On Thu, May 17, 2018 at 04:09:04PM +0300, Oleksandr Andrushchenko wrote:
> On 05/17/2018 04:08 PM, Boris Ostrovsky wrote:
> > On 05/17/2018 01:31 AM, Oleksandr Andrushchenko wrote:
> > > I will go with the change you suggested and
> > > I'll send v4 tomorrow then.
> > 
> > 
> > Please make sure your changes to kbdif.h are in Xen first. I believe you
> > submitted a patch there but I don't see it in the staging tree yet.
> Sure, I already have Release ack for the one which is not
> in Xen tree yet, hope Konrad can apply it today,
> so tomorrow Xen and Linux are both ok (or by the time

It is checked in (on the Xen tree).
> I send v4).
> > -boris
> Thank you,
> Oleksandr

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v1] libxl: fix return code in qmp_synchronous_send

2018-05-17 Thread Olaf Hering
Use error code from libxl namespace, a plain -1 is not valid in this context.

Signed-off-by: Olaf Hering 
---
 tools/libxl/libxl_qmp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index be1fda18ba..0fe42813bf 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -627,7 +627,7 @@ static int qmp_synchronous_send(libxl__qmp_handler *qmp, 
const char *cmd,
 
 id = qmp_send(qmp, cmd, args, callback, opaque, &context);
 if (id <= 0) {
-return -1;
+return ERROR_FAIL;
 }
 qmp->wait_for_id = id;
 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 10/10] mfi-common: Fall back to anointed builds in Executive mode

2018-05-17 Thread Roger Pau Monné
On Thu, May 17, 2018 at 12:16:59PM +0100, Ian Jackson wrote:
> Previously, `make-*-flight' would not work unless FREEBSD_*_BUILDJOB
> was set.  Now we use the anointed values if we can find them.
> 
> If we can't, mg-anoint retrieve will print a warning.
> 
> Signed-off-by: Ian Jackson 

LGTM

Reviewed-by: Roger Pau Monné 

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1] xl: show full value of cpu_khz in xl info output

2018-05-17 Thread Wei Liu
On Thu, May 17, 2018 at 04:25:58PM +0200, Olaf Hering wrote:
> Am Tue,  3 Apr 2018 13:14:11 +0200
> schrieb Olaf Hering :
> 
> > The exact value of cpu_khz can only be obtained via 'xl dmesg', and
> > therefore can be lost after some time. 'xl info' truncates the value to
> > full MHz. Adjust the output to show the full khz value.
> > This helps the host admin to track how a host has calibrated itself. The
> > value of cpu_khz is used during live migration for the decision if
> > access to TSC should be emualted.
> 
> I just found this in my backlog of unapplied patches.
> Any word on this change?

Acked-by: Wei Liu 

Sorry I missed this one.

CC Juergen, I think this should be in 4.11.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/2] xen/PVH: Set up GS segment for stack canary

2018-05-17 Thread Boris Ostrovsky
We are making calls to C code (e.g. xen_prepare_pvh()) which may use
stack canary (stored in GS segment).

Signed-off-by: Boris Ostrovsky 
---
 arch/x86/xen/xen-pvh.S | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/xen-pvh.S b/arch/x86/xen/xen-pvh.S
index e1a5fbe..0db540c 100644
--- a/arch/x86/xen/xen-pvh.S
+++ b/arch/x86/xen/xen-pvh.S
@@ -54,6 +54,9 @@
  * charge of setting up it's own stack, GDT and IDT.
  */
 
+#define PVH_GDT_ENTRY_CANARY   4
+#define PVH_CANARY_SEL (PVH_GDT_ENTRY_CANARY * 8)
+
 ENTRY(pvh_start_xen)
cld
 
@@ -64,6 +67,9 @@ ENTRY(pvh_start_xen)
mov %eax,%es
mov %eax,%ss
 
+   mov $PVH_CANARY_SEL,%eax
+   mov %eax,%gs
+
/* Stash hvm_start_info. */
mov $_pa(pvh_start_info), %edi
mov %ebx, %esi
@@ -98,6 +104,12 @@ ENTRY(pvh_start_xen)
/* 64-bit entry point. */
.code64
 1:
+   /* Set base address in stack canary descriptor. */
+   mov $MSR_GS_BASE,%ecx
+   mov $canary, %rax
+   cdq
+   wrmsr
+
call xen_prepare_pvh
 
/* startup_64 expects boot_params in %rsi. */
@@ -107,6 +119,14 @@ ENTRY(pvh_start_xen)
 
 #else /* CONFIG_X86_64 */
 
+   /* Set base address in stack canary descriptor. */
+   movl _pa(gdt_start),%eax
+   movl $_pa(canary),%ecx
+   movw %cx, (PVH_GDT_ENTRY_CANARY * 8) + 0(%eax)
+   shrl $16, %ecx
+   movb %cl, (PVH_GDT_ENTRY_CANARY * 8) + 2(%eax)
+   movb %ch, (PVH_GDT_ENTRY_CANARY * 8) + 5(%eax)
+
call mk_early_pgtbl_32
 
mov $_pa(initial_page_table), %eax
@@ -150,9 +170,12 @@ gdt_start:
.quad GDT_ENTRY(0xc09a, 0, 0xf) /* __KERNEL_CS */
 #endif
.quad GDT_ENTRY(0xc092, 0, 0xf) /* __KERNEL_DS */
+   .quad GDT_ENTRY(0x4090, 0, 0x18)/* PVH_CANARY_SEL */
 gdt_end:
 
-   .balign 4
+   .balign 16
+canary:
+   .fill 24, 1, 0
 early_stack:
.fill 256, 1, 0
 early_stack_end:
-- 
2.9.3


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/2] PVH GDT fixes

2018-05-17 Thread Boris Ostrovsky
Fix stack canary handling (in the first patch) and re-index PVH GDT to
make it explicit that the GDT PVH-specific

v3:
- Use GS base MSR for 64-bit mode

Boris Ostrovsky (2):
  xen/PVH: Set up GS segment for stack canary
  xen/PVH: Make GDT selectors PVH-specific

 arch/x86/xen/xen-pvh.S | 46 --
 1 file changed, 36 insertions(+), 10 deletions(-)

-- 
2.9.3


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 06/10] x86/SVM: Add vcpu scheduling support for AVIC

2018-05-17 Thread Jan Beulich
>>> On 07.05.18 at 23:07,  wrote:
> @@ -65,6 +66,51 @@ avic_get_physical_id_entry(struct svm_domain *d, unsigned 
> int index)
>  return &d->avic_physical_id_table[index];
>  }
>  
> +static void avic_vcpu_load(struct vcpu *v)
> +{
> +struct arch_svm_struct *s = &v->arch.hvm_svm;
> +int h_phy_apic_id;

APIC IDs are of unsigned type.

> +ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> +
> +/*
> + * Note: APIC ID = 0xff is used for broadcast.
> + *   APIC ID > 0xff is reserved.
> + */
> +h_phy_apic_id = cpu_data[v->processor].apicid;
> +ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
> +
> +s->avic_last_phy_id->host_phy_apic_id = h_phy_apic_id;
> +smp_wmb();
> +set_bit(IS_RUNNING_BIT, (u64*)(s->avic_last_phy_id));

You have a struct defined for this - please avoid such bogus casting.
I can see why you may not want to use the bitfield here - make the
struct a union with a "raw" field, define IS_RUNNING_BIT right there (so
one can easily see the correlation; may require renaming the constant),
and do the operation on &s->raw.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/2] xen/PVH: Make GDT selectors PVH-specific

2018-05-17 Thread Boris Ostrovsky
We don't need to share PVH GDT layout with other GDTs, especially
since we now have a PVH-speciific entry (for stack canary segment).

Define PVH's own selectors.

(As a side effect of this change we are also fixing improper
reference to __KERNEL_CS)

Signed-off-by: Boris Ostrovsky 
---
 arch/x86/xen/xen-pvh.S | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/x86/xen/xen-pvh.S b/arch/x86/xen/xen-pvh.S
index 0db540c..f09350a 100644
--- a/arch/x86/xen/xen-pvh.S
+++ b/arch/x86/xen/xen-pvh.S
@@ -54,7 +54,11 @@
  * charge of setting up it's own stack, GDT and IDT.
  */
 
-#define PVH_GDT_ENTRY_CANARY   4
+#define PVH_GDT_ENTRY_CS   1
+#define PVH_GDT_ENTRY_DS   2
+#define PVH_GDT_ENTRY_CANARY   3
+#define PVH_CS_SEL (PVH_GDT_ENTRY_CS * 8)
+#define PVH_DS_SEL (PVH_GDT_ENTRY_DS * 8)
 #define PVH_CANARY_SEL (PVH_GDT_ENTRY_CANARY * 8)
 
 ENTRY(pvh_start_xen)
@@ -62,7 +66,7 @@ ENTRY(pvh_start_xen)
 
lgdt (_pa(gdt))
 
-   mov $(__BOOT_DS),%eax
+   mov $PVH_DS_SEL,%eax
mov %eax,%ds
mov %eax,%es
mov %eax,%ss
@@ -99,7 +103,7 @@ ENTRY(pvh_start_xen)
mov %eax, %cr0
 
/* Jump to 64-bit mode. */
-   ljmp $__KERNEL_CS, $_pa(1f)
+   ljmp $PVH_CS_SEL, $_pa(1f)
 
/* 64-bit entry point. */
.code64
@@ -136,13 +140,13 @@ ENTRY(pvh_start_xen)
or $(X86_CR0_PG | X86_CR0_PE), %eax
mov %eax, %cr0
 
-   ljmp $__BOOT_CS, $1f
+   ljmp $PVH_CS_SEL, $1f
 1:
call xen_prepare_pvh
mov $_pa(pvh_bootparams), %esi
 
/* startup_32 doesn't expect paging and PAE to be on. */
-   ljmp $__BOOT_CS, $_pa(2f)
+   ljmp $PVH_CS_SEL, $_pa(2f)
 2:
mov %cr0, %eax
and $~X86_CR0_PG, %eax
@@ -151,7 +155,7 @@ ENTRY(pvh_start_xen)
and $~X86_CR4_PAE, %eax
mov %eax, %cr4
 
-   ljmp $__BOOT_CS, $_pa(startup_32)
+   ljmp $PVH_CS_SEL, $_pa(startup_32)
 #endif
 END(pvh_start_xen)
 
@@ -163,13 +167,12 @@ gdt:
.word 0
 gdt_start:
.quad 0x/* NULL descriptor */
-   .quad 0x/* reserved */
 #ifdef CONFIG_X86_64
-   .quad GDT_ENTRY(0xa09a, 0, 0xf) /* __KERNEL_CS */
+   .quad GDT_ENTRY(0xa09a, 0, 0xf) /* PVH_CS_SEL */
 #else
-   .quad GDT_ENTRY(0xc09a, 0, 0xf) /* __KERNEL_CS */
+   .quad GDT_ENTRY(0xc09a, 0, 0xf) /* PVH_CS_SEL */
 #endif
-   .quad GDT_ENTRY(0xc092, 0, 0xf) /* __KERNEL_DS */
+   .quad GDT_ENTRY(0xc092, 0, 0xf) /* PVH_DS_SEL */
.quad GDT_ENTRY(0x4090, 0, 0x18)/* PVH_CANARY_SEL */
 gdt_end:
 
-- 
2.9.3


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 07/10] x86/SVM: Add interrupt management code via AVIC

2018-05-17 Thread Jan Beulich
>>> On 07.05.18 at 23:07,  wrote:
> +void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
> +{
> +struct vlapic *vlapic = vcpu_vlapic(v);
> +
> +/* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
> +if ( !svm_avic_vcpu_enabled(v) )
> +{
> +if ( !vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) 
> )
> +vcpu_kick(v);
> +return;
> +}
> +
> +/* If interrupt is disabled, do not ignore the interrupt */
> +if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
> +return;

It seems to me that I did comment on this before - I don't think EFLAGS.IF
should be considered here:

> +if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
> +return;

Latching the interrupt into IRR ought to keep it pending until the guest sets
EFLAGS.IF again.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Draft NVDIMM proposal

2018-05-17 Thread George Dunlap
On 05/15/2018 07:06 PM, Dan Williams wrote:
> On Tue, May 15, 2018 at 7:19 AM, George Dunlap  
> wrote:
>> So, who decides what this SPA range and interleave set is?  Can the
>> operating system change these interleave sets and mappings, or change
>> data from PMEM to BLK, and is so, how?
> 
> The interleave-set to SPA range association and delineation of
> capacity between PMEM and BLK access modes is current out-of-scope for
> ACPI. The BIOS reports the configuration to the OS via the NFIT, but
> the configuration is currently written by vendor specific tooling.
> Longer term it would be great for this mechanism to become
> standardized and available to the OS, but for now it requires platform
> specific tooling to change the DIMM interleave configuration.

OK -- I was sort of assuming that different hardware would have
different drivers in Linux that ndctl knew how to drive (just like any
other hardware with vendor-specific interfaces); but it sounds a bit
more like at the moment it's binary blobs either in the BIOS/firmware,
or a vendor-supplied tool.

>> And so (here's another guess) -- when you're talking about namespaces
>> and label areas, you're talking about namespaces stored *within a
>> pre-existing SPA range*.  You use the same format as described in the
>> UEFI spec, but ignore all the stuff about interleave sets and whatever,
>> and use system physical addresses relative to the SPA range rather than
>> DPAs.
> 
> Well, we don't ignore it because we need to validate in the driver
> that the interleave set configuration matches a checksum that we
> generated when the namespace was first instantiated on the interleave
> set. However, you are right, for accesses at run time all we care
> about is the SPA for PMEM accesses.
[snip]
> They can change, but only under the control of the BIOS. All changes
> to the interleave set configuration need a reboot because the memory
> controller needs to be set up differently at system-init time.
[snip]
> No, the checksum I'm referring to is the interleave set cookie (see:
> "SetCookie" in the UEFI 2.7 specification). It validates that the
> interleave set backing the SPA has not changed configuration since the
> last boot.
[snip]
> The NVDIMM just provides storage area for the OS to write opaque data
> that just happens to conform to the UEFI Namespace label format. The
> interleave-set configuration is stored in yet another out-of-band
> location on the DIMM or on some platform-specific storage location and
> is consulted / restored by the BIOS each boot. The NFIT is the output
> from the platform specific physical mappings of the DIMMs, and
> Namespaces are logical volumes built on top of those hard-defined NFIT
> boundaries.

OK, so what I'm hearing is:

The label area isn't "within a pre-existing SPA range" as I was guessing
(i.e., similar to a partition table residing within a disk); it is the
per-DIMM label area as described by UEFI spec.

But, the interleave set data in the label area doesn't *control* the
hardware -- the NVDIMM controller / bios / firmware don't read it or do
anything based on what's in it.  Rather, the interleave set data in the
label area is there to *record*, for the operating system's benefit,
what the hardware configuration was when the labels were created, so
that if it changes, the OS knows that the label area is invalid; it must
either refrain from touching the NVRAM (if it wants to preserve the
data), or write a new label area.

The OS can also use labels to partition a single SPA range into several
namespaces.  It can't change the interleaving, but it can specify that
[0-A) is one namespace, [A-B) is another namespace, &c; and these
namespaces will naturally map into the SPA range advertised in the NFIT.

And if a controller allows the same memory to be used either as PMEM or
PBLK, it can write which *should* be used for which, and then can avoid
accessing the same underlying NVRAM in two different ways (which will
yield unpredictable results).

That makes sense.

>> If SPA regions don't change after boot, and if Xen can find its own
>> Xen-specific namespace to use for the frame tables by reading the NFIT
>> table, then that significantly reduces the amount of interaction it
>> needs with Linux.
>>
>> If SPA regions *can* change after boot, and if Xen must rely on Linux to
>> read labels and find out what it can safely use for frame tables, then
>> it makes things significantly more involved.  Not impossible by any
>> means, but a lot more complicated.
>>
>> Hope all that makes sense -- thanks again for your help.
> 
> I think it does, but it seems namespaces are out of reach for Xen
> without some agent / enabling that can execute the necessary AML
> methods.

Sure, we're pretty much used to that. :-)  We'll have Linux read the
label area and tell Xen what it needs to know.  But:

* Xen can know the SPA ranges of all potential NVDIMMs before dom0
starts.  So it can tell, for instance, if a page mapped by dom0 is
inside an

Re: [Xen-devel] [PATCH v2 09/10] x86/SVM: Introduce svm command line option

2018-05-17 Thread Jan Beulich
>>> On 07.05.18 at 23:07,  wrote:
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -64,6 +64,16 @@
>  #include 
>  #include 
>  
> +static int parse_svm_param(const char *s);

This is unnecessary if you move ...

> +/*
> + * The 'svm' parameter en/dis-ables various SVM features.
> + * Optional comma separated value may contain:
> + *
> + *   avic - Enable SVM Advanced Virtual Interrupt Controller (AVIC)
> + */
> +custom_param("svm", parse_svm_param);

... this after the function definition.

> @@ -89,6 +99,28 @@ static bool_t amd_erratum383_found __read_mostly;
>  static uint64_t osvw_length, osvw_status;
>  static DEFINE_SPINLOCK(osvw_lock);
>  
> +static int __init parse_svm_param(const char *s)
> +{
> +char *ss;
> +int val;
> +
> +do {
> +val = !!strncmp(s, "no-", 3);
> +if ( !val )
> +s += 3;

Please use parse_boolean().

> +ss = strchr(s, ',');
> +if ( ss )
> +*ss = '\0';
> +
> +if ( !strcmp(s, "avic") )
> +svm_avic = val;

else
ret = -EINVAL;

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] how to recognize in libxl that a domU has qemu-xen attached?

2018-05-17 Thread Olaf Hering
Am Thu, 17 May 2018 14:55:10 +0200
schrieb Juergen Gross :

> libxl__need_xenpv_qemu() is used to determine whether a pv domain needs
> a qemu process for at least one backend. 

Thanks. Too bad, d_config is not available in that context. It is probably 
known somewhere by the callers. I guess such caller needs to pass a bool down 
to suspend/resume.

Olaf


pgpV619VpXj0Y.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 10/10] x86/SVM: Append AMD AVIC related data to IRQ keyhandler 'i'

2018-05-17 Thread Jan Beulich
>>> On 07.05.18 at 23:07,  wrote:
> @@ -2351,6 +2352,7 @@ static void dump_irqs(unsigned char key)
>  printk("   %#02x -> %ps()\n", i, direct_apic_vector[i]);
>  
>  dump_ioapic_irq_info();
> +dump_avic_info();
>  }

While this is better than a separate key, I still don't like it. These 
statistics
are unrelated to the purpose of 'i'. Why can't this be connected to e.g. 'v',
provided these statistics are all that relevant in the first place?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Test for osstest, features used in Qubes OS

2018-05-17 Thread Marek Marczykowski-Górecki
On Thu, May 17, 2018 at 01:26:30PM +0100, Ian Jackson wrote:
> Marek Marczykowski-Górecki writes ("Test for osstest, features used in Qubes 
> OS"):
> > As discussed some time ago, I'd like to help with adding tests for some
> > features we use in Qubes OS.
> > 
> > IMO the easiest thing to test is host suspend. You just need to execute
> > "rtcwake -s 30 -m mem", and see if the host is back to live after ~30s.
> > Right now I know it works on Xen 4.8, but supposedly is broken on
> > staging (haven't tested the most recent version).
> > Next step would be the same while having some domains running.
> > 
> > How the test should look like (where to add this? etc)?
> 
> I guess this should be a new
>   ts-host-suspend-test
> script.
> 
> Is it likely that this will depend on non-buggy host firmware ?  If so
> then we need to make arrangements to test it and only do it on hosts
> which are not buggy.  In practice this probably means wiring it up to
> the automatic host examiner.

Yes, probably.

> > Next things would be mostly related to PCI passthrough:
> >  - PCI passthrough with qemu in stubdomain
> >  - the same as above, but with Linux-based stubdomain (we need cleanup
> >and send patches for that first, probably 4.12 material)
> >  - guest suspend (recently added libxl_domain_suspend_only), for
> >different guest types (PV, PVH, HVM), also with/without PCI device
> > 
> > For this, the machine obviously need to have IOMMU (I assume at least
> > some of the hardware used in test lab have it), and some spare PCI
> > device. I use sound card for some of such tests. But testing on USB
> > controllers would be more useful (from out experience, one of the most
> > problematic devices for suspend, sadly also lacking FLR or such...).
> 
> I doubt any of our x86 machines have sound cards. ...  Just looked at
> one and it says
>   00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core
>   Processor HD Audio Controller (rev 06)
> which is obviously mad.
> 
> I'm pretty sure they all have usb controllers.  Almost all of them
> have multiple NICs, often on different pci devices, although it is
> difficult to tell if a NIC not connected to anything is working.
> 
> Eg,
> 
>   02:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network
>   Connection (rev 03)
> 
>   03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network
>   Connection (rev 03)
> 
> Is there some kind of cheap USB HID, that is interactable-with, which
> we could plug into each machine's USB port ?  I'm slightly concerned
> that plugging in a storage device, or connecting the other NIC, might
> interfere with booting.

I use mass storage for tests... But if you use network boot, it
shouldn't really interfere, no?

> If you want to get pci passthrough tests working I would suggest
> testing it with non-stubdom first.  I assume the config etc. is the
> same, so having got that working, osstest would be able to test it for
> the stubdom tests too.

Oh, I though there are already tests for that...
Yes, good idea.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] xen/PVH: Set up GS segment for stack canary

2018-05-17 Thread Jan Beulich
>>> On 17.05.18 at 16:47,  wrote:
> @@ -64,6 +67,9 @@ ENTRY(pvh_start_xen)
>   mov %eax,%es
>   mov %eax,%ss
>  
> + mov $PVH_CANARY_SEL,%eax
> + mov %eax,%gs

I doubt this is needed for 64-bit (you could equally well load zero or leave
in place what's there in that case), and loading the selector before setting
the base address in the descriptor won't have the intended effect.

> @@ -150,9 +170,12 @@ gdt_start:
>   .quad GDT_ENTRY(0xc09a, 0, 0xf) /* __KERNEL_CS */
>  #endif
>   .quad GDT_ENTRY(0xc092, 0, 0xf) /* __KERNEL_DS */
> + .quad GDT_ENTRY(0x4090, 0, 0x18)/* PVH_CANARY_SEL */
>  gdt_end:
>  
> - .balign 4
> + .balign 16
> +canary:
> + .fill 24, 1, 0

This is too little space for 64-bit afaict (the canary lives at offset 40 there
if I can trust asm/processor.h).

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] how to recognize in libxl that a domU has qemu-xen attached?

2018-05-17 Thread Olaf Hering
Am Thu, 17 May 2018 16:54:00 +0200
schrieb Olaf Hering :

> Thanks. Too bad, d_config is not available in that context. It is probably 
> known somewhere by the callers. I guess such caller needs to pass a bool down 
> to suspend/resume.

It seems nothing inside libxl knows about libxl_domain_config, only callers of 
the public libxl_domain_suspend API do actually create it once. AFAICS only the 
LIBXL_SUSPEND_* 'flags' would allow to pass something down to the relevant 
code. But the resumer may not know about them...

Olaf


pgpYdf8_1Z9hN.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] XSM in osstest, grub config, outstanding patch

2018-05-17 Thread Ian Jackson
Hi, I'm emailing you because I know you have an interest in XSM
(and therefore in its testing in osstest).

osstest manages the booting of its test hosts using the
distro-supplied bootloader arrangements for its dom0s.  For Debian
that is update-grub.  Currently, osstest has a hacked-up local copy of
the Xen bit of update-grub, /etc/grub.d/20_linux_xen.  This is in
serious danger of diverging from upstream, which is quite bad.

I am intending to drop this file from osstest installs of Debian dom0s
after stretch (ie, for Debian buster).  Currently all the deviations
from upstream we have been carrying are fixed, except for one
XSM-related change.

That change is in the one described in upstream bugtracker here:
  https://savannah.gnu.org/bugs/?43420
According to the osstest commit message for f12512e44919, this is not
quite the same version as is being used by osstest.

This upstream bug is blocked because of unanswered questions about the
naming and discovery of policy files.  According to Wei, we don't have
a good story about how a user-supplied policy file ought to supplant
the one which comes from the Xen build system.

Anyway, without this change, when osstest tries to set up XSM on
Debian buster it will not find a bootloader entry with the right
policy file.  It will then fail that test.

To avoid this in the most expedient way, it would be good to get a
version of this fix into grub upstream before then.

Failing that, as I would be reluctant to continue to carry an
ever-diverging piece of grub configuration, I think it would be
necessary for there to at least be an upstream bug report with a ready
(or nearly-ready) patch; in which case I could provide osstest with a
copy of buster's 20_linux_xen file with that patch applied.

In any case, we will want something close to a ready-to-apply patch in
the upstream bugtracker.

I am emailing you this now because I have just discovered it.  Happily
this will give people plenty of time to debate the policy file naming
issue.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Test for osstest, features used in Qubes OS

2018-05-17 Thread Ian Jackson
Marek Marczykowski-Górecki writes ("Re: Test for osstest, features used in 
Qubes OS"):
> On Thu, May 17, 2018 at 01:26:30PM +0100, Ian Jackson wrote:
> > Is it likely that this will depend on non-buggy host firmware ?  If so
> > then we need to make arrangements to test it and only do it on hosts
> > which are not buggy.  In practice this probably means wiring it up to
> > the automatic host examiner.
> 
> Yes, probably.

That's not entirely trivial then, especially for you, unless you want
to set up your own osstest production instance.  However, I can
probably do the osstest-machinery work if you will help debug it,
review logs, tell me what to do next, etc. :-).

> > Is there some kind of cheap USB HID, that is interactable-with, which
> > we could plug into each machine's USB port ?  I'm slightly concerned
> > that plugging in a storage device, or connecting the other NIC, might
> > interfere with booting.
> 
> I use mass storage for tests... But if you use network boot, it
> shouldn't really interfere, no?

We do both network boot and disk boot.  I think the BIOS disk boot has
to continue to work and boot the HDD.

> > If you want to get pci passthrough tests working I would suggest
> > testing it with non-stubdom first.  I assume the config etc. is the
> > same, so having got that working, osstest would be able to test it for
> > the stubdom tests too.
> 
> Oh, I though there are already tests for that...

There are no PCI passthrough tests at all.  For a while we had some
SRIOV NIC tests which were requested by Intel.  But they always failed
giving kernel stack dumps.  We kept poking Intel to get them to fix
them, or tell us how the tests were wrong, but to no avail.  So we
dropped them.

So any work in this area would be greatly appreciated!

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] how to recognize in libxl that a domU has qemu-xen attached?

2018-05-17 Thread Olaf Hering
Am Thu, 17 May 2018 16:54:00 +0200
schrieb Olaf Hering :

> Am Thu, 17 May 2018 14:55:10 +0200
> schrieb Juergen Gross :
> > libxl__need_xenpv_qemu() is used to determine whether a pv domain needs
> > a qemu process for at least one backend.   
> Thanks. Too bad, d_config is not available in that context. It is probably 
> known somewhere by the callers. I guess such caller needs to pass a bool down 
> to suspend/resume.

I think we may get around that missing d_config like that, I will test this 
approach:

--- xen-4.10.0-testing.orig/tools/libxl/libxl_dom_suspend.c
+++ xen-4.10.0-testing/tools/libxl/libxl_dom_suspend.c
@@ -377,7 +377,9 @@ static void domain_suspend_common_guest_
 libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
 libxl__ev_time_deregister(gc, &dsps->guest_timeout);
 
-if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) {
+if (dsps->type == LIBXL_DOMAIN_TYPE_HVM ||
+libxl__device_model_version_running(gc, domid) ==
+LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
 rc = libxl__domain_suspend_device_model(gc, dsps);
 if (rc) {
 LOGD(ERROR, dsps->domid,
@@ -460,7 +462,9 @@ int libxl__domain_resume(libxl__gc *gc,
 goto out;
 }
 
-if (type == LIBXL_DOMAIN_TYPE_HVM) {
+if (type == LIBXL_DOMAIN_TYPE_HVM ||
+libxl__device_model_version_running(gc, domid) ==
+LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
 rc = libxl__domain_resume_device_model(gc, domid);
 if (rc) {
 LOGD(ERROR, domid, "failed to resume device model:%d", rc);

Olaf


pgpP8oF0VS8LS.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 5/9] xen_backend: add an emulation of grant copy

2018-05-17 Thread Paul Durrant
Not all Xen environments support the xengnttab_grant_copy() operation.
E.g. where the OS is FreeBSD or Xen is older than 4.8.0.

This patch introduces an emulation of that operation using
xengnttab_map_domain_grant_refs() and memcpy() for those environments.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - New in v2
---
 hw/xen/xen_backend.c | 53 
 1 file changed, 53 insertions(+)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 50412d6..3c3fc2c 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -146,6 +146,55 @@ void xen_be_unmap_grant_refs(struct XenDevice *xendev, 
void *ptr,
 }
 }
 
+static int compat_copy_grant_refs(struct XenDevice *xendev,
+  bool to_domain,
+  XenGrantCopySegment segs[],
+  unsigned int nr_segs)
+{
+uint32_t *refs = g_new(uint32_t, nr_segs);
+int prot = to_domain ? PROT_WRITE : PROT_READ;
+void *pages;
+unsigned int i;
+
+for (i = 0; i < nr_segs; i++) {
+XenGrantCopySegment *seg = &segs[i];
+
+refs[i] = to_domain ?
+seg->dest.foreign.ref : seg->source.foreign.ref;
+}
+
+pages = xengnttab_map_domain_grant_refs(xendev->gnttabdev, nr_segs,
+xen_domid, refs, prot);
+if (!pages) {
+xen_pv_printf(xendev, 0,
+  "xengnttab_map_domain_grant_refs failed: %s\n",
+  strerror(errno));
+g_free(refs);
+return -1;
+}
+
+for (i = 0; i < nr_segs; i++) {
+XenGrantCopySegment *seg = &segs[i];
+void *page = pages + (i * XC_PAGE_SIZE);
+
+if (to_domain) {
+memcpy(page + seg->dest.foreign.offset, seg->source.virt,
+   seg->len);
+} else {
+memcpy(seg->dest.virt, page + seg->source.foreign.offset,
+   seg->len);
+}
+}
+
+if (xengnttab_unmap(xendev->gnttabdev, pages, nr_segs)) {
+xen_pv_printf(xendev, 0, "xengnttab_unmap failed: %s\n",
+  strerror(errno));
+}
+
+g_free(refs);
+return 0;
+}
+
 int xen_be_copy_grant_refs(struct XenDevice *xendev,
bool to_domain,
XenGrantCopySegment segs[],
@@ -157,6 +206,10 @@ int xen_be_copy_grant_refs(struct XenDevice *xendev,
 
 assert(xendev->ops->flags & DEVOPS_FLAG_NEED_GNTDEV);
 
+if (!xen_feature_grant_copy) {
+return compat_copy_grant_refs(xendev, to_domain, segs, nr_segs);
+}
+
 xengnttab_segs = g_new0(xengnttab_grant_copy_segment_t, nr_segs);
 
 for (i = 0; i < nr_segs; i++) {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 1/9] xen: add a meaningful declaration of grant_copy_segment into xen_common.h

2018-05-17 Thread Paul Durrant
Currently the xen_disk source has to carry #ifdef exclusions to compile
against Xen older then 4.8. This is a bit messy so this patch lifts the
definition of struct xengnttab_grant_copy_segment and adds it into the
pre-4.8 compat area in xen_common.h, which allows xen_disk to be cleaned
up.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v4:
 - New in v4
---
 hw/block/xen_disk.c | 18 --
 include/hw/xen/xen_common.h | 17 +++--
 2 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index f74fcd4..78bfb41 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -496,8 +496,6 @@ static int ioreq_map(struct ioreq *ioreq)
 return 0;
 }
 
-#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 40800
-
 static void ioreq_free_copy_buffers(struct ioreq *ioreq)
 {
 int i;
@@ -579,22 +577,6 @@ static int ioreq_grant_copy(struct ioreq *ioreq)
 
 return rc;
 }
-#else
-static void ioreq_free_copy_buffers(struct ioreq *ioreq)
-{
-abort();
-}
-
-static int ioreq_init_copy_buffers(struct ioreq *ioreq)
-{
-abort();
-}
-
-static int ioreq_grant_copy(struct ioreq *ioreq)
-{
-abort();
-}
-#endif
 
 static int ioreq_runio_qemu_aio(struct ioreq *ioreq);
 
diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 5f1402b..bbf207d 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -667,8 +667,21 @@ static inline int xen_domain_create(xc_interface *xc, 
uint32_t ssidref,
 
 #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 40800
 
-
-typedef void *xengnttab_grant_copy_segment_t;
+struct xengnttab_grant_copy_segment {
+union xengnttab_copy_ptr {
+void *virt;
+struct {
+uint32_t ref;
+uint16_t offset;
+uint16_t domid;
+} foreign;
+} source, dest;
+uint16_t len;
+uint16_t flags;
+int16_t status;
+};
+
+typedef struct xengnttab_grant_copy_segment xengnttab_grant_copy_segment_t;
 
 static inline int xengnttab_grant_copy(xengnttab_handle *xgt, uint32_t count,
xengnttab_grant_copy_segment_t *segs)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 6/9] xen_disk: remove use of grant map/unmap

2018-05-17 Thread Paul Durrant
Now that the (native or emulated) xen_be_copy_grant_refs() helper is
always available, the xen_disk code can be significantly simplified by
removing direct use of grant map and unmap operations.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Kevin Wolf 
Cc: Max Reitz 

v2:
 - Squashed in separate patche removing persistent grant use
 - Re-based
---
 hw/block/xen_disk.c | 352 
 1 file changed, 25 insertions(+), 327 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index d3be45a..28be8b6 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -36,27 +36,9 @@
 
 /* - */
 
-static int batch_maps   = 0;
-
-/* - */
-
 #define BLOCK_SIZE  512
 #define IOCB_COUNT  (BLKIF_MAX_SEGMENTS_PER_REQUEST + 2)
 
-struct PersistentGrant {
-void *page;
-struct XenBlkDev *blkdev;
-};
-
-typedef struct PersistentGrant PersistentGrant;
-
-struct PersistentRegion {
-void *addr;
-int num;
-};
-
-typedef struct PersistentRegion PersistentRegion;
-
 struct ioreq {
 blkif_request_t req;
 int16_t status;
@@ -65,14 +47,11 @@ struct ioreq {
 off_t   start;
 QEMUIOVectorv;
 int presync;
-uint8_t mapped;
 
 /* grant mapping */
 uint32_trefs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
-int prot;
 void*page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 void*pages;
-int num_unmap;
 
 /* aio status */
 int aio_inflight;
@@ -103,7 +82,6 @@ struct XenBlkDev {
 int protocol;
 blkif_back_rings_t  rings;
 int more_work;
-int cnt_map;
 
 /* request lists */
 QLIST_HEAD(inflight_head, ioreq) inflight;
@@ -114,13 +92,7 @@ struct XenBlkDev {
 int requests_finished;
 unsigned intmax_requests;
 
-/* Persistent grants extension */
 gbooleanfeature_discard;
-gbooleanfeature_persistent;
-GTree   *persistent_gnts;
-GSList  *persistent_regions;
-unsigned intpersistent_gnt_count;
-unsigned intmax_grants;
 
 /* qemu block driver */
 DriveInfo   *dinfo;
@@ -139,10 +111,8 @@ static void ioreq_reset(struct ioreq *ioreq)
 ioreq->status = 0;
 ioreq->start = 0;
 ioreq->presync = 0;
-ioreq->mapped = 0;
 
 memset(ioreq->refs, 0, sizeof(ioreq->refs));
-ioreq->prot = 0;
 memset(ioreq->page, 0, sizeof(ioreq->page));
 ioreq->pages = NULL;
 
@@ -156,37 +126,6 @@ static void ioreq_reset(struct ioreq *ioreq)
 qemu_iovec_reset(&ioreq->v);
 }
 
-static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
-{
-uint ua = GPOINTER_TO_UINT(a);
-uint ub = GPOINTER_TO_UINT(b);
-return (ua > ub) - (ua < ub);
-}
-
-static void destroy_grant(gpointer pgnt)
-{
-PersistentGrant *grant = pgnt;
-struct XenBlkDev *blkdev = grant->blkdev;
-struct XenDevice *xendev = &blkdev->xendev;
-
-xen_be_unmap_grant_ref(xendev, grant->page);
-grant->blkdev->persistent_gnt_count--;
-xen_pv_printf(xendev, 3, "unmapped grant %p\n", grant->page);
-g_free(grant);
-}
-
-static void remove_persistent_region(gpointer data, gpointer dev)
-{
-PersistentRegion *region = data;
-struct XenBlkDev *blkdev = dev;
-struct XenDevice *xendev = &blkdev->xendev;
-
-xen_be_unmap_grant_refs(xendev, region->addr, region->num);
-xen_pv_printf(xendev, 3, "unmapped grant region %p with %d pages\n",
-  region->addr, region->num);
-g_free(region);
-}
-
 static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
 {
 struct ioreq *ioreq = NULL;
@@ -254,7 +193,6 @@ static int ioreq_parse(struct ioreq *ioreq)
   ioreq->req.handle, ioreq->req.id, ioreq->req.sector_number);
 switch (ioreq->req.operation) {
 case BLKIF_OP_READ:
-ioreq->prot = PROT_WRITE; /* to memory */
 break;
 case BLKIF_OP_FLUSH_DISKCACHE:
 ioreq->presync = 1;
@@ -263,7 +201,6 @@ static int ioreq_parse(struct ioreq *ioreq)
 }
 /* fall through */
 case BLKIF_OP_WRITE:
-ioreq->prot = PROT_READ; /* from memory */
 break;
 case BLKIF_OP_DISCARD:
 return 0;
@@ -310,171 +247,6 @@ err:
 return -1;
 }
 
-static void ioreq_unmap(struct ioreq *ioreq)
-{
-struct XenBlkDev *blkdev = ioreq->blkdev;
-struct XenDevice *xendev = &blkdev->xendev;
-int i;
-
-if (ioreq->num_unmap == 0 || ioreq->mapped == 0) {
-return;
-}
-if (batch_maps) {
-if (!ioreq->pages) {
-return;
-}
-xen_be_unmap_grant_refs(xendev, ioreq->pages, ioreq->num_unmap);
-   

[Xen-devel] [PATCH v4 8/9] xen_disk: use a single entry iovec

2018-05-17 Thread Paul Durrant
Since xen_disk now always copies data to and from a guest there is no need
to maintain a vector entry corresponding to every page of a request.
This means there is less per-request state to maintain so the ioreq
structure can shrink significantly.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Kevin Wolf 
Cc: Max Reitz 

v3:
 - Un-break by fixing mis-placed qemu_iovec_add()

v2:
 - Re-based
---
 hw/block/xen_disk.c | 76 +++--
 1 file changed, 21 insertions(+), 55 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 28be8b6..28651c5 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -46,13 +46,10 @@ struct ioreq {
 /* parsed request */
 off_t   start;
 QEMUIOVectorv;
+void*buf;
+size_t  size;
 int presync;
 
-/* grant mapping */
-uint32_trefs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
-void*page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
-void*pages;
-
 /* aio status */
 int aio_inflight;
 int aio_errors;
@@ -110,12 +107,10 @@ static void ioreq_reset(struct ioreq *ioreq)
 memset(&ioreq->req, 0, sizeof(ioreq->req));
 ioreq->status = 0;
 ioreq->start = 0;
+ioreq->buf = NULL;
+ioreq->size = 0;
 ioreq->presync = 0;
 
-memset(ioreq->refs, 0, sizeof(ioreq->refs));
-memset(ioreq->page, 0, sizeof(ioreq->page));
-ioreq->pages = NULL;
-
 ioreq->aio_inflight = 0;
 ioreq->aio_errors = 0;
 
@@ -138,7 +133,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
 ioreq = g_malloc0(sizeof(*ioreq));
 ioreq->blkdev = blkdev;
 blkdev->requests_total++;
-qemu_iovec_init(&ioreq->v, BLKIF_MAX_SEGMENTS_PER_REQUEST);
+qemu_iovec_init(&ioreq->v, 1);
 } else {
 /* get one from freelist */
 ioreq = QLIST_FIRST(&blkdev->freelist);
@@ -183,7 +178,6 @@ static void ioreq_release(struct ioreq *ioreq, bool finish)
 static int ioreq_parse(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
-uintptr_t mem;
 size_t len;
 int i;
 
@@ -230,13 +224,10 @@ static int ioreq_parse(struct ioreq *ioreq)
 goto err;
 }
 
-ioreq->refs[i]   = ioreq->req.seg[i].gref;
-
-mem = ioreq->req.seg[i].first_sect * blkdev->file_blk;
 len = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) 
* blkdev->file_blk;
-qemu_iovec_add(&ioreq->v, (void*)mem, len);
+ioreq->size += len;
 }
-if (ioreq->start + ioreq->v.size > blkdev->file_size) {
+if (ioreq->start + ioreq->size > blkdev->file_size) {
 xen_pv_printf(&blkdev->xendev, 0, "error: access beyond end of 
file\n");
 goto err;
 }
@@ -247,35 +238,6 @@ err:
 return -1;
 }
 
-static void ioreq_free_copy_buffers(struct ioreq *ioreq)
-{
-int i;
-
-for (i = 0; i < ioreq->v.niov; i++) {
-ioreq->page[i] = NULL;
-}
-
-qemu_vfree(ioreq->pages);
-}
-
-static int ioreq_init_copy_buffers(struct ioreq *ioreq)
-{
-int i;
-
-if (ioreq->v.niov == 0) {
-return 0;
-}
-
-ioreq->pages = qemu_memalign(XC_PAGE_SIZE, ioreq->v.niov * XC_PAGE_SIZE);
-
-for (i = 0; i < ioreq->v.niov; i++) {
-ioreq->page[i] = ioreq->pages + i * XC_PAGE_SIZE;
-ioreq->v.iov[i].iov_base = ioreq->page[i];
-}
-
-return 0;
-}
-
 static int ioreq_grant_copy(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
@@ -284,25 +246,27 @@ static int ioreq_grant_copy(struct ioreq *ioreq)
 int i, count, rc;
 int64_t file_blk = ioreq->blkdev->file_blk;
 bool to_domain = (ioreq->req.operation == BLKIF_OP_READ);
+void *virt = ioreq->buf;
 
-if (ioreq->v.niov == 0) {
+if (ioreq->req.nr_segments == 0) {
 return 0;
 }
 
-count = ioreq->v.niov;
+count = ioreq->req.nr_segments;
 
 for (i = 0; i < count; i++) {
 if (to_domain) {
-segs[i].dest.foreign.ref = ioreq->refs[i];
+segs[i].dest.foreign.ref = ioreq->req.seg[i].gref;
 segs[i].dest.foreign.offset = ioreq->req.seg[i].first_sect * 
file_blk;
-segs[i].source.virt = ioreq->v.iov[i].iov_base;
+segs[i].source.virt = virt;
 } else {
-segs[i].source.foreign.ref = ioreq->refs[i];
+segs[i].source.foreign.ref = ioreq->req.seg[i].gref;
 segs[i].source.foreign.offset = ioreq->req.seg[i].first_sect * 
file_blk;
-segs[i].dest.virt = ioreq->v.iov[i].iov_base;
+segs[i].dest.virt = virt;
 }
 segs[i].len = (ioreq->req.seg[i].last_sect
- ioreq->req.seg[i].first_sect + 1) * file_blk;
+virt += segs[i].len;
 }
 
 rc = xen_be_copy_grant_refs(xendev, to_domain, segs, count);
@@ -348,14 +312,14 @

[Xen-devel] [PATCH v4 0/9] xen_disk: legacy code removal and cleanup

2018-05-17 Thread Paul Durrant
The grant copy operation was added to libxengnttab in Xen 4.8.0 (released
nearly 18 months ago) but the xen_disk PV backend QEMU is still carrying
a significant amount of code purely to remain compatible with older
versions of Xen.

As can be inferred from the diff stats below, removing this support for
older versions of Xen from QEMU reduces the size of the xen_disk source by
around 320 lines (~25%).

This versionseries maintains compatibility with older Xen, and OS
not supporting the grant copy operation, by adding an emulation of it
into the xen_backend code. Thus xen_disk can be simplified without
regressing support for any environment. This series also performs
general cleanup of the code by introducing and consistently using
helper functions for calling into libxenttab.

v4:
 - Added new patch #1 to negate the need for #ifdef exclusions in xen_disk
   thus allowing the patch #2 (previous patch #1) to remain unmodified
   from v3 but still compile against Xen 4.7.

Paul Durrant (9):
  xen: add a meaningful declaration of grant_copy_segment into
xen_common.h
  xen_backend: add grant table helpers
  xen_disk: remove open-coded use of libxengnttab
  xen: remove other open-coded use of libxengnttab
  xen_backend: add an emulation of grant copy
  xen_disk: remove use of grant map/unmap
  xen_backend: make the xen_feature_grant_copy flag private
  xen_disk: use a single entry iovec
  xen_disk: be consistent with use of xendev and blkdev->xendev

 hw/9pfs/xen-9p-backend.c |  32 ++-
 hw/block/xen_disk.c  | 614 +++
 hw/char/xen_console.c|   9 +-
 hw/net/xen_nic.c |  33 +--
 hw/usb/xen-usb.c |  37 ++-
 hw/xen/xen_backend.c | 178 -
 include/hw/xen/xen_backend.h |  34 ++-
 include/hw/xen/xen_common.h  |  17 +-
 8 files changed, 365 insertions(+), 589 deletions(-)
---

Cc: Anthony Perard 
Cc: Gerd Hoffmann 
Cc: Greg Kurz 
Cc: Jason Wang 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Paolo Bonzini 
Cc: Stefano Stabellini 

--
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 7/9] xen_backend: make the xen_feature_grant_copy flag private

2018-05-17 Thread Paul Durrant
There is no longer any use of this flag outside of the xen_backend code.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 

v2:
 - New in v2
---
 hw/xen/xen_backend.c | 2 +-
 include/hw/xen/xen_backend.h | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 3c3fc2c..9a8e877 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -44,9 +44,9 @@ BusState *xen_sysbus;
 /* public */
 struct xs_handle *xenstore = NULL;
 const char *xen_protocol;
-bool xen_feature_grant_copy;
 
 /* private */
+static bool xen_feature_grant_copy;
 static int debug;
 
 int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const 
char *val)
diff --git a/include/hw/xen/xen_backend.h b/include/hw/xen/xen_backend.h
index 29bf1c3..9c17fdd 100644
--- a/include/hw/xen/xen_backend.h
+++ b/include/hw/xen/xen_backend.h
@@ -16,7 +16,6 @@
 /* variables */
 extern struct xs_handle *xenstore;
 extern const char *xen_protocol;
-extern bool xen_feature_grant_copy;
 extern DeviceState *xen_sysdev;
 extern BusState *xen_sysbus;
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 9/9] xen_disk: be consistent with use of xendev and blkdev->xendev

2018-05-17 Thread Paul Durrant
Certain functions in xen_disk are called with a pointer to xendev
(struct XenDevice *). They then use container_of() to acces the surrounding
blkdev (struct XenBlkDev) but then in various places use &blkdev->xendev
when use of the original xendev pointer is shorter to express and clearly
equivalent.

This patch is a purely cosmetic patch which makes sure there is a xendev
pointer on stack for any function where the pointer is need on multiple
occasions modified those functions to use it consistently.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Kevin Wolf 
Cc: Max Reitz 

v2:
 - Re-based
---
 hw/block/xen_disk.c | 90 +++--
 1 file changed, 46 insertions(+), 44 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 28651c5..9fbc0cd 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -178,10 +178,11 @@ static void ioreq_release(struct ioreq *ioreq, bool 
finish)
 static int ioreq_parse(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
+struct XenDevice *xendev = &blkdev->xendev;
 size_t len;
 int i;
 
-xen_pv_printf(&blkdev->xendev, 3,
+xen_pv_printf(xendev, 3,
   "op %d, nr %d, handle %d, id %" PRId64 ", sector %" PRId64 
"\n",
   ioreq->req.operation, ioreq->req.nr_segments,
   ioreq->req.handle, ioreq->req.id, ioreq->req.sector_number);
@@ -199,28 +200,28 @@ static int ioreq_parse(struct ioreq *ioreq)
 case BLKIF_OP_DISCARD:
 return 0;
 default:
-xen_pv_printf(&blkdev->xendev, 0, "error: unknown operation (%d)\n",
+xen_pv_printf(xendev, 0, "error: unknown operation (%d)\n",
   ioreq->req.operation);
 goto err;
 };
 
 if (ioreq->req.operation != BLKIF_OP_READ && blkdev->mode[0] != 'w') {
-xen_pv_printf(&blkdev->xendev, 0, "error: write req for ro device\n");
+xen_pv_printf(xendev, 0, "error: write req for ro device\n");
 goto err;
 }
 
 ioreq->start = ioreq->req.sector_number * blkdev->file_blk;
 for (i = 0; i < ioreq->req.nr_segments; i++) {
 if (i == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
-xen_pv_printf(&blkdev->xendev, 0, "error: nr_segments too big\n");
+xen_pv_printf(xendev, 0, "error: nr_segments too big\n");
 goto err;
 }
 if (ioreq->req.seg[i].first_sect > ioreq->req.seg[i].last_sect) {
-xen_pv_printf(&blkdev->xendev, 0, "error: first > last sector\n");
+xen_pv_printf(xendev, 0, "error: first > last sector\n");
 goto err;
 }
 if (ioreq->req.seg[i].last_sect * BLOCK_SIZE >= XC_PAGE_SIZE) {
-xen_pv_printf(&blkdev->xendev, 0, "error: page crossing\n");
+xen_pv_printf(xendev, 0, "error: page crossing\n");
 goto err;
 }
 
@@ -228,7 +229,7 @@ static int ioreq_parse(struct ioreq *ioreq)
 ioreq->size += len;
 }
 if (ioreq->start + ioreq->size > blkdev->file_size) {
-xen_pv_printf(&blkdev->xendev, 0, "error: access beyond end of 
file\n");
+xen_pv_printf(xendev, 0, "error: access beyond end of file\n");
 goto err;
 }
 return 0;
@@ -244,7 +245,7 @@ static int ioreq_grant_copy(struct ioreq *ioreq)
 struct XenDevice *xendev = &blkdev->xendev;
 XenGrantCopySegment segs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 int i, count, rc;
-int64_t file_blk = ioreq->blkdev->file_blk;
+int64_t file_blk = blkdev->file_blk;
 bool to_domain = (ioreq->req.operation == BLKIF_OP_READ);
 void *virt = ioreq->buf;
 
@@ -272,7 +273,7 @@ static int ioreq_grant_copy(struct ioreq *ioreq)
 rc = xen_be_copy_grant_refs(xendev, to_domain, segs, count);
 
 if (rc) {
-xen_pv_printf(&ioreq->blkdev->xendev, 0,
+xen_pv_printf(xendev, 0,
   "failed to copy data %d\n", rc);
 ioreq->aio_errors++;
 return -1;
@@ -287,11 +288,12 @@ static void qemu_aio_complete(void *opaque, int ret)
 {
 struct ioreq *ioreq = opaque;
 struct XenBlkDev *blkdev = ioreq->blkdev;
+struct XenDevice *xendev = &blkdev->xendev;
 
 aio_context_acquire(blkdev->ctx);
 
 if (ret != 0) {
-xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n",
+xen_pv_printf(xendev, 0, "%s I/O error\n",
   ioreq->req.operation == BLKIF_OP_READ ? "read" : 
"write");
 ioreq->aio_errors++;
 }
@@ -625,16 +627,17 @@ static void blk_alloc(struct XenDevice *xendev)
 
 static void blk_parse_discard(struct XenBlkDev *blkdev)
 {
+struct XenDevice *xendev = &blkdev->xendev;
 int enable;
 
 blkdev->feature_discard = true;
 
-if (xenstore_read_be_int(&blkdev->xendev, "discard-enable", &enable) == 0) 
{
+if (xenstore_read_be_int(xendev, "discard-enable", &enable) == 0) {
 blkdev->feature_discard = !!enable;
 }
 
 if (blkdev->feature_dis

[Xen-devel] [PATCH v4 4/9] xen: remove other open-coded use of libxengnttab

2018-05-17 Thread Paul Durrant
Now that helpers are available in xen_backend, use them throughout all
Xen PV backends.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Greg Kurz 
Cc: Paolo Bonzini 
Cc: Jason Wang 
Cc: Gerd Hoffmann 

v2:
 - New in v2
---
 hw/9pfs/xen-9p-backend.c | 32 +++-
 hw/char/xen_console.c|  9 -
 hw/net/xen_nic.c | 33 ++---
 hw/usb/xen-usb.c | 37 +
 4 files changed, 50 insertions(+), 61 deletions(-)

diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 95e50c4..6026780 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -331,14 +331,14 @@ static int xen_9pfs_free(struct XenDevice *xendev)
 
 for (i = 0; i < xen_9pdev->num_rings; i++) {
 if (xen_9pdev->rings[i].data != NULL) {
-xengnttab_unmap(xen_9pdev->xendev.gnttabdev,
-xen_9pdev->rings[i].data,
-(1 << xen_9pdev->rings[i].ring_order));
+xen_be_unmap_grant_refs(&xen_9pdev->xendev,
+xen_9pdev->rings[i].data,
+(1 << xen_9pdev->rings[i].ring_order));
 }
 if (xen_9pdev->rings[i].intf != NULL) {
-xengnttab_unmap(xen_9pdev->xendev.gnttabdev,
-xen_9pdev->rings[i].intf,
-1);
+xen_be_unmap_grant_refs(&xen_9pdev->xendev,
+xen_9pdev->rings[i].intf,
+1);
 }
 if (xen_9pdev->rings[i].bh != NULL) {
 qemu_bh_delete(xen_9pdev->rings[i].bh);
@@ -390,11 +390,10 @@ static int xen_9pfs_connect(struct XenDevice *xendev)
 }
 g_free(str);
 
-xen_9pdev->rings[i].intf =  xengnttab_map_grant_ref(
-xen_9pdev->xendev.gnttabdev,
-xen_9pdev->xendev.dom,
-xen_9pdev->rings[i].ref,
-PROT_READ | PROT_WRITE);
+xen_9pdev->rings[i].intf =
+xen_be_map_grant_ref(&xen_9pdev->xendev,
+ xen_9pdev->rings[i].ref,
+ PROT_READ | PROT_WRITE);
 if (!xen_9pdev->rings[i].intf) {
 goto out;
 }
@@ -403,12 +402,11 @@ static int xen_9pfs_connect(struct XenDevice *xendev)
 goto out;
 }
 xen_9pdev->rings[i].ring_order = ring_order;
-xen_9pdev->rings[i].data = xengnttab_map_domain_grant_refs(
-xen_9pdev->xendev.gnttabdev,
-(1 << ring_order),
-xen_9pdev->xendev.dom,
-xen_9pdev->rings[i].intf->ref,
-PROT_READ | PROT_WRITE);
+xen_9pdev->rings[i].data =
+xen_be_map_grant_refs(&xen_9pdev->xendev,
+  xen_9pdev->rings[i].intf->ref,
+  (1 << ring_order),
+  PROT_READ | PROT_WRITE);
 if (!xen_9pdev->rings[i].data) {
 goto out;
 }
diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index bdfaa40..8b4b4bf 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -233,12 +233,11 @@ static int con_initialise(struct XenDevice *xendev)
 if (!xendev->dev) {
 xen_pfn_t mfn = con->ring_ref;
 con->sring = xenforeignmemory_map(xen_fmem, con->xendev.dom,
-  PROT_READ|PROT_WRITE,
+  PROT_READ | PROT_WRITE,
   1, &mfn, NULL);
 } else {
-con->sring = xengnttab_map_grant_ref(xendev->gnttabdev, 
con->xendev.dom,
- con->ring_ref,
- PROT_READ|PROT_WRITE);
+con->sring = xen_be_map_grant_ref(xendev, con->ring_ref,
+  PROT_READ | PROT_WRITE);
 }
 if (!con->sring)
return -1;
@@ -267,7 +266,7 @@ static void con_disconnect(struct XenDevice *xendev)
 if (!xendev->dev) {
 xenforeignmemory_unmap(xen_fmem, con->sring, 1);
 } else {
-xengnttab_unmap(xendev->gnttabdev, con->sring, 1);
+xen_be_unmap_grant_ref(xendev, con->sring);
 }
 con->sring = NULL;
 }
diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
index 20c43a6..46a8dbf 100644
--- a/hw/net/xen_nic.c
+++ b/hw/net/xen_nic.c
@@ -160,9 +160,8 @@ static void net_tx_packets(struct XenNetDev *netdev)
   (txreq.flags & NETTXF_more_data)  ? " more_data" 
 : "",
   (txreq.flags & NETTXF_extra_info) ? " 
extra_info" : "");
 
-page = xengnttab_map_grant_ref(netdev->xendev.gnttabdev,
-   netdev->xendev.dom,
-   txr

[Xen-devel] [PATCH v4 2/9] xen_backend: add grant table helpers

2018-05-17 Thread Paul Durrant
This patch adds grant table helper functions to the xen_backend code to
localize error reporting and use of xen_domid.

The patch also defers the call to xengnttab_open() until just before the
initialise method in XenDevOps is invoked. This method is responsible for
mapping the shared ring. No prior method requires access to the grant table.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 

v2:
 - New in v2
---
 hw/xen/xen_backend.c | 123 ++-
 include/hw/xen/xen_backend.h |  33 
 2 files changed, 144 insertions(+), 12 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 7445b50..50412d6 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -106,6 +106,103 @@ int xen_be_set_state(struct XenDevice *xendev, enum 
xenbus_state state)
 return 0;
 }
 
+void xen_be_set_max_grant_refs(struct XenDevice *xendev,
+   unsigned int nr_refs)
+{
+assert(xendev->ops->flags & DEVOPS_FLAG_NEED_GNTDEV);
+
+if (xengnttab_set_max_grants(xendev->gnttabdev, nr_refs)) {
+xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n",
+  strerror(errno));
+}
+}
+
+void *xen_be_map_grant_refs(struct XenDevice *xendev, uint32_t *refs,
+unsigned int nr_refs, int prot)
+{
+void *ptr;
+
+assert(xendev->ops->flags & DEVOPS_FLAG_NEED_GNTDEV);
+
+ptr = xengnttab_map_domain_grant_refs(xendev->gnttabdev, nr_refs,
+  xen_domid, refs, prot);
+if (!ptr) {
+xen_pv_printf(xendev, 0,
+  "xengnttab_map_domain_grant_refs failed: %s\n",
+  strerror(errno));
+}
+
+return ptr;
+}
+
+void xen_be_unmap_grant_refs(struct XenDevice *xendev, void *ptr,
+ unsigned int nr_refs)
+{
+assert(xendev->ops->flags & DEVOPS_FLAG_NEED_GNTDEV);
+
+if (xengnttab_unmap(xendev->gnttabdev, ptr, nr_refs)) {
+xen_pv_printf(xendev, 0, "xengnttab_unmap failed: %s\n",
+  strerror(errno));
+}
+}
+
+int xen_be_copy_grant_refs(struct XenDevice *xendev,
+   bool to_domain,
+   XenGrantCopySegment segs[],
+   unsigned int nr_segs)
+{
+xengnttab_grant_copy_segment_t *xengnttab_segs;
+unsigned int i;
+int rc;
+
+assert(xendev->ops->flags & DEVOPS_FLAG_NEED_GNTDEV);
+
+xengnttab_segs = g_new0(xengnttab_grant_copy_segment_t, nr_segs);
+
+for (i = 0; i < nr_segs; i++) {
+XenGrantCopySegment *seg = &segs[i];
+xengnttab_grant_copy_segment_t *xengnttab_seg = &xengnttab_segs[i];
+
+if (to_domain) {
+xengnttab_seg->flags = GNTCOPY_dest_gref;
+xengnttab_seg->dest.foreign.domid = xen_domid;
+xengnttab_seg->dest.foreign.ref = seg->dest.foreign.ref;
+xengnttab_seg->dest.foreign.offset = seg->dest.foreign.offset;
+xengnttab_seg->source.virt = seg->source.virt;
+} else {
+xengnttab_seg->flags = GNTCOPY_source_gref;
+xengnttab_seg->source.foreign.domid = xen_domid;
+xengnttab_seg->source.foreign.ref = seg->source.foreign.ref;
+xengnttab_seg->source.foreign.offset =
+seg->source.foreign.offset;
+xengnttab_seg->dest.virt = seg->dest.virt;
+}
+
+xengnttab_seg->len = seg->len;
+}
+
+rc = xengnttab_grant_copy(xendev->gnttabdev, nr_segs, xengnttab_segs);
+
+if (rc) {
+xen_pv_printf(xendev, 0, "xengnttab_copy failed: %s\n",
+  strerror(errno));
+}
+
+for (i = 0; i < nr_segs; i++) {
+xengnttab_grant_copy_segment_t *xengnttab_seg =
+&xengnttab_segs[i];
+
+if (xengnttab_seg->status != GNTST_okay) {
+xen_pv_printf(xendev, 0, "segment[%u] status: %d\n", i,
+  xengnttab_seg->status);
+rc = -1;
+}
+}
+
+g_free(xengnttab_segs);
+return rc;
+}
+
 /*
  * get xen backend device, allocate a new one if it doesn't exist.
  */
@@ -149,18 +246,6 @@ static struct XenDevice *xen_be_get_xendev(const char 
*type, int dom, int dev,
 }
 qemu_set_cloexec(xenevtchn_fd(xendev->evtchndev));
 
-if (ops->flags & DEVOPS_FLAG_NEED_GNTDEV) {
-xendev->gnttabdev = xengnttab_open(NULL, 0);
-if (xendev->gnttabdev == NULL) {
-xen_pv_printf(NULL, 0, "can't open gnttab device\n");
-xenevtchn_close(xendev->evtchndev);
-qdev_unplug(DEVICE(xendev), NULL);
-return NULL;
-}
-} else {
-xendev->gnttabdev = NULL;
-}
-
 xen_pv_insert_xendev(xendev);
 
 if (xendev->ops->alloc) {
@@ -322,6 +407,16 @@ static int xen_be_try_initialise(struct XenDevice *xendev)
 }
 }
 
+if (xendev->ops->flags & DEVOPS_FLAG_NEED_GNTDEV) {

[Xen-devel] [PATCH v4 3/9] xen_disk: remove open-coded use of libxengnttab

2018-05-17 Thread Paul Durrant
Now that helpers are present in xen_backend, this patch removes open-coded
calls to libxengnttab from the xen_disk code.

This patch also fixes one whitspace error in the assignment of the
XenDevOps initialise method.

Signed-off-by: Paul Durrant 
Acked-by: Anthony Perard 
---
Cc: Stefano Stabellini 
Cc: Kevin Wolf 
Cc: Max Reitz 

v2:
 - New in v2
---
 hw/block/xen_disk.c | 122 ++--
 1 file changed, 32 insertions(+), 90 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 78bfb41..d3be45a 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -68,7 +68,6 @@ struct ioreq {
 uint8_t mapped;
 
 /* grant mapping */
-uint32_tdomids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 uint32_trefs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 int prot;
 void*page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
@@ -142,7 +141,6 @@ static void ioreq_reset(struct ioreq *ioreq)
 ioreq->presync = 0;
 ioreq->mapped = 0;
 
-memset(ioreq->domids, 0, sizeof(ioreq->domids));
 memset(ioreq->refs, 0, sizeof(ioreq->refs));
 ioreq->prot = 0;
 memset(ioreq->page, 0, sizeof(ioreq->page));
@@ -168,16 +166,12 @@ static gint int_cmp(gconstpointer a, gconstpointer b, 
gpointer user_data)
 static void destroy_grant(gpointer pgnt)
 {
 PersistentGrant *grant = pgnt;
-xengnttab_handle *gnt = grant->blkdev->xendev.gnttabdev;
+struct XenBlkDev *blkdev = grant->blkdev;
+struct XenDevice *xendev = &blkdev->xendev;
 
-if (xengnttab_unmap(gnt, grant->page, 1) != 0) {
-xen_pv_printf(&grant->blkdev->xendev, 0,
-  "xengnttab_unmap failed: %s\n",
-  strerror(errno));
-}
+xen_be_unmap_grant_ref(xendev, grant->page);
 grant->blkdev->persistent_gnt_count--;
-xen_pv_printf(&grant->blkdev->xendev, 3,
-  "unmapped grant %p\n", grant->page);
+xen_pv_printf(xendev, 3, "unmapped grant %p\n", grant->page);
 g_free(grant);
 }
 
@@ -185,15 +179,10 @@ static void remove_persistent_region(gpointer data, 
gpointer dev)
 {
 PersistentRegion *region = data;
 struct XenBlkDev *blkdev = dev;
-xengnttab_handle *gnt = blkdev->xendev.gnttabdev;
+struct XenDevice *xendev = &blkdev->xendev;
 
-if (xengnttab_unmap(gnt, region->addr, region->num) != 0) {
-xen_pv_printf(&blkdev->xendev, 0,
-  "xengnttab_unmap region %p failed: %s\n",
-  region->addr, strerror(errno));
-}
-xen_pv_printf(&blkdev->xendev, 3,
-  "unmapped grant region %p with %d pages\n",
+xen_be_unmap_grant_refs(xendev, region->addr, region->num);
+xen_pv_printf(xendev, 3, "unmapped grant region %p with %d pages\n",
   region->addr, region->num);
 g_free(region);
 }
@@ -304,7 +293,6 @@ static int ioreq_parse(struct ioreq *ioreq)
 goto err;
 }
 
-ioreq->domids[i] = blkdev->xendev.dom;
 ioreq->refs[i]   = ioreq->req.seg[i].gref;
 
 mem = ioreq->req.seg[i].first_sect * blkdev->file_blk;
@@ -324,7 +312,8 @@ err:
 
 static void ioreq_unmap(struct ioreq *ioreq)
 {
-xengnttab_handle *gnt = ioreq->blkdev->xendev.gnttabdev;
+struct XenBlkDev *blkdev = ioreq->blkdev;
+struct XenDevice *xendev = &blkdev->xendev;
 int i;
 
 if (ioreq->num_unmap == 0 || ioreq->mapped == 0) {
@@ -334,11 +323,7 @@ static void ioreq_unmap(struct ioreq *ioreq)
 if (!ioreq->pages) {
 return;
 }
-if (xengnttab_unmap(gnt, ioreq->pages, ioreq->num_unmap) != 0) {
-xen_pv_printf(&ioreq->blkdev->xendev, 0,
-  "xengnttab_unmap failed: %s\n",
-  strerror(errno));
-}
+xen_be_unmap_grant_refs(xendev, ioreq->pages, ioreq->num_unmap);
 ioreq->blkdev->cnt_map -= ioreq->num_unmap;
 ioreq->pages = NULL;
 } else {
@@ -346,11 +331,7 @@ static void ioreq_unmap(struct ioreq *ioreq)
 if (!ioreq->page[i]) {
 continue;
 }
-if (xengnttab_unmap(gnt, ioreq->page[i], 1) != 0) {
-xen_pv_printf(&ioreq->blkdev->xendev, 0,
-  "xengnttab_unmap failed: %s\n",
-  strerror(errno));
-}
+xen_be_unmap_grant_ref(xendev, ioreq->page[i]);
 ioreq->blkdev->cnt_map--;
 ioreq->page[i] = NULL;
 }
@@ -360,14 +341,14 @@ static void ioreq_unmap(struct ioreq *ioreq)
 
 static int ioreq_map(struct ioreq *ioreq)
 {
-xengnttab_handle *gnt = ioreq->blkdev->xendev.gnttabdev;
-uint32_t domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+struct XenBlkDev *blkdev = ioreq->blkdev;
+struct XenDevice *xendev = &blkdev->xendev;
 uint32_t refs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 int i, j, new_maps = 0;

Re: [Xen-devel] [PATCH v4 9/9] xen_disk: be consistent with use of xendev and blkdev->xendev

2018-05-17 Thread Paul Durrant
> -Original Message-
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: 17 May 2018 16:36
> To: xen-devel@lists.xenproject.org; qemu-bl...@nongnu.org; qemu-
> de...@nongnu.org
> Cc: Paul Durrant ; Stefano Stabellini
> ; Anthony Perard ;
> Kevin Wolf ; Max Reitz 
> Subject: [PATCH v4 9/9] xen_disk: be consistent with use of xendev and
> blkdev->xendev
> 
> Certain functions in xen_disk are called with a pointer to xendev
> (struct XenDevice *). They then use container_of() to acces the surrounding
> blkdev (struct XenBlkDev) but then in various places use &blkdev->xendev
> when use of the original xendev pointer is shorter to express and clearly
> equivalent.
> 
> This patch is a purely cosmetic patch which makes sure there is a xendev
> pointer on stack for any function where the pointer is need on multiple
> occasions modified those functions to use it consistently.
> 
> Signed-off-by: Paul Durrant 

Apologies to Anthony. This is already:

Acked-by: Anthony PERARD 

> ---
> Cc: Stefano Stabellini 
> Cc: Anthony Perard 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> 
> v2:
>  - Re-based
> ---
>  hw/block/xen_disk.c | 90 +++--
> 
>  1 file changed, 46 insertions(+), 44 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 28651c5..9fbc0cd 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -178,10 +178,11 @@ static void ioreq_release(struct ioreq *ioreq, bool
> finish)
>  static int ioreq_parse(struct ioreq *ioreq)
>  {
>  struct XenBlkDev *blkdev = ioreq->blkdev;
> +struct XenDevice *xendev = &blkdev->xendev;
>  size_t len;
>  int i;
> 
> -xen_pv_printf(&blkdev->xendev, 3,
> +xen_pv_printf(xendev, 3,
>"op %d, nr %d, handle %d, id %" PRId64 ", sector %" PRId64 
> "\n",
>ioreq->req.operation, ioreq->req.nr_segments,
>ioreq->req.handle, ioreq->req.id, 
> ioreq->req.sector_number);
> @@ -199,28 +200,28 @@ static int ioreq_parse(struct ioreq *ioreq)
>  case BLKIF_OP_DISCARD:
>  return 0;
>  default:
> -xen_pv_printf(&blkdev->xendev, 0, "error: unknown operation
> (%d)\n",
> +xen_pv_printf(xendev, 0, "error: unknown operation (%d)\n",
>ioreq->req.operation);
>  goto err;
>  };
> 
>  if (ioreq->req.operation != BLKIF_OP_READ && blkdev->mode[0] != 'w') {
> -xen_pv_printf(&blkdev->xendev, 0, "error: write req for ro 
> device\n");
> +xen_pv_printf(xendev, 0, "error: write req for ro device\n");
>  goto err;
>  }
> 
>  ioreq->start = ioreq->req.sector_number * blkdev->file_blk;
>  for (i = 0; i < ioreq->req.nr_segments; i++) {
>  if (i == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
> -xen_pv_printf(&blkdev->xendev, 0, "error: nr_segments too 
> big\n");
> +xen_pv_printf(xendev, 0, "error: nr_segments too big\n");
>  goto err;
>  }
>  if (ioreq->req.seg[i].first_sect > ioreq->req.seg[i].last_sect) {
> -xen_pv_printf(&blkdev->xendev, 0, "error: first > last 
> sector\n");
> +xen_pv_printf(xendev, 0, "error: first > last sector\n");
>  goto err;
>  }
>  if (ioreq->req.seg[i].last_sect * BLOCK_SIZE >= XC_PAGE_SIZE) {
> -xen_pv_printf(&blkdev->xendev, 0, "error: page crossing\n");
> +xen_pv_printf(xendev, 0, "error: page crossing\n");
>  goto err;
>  }
> 
> @@ -228,7 +229,7 @@ static int ioreq_parse(struct ioreq *ioreq)
>  ioreq->size += len;
>  }
>  if (ioreq->start + ioreq->size > blkdev->file_size) {
> -xen_pv_printf(&blkdev->xendev, 0, "error: access beyond end of
> file\n");
> +xen_pv_printf(xendev, 0, "error: access beyond end of file\n");
>  goto err;
>  }
>  return 0;
> @@ -244,7 +245,7 @@ static int ioreq_grant_copy(struct ioreq *ioreq)
>  struct XenDevice *xendev = &blkdev->xendev;
>  XenGrantCopySegment segs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>  int i, count, rc;
> -int64_t file_blk = ioreq->blkdev->file_blk;
> +int64_t file_blk = blkdev->file_blk;
>  bool to_domain = (ioreq->req.operation == BLKIF_OP_READ);
>  void *virt = ioreq->buf;
> 
> @@ -272,7 +273,7 @@ static int ioreq_grant_copy(struct ioreq *ioreq)
>  rc = xen_be_copy_grant_refs(xendev, to_domain, segs, count);
> 
>  if (rc) {
> -xen_pv_printf(&ioreq->blkdev->xendev, 0,
> +xen_pv_printf(xendev, 0,
>"failed to copy data %d\n", rc);
>  ioreq->aio_errors++;
>  return -1;
> @@ -287,11 +288,12 @@ static void qemu_aio_complete(void *opaque, int
> ret)
>  {
>  struct ioreq *ioreq = opaque;
>  struct XenBlkDev *blkdev = ioreq->blkdev;
> +struct XenDevice *xendev = &blkdev->xendev;
> 
>  aio_context_acquire(blkdev->ctx);
> 
>  if (ret != 0) {
> -xen_pv_printf(&blkdev

Re: [Xen-devel] [PATCH v4 1/2] x86/hvm: Introduce *save_one() functions

2018-05-17 Thread Jan Beulich
>>> On 07.05.18 at 10:24,  wrote:
> This patch introduces save_one() functions. They will be called in the
> *save() so we can extract data for a single instance.

Mostly fine, but please split up into one patch per save type.

> --- a/xen/arch/x86/cpu/mcheck/vmce.c
> +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> @@ -349,6 +349,14 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
>  return ret;
>  }
>  
> +void vmce_save_vcpu_ctxt_one(struct vcpu *v, struct hvm_vmce_vcpu *ctxt)

static (also elsewhere)

> @@ -1173,6 +1184,18 @@ HVM_REGISTER_SAVE_RESTORE(CPU, hvm_save_cpu_ctxt, 
> hvm_load_cpu_ctxt,
> save_area) + \
>xstate_ctxt_size(xcr0))
>  
> +void hvm_save_cpu_xsave_states_one(struct vcpu *v, struct hvm_hw_cpu_xsave 
> **ctx, hvm_domain_context_t *h)

This is inconsistent with the others: Why the extra indirection for ctx?
And why the passing of h?

> +{
> +unsigned int size = HVM_CPU_XSAVE_SIZE(v->arch.xcr0_accum);
> +struct hvm_hw_cpu_xsave *ctxt = * ctx;
> +
> +h->cur += size;

This belongs in the caller afaict.

> @@ -1339,6 +1358,39 @@ static const uint32_t msrs_to_send[] = {
>  };
>  static unsigned int __read_mostly msr_count_max = ARRAY_SIZE(msrs_to_send);
>  
> +int hvm_save_cpu_msrs_one(struct vcpu *v, struct hvm_msr **ctx, 
> hvm_domain_context_t *h)

Same as above; I can't even spot where you use h in this function.

>  static int hvm_save_mtrr_msr(struct domain *d, hvm_domain_context_t *h)
>  {
> -int i;
>  struct vcpu *v;
>  struct hvm_hw_mtrr hw_mtrr;
> -struct mtrr_state *mtrr_state;
>  /* save mtrr&pat */

Please take the opportunity and add a blank line after the declarations.

> --- a/xen/arch/x86/hvm/viridian.c
> +++ b/xen/arch/x86/hvm/viridian.c
> @@ -1028,6 +1028,12 @@ static int viridian_load_domain_ctxt(struct domain *d, 
> hvm_domain_context_t *h)
>  HVM_REGISTER_SAVE_RESTORE(VIRIDIAN_DOMAIN, viridian_save_domain_ctxt,
>viridian_load_domain_ctxt, 1, HVMSR_PER_DOM);
>  
> +void viridian_save_vcpu_ctxt_one(struct vcpu *v, struct 
> hvm_viridian_vcpu_context *ctxt)
> +{
> +ctxt->vp_assist_msr = v->arch.hvm_vcpu.viridian.vp_assist.msr.raw;
> +ctxt->vp_assist_pending = v->arch.hvm_vcpu.viridian.vp_assist.pending;
> +}
> +
>  static int viridian_save_vcpu_ctxt(struct domain *d, hvm_domain_context_t *h)
>  {
>  struct vcpu *v;
> @@ -1036,10 +1042,9 @@ static int viridian_save_vcpu_ctxt(struct domain *d, 
> hvm_domain_context_t *h)
>  return 0;
>  
>  for_each_vcpu( d, v ) {
> -struct hvm_viridian_vcpu_context ctxt = {
> -.vp_assist_msr = v->arch.hvm_vcpu.viridian.vp_assist.msr.raw,
> -.vp_assist_pending = v->arch.hvm_vcpu.viridian.vp_assist.pending,
> -};
> +struct hvm_viridian_vcpu_context ctxt;
> +
> +viridian_save_vcpu_ctxt_one(v, &ctxt);

There is a reason ctxt has an initializer: You're now leaking 7 bytes of 
hypervisor
stack data (through the _pad[] array).

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v1] libxl: always call qemus xen-save-devices-state in suspend/resume

2018-05-17 Thread Olaf Hering
If a domU has a qemu-xen instance attached, it is required to call qemus
"xen-save-devices-state" method. Without it, the receiving side of a PV
migration may be unable to lock the image:

xen be: qdisk-51712: xen be: qdisk-51712: error: Failed to get "write" lock
error: Failed to get "write" lock
xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed
initialise() failed

The proper way to decide if a PV or PVH domU has a qemu-xen running is
to use the libxl__need_xenpv_qemu API. But since there is no copy of a
libxl_domain_config available in these places, it should be enough to
check if xenstore contains the relevant info.

Signed-off-by: Olaf Hering 
---
 tools/libxl/libxl_dom_suspend.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 1e904bae8a..0d769eefd8 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -377,7 +377,9 @@ static void 
domain_suspend_common_guest_suspended(libxl__egc *egc,
 libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
 libxl__ev_time_deregister(gc, &dsps->guest_timeout);
 
-if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) {
+if (dsps->type == LIBXL_DOMAIN_TYPE_HVM ||
+libxl__device_model_version_running(gc, dsps->domid) ==
+LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
 rc = libxl__domain_suspend_device_model(gc, dsps);
 if (rc) {
 LOGD(ERROR, dsps->domid,
@@ -460,7 +462,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int 
suspend_cancel)
 goto out;
 }
 
-if (type == LIBXL_DOMAIN_TYPE_HVM) {
+if (type == LIBXL_DOMAIN_TYPE_HVM ||
+libxl__device_model_version_running(gc, domid) ==
+LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
 rc = libxl__domain_resume_device_model(gc, domid);
 if (rc) {
 LOGD(ERROR, domid, "failed to resume device model:%d", rc);

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] x86/domctl: Don't pause the whole domain if only getting vcpu state

2018-05-17 Thread Jan Beulich
>>> On 07.05.18 at 10:24,  wrote:
> --- a/xen/arch/x86/cpu/mcheck/vmce.c
> +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> @@ -357,20 +357,14 @@ void vmce_save_vcpu_ctxt_one(struct vcpu *v, struct 
> hvm_vmce_vcpu *ctxt)
>  ctxt->mcg_ext_ctl = v->arch.vmce.mcg_ext_ctl;
>  }
>  
> -static int vmce_save_vcpu_ctxt(struct domain *d, hvm_domain_context_t *h)
> +static int vmce_save_vcpu_ctxt(struct domain *d, hvm_domain_context_t *h,
> +   struct vcpu *v)
>  {
> -struct vcpu *v;
>  int err = 0;
> +struct hvm_vmce_vcpu ctxt;
>  
> -for_each_vcpu ( d, v )
> -{
> -struct hvm_vmce_vcpu ctxt;
> -
> -vmce_save_vcpu_ctxt_one(v, &ctxt);
> -err = hvm_save_entry(VMCE_VCPU, v->vcpu_id, h, &ctxt);
> -if ( err )
> -break;
> -}
> +vmce_save_vcpu_ctxt_one(v, &ctxt);
> +err = hvm_save_entry(VMCE_VCPU, v->vcpu_id, h, &ctxt);
>  
>  return err;
>  }

At the example of this one: The idea of breaking out the patch introducing
the _one() functions was to avoid restructuring in this patch like what you
do here. Any such change not strictly fitting under the title of this patch
should be broken out. There may be multiple steps involved here.

As it stands, the function is now no longer meaningfully different from
vmce_save_vcpu_ctxt_one(), and this pattern recurs. Such redundancy
is undesirable. Together with you now passing v and d (when just v
would suffice) I think you want to further re-structure how handling of
save/restore happens, such that no stub functions like the one here
remain. IOW after having introduced the _one() functions, a second
transformation would be expected to eliminate the original ones, with
(as you do here) the loop moving into the caller.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen-hvm: stop faking I/O to access PCI config space

2018-05-17 Thread Anthony PERARD
On Thu, May 03, 2018 at 12:18:40PM +0100, Paul Durrant wrote:
> This patch removes the current hackery where IOREQ_TYPE_PCI_CONFIG
> reqyests are handled by faking PIO to 0xcf8 and 0xcfc and replaces it

  ^ requests

> with direct calls to pci_host_config_read/write_common().
> Doing so necessitates mapping BDFs to PCIDevices but maintaining a simple
> QLIST in xen_device_realize/unrealize() will suffice.
> 
> NOTE: whilst config space accesses are currently limited to
>   PCI_CONFIG_SPACE_SIZE, this patch paves the way to increasing the
>   limit to PCIE_CONFIG_SPACE_SIZE when Xen gains the ability to
>   emulate MCFG table accesses.
> 
> Signed-off-by: Paul Durrant 

> +static void cpu_ioreq_config(XenIOState *state, ioreq_t *req)
> +{
> +uint32_t sbdf = req->addr >> 32;
> +uint32_t reg = req->addr;
> +XenPciDevice *xendev;
> +
> +if (req->size > sizeof(uint32_t)) {
> +hw_error("PCI config access: bad size (%u)", req->size);
> +}
> +
> +QLIST_FOREACH(xendev, &state->dev_list, entry) {
> +unsigned int i;
> +
> +if (xendev->sbdf != sbdf) {
> +continue;
> +}
> +
> +if (req->dir == IOREQ_READ) {
> +if (!req->data_is_ptr) {
> +req->data = pci_host_config_read_common(
> +xendev->pci_dev, reg, PCI_CONFIG_SPACE_SIZE,
> +req->size);
> +trace_cpu_ioreq_config_read(req, sbdf, reg, req->size,
> +req->data);
> +} else {
> +for (i = 0; i < req->count; i++) {
> +uint32_t tmp;
> +
> +tmp = pci_host_config_read_common(
> +xendev->pci_dev, reg, PCI_CONFIG_SPACE_SIZE,
> +req->size);

So, if data is a pointer, we just keep reading the same address
req->count time?

> +write_phys_req_item(req->data, req, i, &tmp);
> +}
> +}
> +} else if (req->dir == IOREQ_WRITE) {
> +if (!req->data_is_ptr) {
> +trace_cpu_ioreq_config_write(req, sbdf, reg, req->size,
> + req->data);
> +pci_host_config_write_common(
> +xendev->pci_dev, reg, PCI_CONFIG_SPACE_SIZE, req->data,
> +req->size);
> +} else {
> +for (i = 0; i < req->count; i++) {
> +uint32_t tmp = 0;
> +
> +read_phys_req_item(req->data, req, i, &tmp);
> +pci_host_config_write_common(
> +xendev->pci_dev, reg, PCI_CONFIG_SPACE_SIZE, tmp,
> +req->size);
> +}
> +}
> +}
> +}
> +}

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 122898: tolerable all pass - PUSHED

2018-05-17 Thread osstest service owner
flight 122898 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122898/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  8f853dbc06361416bb1faa6ca7721e3982adbf38
baseline version:
 xen  e1f912cbf7178798b0646c7d0753b8d67e139e75

Last test of basis   122888  2018-05-16 18:02:08 Z0 days
Failing since122898  2018-05-17 15:00:29 Z0 days1 attempts
Testing same since  (not found) 0 attempts


People who touched revisions under test:
  Konrad Rzeszutek Wilk 
  Oleksandr Andrushchenko 

jobs:
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   e1f912cbf7..8f853dbc06  8f853dbc06361416bb1faa6ca7721e3982adbf38 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC][PATCH] KVM: APPLES can improve the performance of applications and virtualized systems by up to 49%

2018-05-17 Thread Dario Faggioli
On Sat, 2018-05-12 at 16:27 +0800, Weiwei Jia wrote:
> We already have a prototype implementation based on KVM (Linux Kernel
> 3.19.8). Our patch for Linux Kernel 3.19.8 and the paper describing
> our idea are available in Github repository [1][2][3]. We are pleased
> to revise our patch in order to merge it into Linux/KVM and XEN. We
> hope that you can test and adopt our approach/techniques. We are
> pleased to get some comments/suggestions on the approach and on how
> the idea can be adopted/tested by Linux/KVM and XEN. Thank you.
> 
To see how this is being handled in Xen currently, I suggest you to
grep for:
 ple_gap
 ple_window
 SECONDARY_EXEC_PAUSE_LOOP_EXITING
 EXIT_REASON_PAUSE_INSTRUCTION
 pauseloop_exits

I am aware of your paper, but I haven't got to being able to read it
carefully yet. I'll do that (but can't promise when).

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [distros-debian-wheezy test] 74722: all pass

2018-05-17 Thread Platform Team regression test user
flight 74722 distros-debian-wheezy real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/74722/

Perfect :-)
All tests in this flight passed as required
baseline version:
 flight   74702

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-wheezy-netboot-pvgrub pass
 test-amd64-i386-i386-wheezy-netboot-pvgrub   pass
 test-amd64-i386-amd64-wheezy-netboot-pygrub  pass
 test-amd64-amd64-i386-wheezy-netboot-pygrub  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.10-testing test] 122837: tolerable FAIL - PUSHED

2018-05-17 Thread osstest service owner
flight 122837 xen-4.10-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122837/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  25e0657ed49e4febfb6fce729adb00a8d7b87042
baseline version:
 xen  c30ab3d97c8ff0d2ed8948dd013737befc7a2223

Last test of basis   122490  2018-04-28 06:03:56 Z   19 days
Failing since122560  2018-05-02 10:07:00 Z   15 days9 attempts
Testing same since   122688  2018-05-10 13:23:10 Z7 days3 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Roger Pau Monné 
  Xen Project Security Team 

jobs:
 build-amd64-xsm 

Re: [Xen-devel] [PATCHv3] xen: Add EFI_LOAD_OPTION support

2018-05-17 Thread Tamas K Lengyel
On Thu, May 17, 2018 at 2:03 AM, Jan Beulich  wrote:
 On 07.02.18 at 17:00,  wrote:
>> This patch as-is correctly tells the two possible formats apart. I
>> tested and Xen boots correctly both from the Shell and from the
>> firmware boot menu. I would not like to start addressing hypothetical
>> scenarios that I have no reasonable way to test against. If you are
>> inclined to do that, that's your call but I'll just leave this patch
>> here for now and I hope you would consider merging it.
>
> Would you mind giving the tentative v4 (below) a try?

Unfortunately this does not seem to work as intended:

# cat /boot/efi/EFI/xen/xen.cfg
[global]
default=old

[old]
options=console=vga
kernel=vmlinuz-4.9.0-6-amd64
root=UUID=19f184db-23a8-42c6-8dfa-67816c822573 ro quiet
ramdisk=initrd.img-4.9.0-6-amd64

[new]
options=console=vga,com1 com1=115200,8n1,amt loglvl=all
guest_loglvl=all altp2m=1
kernel=vmlinuz-4.9.0-6-amd64
root=UUID=19f184db-23a8-42c6-8dfa-67816c822573 ro quiet
ramdisk=initrd.img-4.9.0-6-amd64


# efibootmgr -v
BootCurrent: 0001
Timeout: 0 seconds
BootOrder: 0001,,0003,0004,0005,0006,0007
Boot* Xen
HD(1,GPT,ffc5e29b-fa57-483d-a5de-0053f87abdc4,0x800,0x10)/File(\EFI\xen\xen.efi)
Boot0001* Xen altp2m
HD(1,GPT,ffc5e29b-fa57-483d-a5de-0053f87abdc4,0x800,0x10)/File(\EFI\xen\xen.efi)n.e.w.

# xl info
...
xen_commandline: console=vga

As you can see boot option 1 (Xen altp2m) was used for booting but Xen
still used the default global option from the config file instead of
the one specified by the OptionalData.

Tamas

>
> Jan
>
> EFI: add EFI_LOAD_OPTION support
>
> When booting Xen via UEFI the Xen config file can contain multiple
> sections each describing different boot options. It is currently only
> possible to choose which section to boot with if the buffer contains a
> string. UEFI provides a different standard to pass optional arguments
> to an application, and in this patch we make Xen properly parse this
> buffer, thus making it possible to have separate EFI boot options
> present for the different config sections.
>
> Signed-off-by: Tamas K Lengyel 
> Signed-off-by: Jan Beulich 
> ---
> v4: Address my own review comments.
>
> --- unstable.orig/xen/common/efi/boot.c
> +++ unstable/xen/common/efi/boot.c
> @@ -88,6 +88,14 @@ typedef struct _EFI_APPLE_PROPERTIES {
>  EFI_APPLE_PROPERTIES_GETALL GetAll;
>  } EFI_APPLE_PROPERTIES;
>
> +typedef struct _EFI_LOAD_OPTION {
> +UINT32 Attributes;
> +UINT16 FilePathListLength;
> +CHAR16 Description[];
> +} EFI_LOAD_OPTION;
> +
> +#define LOAD_OPTION_ACTIVE  0x0001
> +
>  union string {
>  CHAR16 *w;
>  char *s;
> @@ -275,6 +283,16 @@ static int __init wstrncmp(const CHAR16
>  return n ? *s1 - *s2 : 0;
>  }
>
> +static const CHAR16 *__init wmemchr(const CHAR16 *s, CHAR16 c, UINTN n)
> +{
> +while ( n && *s != c )
> +{
> +--n;
> +++s;
> +}
> +return n ? s : NULL;
> +}
> +
>  static CHAR16 *__init s2w(union string *str)
>  {
>  const char *s = str->s;
> @@ -374,14 +392,49 @@ static void __init PrintErrMesg(const CH
>  }
>
>  static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
> -CHAR16 *cmdline, UINTN cmdsize,
> +VOID *data, UINTN size, UINTN *offset,
>  CHAR16 **options)
>  {
> -CHAR16 *ptr = (CHAR16 *)(argv + argc + 1), *prev = NULL;
> +CHAR16 *ptr = (CHAR16 *)(argv + argc + 1), *prev = NULL, *cmdline = NULL;
>  bool prev_sep = true;
>
> -for ( ; cmdsize > sizeof(*cmdline) && *cmdline;
> -cmdsize -= sizeof(*cmdline), ++cmdline )
> +if ( *offset < size )
> +cmdline = data + *offset;
> +else if ( size > sizeof(*cmdline) && !(size % sizeof(*cmdline)) &&
> +  (wmemchr(data, 0, size / sizeof(*cmdline)) ==
> +   data + size - sizeof(*cmdline)) )
> +{
> +*offset = 0;
> +cmdline = data;
> +}
> +else if ( size > sizeof(EFI_LOAD_OPTION) )
> +{
> +const EFI_LOAD_OPTION *elo = data;
> +/* The minimum size the buffer needs to be. */
> +size_t elo_min = offsetof(EFI_LOAD_OPTION, Description[1]) +
> + elo->FilePathListLength;
> +
> +if ( (elo->Attributes & LOAD_OPTION_ACTIVE) && size > elo_min &&
> + !((size - elo_min) % sizeof(*cmdline)) )
> +{
> +const CHAR16 *desc = elo->Description;
> +const CHAR16 *end = wmemchr(desc, 0,
> +(size - elo_min) / sizeof(*desc) + 
> 1);
> +
> +if ( end )
> +{
> +*offset = elo_min + (end - desc) * sizeof(*desc);
> +if ( (size -= *offset) > sizeof(*cmdline) )
> +cmdline = data + *offset;
> +}
> +}
> +}
> +
> +if ( !cmdline )
> +return 0;
> +
> +for ( ; size > sizeof

Re: [Xen-devel] [PATCH v3 1/2] xen/PVH: Set up GS segment for stack canary

2018-05-17 Thread Boris Ostrovsky
On 05/17/2018 11:02 AM, Jan Beulich wrote:
 On 17.05.18 at 16:47,  wrote:
>> @@ -64,6 +67,9 @@ ENTRY(pvh_start_xen)
>>  mov %eax,%es
>>  mov %eax,%ss
>>  
>> +mov $PVH_CANARY_SEL,%eax
>> +mov %eax,%gs
> I doubt this is needed for 64-bit (you could equally well load zero or leave
> in place what's there in that case),

I don't understand this.


>  and loading the selector before setting
> the base address in the descriptor won't have the intended effect.


I wasn't sure about this either but then I noticed that
secondary_startup_64() does it in the same order (although not using the
MSR).


>
>> @@ -150,9 +170,12 @@ gdt_start:
>>  .quad GDT_ENTRY(0xc09a, 0, 0xf) /* __KERNEL_CS */
>>  #endif
>>  .quad GDT_ENTRY(0xc092, 0, 0xf) /* __KERNEL_DS */
>> +.quad GDT_ENTRY(0x4090, 0, 0x18)/* PVH_CANARY_SEL */
>>  gdt_end:
>>  
>> -.balign 4
>> +.balign 16
>> +canary:
>> +.fill 24, 1, 0
> This is too little space for 64-bit afaict (the canary lives at offset 40 
> there
> if I can trust asm/processor.h).

Yes, should be 48. I didn't realize the two modes use different offsets.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen_pt: Present the size of 64 bit BARs correctly

2018-05-17 Thread Anthony PERARD
On Mon, May 14, 2018 at 10:57:46AM +0100, Ross Lagerwall wrote:
> The full size of the BAR is stored in the lower PCIIORegion.size. The
> upper PCIIORegion.size is 0.  Calculate the size of the upper half
> correctly from the lower half otherwise the size read by the guest will
> be incorrect.
> 
> Signed-off-by: Ross Lagerwall 

Acked-by: Anthony PERARD 

> ---
>  hw/xen/xen_pt_config_init.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
> index a3ce33e..aee31c6 100644
> --- a/hw/xen/xen_pt_config_init.c
> +++ b/hw/xen/xen_pt_config_init.c
> @@ -504,6 +504,8 @@ static int xen_pt_bar_reg_write(XenPCIPassthroughState 
> *s, XenPTReg *cfg_entry,
>  bar_ro_mask = XEN_PT_BAR_IO_RO_MASK | (r_size - 1);
>  break;
>  case XEN_PT_BAR_FLAG_UPPER:
> +assert(index > 0);
> +r_size = d->io_regions[index - 1].size >> 32;
>  bar_emu_mask = XEN_PT_BAR_ALLF;
>  bar_ro_mask = r_size ? r_size - 1 : 0;
>  break;


-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Test for osstest, features used in Qubes OS

2018-05-17 Thread Sander Eikelenboom
Marek / Ian,

Nice to see PCI-passthrough getting some attention again.

On 17/05/18 17:12, Ian Jackson wrote:
> Marek Marczykowski-Górecki writes ("Re: Test for osstest, features used in 
> Qubes OS"):
>> On Thu, May 17, 2018 at 01:26:30PM +0100, Ian Jackson wrote:
>>> Is it likely that this will depend on non-buggy host firmware ?  If so
>>> then we need to make arrangements to test it and only do it on hosts
>>> which are not buggy.  In practice this probably means wiring it up to
>>> the automatic host examiner.
>>
>> Yes, probably.
> 
> That's not entirely trivial then, especially for you, unless you want
> to set up your own osstest production instance.  However, I can
> probably do the osstest-machinery work if you will help debug it,
> review logs, tell me what to do next, etc. :-).
> 
>>> Is there some kind of cheap USB HID, that is interactable-with, which
>>> we could plug into each machine's USB port ?  I'm slightly concerned
>>> that plugging in a storage device, or connecting the other NIC, might
>>> interfere with booting.
>>
>> I use mass storage for tests... But if you use network boot, it
>> shouldn't really interfere, no?
> 
> We do both network boot and disk boot.  I think the BIOS disk boot has
> to continue to work and boot the HDD.

As a user of pci-passthrough for quite some time and reporting some 
pci-passthrough bugs in the past,
I do have some comments:

- First of all it would be very nice to get some autotesting :).
- But if you want to thoroughly test pci-passthrough, 
  it will be far from easy since there is quite a multi-dimensional support 
matrix
  (I'm not implying that everything should be done or it won't be valuable if 
any is missing,
   it's only meant for reference):
  1) Guest side implementation: 
 - PV guest (pcifront)
 - HVM (qemu-traditional) 
 - HVM (qemu-xen) 
 - HVM (qemu-upstream) 
 - perhaps PVH support for pci passthrough coming around the corner.

  2) (Un)Binding method to pciback:
 - binding pci devices to pciback on host boot (command line) 
 - de/re/unbinding devices from dom0 while running.
 
  3) (Un)binding to guest:
 - On guest start (guest.cfg pci=[...])
 - After the guest has been started with 'xl pci-*' commands
  3) Device interrupts: legacy versus MSI versus MSI-X
  4) Other pci device features: roms, BAR sizes, etc.
  5) AMD versus Intel IOMMU

From the past reports, I know (1) and (3) did matter (problems being isolated 
to one of these variants only).


As for restarting guests and reassigning pci-devices again to other guests the 
current pciback reset support lacks
the bus-reset patches at present in upstream linux kernels. Passthrough of AMD 
Radeon graphics adapters works only one
time without it (if you stop and restart a guest it doesn't work anymore and 
you need to reboot the host). 
With the bus-reset patches (which have been posted to the list and seem to be 
in both Qubes and Xenserver 
in some form but not in upstream linux). Someone from Oracle had picked them up 
to get them upstream some time ago,
but that effort seems to have stalled.

The code in libxl seems to be quite messy for pci-passthrough especially for 
handling all the guest side implementations (1)
and xenstore interactions that go with it (or don't for qemu).

--
Sander

 
>>> If you want to get pci passthrough tests working I would suggest
>>> testing it with non-stubdom first.  I assume the config etc. is the
>>> same, so having got that working, osstest would be able to test it for
>>> the stubdom tests too.
>>
>> Oh, I though there are already tests for that...
> 
> There are no PCI passthrough tests at all.  For a while we had some
> SRIOV NIC tests which were requested by Intel.  But they always failed
> giving kernel stack dumps.  We kept poking Intel to get them to fix
> them, or tell us how the tests were wrong, but to no avail.  So we
> dropped them.
> 
> So any work in this area would be greatly appreciated!
> 
> Ian.
> 
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke baseline test] 122899: tolerable all pass

2018-05-17 Thread osstest service owner
"Old" tested version had not actually been tested; therefore in this
flight we test it, rather than a new candidate.  The baseline, if
any, is the most recent actually tested revision.

flight 122899 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122899/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  8f853dbc06361416bb1faa6ca7721e3982adbf38
baseline version:
 xen  e1f912cbf7178798b0646c7d0753b8d67e139e75

Last test of basis   122888  2018-05-16 18:02:08 Z1 days
Failing since122898  2018-05-17 15:00:29 Z0 days2 attempts
Testing same since  (not found) 0 attempts


People who touched revisions under test:
  Konrad Rzeszutek Wilk 
  Oleksandr Andrushchenko 

jobs:
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Push not applicable.


commit 8f853dbc06361416bb1faa6ca7721e3982adbf38
Author: Oleksandr Andrushchenko 
Date:   Wed May 2 17:49:19 2018 +0300

xen/kbdif: Add features to disable keyboard and pointer

It is now not fully possible to control if and which virtual devices
are created by the frontend, e.g. keyboard and pointer devices
are always created and multi-touch device is created if the
backend advertises multi-touch support. In some cases this
behavior is not desirable and better control over the frontend's
configuration is required.

Add new XenStore feature fields, so it is possible to individually
control set of exposed virtual devices for each guest OS:
 - set feature-disable-keyboard to 1 if no keyboard device needs
   to be created
 - set feature-disable-pointer to 1 if no pointer device needs
   to be created

Keep old behavior by default.

Signed-off-by: Oleksandr Andrushchenko 
Reviewed-by: Konrad Rzeszutek Wilk 
Signed-off-by: Konrad Rzeszutek Wilk 
Release-acked-by: Juergen Gross 
(qemu changes not included)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/2] xen/kbdif: Add features to disable keyboard and pointer

2018-05-17 Thread Oleksandr Andrushchenko

On 05/17/2018 05:09 PM, Konrad Rzeszutek Wilk wrote:

On Thu, May 17, 2018 at 08:51:38AM +0300, Oleksandr Andrushchenko wrote:

On 05/17/2018 08:50 AM, Juergen Gross wrote:

On 17/05/18 07:45, Oleksandr Andrushchenko wrote:

Hi, Juergen!

This patch should go into 4.11 as it is needed for a related Linux
kernel patch and the risk is next to zero for Xen due to only adding
some macros not in use on Xen side.

Could you please release ack this

Release-acked-by: Juergen Gross 

Thank you

and apply?

This has to be done by a committer, which I'm not.

Konrad, could you please apply?

Yes of course.

Thank you

Juergen

Thank you,
Oleksandr



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH UPSTREAM] xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent

2018-05-17 Thread Joe Jin
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.

This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".

Signed-off-by: Joe Jin 
Tested-by: John Sobecki  
Reviewed-by: Rzeszutek Wilk 
Cc: sta...@vger.kernel.org
---
 drivers/xen/swiotlb-xen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e1c60899fdbc..a6f9ba85dc4b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -351,7 +351,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
 * physical address */
phys = xen_bus_to_phys(dev_addr);
 
-   if (((dev_addr + size - 1 > dma_mask)) ||
+   if (((dev_addr + size - 1 <= dma_mask)) ||
range_straddles_page_boundary(phys, size))
xen_destroy_contiguous_region(phys, order);
 
-- 
2.14.3 (Apple Git-98)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH UPSTREAM] xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent

2018-05-17 Thread Greg KH
On Thu, May 17, 2018 at 11:45:57AM -0700, Joe Jin wrote:
> When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
> but Dom Heap is increased by the same size. Tracing raidconfig we found
> that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
> to apply memory. If the memory allocated by Dom0 is not in the DMA area,
> it will exchange memory with Xen to meet the requiment. Later drivers
> call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
> the check condition (dev_addr + size - 1 <= dma_mask) is always false,
> it prevents calling xen_destroy_contiguous_region() to return the memory
> to the Xen DMA heap.
> 
> This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
> coherent alloc/dealloc check before swizzling the MFNs.".
> 
> Signed-off-by: Joe Jin 
> Tested-by: John Sobecki  
> Reviewed-by: Rzeszutek Wilk 
> Cc: sta...@vger.kernel.org
> ---
>  drivers/xen/swiotlb-xen.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

What does "PATCH UPSTREAM" mean?

confused,

greg k-h

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH UPSTREAM] xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent

2018-05-17 Thread Joe Jin
On 5/17/18 12:10 PM, Greg KH wrote:
> On Thu, May 17, 2018 at 11:45:57AM -0700, Joe Jin wrote:
>> When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
>> but Dom Heap is increased by the same size. Tracing raidconfig we found
>> that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
>> to apply memory. If the memory allocated by Dom0 is not in the DMA area,
>> it will exchange memory with Xen to meet the requiment. Later drivers
>> call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
>> the check condition (dev_addr + size - 1 <= dma_mask) is always false,
>> it prevents calling xen_destroy_contiguous_region() to return the memory
>> to the Xen DMA heap.
>>
>> This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
>> coherent alloc/dealloc check before swizzling the MFNs.".
>>
>> Signed-off-by: Joe Jin 
>> Tested-by: John Sobecki  
>> Reviewed-by: Rzeszutek Wilk 
>> Cc: sta...@vger.kernel.org
>> ---
>>  drivers/xen/swiotlb-xen.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> What does "PATCH UPSTREAM" mean?

Oops I forgot to remove UPSTREAM, the tag for internal review.

Sorry for this, will resend it without the tag.

Thanks,
Joe

> 
> confused,
> 
> greg k-h
>

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent

2018-05-17 Thread Joe Jin
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.

This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".

Signed-off-by: Joe Jin 
Tested-by: John Sobecki  
Reviewed-by: Rzeszutek Wilk 
Cc: sta...@vger.kernel.org
---
 drivers/xen/swiotlb-xen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e1c60899fdbc..a6f9ba85dc4b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -351,7 +351,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
 * physical address */
phys = xen_bus_to_phys(dev_addr);
 
-   if (((dev_addr + size - 1 > dma_mask)) ||
+   if (((dev_addr + size - 1 <= dma_mask)) ||
range_straddles_page_boundary(phys, size))
xen_destroy_contiguous_region(phys, order);
 
-- 
2.14.3 (Apple Git-98)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >