RE: [PATCH v3 00/23] xl / libxl: named PCI pass-through devices

2020-11-23 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper 
> Sent: 23 November 2020 22:18
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Paul Durrant ; Anthony PERARD 
> ; Christian Lindig
> ; David Scott ; George Dunlap
> ; Ian Jackson ; Nick Rosbrook 
> ;
> Wei Liu 
> Subject: Re: [PATCH v3 00/23] xl / libxl: named PCI pass-through devices
> 
> On 23/11/2020 17:44, Paul Durrant wrote:
> > From: Paul Durrant 
> >
> > Paul Durrant (23):
> >   xl / libxl: s/pcidev/pci and remove DEFINE_DEVICE_TYPE_STRUCT_X
> >   libxl: make libxl__device_list() work correctly for
> > LIBXL__DEVICE_KIND_PCI...
> >   libxl: Make sure devices added by pci-attach are reflected in the
> > config
> >   libxl: add/recover 'rdm_policy' to/from PCI backend in xenstore
> >   libxl: s/detatched/detached in libxl_pci.c
> >   libxl: remove extraneous arguments to do_pci_remove() in libxl_pci.c
> >   libxl: stop using aodev->device_config in libxl__device_pci_add()...
> >   libxl: generalise 'driver_path' xenstore access functions in
> > libxl_pci.c
> >   libxl: remove unnecessary check from libxl__device_pci_add()
> >   libxl: remove get_all_assigned_devices() from libxl_pci.c
> >   libxl: make sure callers of libxl_device_pci_list() free the list
> > after use
> >   libxl: add libxl_device_pci_assignable_list_free()...
> >   libxl: use COMPARE_PCI() macro is_pci_in_array()...
> >   docs/man: extract documentation of PCI_SPEC_STRING from the xl.cfg
> > manpage...
> >   docs/man: improve documentation of PCI_SPEC_STRING...
> >   docs/man: fix xl(1) documentation for 'pci' operations
> >   libxl: introduce 'libxl_pci_bdf' in the idl...
> >   libxlu: introduce xlu_pci_parse_spec_string()
> >   libxl: modify
> > libxl_device_pci_assignable_add/remove/list/list_free()...
> >   docs/man: modify xl(1) in preparation for naming of assignable devices
> >   xl / libxl: support naming of assignable devices
> >   docs/man: modify xl-pci-configuration(5) to add 'name' field to
> > PCI_SPEC_STRING
> >   xl / libxl: support 'xl pci-attach/detach' by name
> 
> We're trying to get the CI loop up and running.  Its not emailing
> xen-devel yet, but has found a real error somewhere in this series.
> 
> https://gitlab.com/xen-project/patchew/xen/-/pipelines/220153571
> 

Found it, thanks...

libxl_pci.c: In function 'libxl_device_pci_assignable_name2bdf':
libxl_pci.c:970:5: error: 'pcibdf' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
 return pcibdf;
 ^

Odd that my local build (debian 9.13) didn't pick it up. Will send a v4 shortly.

  Paul

> ~Andrew




[linux-linus test] 156972: regressions - FAIL

2020-11-23 Thread osstest service owner
flight 156972 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156972/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-arm64-arm64-examine  8 reboot   fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 20 guest-stop  fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 20 guest-stop   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-arm64-arm64-xl-xsm   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-credit1  12 debian-install fail in 156955 REGR. vs. 152332

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail in 156955 pass in 
156972
 test-arm64-arm64-xl   8 xen-boot fail in 156964 pass in 156972
 test-arm64-arm64-xl-credit1  10 host-ping-check-xenfail pass in 156955
 test-arm64-arm64-xl  10 host-ping-check-xenfail pass in 156955
 test-arm64-arm64-xl-seattle   8 xen-boot   fail pass in 156955
 test-arm64-arm64-libvirt-xsm  8 xen-boot   fail pass in 156955
 test-armhf-armhf-libvirt-raw  8 xen-boot   fail pass in 156964

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-seattle 11 leak-check/basis(11) fail in 156955 blocked in 
152332
 test-arm64-arm64-libvirt-xsm 11 leak-check/basis(11) fail in 156955 blocked in 
152332
 test-arm64-arm64-xl   11 leak-check/basis(11) fail in 156955 blocked in 152332
 test-armhf-armhf-libvirt-raw 15 saverestore-support-check fail in 156955 like 
152332
 test-armhf-armhf-libvirt-raw 14 migrate-support-check fail in 156955 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail  

[PATCH v7 1/3] xen/events: access last_priority and last_vcpu_id together

2020-11-23 Thread Juergen Gross
The queue for a fifo event is depending on the vcpu_id and the
priority of the event. When sending an event it might happen the
event needs to change queues and the old queue needs to be kept for
keeping the links between queue elements intact. For this purpose
the event channel contains last_priority and last_vcpu_id values
elements for being able to identify the old queue.

In order to avoid races always access last_priority and last_vcpu_id
with a single atomic operation avoiding any inconsistencies.

Signed-off-by: Juergen Gross 
Reviewed-by: Julien Grall 
---
 xen/common/event_fifo.c | 25 +++--
 xen/include/xen/sched.h |  3 +--
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
index c6e58d2a1a..79090c04ca 100644
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -42,6 +42,14 @@ struct evtchn_fifo_domain {
 unsigned int num_evtchns;
 };
 
+union evtchn_fifo_lastq {
+uint32_t raw;
+struct {
+uint8_t last_priority;
+uint16_t last_vcpu_id;
+};
+};
+
 static inline event_word_t *evtchn_fifo_word_from_port(const struct domain *d,
unsigned int port)
 {
@@ -86,16 +94,18 @@ static struct evtchn_fifo_queue *lock_old_queue(const 
struct domain *d,
 struct vcpu *v;
 struct evtchn_fifo_queue *q, *old_q;
 unsigned int try;
+union evtchn_fifo_lastq lastq;
 
 for ( try = 0; try < 3; try++ )
 {
-v = d->vcpu[evtchn->last_vcpu_id];
-old_q = >evtchn_fifo->queue[evtchn->last_priority];
+lastq.raw = read_atomic(>fifo_lastq);
+v = d->vcpu[lastq.last_vcpu_id];
+old_q = >evtchn_fifo->queue[lastq.last_priority];
 
 spin_lock_irqsave(_q->lock, *flags);
 
-v = d->vcpu[evtchn->last_vcpu_id];
-q = >evtchn_fifo->queue[evtchn->last_priority];
+v = d->vcpu[lastq.last_vcpu_id];
+q = >evtchn_fifo->queue[lastq.last_priority];
 
 if ( old_q == q )
 return old_q;
@@ -246,8 +256,11 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct 
evtchn *evtchn)
 /* Moved to a different queue? */
 if ( old_q != q )
 {
-evtchn->last_vcpu_id = v->vcpu_id;
-evtchn->last_priority = q->priority;
+union evtchn_fifo_lastq lastq = { };
+
+lastq.last_vcpu_id = v->vcpu_id;
+lastq.last_priority = q->priority;
+write_atomic(>fifo_lastq, lastq.raw);
 
 spin_unlock_irqrestore(_q->lock, flags);
 spin_lock_irqsave(>lock, flags);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 7251b3ae3e..a345cc01f8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -117,8 +117,7 @@ struct evtchn
 #ifndef NDEBUG
 u8 old_state;  /* State when taking lock in write mode. */
 #endif
-u8 last_priority;
-u16 last_vcpu_id;
+u32 fifo_lastq;/* Data for fifo events identifying last queue. */
 #ifdef CONFIG_XSM
 union {
 #ifdef XSM_NEED_GENERIC_EVTCHN_SSID
-- 
2.26.2




[PATCH v7 3/3] xen/events: rework fifo queue locking

2020-11-23 Thread Juergen Gross
Two cpus entering evtchn_fifo_set_pending() for the same event channel
can race in case the first one gets interrupted after setting
EVTCHN_FIFO_PENDING and when the other one manages to set
EVTCHN_FIFO_LINKED before the first one is testing that bit. This can
lead to evtchn_check_pollers() being called before the event is put
properly into the queue, resulting eventually in the guest not seeing
the event pending and thus blocking forever afterwards.

Note that commit 5f2df45ead7c1195 ("xen/evtchn: rework per event channel
lock") made the race just more obvious, while the fifo event channel
implementation had this race from the beginning when an unmask operation
was running in parallel with an event channel send operation.

For avoiding this race the queue locking in evtchn_fifo_set_pending()
needs to be reworked to cover the test of EVTCHN_FIFO_PENDING,
EVTCHN_FIFO_MASKED and EVTCHN_FIFO_LINKED, too. Additionally when an
event channel needs to change queues both queues need to be locked
initially.

Fixes: 5f2df45ead7c1195 ("xen/evtchn: rework per event channel lock")
Fixes: 88910061ec615b2d ("evtchn: add FIFO-based event channel hypercalls and 
port ops")
Signed-off-by: Juergen Gross 
---
 xen/common/event_fifo.c | 115 
 1 file changed, 58 insertions(+), 57 deletions(-)

diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
index 79090c04ca..a57d459cc2 100644
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -87,38 +87,6 @@ static void evtchn_fifo_init(struct domain *d, struct evtchn 
*evtchn)
  d->domain_id, evtchn->port);
 }
 
-static struct evtchn_fifo_queue *lock_old_queue(const struct domain *d,
-struct evtchn *evtchn,
-unsigned long *flags)
-{
-struct vcpu *v;
-struct evtchn_fifo_queue *q, *old_q;
-unsigned int try;
-union evtchn_fifo_lastq lastq;
-
-for ( try = 0; try < 3; try++ )
-{
-lastq.raw = read_atomic(>fifo_lastq);
-v = d->vcpu[lastq.last_vcpu_id];
-old_q = >evtchn_fifo->queue[lastq.last_priority];
-
-spin_lock_irqsave(_q->lock, *flags);
-
-v = d->vcpu[lastq.last_vcpu_id];
-q = >evtchn_fifo->queue[lastq.last_priority];
-
-if ( old_q == q )
-return old_q;
-
-spin_unlock_irqrestore(_q->lock, *flags);
-}
-
-gprintk(XENLOG_WARNING,
-"dom%d port %d lost event (too many queue changes)\n",
-d->domain_id, evtchn->port);
-return NULL;
-}  
-
 static int try_set_link(event_word_t *word, event_word_t *w, uint32_t link)
 {
 event_word_t new, old;
@@ -190,6 +158,9 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct 
evtchn *evtchn)
 event_word_t *word;
 unsigned long flags;
 bool_t was_pending;
+struct evtchn_fifo_queue *q, *old_q;
+unsigned int try;
+bool linked = true;
 
 port = evtchn->port;
 word = evtchn_fifo_word_from_port(d, port);
@@ -204,6 +175,48 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct 
evtchn *evtchn)
 return;
 }
 
+for ( try = 0; ; try++ )
+{
+union evtchn_fifo_lastq lastq;
+struct vcpu *old_v;
+
+lastq.raw = read_atomic(>fifo_lastq);
+old_v = d->vcpu[lastq.last_vcpu_id];
+
+q = >evtchn_fifo->queue[evtchn->priority];
+old_q = _v->evtchn_fifo->queue[lastq.last_priority];
+
+if ( q <= old_q )
+{
+spin_lock_irqsave(>lock, flags);
+if ( q != old_q )
+spin_lock(_q->lock);
+}
+else
+{
+spin_lock_irqsave(_q->lock, flags);
+spin_lock(>lock);
+}
+
+lastq.raw = read_atomic(>fifo_lastq);
+old_v = d->vcpu[lastq.last_vcpu_id];
+if ( q == >evtchn_fifo->queue[evtchn->priority] &&
+ old_q == _v->evtchn_fifo->queue[lastq.last_priority] )
+break;
+
+if ( q != old_q )
+spin_unlock(_q->lock);
+spin_unlock_irqrestore(>lock, flags);
+
+if ( try == 3 )
+{
+gprintk(XENLOG_WARNING,
+"dom%d port %d lost event (too many queue changes)\n",
+d->domain_id, evtchn->port);
+return;
+}
+}
+
 was_pending = guest_test_and_set_bit(d, EVTCHN_FIFO_PENDING, word);
 
 /*
@@ -212,9 +225,7 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct 
evtchn *evtchn)
 if ( !guest_test_bit(d, EVTCHN_FIFO_MASKED, word) &&
  !guest_test_bit(d, EVTCHN_FIFO_LINKED, word) )
 {
-struct evtchn_fifo_queue *q, *old_q;
 event_word_t *tail_word;
-bool_t linked = 0;
 
 /*
  * Control block not mapped.  The guest must not unmask an
@@ -228,22 +239,8 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct 
evtchn *evtchn)
 goto done;
 }
 

[PATCH v7 0/3] xen/events: further locking adjustments

2020-11-23 Thread Juergen Gross
This is an add-on of my event channel locking series.

It is a resend of the single patch not having been applied from my
V6 series (being the reason to name this one V7), plus two patches
addressing issues Jan identified with the previous approach (with
one issue being more a latent one, while the other one actually existed
since the introduction on fifo events and just has been made more
probable with the new locking scheme).

Juergen Gross (3):
  xen/events: access last_priority and last_vcpu_id together
  xen/events: modify struct evtchn layout
  xen/events: rework fifo queue locking

 xen/common/event_fifo.c | 128 ++--
 xen/include/xen/sched.h |  23 
 2 files changed, 83 insertions(+), 68 deletions(-)

-- 
2.26.2




[PATCH v7 2/3] xen/events: modify struct evtchn layout

2020-11-23 Thread Juergen Gross
In order to avoid latent races when updating an event channel put
xen_consumer and pending fields in different bytes.

At the same time move some other fields around to have less implicit
paddings and to keep related fields more closely together.

Signed-off-by: Juergen Gross 
---
 xen/include/xen/sched.h | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index a345cc01f8..e6d09aa055 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -80,8 +80,7 @@ extern domid_t hardware_domid;
 #define EVTCHNS_PER_GROUP  (BUCKETS_PER_GROUP * EVTCHNS_PER_BUCKET)
 #define NR_EVTCHN_GROUPS   DIV_ROUND_UP(MAX_NR_EVTCHNS, EVTCHNS_PER_GROUP)
 
-#define XEN_CONSUMER_BITS 3
-#define NR_XEN_CONSUMERS ((1 << XEN_CONSUMER_BITS) - 1)
+#define NR_XEN_CONSUMERS 8
 
 struct evtchn
 {
@@ -94,9 +93,10 @@ struct evtchn
 #define ECS_VIRQ 5 /* Channel is bound to a virtual IRQ line.*/
 #define ECS_IPI  6 /* Channel is bound to a virtual IPI line.*/
 u8  state; /* ECS_* */
-u8  xen_consumer:XEN_CONSUMER_BITS; /* Consumer in Xen if nonzero */
-u8  pending:1;
-u16 notify_vcpu_id;/* VCPU for local delivery notification */
+#ifndef NDEBUG
+u8  old_state; /* State when taking lock in write mode. */
+#endif
+u8  xen_consumer;  /* Consumer in Xen if nonzero */
 u32 port;
 union {
 struct {
@@ -113,11 +113,13 @@ struct evtchn
 } pirq;/* state == ECS_PIRQ */
 u16 virq;  /* state == ECS_VIRQ */
 } u;
-u8 priority;
-#ifndef NDEBUG
-u8 old_state;  /* State when taking lock in write mode. */
-#endif
-u32 fifo_lastq;/* Data for fifo events identifying last queue. */
+
+/* FIFO event channels only. */
+u8  pending;
+u8  priority;
+u16 notify_vcpu_id;/* VCPU for local delivery notification */
+u32 fifo_lastq;/* Data for identifying last queue. */
+
 #ifdef CONFIG_XSM
 union {
 #ifdef XSM_NEED_GENERIC_EVTCHN_SSID
-- 
2.26.2




[qemu-mainline test] 156970: regressions - FAIL

2020-11-23 Thread osstest service owner
flight 156970 qemu-mainline real [real]
flight 156976 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/156970/
http://logs.test-lab.xenproject.org/osstest/logs/156976/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631
 test-amd64-amd64-xl-qcow2   21 guest-start/debian.repeat fail REGR. vs. 152631
 test-armhf-armhf-xl-vhd 17 guest-start/debian.repeat fail REGR. vs. 152631

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10   fail REGR. vs. 152631

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152631
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152631
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 152631
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuufb764373eaf7f65fd9e85377736f83aae09817b2
baseline version:
 qemuu1d806cef0e38b5db8347a8e12f214d543204a314

Last test of basis   152631  2020-08-20 09:07:46 Z   95 days
Failing since152659  2020-08-21 14:07:39 Z   94 days  201 attempts
Testing same since   156970  2020-11-23 20:07:42 Z0 days1 attempts


People who touched 

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Finn Thain


On Mon, 23 Nov 2020, Joe Perches wrote:

> On Tue, 2020-11-24 at 11:58 +1100, Finn Thain wrote:
> > it's not for me to prove that such patches don't affect code 
> > generation. That's for the patch author and (unfortunately) for 
> > reviewers.
> 
> Ideally, that proof would be provided by the compilation system itself 
> and not patch authors nor reviewers nor maintainers.
> 
> Unfortunately gcc does not guarantee repeatability or deterministic 
> output. To my knowledge, neither does clang.
> 

Yes, I've said the same thing myself. But having attempted it, I now think 
this is a hard problem. YMMV.

https://lore.kernel.org/linux-scsi/alpine.LNX.2.22.394.2004281017310.12@nippy.intranet/
https://lore.kernel.org/linux-scsi/alpine.LNX.2.22.394.2005211358460.8@nippy.intranet/



Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Nick Desaulniers
On Sun, Nov 22, 2020 at 8:17 AM Kees Cook  wrote:
>
> On Fri, Nov 20, 2020 at 11:51:42AM -0800, Jakub Kicinski wrote:
> > If none of the 140 patches here fix a real bug, and there is no change
> > to machine code then it sounds to me like a W=2 kind of a warning.
>
> FWIW, this series has found at least one bug so far:
> https://lore.kernel.org/lkml/CAFCwf11izHF=g1mGry1fE5kvFFFrxzhPSM6qKAO8gxSp=kr...@mail.gmail.com/

So looks like the bulk of these are:
switch (x) {
  case 0:
++x;
  default:
break;
}

I have a patch that fixes those up for clang:
https://reviews.llvm.org/D91895

There's 3 other cases that don't quite match between GCC and Clang I
observe in the kernel:
switch (x) {
  case 0:
++x;
  default:
goto y;
}
y:;

switch (x) {
  case 0:
++x;
  default:
return;
}

switch (x) {
  case 0:
++x;
  default:
;
}

Based on your link, and Nathan's comment on my patch, maybe Clang
should continue to warn for the above (at least the `default: return;`
case) and GCC should change?  While the last case looks harmless,
there were only 1 or 2 across the tree in my limited configuration
testing; I really think we should just add `break`s for those.
-- 
Thanks,
~Nick Desaulniers



Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Finn Thain


On Mon, 23 Nov 2020, Miguel Ojeda wrote:

> On Mon, 23 Nov 2020, Finn Thain wrote:
> 
> > On Sun, 22 Nov 2020, Miguel Ojeda wrote:
> > 
> > > 
> > > It isn't that much effort, isn't it? Plus we need to take into 
> > > account the future mistakes that it might prevent, too.
> > 
> > We should also take into account optimisim about future improvements 
> > in tooling.
> > 
> Not sure what you mean here. There is no reliable way to guess what the 
> intention was with a missing fallthrough, even if you parsed whitespace 
> and indentation.
> 

What I meant was that you've used pessimism as if it was fact.

For example, "There is no way to guess what the effect would be if the 
compiler trained programmers to add a knee-jerk 'break' statement to avoid 
a warning".

Moreover, what I meant was that preventing programmer mistakes is a 
problem to be solved by development tools. The idea that retro-fitting new 
language constructs onto mature code is somehow necessary to "prevent 
future mistakes" is entirely questionable.

> > > So even if there were zero problems found so far, it is still a 
> > > positive change.
> > > 
> > 
> > It is if you want to spin it that way.
> > 
> How is that a "spin"? It is a fact that we won't get *implicit* 
> fallthrough mistakes anymore (in particular if we make it a hard error).
> 

Perhaps "handwaving" is a better term?

> > > I would agree if these changes were high risk, though; but they are 
> > > almost trivial.
> > > 
> > 
> > This is trivial:
> > 
> >  case 1:
> > this();
> > +   fallthrough;
> >  case 2:
> > that();
> > 
> > But what we inevitably get is changes like this:
> > 
> >  case 3:
> > this();
> > +   break;
> >  case 4:
> > hmmm();
> > 
> > Why? Mainly to silence the compiler. Also because the patch author 
> > argued successfully that they had found a theoretical bug, often in 
> > mature code.
> > 
> If someone changes control flow, that is on them. Every kernel developer 
> knows what `break` does.
> 

Sure. And if you put -Wimplicit-fallthrough into the Makefile and if that 
leads to well-intentioned patches that cause regressions, it is partly on 
you.

Have you ever considered the overall cost of the countless 
-Wpresume-incompetence flags?

Perhaps you pay the power bill for a build farm that produces logs that 
no-one reads? Perhaps you've run git bisect, knowing that the compiler 
messages are not interesting? Or compiled software in using a language 
that generates impenetrable messages? If so, here's a tip:

# grep CFLAGS /etc/portage/make.conf 
CFLAGS="... -Wno-all -Wno-extra ..."
CXXFLAGS="${CFLAGS}"

Now allow me some pessimism: the hardware upgrades, gigawatt hours and 
wait time attributable to obligatory static analyses are a net loss.

> > But is anyone keeping score of the regressions? If unreported bugs 
> > count, what about unreported regressions?
> > 
> Introducing `fallthrough` does not change semantics. If you are really 
> keen, you can always compare the objects because the generated code 
> shouldn't change.
> 

No, it's not for me to prove that such patches don't affect code 
generation. That's for the patch author and (unfortunately) for reviewers.

> Cheers,
> Miguel
> 



Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Joe Perches
On Tue, 2020-11-24 at 11:58 +1100, Finn Thain wrote:
> it's not for me to prove that such patches don't affect code 
> generation. That's for the patch author and (unfortunately) for reviewers.

Ideally, that proof would be provided by the compilation system itself
and not patch authors nor reviewers nor maintainers.

Unfortunately gcc does not guarantee repeatability or deterministic output.
To my knowledge, neither does clang.





Re: [PATCH v2 4/8] lib: move parse_size_and_unit()

2020-11-23 Thread Andrew Cooper
On 23/10/2020 11:17, Jan Beulich wrote:
> ... into its own CU, to build it into an archive.
>
> Signed-off-by: Jan Beulich 
> ---
>  xen/common/lib.c | 39 --
>  xen/lib/Makefile |  1 +
>  xen/lib/parse-size.c | 50 
>  3 files changed, 51 insertions(+), 39 deletions(-)
>  create mode 100644 xen/lib/parse-size.c

What is the point of turning this into a library?  It isn't a leaf
function (calls simple_strtoull()) and doesn't have any any plausible
way of losing all its callers in various configurations (given its
direct use by the cmdline parsing logic).

~Andrew



Re: [PATCH v2 7/8] lib: move bsearch code

2020-11-23 Thread Andrew Cooper
On 23/11/2020 22:49, Julien Grall wrote:
> Hi Jan,
>
> On 19/11/2020 10:27, Jan Beulich wrote:
>> On 18.11.2020 19:09, Julien Grall wrote:
>>> On 23/10/2020 11:19, Jan Beulich wrote:
 --- a/xen/include/xen/compiler.h
 +++ b/xen/include/xen/compiler.h
 @@ -12,6 +12,7 @@
       #define inline    __inline__
    #define always_inline __inline__ __attribute__
 ((__always_inline__))
 +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
>>>
>>> bsearch() is only used by Arm and I haven't seen anyone so far
>>> complaining about the perf of I/O emulation.
>>>
>>> Therefore, I am not convinced that there is enough justification to
>>> introduce a GNU attribute just for this patch.
>>
>> Please settle this with Andrew: He had asked for the function to
>> become inline. I don't view making it static inline in the header
>> as an option here - if the compiler decides to not inline it, we
>> should not end up with multiple instances in different CUs.
>
> That's the cons of static inline... but then why is it suddenly a
> problem with this helper?
>
>> And
>> without making it static inline the attribute needs adding; at
>> least I'm unaware of an alternative which works with the various
>> compiler versions.
>
> The question we have to answer is: What is the gain with this approach?

Substantial.

>
> If it is not quantifiable, then introducing compiler specific
> attribute is not an option.
>
> IIRC, there are only two callers (all in Arm code) of this function.
> Even inlined, I don't believe you would drastically reduce the number
> of instructions compare to a full blown version. To be generous, I
> would say you may save ~20 instructions per copy.
>
> Therefore, so far, the compiler specific attribute doesn't look
> justified to me. As usual, I am happy to be proven wrong

There is a very good reason why this is the classic example used for
extern inline's in various libc's.

The gains are from the compiler being able to optimise away the function
pointer(s) entirely.  Instead of working on opaque objects, it can see
the accesses directly, implement compares as straight array reads, (for
sorting, the swap() call turns into memcpy()) and because it can see all
the memory accesses, doesn't have to assume that every call to cmp()
modifies arbitrary data in the array (i.e. doesn't have to reload the
objects from memory every iteration).

extern inline allows the compiler full flexibility to judge whether
inlining is a net win, based on optimisation settings and observing what
the practical memory access pattern would be from not inlining.

extern inline is the appropriate thing to use here, except for the big
note in the GCC manual saying "always use gnu_inline in this case" which
appears to be working around a change in the C99 standard which forces
any non-static inline to emit a body even when its not called, due to
rules about global symbols.

Therefore, Reviewed-by: Andrew Cooper 

Some further observations:

For arch/arm/io.c, the handlers are sorted, so find_mmio_handler() will
be O(lg n), but it will surely be faster with the inlined version, and
this is the fastpath.

register_mmio_handler() OTOH is massively expensive, because sort()
turns the array into a heap and back into an array on every insertion,
just to insert an entry into an already sorted array.  It would be more
efficient to library-fy the work I did for VT-x MSR load/save lists
(again, extern inline) and reuse
"insert_$FOO_into_sorted_list_of_FOOs()" which is a search, single
memmove() to make a gap, and a memcpy() into place.

When you compile io.c with this patch in place, the delta is:

add/remove: 0/1 grow/shrink: 1/0 up/down: 92/-164 (-72)
Function old new   delta
try_handle_mmio  720 812 +92
bsearch  164   -    -164
Total: Before=992489, After=992417, chg -0.01%

The reason cmp_mmio_handler (140 bytes) doesn't drop out is because it
is referenced by register_mmio_hanlder()'s call to sort().  All in all,
the inlined version is less than 1/3 the size of the out-of-lined
version, but I haven't characterised it further than that.


On a totally separate point,  I wonder if we'd be better off compiling
with -fgnu89-inline because I can't see any case we're we'd want the C99
inline semantics anywhere in Xen.

~Andrew



Re: [PATCH RFC 4/6] xen/arm: mm: Allow other mapping size in xen_pt_update_entry()

2020-11-23 Thread Stefano Stabellini
On Mon, 23 Nov 2020, Julien Grall wrote:
> Hi Stefano,
> 
> On 23/11/2020 22:27, Stefano Stabellini wrote:
> > On Fri, 20 Nov 2020, Julien Grall wrote:
> > > > >/*
> > > > > * For arm32, page-tables are different on each CPUs. Yet, they
> > > > > share
> > > > > @@ -1265,14 +1287,43 @@ static int xen_pt_update(unsigned long virt,
> > > > >  spin_lock(_pt_lock);
> > > > >-for ( ; addr < addr_end; addr += PAGE_SIZE )
> > > > > +while ( left )
> > > > >{
> > > > > -rc = xen_pt_update_entry(root, addr, mfn, flags);
> > > > > +unsigned int order;
> > > > > +unsigned long mask;
> > > > > +
> > > > > +/*
> > > > > + * Don't take into account the MFN when removing mapping (i.e
> > > > > + * MFN_INVALID) to calculate the correct target order.
> > > > > + *
> > > > > + * XXX: Support superpage mappings if nr is not aligned to a
> > > > > + * superpage size.
> > > > 
> > > > It would be good to add another sentence to explain that the checks
> > > > below are simply based on masks and rely on the mfn, vfn, and also
> > > > nr_mfn to be superpage aligned. (It took me some time to figure it out.)
> > > 
> > > I am not sure to understand what you wrote here. Could you suggest a
> > > sentence?
> > 
> > Something like the following:
> > 
> > /*
> >   * Don't take into account the MFN when removing mapping (i.e
> >   * MFN_INVALID) to calculate the correct target order.
> >   *
> >   * This loop relies on mfn, vfn, and nr_mfn, to be all superpage
> >   * aligned, and it uses `mask' to check for that.
> 
> Unfortunately, I am still not sure to understand this comment.
> The loop can deal with any (super)page size (4KB, 2MB, 1GB). There are no
> assumption on any alignment for mfn, vfn and nr_mfn.
> 
> By OR-ing the 3 components together, we can use it to find out the maximum
> size that can be used for the mapping.
> 
> So can you clarify what you mean?

In pseudo-code:

  mask = mfn | vfn | nr_mfns;
  if (mask & ((1< >   *
> >   * XXX: Support superpage mappings if nr_mfn is not aligned to a
> >   * superpage size.
> >   */
> > 
> > 
> > > Regarding the TODO itself, we have the exact same one in the P2M code. I
> > > couldn't find a clever way to deal with it yet. Any idea how this could be
> > > solved?
> >   I was thinking of a loop that start with the highest possible superpage
> > size that virt and mfn are aligned to, and also smaller or equal to
> > nr_mfn. So rather than using the mask to also make sure nr_mfns is
> > aligned, I would only use the mask to check that mfn and virt are
> > aligned. Then, we only need to check that superpage_size <= left.
> > 
> > Concrete example: virt and mfn are 2MB aligned, nr_mfn is 5MB / 1280 4K
> > pages. We allocate 2MB superpages until onlt 1MB is left. At that point
> > superpage_size <= left fails and we go down to 4K allocations.
> > 
> > Would that work?
> 
> Unfortunately no, AFAICT, your assumption is that vfn/mfn are originally
> aligned to higest possible superpage size. There are situation where this is
> not the case.

Yes, I was assuming that vfn/mfn are originally aligned to higest
possible superpage size. It is more difficult without that assumption
:-)


> To give a concrete example, at the moment the RAM is mapped using 1GB
> superpage in Xen. But in the future, we will only want to map RAM regions in
> the directmap that haven't been marked as reserved [1].
> 
> Those reserved regions don't have architectural alignment or placement.
> 
> I will use an over-exegerated example (or maybe not :)).
> 
> Imagine you have 4GB of RAM starting at 0. The HW/Software engineer decided to
> place a 2MB reserved region start at 512MB.
> 
> As a result we would want to map two RAM regions:
>1) 0 to 512MB
>2) 514MB to 4GB
> 
> I will only focus on 2). In the ideal situation, we would want to map
>a) 514MB to 1GB using 2MB superpage
>b) 1GB to 4GB using 1GB superpage
> 
> We don't want be to use 2MB superpage because this will increase TLB pressure
> (we want to avoid Xen using too much TLB entries) and also increase the size
> of the page-tables.
> 
> Therefore, we want to select the best size for each iteration. For now, the
> only solution I can come up with is to OR vfn/mfn and then use a series of
> check to compare the mask and nr_mfn.

Yeah, that's more or less what I was imagining too. Maybe we could use
ffs and friends to avoid or simplify some of those checks.


> In addition to the "classic" mappings (i.e. 4KB, 2MB, 1GB). I would like to
> explore contiguous mapping (e.g. 64KB, 32MB) to further reduce the TLBs
> pressure. Note that a processor may or may not take advantage of contiguous
> mapping to reduce the number of TLBs used.
> 
> This will unfortunately increase the numbers of check. I will try to come up
> with a patch and we can discuss from there.

OK



Re: [PATCH v2 3/3] ns16550: drop stray "#ifdef CONFIG_HAS_PCI"

2020-11-23 Thread Stefano Stabellini
On Mon, 23 Nov 2020, Jan Beulich wrote:
> There's no point wrapping the function invocation when
> - the function body is already suitably wrapped,
> - the function itself is unconditionally available.
> 
> Reported-by: Julien Grall 
> Signed-off-by: Jan Beulich 

Reviewed-by: Stefano Stabellini 


> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -662,9 +662,7 @@ static int __init check_existence(struct
>  return 1; /* Everything is MMIO */
>  #endif
>  
> -#ifdef CONFIG_HAS_PCI
>  pci_serial_early_init(uart);
> -#endif
>  
>  /*
>   * Do a simple existence test first; if we fail this,
> 



Re: [PATCH v2 2/3] ns16550: "com=" command line options are x86-specific

2020-11-23 Thread Stefano Stabellini
On Mon, 23 Nov 2020, Jan Beulich wrote:
> Pure code motion (plus the addition of "#ifdef CONFIG_X86); no
> functional change intended.
> 
> Reported-by: Julien Grall 
> Signed-off-by: Jan Beulich 

Great cleanup

Reviewed-by: Stefano Stabellini 


> ---
> v2: Re-base over new earlier patch.
> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -318,8 +318,8 @@ Interrupts.  Specifying zero disables CM
>  Flag to indicate whether to probe for a CMOS Real Time Clock irrespective of
>  ACPI indicating none to be there.
>  
> -### com1
> -### com2
> +### com1 (x86)
> +### com2 (x86)
>  > `= 
> [/][,[DPS][,[|pci|amt][,[|msi][,[][,[]]`
>  
>  Both option `com1` and `com2` follow the same format.
> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -31,38 +31,6 @@
>  #include 
>  #endif
>  
> -/*
> - * Configure serial port with a string:
> - *   
> [/][,DPS[,[,[,[,].
> - * The tail of the string can be omitted if platform defaults are sufficient.
> - * If the baud rate is pre-configured, perhaps by a bootloader, then 'auto'
> - * can be specified in place of a numeric baud rate. Polled mode is specified
> - * by requesting irq 0.
> - */
> -static char __initdata opt_com1[128] = "";
> -static char __initdata opt_com2[128] = "";
> -string_param("com1", opt_com1);
> -string_param("com2", opt_com2);
> -
> -enum serial_param_type {
> -baud,
> -clock_hz,
> -data_bits,
> -io_base,
> -irq,
> -parity,
> -reg_shift,
> -reg_width,
> -stop_bits,
> -#ifdef CONFIG_HAS_PCI
> -bridge_bdf,
> -device,
> -port_bdf,
> -#endif
> -/* List all parameters before this line. */
> -num_serial_params
> -};
> -
>  static struct ns16550 {
>  int baud, clock_hz, data_bits, parity, stop_bits, fifo_size, irq;
>  u64 io_base;   /* I/O port or memory-mapped I/O address. */
> @@ -98,32 +66,6 @@ static struct ns16550 {
>  #endif
>  } ns16550_com[2] = { { 0 } };
>  
> -struct serial_param_var {
> -char name[12];
> -enum serial_param_type type;
> -};
> -
> -/*
> - * Enum struct keeping a table of all accepted parameter names for parsing
> - * com_console_options for serial port com1 and com2.
> - */
> -static const struct serial_param_var __initconst sp_vars[] = {
> -{"baud", baud},
> -{"clock-hz", clock_hz},
> -{"data-bits", data_bits},
> -{"io-base", io_base},
> -{"irq", irq},
> -{"parity", parity},
> -{"reg-shift", reg_shift},
> -{"reg-width", reg_width},
> -{"stop-bits", stop_bits},
> -#ifdef CONFIG_HAS_PCI
> -{"bridge", bridge_bdf},
> -{"dev", device},
> -{"port", port_bdf},
> -#endif
> -};
> -
>  #ifdef CONFIG_HAS_PCI
>  struct ns16550_config {
>  u16 vendor_id;
> @@ -674,6 +616,19 @@ static struct uart_driver __read_mostly
>  #endif
>  };
>  
> +static void ns16550_init_common(struct ns16550 *uart)
> +{
> +uart->clock_hz  = UART_CLOCK_HZ;
> +
> +/* Default is no transmit FIFO. */
> +uart->fifo_size = 1;
> +
> +/* Default lsr_mask = UART_LSR_THRE */
> +uart->lsr_mask  = UART_LSR_THRE;
> +}
> +
> +#ifdef CONFIG_X86
> +
>  static int __init parse_parity_char(int c)
>  {
>  switch ( c )
> @@ -1217,6 +1172,64 @@ pci_uart_config(struct ns16550 *uart, bo
>  #endif /* CONFIG_HAS_PCI */
>  
>  /*
> + * Configure serial port with a string:
> + *   
> [/][,DPS[,[,[,[,].
> + * The tail of the string can be omitted if platform defaults are sufficient.
> + * If the baud rate is pre-configured, perhaps by a bootloader, then 'auto'
> + * can be specified in place of a numeric baud rate. Polled mode is specified
> + * by requesting irq 0.
> + */
> +static char __initdata opt_com1[128] = "";
> +static char __initdata opt_com2[128] = "";
> +string_param("com1", opt_com1);
> +string_param("com2", opt_com2);
> +
> +enum serial_param_type {
> +baud,
> +clock_hz,
> +data_bits,
> +io_base,
> +irq,
> +parity,
> +reg_shift,
> +reg_width,
> +stop_bits,
> +#ifdef CONFIG_HAS_PCI
> +bridge_bdf,
> +device,
> +port_bdf,
> +#endif
> +/* List all parameters before this line. */
> +num_serial_params
> +};
> +
> +struct serial_param_var {
> +char name[12];
> +enum serial_param_type type;
> +};
> +
> +/*
> + * Enum struct keeping a table of all accepted parameter names for parsing
> + * com_console_options for serial port com1 and com2.
> + */
> +static const struct serial_param_var __initconst sp_vars[] = {
> +{"baud", baud},
> +{"clock-hz", clock_hz},
> +{"data-bits", data_bits},
> +{"io-base", io_base},
> +{"irq", irq},
> +{"parity", parity},
> +{"reg-shift", reg_shift},
> +{"reg-width", reg_width},
> +{"stop-bits", stop_bits},
> +#ifdef CONFIG_HAS_PCI
> +{"bridge", bridge_bdf},
> +{"dev", device},
> +{"port", port_bdf},
> +#endif
> +};
> +
> +/*
>   * Used to parse name value pairs and return which value it is along with
>   * pointer for the 

Re: [PATCH v2 1/3] ns16550: move PCI arrays next to the function using them

2020-11-23 Thread Stefano Stabellini
On Mon, 23 Nov 2020, Jan Beulich wrote:
> Pure code motion; no functional change intended.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Stefano Stabellini 


> ---
> v2: New.
> 
> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -153,312 +153,6 @@ struct ns16550_config_param {
>  unsigned int uart_offset;
>  unsigned int first_offset;
>  };
> -
> -/*
> - * Create lookup tables for specific devices. It is assumed that if
> - * the device found is MMIO, then you have indexed it here. Else, the
> - * driver does nothing for MMIO based devices.
> - */
> -static const struct ns16550_config_param __initconst uart_param[] = {
> -[param_default] = {
> -.reg_width = 1,
> -.lsr_mask = UART_LSR_THRE,
> -.max_ports = 1,
> -},
> -[param_trumanage] = {
> -.reg_shift = 2,
> -.reg_width = 1,
> -.fifo_size = 16,
> -.lsr_mask = (UART_LSR_THRE | UART_LSR_TEMT),
> -.mmio = 1,
> -.max_ports = 1,
> -},
> -[param_oxford] = {
> -.base_baud = 400,
> -.uart_offset = 0x200,
> -.first_offset = 0x1000,
> -.reg_width = 1,
> -.fifo_size = 16,
> -.lsr_mask = UART_LSR_THRE,
> -.mmio = 1,
> -.max_ports = 1, /* It can do more, but we would need more custom 
> code.*/
> -},
> -[param_oxford_2port] = {
> -.base_baud = 400,
> -.uart_offset = 0x200,
> -.first_offset = 0x1000,
> -.reg_width = 1,
> -.fifo_size = 16,
> -.lsr_mask = UART_LSR_THRE,
> -.mmio = 1,
> -.max_ports = 2,
> -},
> -[param_pericom_1port] = {
> -.base_baud = 921600,
> -.uart_offset = 8,
> -.reg_width = 1,
> -.fifo_size = 16,
> -.lsr_mask = UART_LSR_THRE,
> -.bar0 = 1,
> -.max_ports = 1,
> -},
> -[param_pericom_2port] = {
> -.base_baud = 921600,
> -.uart_offset = 8,
> -.reg_width = 1,
> -.fifo_size = 16,
> -.lsr_mask = UART_LSR_THRE,
> -.bar0 = 1,
> -.max_ports = 2,
> -},
> -/*
> - * Of the two following ones, we can't really use all of their ports,
> - * unless ns16550_com[] would get grown.
> - */
> -[param_pericom_4port] = {
> -.base_baud = 921600,
> -.uart_offset = 8,
> -.reg_width = 1,
> -.fifo_size = 16,
> -.lsr_mask = UART_LSR_THRE,
> -.bar0 = 1,
> -.max_ports = 4,
> -},
> -[param_pericom_8port] = {
> -.base_baud = 921600,
> -.uart_offset = 8,
> -.reg_width = 1,
> -.fifo_size = 16,
> -.lsr_mask = UART_LSR_THRE,
> -.bar0 = 1,
> -.max_ports = 8,
> -}
> -};
> -static const struct ns16550_config __initconst uart_config[] =
> -{
> -/* Broadcom TruManage device */
> -{
> -.vendor_id = PCI_VENDOR_ID_BROADCOM,
> -.dev_id = 0x160a,
> -.param = param_trumanage,
> -},
> -/* OXPCIe952 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc11b,
> -.param = param_oxford,
> -},
> -/* OXPCIe952 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc11f,
> -.param = param_oxford,
> -},
> -/* OXPCIe952 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc138,
> -.param = param_oxford,
> -},
> -/* OXPCIe952 2 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc158,
> -.param = param_oxford_2port,
> -},
> -/* OXPCIe952 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc13d,
> -.param = param_oxford,
> -},
> -/* OXPCIe952 2 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc15d,
> -.param = param_oxford_2port,
> -},
> -/* OXPCIe952 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc40b,
> -.param = param_oxford,
> -},
> -/* OXPCIe200 1 Native UART */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc40f,
> -.param = param_oxford,
> -},
> -/* OXPCIe200 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc41b,
> -.param = param_oxford,
> -},
> -/* OXPCIe200 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc41f,
> -.param = param_oxford,
> -},
> -/* OXPCIe200 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc42b,
> -.param = param_oxford,
> -},
> -/* OXPCIe200 1 Native UART  */
> -{
> -.vendor_id = PCI_VENDOR_ID_OXSEMI,
> -.dev_id = 0xc42f,
> -.param = 

Re: [PATCH RFC 4/6] xen/arm: mm: Allow other mapping size in xen_pt_update_entry()

2020-11-23 Thread Julien Grall

Hi Stefano,

On 23/11/2020 22:27, Stefano Stabellini wrote:

On Fri, 20 Nov 2020, Julien Grall wrote:

   /*
* For arm32, page-tables are different on each CPUs. Yet, they
share
@@ -1265,14 +1287,43 @@ static int xen_pt_update(unsigned long virt,
 spin_lock(_pt_lock);
   -for ( ; addr < addr_end; addr += PAGE_SIZE )
+while ( left )
   {
-rc = xen_pt_update_entry(root, addr, mfn, flags);
+unsigned int order;
+unsigned long mask;
+
+/*
+ * Don't take into account the MFN when removing mapping (i.e
+ * MFN_INVALID) to calculate the correct target order.
+ *
+ * XXX: Support superpage mappings if nr is not aligned to a
+ * superpage size.


It would be good to add another sentence to explain that the checks
below are simply based on masks and rely on the mfn, vfn, and also
nr_mfn to be superpage aligned. (It took me some time to figure it out.)


I am not sure to understand what you wrote here. Could you suggest a sentence?


Something like the following:

/*
  * Don't take into account the MFN when removing mapping (i.e
  * MFN_INVALID) to calculate the correct target order.
  *
  * This loop relies on mfn, vfn, and nr_mfn, to be all superpage
  * aligned, and it uses `mask' to check for that.


Unfortunately, I am still not sure to understand this comment.
The loop can deal with any (super)page size (4KB, 2MB, 1GB). There are 
no assumption on any alignment for mfn, vfn and nr_mfn.


By OR-ing the 3 components together, we can use it to find out the 
maximum size that can be used for the mapping.


So can you clarify what you mean?


  *
  * XXX: Support superpage mappings if nr_mfn is not aligned to a
  * superpage size.
  */



Regarding the TODO itself, we have the exact same one in the P2M code. I
couldn't find a clever way to deal with it yet. Any idea how this could be
solved?
  
I was thinking of a loop that start with the highest possible superpage

size that virt and mfn are aligned to, and also smaller or equal to
nr_mfn. So rather than using the mask to also make sure nr_mfns is
aligned, I would only use the mask to check that mfn and virt are
aligned. Then, we only need to check that superpage_size <= left.

Concrete example: virt and mfn are 2MB aligned, nr_mfn is 5MB / 1280 4K
pages. We allocate 2MB superpages until onlt 1MB is left. At that point
superpage_size <= left fails and we go down to 4K allocations.

Would that work?


Unfortunately no, AFAICT, your assumption is that vfn/mfn are originally 
aligned to higest possible superpage size. There are situation where 
this is not the case.


To give a concrete example, at the moment the RAM is mapped using 1GB 
superpage in Xen. But in the future, we will only want to map RAM 
regions in the directmap that haven't been marked as reserved [1].


Those reserved regions don't have architectural alignment or placement.

I will use an over-exegerated example (or maybe not :)).

Imagine you have 4GB of RAM starting at 0. The HW/Software engineer 
decided to place a 2MB reserved region start at 512MB.


As a result we would want to map two RAM regions:
   1) 0 to 512MB
   2) 514MB to 4GB

I will only focus on 2). In the ideal situation, we would want to map
   a) 514MB to 1GB using 2MB superpage
   b) 1GB to 4GB using 1GB superpage

We don't want be to use 2MB superpage because this will increase TLB 
pressure (we want to avoid Xen using too much TLB entries) and also 
increase the size of the page-tables.


Therefore, we want to select the best size for each iteration. For now, 
the only solution I can come up with is to OR vfn/mfn and then use a 
series of check to compare the mask and nr_mfn.


In addition to the "classic" mappings (i.e. 4KB, 2MB, 1GB). I would like 
to explore contiguous mapping (e.g. 64KB, 32MB) to further reduce the 
TLBs pressure. Note that a processor may or may not take advantage of 
contiguous mapping to reduce the number of TLBs used.


This will unfortunately increase the numbers of check. I will try to 
come up with a patch and we can discuss from there.


Cheers,

[1] Reserved region may be marked as uncacheable and therefore we 
shouldn't map them in Xen address space to avoid break cache coherency.


--
Julien Grall



Re: [PATCH 058/141] xen-blkfront: Fix fall-through warnings for Clang

2020-11-23 Thread Gustavo A. R. Silva
On Fri, Nov 20, 2020 at 04:36:26PM -0500, boris.ostrov...@oracle.com wrote:
> 
> On 11/20/20 1:32 PM, Gustavo A. R. Silva wrote:
> > In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
> > by explicitly adding a break statement instead of letting the code fall
> > through to the next case.
> >
> > Link: https://github.com/KSPP/linux/issues/115
> > Signed-off-by: Gustavo A. R. Silva 
> > ---
> >  drivers/block/xen-blkfront.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> > index 48629d3433b4..34b028be78ab 100644
> > --- a/drivers/block/xen-blkfront.c
> > +++ b/drivers/block/xen-blkfront.c
> > @@ -2462,6 +2462,7 @@ static void blkback_changed(struct xenbus_device *dev,
> > break;
> > if (talk_to_blkback(dev, info))
> > break;
> > +   break;
> > case XenbusStateInitialising:
> > case XenbusStateInitialised:
> > case XenbusStateReconfiguring:
> 
> 
> Reviewed-by Boris Ostrovsky 
> 
> 
> (for patch 138 as well)

Thank you for both reviews, Boris.

> Although I thought using 'fallthrough' attribute was the more common approach.

I've got it. I will consider that for a future patch.

Thanks
--
Gustavo



Re: [PATCH v2 7/8] lib: move bsearch code

2020-11-23 Thread Julien Grall

Hi Jan,

On 19/11/2020 10:27, Jan Beulich wrote:

On 18.11.2020 19:09, Julien Grall wrote:

On 23/10/2020 11:19, Jan Beulich wrote:

--- a/xen/include/xen/compiler.h
+++ b/xen/include/xen/compiler.h
@@ -12,6 +12,7 @@
   
   #define inline__inline__

   #define always_inline __inline__ __attribute__ ((__always_inline__))
+#define gnu_inline__inline__ __attribute__ ((__gnu_inline__))


bsearch() is only used by Arm and I haven't seen anyone so far
complaining about the perf of I/O emulation.

Therefore, I am not convinced that there is enough justification to
introduce a GNU attribute just for this patch.


Please settle this with Andrew: He had asked for the function to
become inline. I don't view making it static inline in the header
as an option here - if the compiler decides to not inline it, we
should not end up with multiple instances in different CUs.


That's the cons of static inline... but then why is it suddenly a 
problem with this helper?



And
without making it static inline the attribute needs adding; at
least I'm unaware of an alternative which works with the various
compiler versions.


The question we have to answer is: What is the gain with this approach?

If it is not quantifiable, then introducing compiler specific attribute 
is not an option.


IIRC, there are only two callers (all in Arm code) of this function. 
Even inlined, I don't believe you would drastically reduce the number of 
instructions compare to a full blown version. To be generous, I would 
say you may save ~20 instructions per copy.


Therefore, so far, the compiler specific attribute doesn't look 
justified to me. As usual, I am happy to be proven wrong.


Cheers,

--
Julien Grall



Re: [PATCH v2] xen: EXPERT clean-up and introduce UNSUPPORTED

2020-11-23 Thread Stefano Stabellini
On Fri, 20 Nov 2020, Jan Beulich wrote:
> On 19.11.2020 22:40, Stefano Stabellini wrote:
> > On Thu, 19 Nov 2020, Jan Beulich wrote:
> >> On 18.11.2020 22:00, Stefano Stabellini wrote:
> >>> On Wed, 18 Nov 2020, Jan Beulich wrote:
>  On 18.11.2020 01:50, Stefano Stabellini wrote:
> > 1) It is not obvious that "Configure standard Xen features (expert
> > users)" is actually the famous EXPERT we keep talking about on xen-devel
> 
>  Which can be addressed by simply changing the one prompt line.
> 
> > 2) It is not obvious when we need to enable EXPERT to get a specific
> > feature
> >
> > In particular if you want to enable ACPI support so that you can boot
> > Xen on an ACPI platform, you have to enable EXPERT first. But searching
> > through the kconfig menu it is really not clear (type '/' and "ACPI"):
> > nothing in the description tells you that you need to enable EXPERT to
> > get the option.
> 
>  And what causes this to be different once you switch to UNSUPPORTED?
> >>>
> >>> Two things: firstly, it doesn't and shouldn't take an expert to enable
> >>> ACPI support, even if ACPI support is experimental. So calling it
> >>> UNSUPPORTED helps a lot. This is particularly relevant to the ARM Kconfig
> >>> options changed by this patch. Secondly, this patch is adding
> >>> "(UNSUPPORTED)" in the oneline prompt so that it becomes easy to match
> >>> it with the option you need to enable.
> >>
> >> There's redundancy here then, which I think is in almost all cases
> >> better to avoid. That's first and foremost because the two places
> >> can go out of sync. Therefore, if the primary thing is to help
> >> "make menuconfig" (which I admit I don't normally use, as it's
> >> nothing that gets invoked implicitly by the build process afaict,
> >> i.e. one has to actively invoke it), perhaps we should enhance
> >> kconfig to attach at least a pre-determined subset of labels to
> >> the prompts automatically?
> >>
> >> And second, also in reply to what you've been saying further down,
> >> perhaps we would better go with a hierarchy of controls here, e.g.
> >> EXPERT -> EXPERIMENTAL -> UNSUPPORTED?
> > 
> > Both these are good ideas worth discussing; somebody else made a similar
> > suggestion some time back. I was already thinking this could be a great
> > candidate for one of the first "working groups" as defined by George
> > during the last community call because the topic is not purely
> > technical: a working group could help getting alignment and make
> > progress faster. We can propose it to George when he is back.
> > 
> > However, I don't think we need the working group to make progress on
> > this limited patch that only addresses the lowest hanging fruit.
> > 
> > I'd like to suggest to make progress on this patch in its current form,
> > and in parallel start a longer term discussion on how to do something
> > like you suggested above.
> 
> Okay, I guess I can accept this. So FAOD I'm not objecting to the
> change (with some suitable adjustments, as discussed), but I'm
> then also not going to be the one to ack it. Nevertheless I'd like
> to point out that doing such a partial solution may end up adding
> confusion rather than reducing it. Much depends on how exactly
> consumers interpret what we hand to them.

Thank you Jan. I'll clarify the patch and address your comments. I'll
also try to get the attention of one of the other maintainers for the
ack.



Re: [PATCH v3 1/3] xen/ns16550: Make ns16550 driver usable on ARM with HAS_PCI enabled.

2020-11-23 Thread Stefano Stabellini
On Mon, 23 Nov 2020, Jan Beulich wrote:
> Rahul,
> 
> On 23.11.2020 12:54, Rahul Singh wrote:
> > Hello Jan,
> 
> as an aside - it helps if you also put the addressee of your mail
> on the To list.
> 
> >> On 20 Nov 2020, at 12:14 am, Stefano Stabellini  
> >> wrote:
> >>
> >> On Thu, 19 Nov 2020, Julien Grall wrote:
> >>> On Thu, 19 Nov 2020, 23:38 Stefano Stabellini,  
> >>> wrote:
> >>>  On Thu, 19 Nov 2020, Rahul Singh wrote:
> > On 19/11/2020 09:53, Jan Beulich wrote:
> >> On 19.11.2020 10:21, Julien Grall wrote:
> >>> Hi Jan,
> >>>
> >>> On 19/11/2020 09:05, Jan Beulich wrote:
>  On 18.11.2020 16:50, Julien Grall wrote:
> > On 16/11/2020 12:25, Rahul Singh wrote:
> >> NS16550 driver has PCI support that is under HAS_PCI flag. When 
> >> HAS_PCI
> >> is enabled for ARM, compilation error is observed for ARM 
> >> architecture
> >> because ARM platforms do not have full PCI support available.
> >>
> >> Introducing new kconfig option CONFIG_HAS_NS16550_PCI to support
> >> ns16550 PCI for X86.
> >>
> >> For X86 platforms it is enabled by default. For ARM platforms it is
> >> disabled by default, once we have proper support for NS16550 PCI 
> >> for
> >> ARM we can enable it.
> >>
> >> No functional change.
> >
> > NIT: I would say "No functional change intended" to make clear this 
> > is
> > an expectation and hopefully will be correct :).
> >
> > Regarding the commit message itself, I would suggest the following 
> > to
> > address Jan's concern:
> 
>  While indeed this is a much better description, I continue to think
>  that the proposed Kconfig option is undesirable to have.
> >>>
> >>> I am yet to see an argument into why we should keep the PCI code
> >>> compiled on Arm when there will be no-use
> >> Well, see my patch suppressing building of quite a part of it.
> >
> > I will let Rahul figuring out whether your patch series is sufficient 
> > to fix compilation issues (this is what matters right
> >>>  now).
> 
>  I just checked the compilation error for ARM after enabling the HAS_PCI 
>  on ARM. I am observing the same compilation error
> >>>  what I observed previously.
>  There are two new errors related to struct uart_config and struct 
>  part_param as those struct defined globally but used under
> >>>  X86 flags.
> 
>  At top level:
>  ns16550.c:179:48: error: ‘uart_config’ defined but not used 
>  [-Werror=unused-const-variable=]
>    static const struct ns16550_config __initconst uart_config[] =
>   ^~~
>  ns16550.c:104:54: error: ‘uart_param’ defined but not used 
>  [-Werror=unused-const-variable=]
>    static const struct ns16550_config_param __initconst uart_param[] = {
> 
> 
> >
>  Either,
>  following the patch I've just sent, truly x86-specific things (at
>  least as far as current state goes - if any of this was to be
>  re-used by a future port, suitable further abstraction may be
>  needed) should be guarded by CONFIG_X86 (or abstracted into arch
>  hooks), or the HAS_PCI_MSI proposal would at least want further
>  investigating as to its feasibility to address the issues at hand.
> >>>
> >>> I would be happy with CONFIG_X86, despite the fact that this is only
> >>> deferring the problem.
> >>>
> >>> Regarding HAS_PCI_MSI, I don't really see the point of introducing 
> >>> given
> >>> that we are not going to use NS16550 PCI on Arm in the forseeable
> >>> future.
> >> And I continue to fail to see what would guarantee this: As soon
> >> as you can plug in such a card into an Arm system, people will
> >> want to be able use it. That's why we had to add support for it
> >> on x86, after all.
> >
> > Well, plug-in PCI cards on Arm has been available for quite a while... 
> > Yet I haven't heard anyone asking for NS16550 PCI
> >>>  support.
> >
> > This is probably because SBSA compliant server should always provide an 
> > SBSA UART (a cut-down version of the PL011). So why
> >>>  would bother to lose a PCI slot for yet another UART?
> >
>  So why do we need a finer graine Kconfig?
> >> Because most of the involved code is indeed MSI-related?
> >
> > Possibly, yet it would not be necessary if we don't want NS16550 PCI 
> > support...
> 
>  To fix compilation error on ARM as per the discussion there are below 
>  options please suggest which one to use to proceed
> >>>  further.
> 
>  1. Use the newly introduced CONFIG_HAS_NS16550_PCI config options. This 
>  helps also non-x86 

AW: Xen data from meta-virtualization layer

2020-11-23 Thread Leo Krueger
Hi,

Thanks for your effort!

> -Ursprüngliche Nachricht-
> Von: Julien Grall 
> Gesendet: Montag, 23. November 2020 19:42
> An: Rahul Singh ; Leo Krueger
> 
> Cc: Stefano Stabellini ; Peng Fan
> ; bru...@xilinx.com; Cornelia Bruelhart
> ; oleksandr_andrushche...@epam.com; xen-
> de...@lists.xenproject.org; Bertrand Marquis
> 
> Betreff: Re: Xen data from meta-virtualization layer
> 
> 
> 
> On 23/11/2020 11:41, Rahul Singh wrote:
> > Hello ,
> 
> Hi Rahul,
> 
> >> On 22 Nov 2020, at 10:55 pm, Leo Krueger  wrote:
> >> root@kontron-sal28:~# ip link set up dev gbe0
> >> (XEN) vgic-v3-its.c:902:d0v0 vITS  cmd 0x0c: 0017000c
> >> 0001  
> >> (XEN) vgic-v3-its.c:902:d0v0 vITS  cmd 0x05: 0005
>   
> >> [   34.034598] Atheros 8031 ethernet :00:00.3:05: attached PHY driver
> [Atheros 8031 ethernet] (mii_bus:phy_addr=:00:00.3:05, irq=POLL)
> >> [   34.04] 8021q: adding VLAN 0 to HW filter on device gbe0
> >> [   34.041209] IPv6: ADDRCONF(NETDEV_UP): gbe0: link is not ready
> >> root@kontron-sal28:~# [   35.041951] fsl_enetc :00:00.0 gbe0: Link is
> Down
> >> [   38.114426] fsl_enetc :00:00.0 gbe0: Link is Up - 1Gbps/Full - flow
> control off
> >> [   38.114508] IPv6: ADDRCONF(NETDEV_CHANGE): gbe0: link becomes
> ready
> >>
> >> Does that tell you anything?
> >>
> >
> > I just checked the logs shared, what I found out that there’s is an error
> while booting to configure the MSI for the PCI device because of that there
> will be case that Device Id generate out-of-band is not mapped correctly to
> ITS device table created while initialising the MSI for the device.
> > I might be wrong let someone else also comments on this.
> 
> I think there might be multiple issues. You spotted one below :).
> 
> > [0.173964] OF: /soc/pcie@1f000: Invalid msi-map translation - no
> match for rid 0xf8 on   (null)
> 
> Leo, just to confirm, this error message is not spotted when booting Linux on
> baremetal?

In fact it is:

[0.160077] OF: /soc/pcie@1f000: Invalid msi-map translation - no match 
for rid 0xf8 on   (null)

But everything works as expected here:

110:  34288  0   ITS-MSI   1 Edge  gbe0-rxtx0
111:  0   6196   ITS-MSI   2 Edge  gbe0-rxtx1

> 
> Cheers,

Best wishes

> 
> --
> Julien Grall


Re: [PATCH RFC 4/6] xen/arm: mm: Allow other mapping size in xen_pt_update_entry()

2020-11-23 Thread Stefano Stabellini
On Fri, 20 Nov 2020, Julien Grall wrote:
> > >   /*
> > >* For arm32, page-tables are different on each CPUs. Yet, they
> > > share
> > > @@ -1265,14 +1287,43 @@ static int xen_pt_update(unsigned long virt,
> > > spin_lock(_pt_lock);
> > >   -for ( ; addr < addr_end; addr += PAGE_SIZE )
> > > +while ( left )
> > >   {
> > > -rc = xen_pt_update_entry(root, addr, mfn, flags);
> > > +unsigned int order;
> > > +unsigned long mask;
> > > +
> > > +/*
> > > + * Don't take into account the MFN when removing mapping (i.e
> > > + * MFN_INVALID) to calculate the correct target order.
> > > + *
> > > + * XXX: Support superpage mappings if nr is not aligned to a
> > > + * superpage size.
> > 
> > It would be good to add another sentence to explain that the checks
> > below are simply based on masks and rely on the mfn, vfn, and also
> > nr_mfn to be superpage aligned. (It took me some time to figure it out.)
> 
> I am not sure to understand what you wrote here. Could you suggest a sentence?

Something like the following:

/*
 * Don't take into account the MFN when removing mapping (i.e
 * MFN_INVALID) to calculate the correct target order.
 *
 * This loop relies on mfn, vfn, and nr_mfn, to be all superpage
 * aligned, and it uses `mask' to check for that.
 *
 * XXX: Support superpage mappings if nr_mfn is not aligned to a
 * superpage size.
 */


> Regarding the TODO itself, we have the exact same one in the P2M code. I
> couldn't find a clever way to deal with it yet. Any idea how this could be
> solved?
 
I was thinking of a loop that start with the highest possible superpage
size that virt and mfn are aligned to, and also smaller or equal to
nr_mfn. So rather than using the mask to also make sure nr_mfns is
aligned, I would only use the mask to check that mfn and virt are
aligned. Then, we only need to check that superpage_size <= left.

Concrete example: virt and mfn are 2MB aligned, nr_mfn is 5MB / 1280 4K
pages. We allocate 2MB superpages until onlt 1MB is left. At that point
superpage_size <= left fails and we go down to 4K allocations.

Would that work?



Re: [PATCH v3 00/23] xl / libxl: named PCI pass-through devices

2020-11-23 Thread Andrew Cooper
On 23/11/2020 17:44, Paul Durrant wrote:
> From: Paul Durrant 
>
> Paul Durrant (23):
>   xl / libxl: s/pcidev/pci and remove DEFINE_DEVICE_TYPE_STRUCT_X
>   libxl: make libxl__device_list() work correctly for
> LIBXL__DEVICE_KIND_PCI...
>   libxl: Make sure devices added by pci-attach are reflected in the
> config
>   libxl: add/recover 'rdm_policy' to/from PCI backend in xenstore
>   libxl: s/detatched/detached in libxl_pci.c
>   libxl: remove extraneous arguments to do_pci_remove() in libxl_pci.c
>   libxl: stop using aodev->device_config in libxl__device_pci_add()...
>   libxl: generalise 'driver_path' xenstore access functions in
> libxl_pci.c
>   libxl: remove unnecessary check from libxl__device_pci_add()
>   libxl: remove get_all_assigned_devices() from libxl_pci.c
>   libxl: make sure callers of libxl_device_pci_list() free the list
> after use
>   libxl: add libxl_device_pci_assignable_list_free()...
>   libxl: use COMPARE_PCI() macro is_pci_in_array()...
>   docs/man: extract documentation of PCI_SPEC_STRING from the xl.cfg
> manpage...
>   docs/man: improve documentation of PCI_SPEC_STRING...
>   docs/man: fix xl(1) documentation for 'pci' operations
>   libxl: introduce 'libxl_pci_bdf' in the idl...
>   libxlu: introduce xlu_pci_parse_spec_string()
>   libxl: modify
> libxl_device_pci_assignable_add/remove/list/list_free()...
>   docs/man: modify xl(1) in preparation for naming of assignable devices
>   xl / libxl: support naming of assignable devices
>   docs/man: modify xl-pci-configuration(5) to add 'name' field to
> PCI_SPEC_STRING
>   xl / libxl: support 'xl pci-attach/detach' by name

We're trying to get the CI loop up and running.  Its not emailing
xen-devel yet, but has found a real error somewhere in this series.

https://gitlab.com/xen-project/patchew/xen/-/pipelines/220153571

~Andrew



[linux-linus test] 156964: regressions - FAIL

2020-11-23 Thread osstest service owner
flight 156964 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156964/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-arm64-arm64-examine  8 reboot   fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 20 guest-stop  fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 20 guest-stop   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-arm64-arm64-xl-xsm   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-credit1  12 debian-install fail in 156955 REGR. vs. 152332

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail in 156955 pass in 
156964
 test-arm64-arm64-xl-credit1  10 host-ping-check-xenfail pass in 156955
 test-arm64-arm64-xl   8 xen-boot   fail pass in 156955
 test-arm64-arm64-xl-seattle   8 xen-boot   fail pass in 156955
 test-arm64-arm64-libvirt-xsm  8 xen-boot   fail pass in 156955

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-seattle 11 leak-check/basis(11) fail in 156955 blocked in 
152332
 test-arm64-arm64-libvirt-xsm 11 leak-check/basis(11) fail in 156955 blocked in 
152332
 test-arm64-arm64-xl   11 leak-check/basis(11) fail in 156955 blocked in 152332
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread James Bottomley
On Mon, 2020-11-23 at 19:56 +0100, Miguel Ojeda wrote:
> On Mon, Nov 23, 2020 at 4:58 PM James Bottomley
>  wrote:
> > Well, I used git.  It says that as of today in Linus' tree we have
> > 889 patches related to fall throughs and the first series went in
> > in october 2017 ... ignoring a couple of outliers back to February.
> 
> I can see ~10k insertions over ~1k commits and 15 years that mention
> a fallthrough in the entire repo. That is including some commits
> (like the biggest one, 960 insertions) that have nothing to do with C
> fallthrough. A single kernel release has an order of magnitude more
> changes than this...
> 
> But if we do the math, for an author, at even 1 minute per line
> change and assuming nothing can be automated at all, it would take 1
> month of work. For maintainers, a couple of trivial lines is noise
> compared to many other patches.

So you think a one line patch should take one minute to produce ... I
really don't think that's grounded in reality.  I suppose a one line
patch only takes a minute to merge with b4 if no-one reviews or tests
it, but that's not really desirable.

> In fact, this discussion probably took more time than the time it
> would take to review the 200 lines. :-)

I'm framing the discussion in terms of the whole series of changes we
have done for fall through, both what's in the tree currently (889
patches) both in terms of the produce and the consumer.  That's what I
used for my figures for cost.

> > We're also complaining about the inability to recruit maintainers:
> > 
> > https://www.theregister.com/2020/06/30/hard_to_find_linux_maintainers_says_torvalds/
> > 
> > And burn out:
> > 
> > http://antirez.com/news/129
> 
> Accepting trivial and useful 1-line patches

Part of what I'm trying to measure is the "and useful" bit because
that's not a given.

> is not what makes a voluntary maintainer quit...

so the proverb "straw which broke the camel's back" uniquely doesn't
apply to maintainers

>  Thankless work with demanding deadlines is.

That's another potential reason, but it doesn't may other reasons less
valid.

> > The whole crux of your argument seems to be maintainers' time isn't
> > important so we should accept all trivial patches
> 
> I have not said that, at all. In fact, I am a voluntary one and I
> welcome patches like this. It takes very little effort on my side to
> review and it helps the kernel overall.

Well, you know, subsystems are very different in terms of the amount of
patches a maintainer has to process per release cycle of the kernel. 
If a maintainer is close to capacity, additional patches, however
trivial, become a problem.  If a maintainer has spare cycles, trivial
patches may look easy.

> Paid maintainers are the ones that can take care of big
> features/reviews.
> 
> > What I'm actually trying to articulate is a way of measuring value
> > of the patch vs cost ... it has nothing really to do with who foots
> > the actual bill.
> 
> I understand your point, but you were the one putting it in terms of
> a junior FTE.

No, I evaluated the producer side in terms of an FTE.  What we're
mostly arguing about here is the consumer side: the maintainers and
people who have to rework their patch sets. I estimated that at 100h.

>  In my view, 1 month-work (worst case) is very much worth
> removing a class of errors from a critical codebase.
> 
> > One thesis I'm actually starting to formulate is that this
> > continual devaluing of maintainers is why we have so much
> > difficulty keeping and recruiting them.
> 
> That may very well be true, but I don't feel anybody has devalued
> maintainers in this discussion.

You seem to be saying that because you find it easy to merge trivial
patches, everyone should.  I'm reminded of a friend long ago who
thought being a Tees River Pilot was a sinecure because he could
navigate the Tees blindfold.  What he forgot, of course, is that just
because it's easy with a trawler doesn't mean it's easy with an oil
tanker.  In fact it takes longer to qualify as a Tees River Pilot than
it does to get a PhD.

James





[qemu-mainline test] 156962: regressions - FAIL

2020-11-23 Thread osstest service owner
flight 156962 qemu-mainline real [real]
flight 156968 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/156962/
http://logs.test-lab.xenproject.org/osstest/logs/156968/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631
 test-amd64-amd64-xl-qcow2   21 guest-start/debian.repeat fail REGR. vs. 152631
 test-armhf-armhf-xl-vhd 17 guest-start/debian.repeat fail REGR. vs. 152631

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152631
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152631
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 152631
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu8cc30eb1400fc01f2b139cdd3dc524f8b84dbe07
baseline version:
 qemuu1d806cef0e38b5db8347a8e12f214d543204a314

Last test of basis   152631  2020-08-20 09:07:46 Z   95 days
Failing since152659  2020-08-21 14:07:39 Z   94 days  200 attempts
Testing same since   156953  2020-11-23 00:08:31 Z0 days2 attempts


People who touched revisions under test:
Aaron Lindsay 
  Alberto Garcia 
  Aleksandar Markovic 
  Alex Bennée 
  Alex Chen 
  Alex Williamson 
  Alexander 

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Jason Gunthorpe
On Fri, Nov 20, 2020 at 12:21:39PM -0600, Gustavo A. R. Silva wrote:

>   IB/hfi1: Fix fall-through warnings for Clang
>   IB/mlx4: Fix fall-through warnings for Clang
>   IB/qedr: Fix fall-through warnings for Clang
>   RDMA/mlx5: Fix fall-through warnings for Clang

I picked these four to the rdma tree, thanks

Jason



Re: [PATCH] MAINTINERS: Propose Ian Jackson as new release manager

2020-11-23 Thread Jürgen Groß

On 23.11.20 18:08, Ian Jackson wrote:

George Dunlap writes ("[PATCH] MAINTINERS: Propose Ian Jackson as new release 
manager"):

Ian Jackson has agreed to be the release manager for 4.15.  Signify
this by giving him maintainership over CHANGELOG.md.


Acked-by: Ian Jackson 

Obviously that signifies my consent but I think it needs more acks.

Wei, Juergen, Paul, I think I am likely to ask you some questions.
Any tips etc would be welcome.


Fine with me. :-)


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Miguel Ojeda
On Mon, Nov 23, 2020 at 4:58 PM James Bottomley
 wrote:
>
> Well, I used git.  It says that as of today in Linus' tree we have 889
> patches related to fall throughs and the first series went in in
> october 2017 ... ignoring a couple of outliers back to February.

I can see ~10k insertions over ~1k commits and 15 years that mention a
fallthrough in the entire repo. That is including some commits (like
the biggest one, 960 insertions) that have nothing to do with C
fallthrough. A single kernel release has an order of magnitude more
changes than this...

But if we do the math, for an author, at even 1 minute per line change
and assuming nothing can be automated at all, it would take 1 month of
work. For maintainers, a couple of trivial lines is noise compared to
many other patches.

In fact, this discussion probably took more time than the time it
would take to review the 200 lines. :-)

> We're also complaining about the inability to recruit maintainers:
>
> https://www.theregister.com/2020/06/30/hard_to_find_linux_maintainers_says_torvalds/
>
> And burn out:
>
> http://antirez.com/news/129

Accepting trivial and useful 1-line patches is not what makes a
voluntary maintainer quit... Thankless work with demanding deadlines is.

> The whole crux of your argument seems to be maintainers' time isn't
> important so we should accept all trivial patches

I have not said that, at all. In fact, I am a voluntary one and I
welcome patches like this. It takes very little effort on my side to
review and it helps the kernel overall. Paid maintainers are the ones
that can take care of big features/reviews.

> What I'm actually trying to articulate is a way of measuring value of
> the patch vs cost ... it has nothing really to do with who foots the
> actual bill.

I understand your point, but you were the one putting it in terms of a
junior FTE. In my view, 1 month-work (worst case) is very much worth
removing a class of errors from a critical codebase.

> One thesis I'm actually starting to formulate is that this continual
> devaluing of maintainers is why we have so much difficulty keeping and
> recruiting them.

That may very well be true, but I don't feel anybody has devalued
maintainers in this discussion.

Cheers,
Miguel



Re: Xen data from meta-virtualization layer

2020-11-23 Thread Julien Grall




On 23/11/2020 11:41, Rahul Singh wrote:

Hello ,


Hi Rahul,


On 22 Nov 2020, at 10:55 pm, Leo Krueger  wrote:
root@kontron-sal28:~# ip link set up dev gbe0
(XEN) vgic-v3-its.c:902:d0v0 vITS  cmd 0x0c: 0017000c 0001 
 
(XEN) vgic-v3-its.c:902:d0v0 vITS  cmd 0x05: 0005  
 
[   34.034598] Atheros 8031 ethernet :00:00.3:05: attached PHY driver 
[Atheros 8031 ethernet] (mii_bus:phy_addr=:00:00.3:05, irq=POLL)
[   34.04] 8021q: adding VLAN 0 to HW filter on device gbe0
[   34.041209] IPv6: ADDRCONF(NETDEV_UP): gbe0: link is not ready
root@kontron-sal28:~# [   35.041951] fsl_enetc :00:00.0 gbe0: Link is Down
[   38.114426] fsl_enetc :00:00.0 gbe0: Link is Up - 1Gbps/Full - flow 
control off
[   38.114508] IPv6: ADDRCONF(NETDEV_CHANGE): gbe0: link becomes ready

Does that tell you anything?



I just checked the logs shared, what I found out that there’s is an error while 
booting to configure the MSI for the PCI device because of that there will be 
case that Device Id generate out-of-band is not mapped correctly to ITS device 
table created while initialising the MSI for the device.
I might be wrong let someone else also comments on this.


I think there might be multiple issues. You spotted one below :).


[0.173964] OF: /soc/pcie@1f000: Invalid msi-map translation - no match 
for rid 0xf8 on   (null)


Leo, just to confirm, this error message is not spotted when booting 
Linux on baremetal?


Cheers,

--
Julien Grall



Re: AW: AW: AW: AW: AW: Xen data from meta-virtualization layer

2020-11-23 Thread Julien Grall




On 22/11/2020 22:55, Leo Krueger wrote:

Hi Julien,


Hi Leo,



finally I could try out what you suggested, please find my answers inline.


Thank you for sending the logs!




-Ursprüngliche Nachricht-
Von: Julien Grall 
Gesendet: Mittwoch, 18. November 2020 13:24
An: Stefano Stabellini ; Leo Krueger

Cc: Peng Fan ; bru...@xilinx.com; Cornelia Bruelhart
; oleksandr_andrushche...@epam.com; xen-
de...@lists.xenproject.org; bertrand.marq...@arm.com
Betreff: Re: AW: AW: AW: AW: Xen data from meta-virtualization layer

Hi,

On 17/11/2020 23:53, Stefano Stabellini wrote:

Adding Bertrand, Oleksandr, Julien, and others -- they have a more
recent experience with GICv3 ITS than me and might be able to help.
I am attaching the device tree Leo sent a few days ago for reference.


Typically when you can set the ethernet link up and no packets are
exchanged it is because of a missing interrupt. In this case a missing
MSI.

Bertrand, I believe you tried the GIC ITS driver with PCI devices
recently. It is expected to work correctly with MSIs in Dom0, right?


OSSTest has some hardware (e.g. Thunder-X) where ITS is required to boot
Dom0. I haven't seen any failure on recent Xen. We are testing 4.11 and
onwards on Thunder-X.

However, it may be possible that some more work is necessary for other
hardware (e.g. workaround, missing code...). See more below.





On Tue, 17 Nov 2020, Leo Krueger wrote:

Hi,

I enabled CONFIG_HAS_ITS (what a stupid mistake by me to not set it
before...) but then had to add the following node to my device tree

gic_lpi_base: syscon@0x8000 {
compatible = "gic-lpi-base";


I couldn't find this compatible defined/used in Linux 5.10-rc4. @Leo, could
you clarify which flavor/version of Linux you are using?


It is Linux 4.19 from Yocto (Warror release). XEN 4.13.2.


Do you have a link to the Linux tree? Is there any additional patches on 
top of vanilla?



While searching around the Internet for any solution, I came across [0] which 
contained the gic-lpi-base node.
So I just tried adding it (quite desperate I know) and voila, it at least 
brought me one step further (XEN exposing the ITS)...


I am slightly confused to how this would help. Xen and, AFAICT, Linux 
don't understand gic-lpi-base. Do you have modification in your Linux to 
use it?


Looking at the DT changes in [0], it looks like the node is not a child 
of gic@. So I think Xen will map the region to Dom0.


There are two things that I can notice:
  1) This region is RAM, but I can't find any reserve node. Is there 
any specific code in Linux to reserve it?
  2) The implementation in U-boot seems to suggest that the firmware 
will configure the LPIs and then enable it. If that's the case, then Xen 
needs to re-use the table in the DT rather than allocating a new one. 
However, I would have expected an error message in the log:


   "GICv3: CPUx: Cannot initialize LPIs"

At least Xen should not expose gic-lpi-base to the kernel, but I will 
wait on more details about the Linux kernel used before commenting more.


I would also be interested to know more details about the failure when 
gic-lpi-base is not added in your DT. In particular, I am interested to 
understand why Xen would not expose the ITS as we don't parse that node.


[...]


For XEN 4.13.2 I had to adapt your patch slightly [1], see below (yes I know, 
quite ugly in parts).


No worries, debug patches are not meant to be nice to read ;).


Find attached the boot log and an output of "xl dmesg" which is truncated due 
to the large amount of messages.

When enabling the network interface (gbe0), the following output is visible:

root@kontron-sal28:~# ip link set up dev gbe0
(XEN) vgic-v3-its.c:902:d0v0 vITS  cmd 0x0c: 0017000c 0001 
 
(XEN) vgic-v3-its.c:902:d0v0 vITS  cmd 0x05: 0005  
 


0xc is INV and 0x5 is SYNC. Most likely the driver unmask the interrupt 
by writing in the property table (access are not trapped to Xen) and 
then requested to invalidate the cache state.



[   34.034598] Atheros 8031 ethernet :00:00.3:05: attached PHY driver 
[Atheros 8031 ethernet] (mii_bus:phy_addr=:00:00.3:05, irq=POLL)
[   34.04] 8021q: adding VLAN 0 to HW filter on device gbe0
[   34.041209] IPv6: ADDRCONF(NETDEV_UP): gbe0: link is not ready
root@kontron-sal28:~# [   35.041951] fsl_enetc :00:00.0 gbe0: Link is Down
[   38.114426] fsl_enetc :00:00.0 gbe0: Link is Up - 1Gbps/Full - flow 
control off
[   38.114508] IPv6: ADDRCONF(NETDEV_CHANGE): gbe0: link becomes ready

Does that tell you anything?


It is at least a good sign because it means Linux is able to 
initialize/talk to the vITS.


I would lean towards one (or multiple) issue with pITS and/or the 
device-tree exposed to Linux. I am not entirely what exactly... I think 
having more details about the Linux setup would be helpful.


I will reply on 

[PATCH v3 12/23] libxl: add libxl_device_pci_assignable_list_free()...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... to be used by callers of libxl_device_pci_assignable_list().

Currently there is no API for callers of libxl_device_pci_assignable_list()
to free the list. The xl function pciassignable_list() calls
libxl_device_pci_dispose() on each element of the returned list, but
libxl_pci_assignable() in libxl_pci.c does not. Neither does the implementation
of libxl_device_pci_assignable_list() call libxl_device_pci_init().

This patch adds the new API function, makes sure it is used everywhere and
also modifies libxl_device_pci_assignable_list() to initialize list
entries rather than just zeroing them.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Christian Lindig 
Cc: David Scott 
Cc: Anthony PERARD 
---
 tools/include/libxl.h|  7 +++
 tools/libs/light/libxl_pci.c | 14 --
 tools/ocaml/libs/xl/xenlight_stubs.c |  3 +--
 tools/xl/xl_pci.c|  3 +--
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index ee52d3cf7e..8225809d94 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -458,6 +458,12 @@
 #define LIBXL_HAVE_DEVICE_PCI_LIST_FREE 1
 
 /*
+ * LIBXL_HAVE_DEVICE_PCI_ASSIGNABLE_LIST_FREE indicates that the
+ * libxl_device_pci_assignable_list_free() function is defined.
+ */
+#define LIBXL_HAVE_DEVICE_PCI_ASSIGNABLE_LIST_FREE 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
@@ -2369,6 +2375,7 @@ int libxl_device_events_handler(libxl_ctx *ctx,
 int libxl_device_pci_assignable_add(libxl_ctx *ctx, libxl_device_pci *pci, int 
rebind);
 int libxl_device_pci_assignable_remove(libxl_ctx *ctx, libxl_device_pci *pci, 
int rebind);
 libxl_device_pci *libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num);
+void libxl_device_pci_assignable_list_free(libxl_device_pci *list, int num);
 
 /* CPUID handling */
 int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str);
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 0f41939d1f..5a3352c2ec 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -457,7 +457,7 @@ libxl_device_pci 
*libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num)
 pcis = new;
 new = pcis + *num;
 
-memset(new, 0, sizeof(*new));
+libxl_device_pci_init(new);
 pci_struct_fill(new, dom, bus, dev, func, 0);
 
 if (pci_info_xs_read(gc, new, "domid")) /* already assigned */
@@ -472,6 +472,16 @@ out:
 return pcis;
 }
 
+void libxl_device_pci_assignable_list_free(libxl_device_pci *list, int num)
+{
+int i;
+
+for (i = 0; i < num; i++)
+libxl_device_pci_dispose([i]);
+
+free(list);
+}
+
 /* Unbind device from its current driver, if any.  If driver_path is non-NULL,
  * store the path to the original driver in it. */
 static int sysfs_dev_unbind(libxl__gc *gc, libxl_device_pci *pci,
@@ -1490,7 +1500,7 @@ static int libxl_pci_assignable(libxl_ctx *ctx, 
libxl_device_pci *pci)
 pcis[i].func == pci->func)
 break;
 }
-free(pcis);
+libxl_device_pci_assignable_list_free(pcis, num);
 return i != num;
 }
 
diff --git a/tools/ocaml/libs/xl/xenlight_stubs.c 
b/tools/ocaml/libs/xl/xenlight_stubs.c
index 1181971da4..352a00134d 100644
--- a/tools/ocaml/libs/xl/xenlight_stubs.c
+++ b/tools/ocaml/libs/xl/xenlight_stubs.c
@@ -894,9 +894,8 @@ value stub_xl_device_pci_assignable_list(value ctx)
Field(list, 1) = temp;
temp = list;
Store_field(list, 0, Val_device_pci(_list[i]));
-   libxl_device_pci_dispose(_list[i]);
}
-   free(c_list);
+   libxl_device_pci_assignable_list_free(c_list, nb);
 
CAMLreturn(list);
 }
diff --git a/tools/xl/xl_pci.c b/tools/xl/xl_pci.c
index 7c0f102ac7..f71498cbb5 100644
--- a/tools/xl/xl_pci.c
+++ b/tools/xl/xl_pci.c
@@ -164,9 +164,8 @@ static void pciassignable_list(void)
 for (i = 0; i < num; i++) {
 printf("%04x:%02x:%02x.%01x\n",
pcis[i].domain, pcis[i].bus, pcis[i].dev, pcis[i].func);
-libxl_device_pci_dispose([i]);
 }
-free(pcis);
+libxl_device_pci_assignable_list_free(pcis, num);
 }
 
 int main_pciassignable_list(int argc, char **argv)
-- 
2.11.0




[PATCH v3 21/23] xl / libxl: support naming of assignable devices

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

This patch modifies libxl_device_pci_assignable_add() to take an optional
'name' argument, which (if supplied) is saved into xenstore and can hence be
used to refer to the now-assignable BDF in subsequent operations. To
facilitate this, a new libxl_device_pci_assignable_name2bdf() function is
added.

The xl code is modified to allow a name to be specified in the
'pci-assignable-add' operation and also allow an option to be specified to
'pci-assignable-list' requesting that names be displayed. The latter is
facilitated by a new libxl_device_pci_assignable_bdf2name() function. Finally
xl 'pci-assignable-remove' is modified to that either a name or BDF can be
supplied. The supplied 'identifier' is first assumed to be a name, but if
libxl_device_pci_assignable_name2bdf() fails to find a matching BDF the
identifier itself will be parsed as a BDF. Names my only include printable
characters and may not include whitespace.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Christian Lindig 
Cc: David Scott 
Cc: Anthony PERARD 
---
 tools/include/libxl.h| 19 +++-
 tools/libs/light/libxl_pci.c | 86 +---
 tools/ocaml/libs/xl/xenlight_stubs.c |  3 +-
 tools/xl/xl_cmdtable.c   | 12 +++--
 tools/xl/xl_pci.c| 84 ---
 5 files changed, 166 insertions(+), 38 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 5703fdf367..4025d3a3d4 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -477,6 +477,14 @@
 #define LIBXL_HAVE_PCI_ASSIGNABLE_BDF 1
 
 /*
+ * LIBXL_HAVE_PCI_ASSIGNABLE_NAME indicates that the
+ * libxl_device_pci_assignable_add() function takes a 'name' argument
+ * and that the libxl_device_pci_assignable_name2bdf() and
+ * libxl_device_pci_assignable_bdf2name() functions are defined.
+ */
+#define LIBXL_HAVE_PCI_ASSIGNABLE_NAME 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
@@ -2385,11 +2393,18 @@ int libxl_device_events_handler(libxl_ctx *ctx,
  * added or is not bound, the functions will emit a warning but return
  * SUCCESS.
  */
-int libxl_device_pci_assignable_add(libxl_ctx *ctx, libxl_pci_bdf *pcibdf, int 
rebind);
-int libxl_device_pci_assignable_remove(libxl_ctx *ctx, libxl_pci_bdf *pcibdf, 
int rebind);
+int libxl_device_pci_assignable_add(libxl_ctx *ctx, libxl_pci_bdf *pcibdf,
+const char *name, int rebind);
+int libxl_device_pci_assignable_remove(libxl_ctx *ctx, libxl_pci_bdf *pcibdf,
+   int rebind);
 libxl_pci_bdf *libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num);
 void libxl_device_pci_assignable_list_free(libxl_pci_bdf *list, int num);
 
+libxl_pci_bdf *libxl_device_pci_assignable_name2bdf(libxl_ctx *ctx,
+const char *name);
+char *libxl_device_pci_assignable_bdf2name(libxl_ctx *ctx,
+   libxl_pci_bdf *pcibdf);
+
 /* CPUID handling */
 int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str);
 int libxl_cpuid_parse_config_xend(libxl_cpuid_policy_list *cpuid,
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index f9ace1faec..a1c9ae0d5b 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -745,6 +745,7 @@ static int pciback_dev_unassign(libxl__gc *gc, 
libxl_pci_bdf *pcibdf)
 
 static int libxl__device_pci_assignable_add(libxl__gc *gc,
 libxl_pci_bdf *pcibdf,
+const char *name,
 int rebind)
 {
 libxl_ctx *ctx = libxl__gc_owner(gc);
@@ -753,6 +754,23 @@ static int libxl__device_pci_assignable_add(libxl__gc *gc,
 int rc;
 struct stat st;
 
+/* Sanitise any name that was passed */
+if (name) {
+unsigned int i, n = strlen(name);
+
+if (n > 64) { /* Reasonable upper bound on name length */
+LOG(ERROR, "Name too long");
+return ERROR_FAIL;
+}
+
+for (i = 0; i < n; i++) {
+if (!isgraph(name[i])) {
+LOG(ERROR, "Names may only include printable characters");
+return ERROR_FAIL;
+}
+}
+}
+
 /* Local copy for convenience */
 dom = pcibdf->domain;
 bus = pcibdf->bus;
@@ -773,7 +791,7 @@ static int libxl__device_pci_assignable_add(libxl__gc *gc,
 }
 if ( rc ) {
 LOG(WARN, PCI_BDF" already assigned to pciback", dom, bus, dev, func);
-goto quarantine;
+goto name;
 }
 
 /* Check to see if there's already a driver that we need to unbind from */
@@ -804,7 +822,12 @@ static int libxl__device_pci_assignable_add(libxl__gc *gc,
 return ERROR_FAIL;
 }
 
-quarantine:
+name:
+if (name)
+ 

[PATCH v3 19/23] libxl: modify libxl_device_pci_assignable_add/remove/list/list_free()...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... to use 'libxl_pci_bdf' rather than 'libxl_device_pci'.

This patch modifies the API and callers accordingly. It also modifies
several internal functions in libxl_pci.c that support the API to also use
'libxl_pci_bdf'.

NOTE: The OCaml bindings are adjusted to contain the interface change. It
  should therefore not affect compatibility with OCaml-based utilities.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Christian Lindig 
Cc: David Scott 
Cc: Anthony PERARD 
---
 tools/include/libxl.h|  15 ++-
 tools/libs/light/libxl_pci.c | 215 +++
 tools/ocaml/libs/xl/xenlight_stubs.c |  15 ++-
 tools/xl/xl_pci.c|  32 +++---
 4 files changed, 157 insertions(+), 120 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 5edacccbd1..5703fdf367 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -470,6 +470,13 @@
 #define LIBXL_HAVE_PCI_BDF 1
 
 /*
+ * LIBXL_HAVE_PCI_ASSIGNABLE_BDF indicates that the
+ * libxl_device_pci_assignable_add/remove/list/list_free() functions all
+ * use the 'libxl_pci_bdf' type rather than 'libxl_device_pci' type.
+ */
+#define LIBXL_HAVE_PCI_ASSIGNABLE_BDF 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
@@ -2378,10 +2385,10 @@ int libxl_device_events_handler(libxl_ctx *ctx,
  * added or is not bound, the functions will emit a warning but return
  * SUCCESS.
  */
-int libxl_device_pci_assignable_add(libxl_ctx *ctx, libxl_device_pci *pci, int 
rebind);
-int libxl_device_pci_assignable_remove(libxl_ctx *ctx, libxl_device_pci *pci, 
int rebind);
-libxl_device_pci *libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num);
-void libxl_device_pci_assignable_list_free(libxl_device_pci *list, int num);
+int libxl_device_pci_assignable_add(libxl_ctx *ctx, libxl_pci_bdf *pcibdf, int 
rebind);
+int libxl_device_pci_assignable_remove(libxl_ctx *ctx, libxl_pci_bdf *pcibdf, 
int rebind);
+libxl_pci_bdf *libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num);
+void libxl_device_pci_assignable_list_free(libxl_pci_bdf *list, int num);
 
 /* CPUID handling */
 int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str);
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 3cfba0e527..f9ace1faec 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -25,26 +25,33 @@
 #define PCI_BDF_XSPATH "%04x-%02x-%02x-%01x"
 #define PCI_PT_QDEV_ID "pci-pt-%02x_%02x.%01x"
 
-static unsigned int pci_encode_bdf(libxl_device_pci *pci)
+static unsigned int pci_encode_bdf(libxl_pci_bdf *pcibdf)
 {
 unsigned int value;
 
-value = pci->bdf.domain << 16;
-value |= (pci->bdf.bus & 0xff) << 8;
-value |= (pci->bdf.dev & 0x1f) << 3;
-value |= (pci->bdf.func & 0x7);
+value = pcibdf->domain << 16;
+value |= (pcibdf->bus & 0xff) << 8;
+value |= (pcibdf->dev & 0x1f) << 3;
+value |= (pcibdf->func & 0x7);
 
 return value;
 }
 
+static void pcibdf_struct_fill(libxl_pci_bdf *pcibdf, unsigned int domain,
+   unsigned int bus, unsigned int dev,
+   unsigned int func)
+{
+pcibdf->domain = domain;
+pcibdf->bus = bus;
+pcibdf->dev = dev;
+pcibdf->func = func;
+}
+
 static void pci_struct_fill(libxl_device_pci *pci, unsigned int domain,
 unsigned int bus, unsigned int dev,
 unsigned int func, unsigned int vdevfn)
 {
-pci->bdf.domain = domain;
-pci->bdf.bus = bus;
-pci->bdf.dev = dev;
-pci->bdf.func = func;
+pcibdf_struct_fill(>bdf, domain, bus, dev, func);
 pci->vdevfn = vdevfn;
 }
 
@@ -350,8 +357,8 @@ static bool is_pci_in_array(libxl_device_pci *pcis, int num,
 }
 
 /* Write the standard BDF into the sysfs path given by sysfs_path. */
-static int sysfs_write_bdf(libxl__gc *gc, const char * sysfs_path,
-   libxl_device_pci *pci)
+static int sysfs_write_bdf(libxl__gc *gc, const char *sysfs_path,
+   libxl_pci_bdf *pcibdf)
 {
 int rc, fd;
 char *buf;
@@ -362,8 +369,8 @@ static int sysfs_write_bdf(libxl__gc *gc, const char * 
sysfs_path,
 return ERROR_FAIL;
 }
 
-buf = GCSPRINTF(PCI_BDF, pci->bdf.domain, pci->bdf.bus,
-pci->bdf.dev, pci->bdf.func);
+buf = GCSPRINTF(PCI_BDF, pcibdf->domain, pcibdf->bus,
+pcibdf->dev, pcibdf->func);
 rc = write(fd, buf, strlen(buf));
 /* Annoying to have two if's, but we need the errno */
 if (rc < 0)
@@ -378,22 +385,22 @@ static int sysfs_write_bdf(libxl__gc *gc, const char * 
sysfs_path,
 
 #define PCI_INFO_PATH "/libxl/pci"
 
-static char *pci_info_xs_path(libxl__gc *gc, libxl_device_pci *pci,
+static char *pci_info_xs_path(libxl__gc *gc, libxl_pci_bdf *pcibdf,
   

[PATCH v3 15/23] docs/man: improve documentation of PCI_SPEC_STRING...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... and prepare for adding support for non-positional parsing of 'bdf' and
'vslot' in a subsequent patch.

Also document 'BDF' as a first-class parameter type and fix the documentation
to state that the default value of 'rdm_policy' is actually 'strict', not
'relaxed', as can be seen in libxl__device_pci_setdefault().

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 docs/man/xl-pci-configuration.5.pod | 177 ++--
 1 file changed, 148 insertions(+), 29 deletions(-)

diff --git a/docs/man/xl-pci-configuration.5.pod 
b/docs/man/xl-pci-configuration.5.pod
index 72a27bd95d..4dd73bc498 100644
--- a/docs/man/xl-pci-configuration.5.pod
+++ b/docs/man/xl-pci-configuration.5.pod
@@ -6,32 +6,105 @@ xl-pci-configuration - XL PCI Configuration Syntax
 
 =head1 SYNTAX
 
-This document specifies the format for B which is used by
-the L pci configuration option, and related L commands.
+This document specifies the format for B and B which are
+used by the L pci configuration option, and related L
+commands.
 
-Each B has the form of
-B<[:]BB:DD.F[@VSLOT],KEY=VALUE,KEY=VALUE,...> where:
+A B has the following form:
+
+[:]BB:SS.F
+
+B is the domain number, B is the bus number, B is the device (or
+slot) number, and B is the function number. This is the same scheme as
+used in the output of L for the device in question. By default
+L will omit the domain (B) if it is zero and hence a zero
+value for domain may also be omitted when specifying a B.
+
+Each B has the one of the forms:
+
+=over 4
+
+[[@,][=,]*
+[=,]*
+
+=back
+
+For example, these strings are equivalent:
 
 =over 4
 
-=item B<[:]BB:DD.F>
+36:00.0@20,seize=1
+36:00.0,vslot=20,seize=1
+bdf=36:00.0,vslot=20,seize=1
 
-Identifies the PCI device from the host perspective in the domain
-(B), Bus (B), Device (B) and Function (B) syntax. This is
-the same scheme as used in the output of B for the device in
-question.
+=back
+
+More formally, the string is a series of comma-separated keyword/value
+pairs, flags and positional parameters.  Parameters which are not bare
+keywords and which do not contain "=" symbols are assigned to the
+positional parameters, in the order specified below.  The positional
+parameters may also be specified by name.
+
+Each parameter may be specified at most once, either as a positional
+parameter or a named parameter.  Default values apply if the parameter
+is not specified, or if it is specified with an empty value (whether
+positionally or explicitly).
+
+B: In context of B (see L), parameters other than
+B will be ignored.
+
+=head1 Positional Parameters
+
+=over 4
+
+=item B=I
+
+=over 4
 
-Note: by default B will omit the domain (B) if it
-is zero and it is optional here also. You may specify the function
-(B) as B<*> to indicate all functions.
+=item Description
 
-=item B<@VSLOT>
+This identifies the PCI device from the host perspective.
 
-Specifies the virtual slot where the guest will see this
-device. This is equivalent to the B which the guest sees. In a
-guest B and B are C<:00>.
+In the context of a B you may specify the function (B) as
+B<*> to indicate all functions of a multi-function device.
 
-=item B
+=item Default Value
+
+None. This parameter is mandatory as it identifies the device.
+
+=back
+
+=item B=I
+
+=over 4
+
+=item Description
+
+Specifies the virtual slot (device) number where the guest will see this
+device. For example, running L in a Linux guest where B
+was specified as C<8> would identify the device as C<00:08.0>. Virtual domain
+and bus numbers are always 0.
+
+B This parameter is always parsed as a hexidecimal value.
+
+=item Default Value
+
+None. This parameter is not mandatory. An available B will be selected
+if this parameter is not specified.
+
+=back
+
+=back
+
+=head1 Other Parameters and Flags
+
+=over 4
+
+=item B=I
+
+=over 4
+
+=item Description
 
 By default pciback only allows PV guests to write "known safe" values
 into PCI configuration space, likewise QEMU (both qemu-xen and
@@ -46,33 +119,79 @@ more control over the device, which may have security or 
stability
 implications.  It is recommended to only enable this option for
 trusted VMs under administrator's control.
 
-=item B
+=item Default Value
+
+0
+
+=back
+
+=item B=I
+
+=over 4
+
+=item Description
 
 Specifies that MSI-INTx translation should be turned on for the PCI
 device. When enabled, MSI-INTx translation will always enable MSI on
-the PCI device regardless of whether the guest uses INTx or MSI. Some
-device drivers, such as NVIDIA's, detect an inconsistency and do not
+the PCI device regardless of whether the guest uses INTx or MSI.
+
+=item Default Value
+
+Some device drivers, such as NVIDIA's, detect an inconsistency and do not
 function when this option is enabled. Therefore the default is false (0).
 
-=item B
+=back
+
+=item B=I
+
+=over 4
+
+=item Description
 
-Tells B to automatically attempt to 

[PATCH v3 18/23] libxlu: introduce xlu_pci_parse_spec_string()

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

This patch largely re-writes the code to parse a PCI_SPEC_STRING and enters
it via the newly introduced function. The new parser also deals with 'bdf'
and 'vslot' as non-positional paramaters, as per the documentation in
xl-pci-configuration(5).

The existing xlu_pci_parse_bdf() function remains, but now strictly parses
BDF values. Some existing callers of xlu_pci_parse_bdf() are
modified to call xlu_pci_parse_spec_string() as per the documentation in xl(1).

NOTE: Usage text in xl_cmdtable.c and error messages are also modified
  appropriately.

Fixes: d25cc3ec93eb ("libxl: workaround gcc 10.2 maybe-uninitialized warning")
Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anthony PERARD 
---
 tools/include/libxlutil.h|   8 +-
 tools/libs/util/libxlu_pci.c | 354 +++
 tools/xl/xl_cmdtable.c   |   4 +-
 tools/xl/xl_parse.c  |   4 +-
 tools/xl/xl_pci.c|  37 +++--
 5 files changed, 220 insertions(+), 187 deletions(-)

diff --git a/tools/include/libxlutil.h b/tools/include/libxlutil.h
index 92e35c5462..cdd6aab4f8 100644
--- a/tools/include/libxlutil.h
+++ b/tools/include/libxlutil.h
@@ -109,9 +109,15 @@ int xlu_disk_parse(XLU_Config *cfg, int nspecs, const char 
*const *specs,
*/
 
 /*
+ * PCI BDF
+ */
+int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_pci_bdf *bdf, const char *str);
+
+/*
  * PCI specification parsing
  */
-int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const char 
*str);
+int xlu_pci_parse_spec_string(XLU_Config *cfg, libxl_device_pci *pci,
+  const char *str);
 
 /*
  * RDM parsing
diff --git a/tools/libs/util/libxlu_pci.c b/tools/libs/util/libxlu_pci.c
index 5c107f2642..a8b6ce5427 100644
--- a/tools/libs/util/libxlu_pci.c
+++ b/tools/libs/util/libxlu_pci.c
@@ -1,5 +1,7 @@
 #define _GNU_SOURCE
 
+#include 
+
 #include "libxlu_internal.h"
 #include "libxlu_disk_l.h"
 #include "libxlu_disk_i.h"
@@ -9,185 +11,213 @@
 #define XLU__PCI_ERR(_c, _x, _a...) \
 if((_c) && (_c)->report) fprintf((_c)->report, _x, ##_a)
 
-static int hex_convert(const char *str, unsigned int *val, unsigned int mask)
+static int parse_bdf(libxl_pci_bdf *bdfp, uint32_t *vfunc_maskp,
+ const char *str, const char **endp)
 {
-unsigned long ret;
-char *end;
-
-ret = strtoul(str, , 16);
-if ( end == str || *end != '\0' )
-return -1;
-if ( ret & ~mask )
-return -1;
-*val = (unsigned int)ret & mask;
+const char *ptr = str;
+unsigned int colons = 0;
+unsigned int domain, bus, dev, func;
+int n;
+
+/* Count occurrences of ':' to detrmine presence/absence of the 'domain' */
+while (isxdigit(*ptr) || *ptr == ':') {
+if (*ptr == ':')
+colons++;
+ptr++;
+}
+
+ptr = str;
+switch (colons) {
+case 1:
+domain = 0;
+if (sscanf(ptr, "%x:%x.%n", , , ) != 2)
+return ERROR_INVAL;
+break;
+case 2:
+if (sscanf(ptr, "%x:%x:%x.%n", , , , ) != 3)
+return ERROR_INVAL;
+break;
+default:
+return ERROR_INVAL;
+}
+
+if (domain > 0x || bus > 0xff || dev > 0x1f)
+return ERROR_INVAL;
+
+ptr += n;
+if (*ptr == '*') {
+if (!vfunc_maskp)
+return ERROR_INVAL;
+*vfunc_maskp = LIBXL_PCI_FUNC_ALL;
+func = 0;
+ptr++;
+} else {
+if (sscanf(ptr, "%x%n", , ) != 1)
+return ERROR_INVAL;
+if (func > 7)
+return ERROR_INVAL;
+if (vfunc_maskp)
+*vfunc_maskp = 1;
+ptr += n;
+}
+
+bdfp->domain = domain;
+bdfp->bus = bus;
+bdfp->dev = dev;
+bdfp->func = func;
+
+if (endp)
+*endp = ptr;
+
 return 0;
 }
 
-static int pci_struct_fill(libxl_device_pci *pci, unsigned int domain,
-   unsigned int bus, unsigned int dev,
-   unsigned int func, unsigned int vdevfn)
+static int parse_vslot(uint32_t *vdevfnp, const char *str, const char **endp)
 {
-pci->bdf.domain = domain;
-pci->bdf.bus = bus;
-pci->bdf.dev = dev;
-pci->bdf.func = func;
-pci->vdevfn = vdevfn;
+const char *ptr = str;
+unsigned int val;
+int n;
+
+if (sscanf(ptr, "%x%n", , ) != 1)
+return ERROR_INVAL;
+
+if (val > 0x1f)
+return ERROR_INVAL;
+
+ptr += n;
+
+*vdevfnp = val << 3;
+
+if (endp)
+*endp = ptr;
+
 return 0;
 }
 
-#define STATE_DOMAIN0
-#define STATE_BUS   1
-#define STATE_DEV   2
-#define STATE_FUNC  3
-#define STATE_VSLOT 4
-#define STATE_OPTIONS_K 6
-#define STATE_OPTIONS_V 7
-#define STATE_TERMINAL  8
-#define STATE_TYPE  9
-#define STATE_RDM_STRATEGY  10
-#define STATE_RESERVE_POLICY11
-#define INVALID 0x
-int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pci, const char *str)
+static int 

[PATCH v3 17/23] libxl: introduce 'libxl_pci_bdf' in the idl...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... and use in 'libxl_device_pci'

This patch is preparatory work for restricting the type passed to functions
that only require BDF information, rather than passing a 'libxl_device_pci'
structure which is only partially filled. In this patch only the minimal
mechanical changes necessary to deal with the structural changes are made.
Subsequent patches will adjust the code to make better use of the new type.

Signed-off-by: Paul Durrant 
---
Cc: George Dunlap 
Cc: Nick Rosbrook 
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anthony PERARD 
---
 tools/golang/xenlight/helpers.gen.go |  77 --
 tools/golang/xenlight/types.gen.go   |   8 +-
 tools/include/libxl.h|   6 ++
 tools/libs/light/libxl_dm.c  |   8 +-
 tools/libs/light/libxl_internal.h|   3 +-
 tools/libs/light/libxl_pci.c | 148 +--
 tools/libs/light/libxl_types.idl |  16 ++--
 tools/libs/util/libxlu_pci.c |   8 +-
 tools/xl/xl_pci.c|   6 +-
 tools/xl/xl_sxp.c|   4 +-
 10 files changed, 167 insertions(+), 117 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go 
b/tools/golang/xenlight/helpers.gen.go
index c8605994e7..b7230f693c 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1999,6 +1999,41 @@ xc.colo_checkpoint_port = 
C.CString(x.ColoCheckpointPort)}
  return nil
  }
 
+// NewPciBdf returns an instance of PciBdf initialized with defaults.
+func NewPciBdf() (*PciBdf, error) {
+var (
+x PciBdf
+xc C.libxl_pci_bdf)
+
+C.libxl_pci_bdf_init()
+defer C.libxl_pci_bdf_dispose()
+
+if err := x.fromC(); err != nil {
+return nil, err }
+
+return , nil}
+
+func (x *PciBdf) fromC(xc *C.libxl_pci_bdf) error {
+ x.Func = byte(xc._func)
+x.Dev = byte(xc.dev)
+x.Bus = byte(xc.bus)
+x.Domain = int(xc.domain)
+
+ return nil}
+
+func (x *PciBdf) toC(xc *C.libxl_pci_bdf) (err error){defer func(){
+if err != nil{
+C.libxl_pci_bdf_dispose(xc)}
+}()
+
+xc._func = C.uint8_t(x.Func)
+xc.dev = C.uint8_t(x.Dev)
+xc.bus = C.uint8_t(x.Bus)
+xc.domain = C.int(x.Domain)
+
+ return nil
+ }
+
 // NewDevicePci returns an instance of DevicePci initialized with defaults.
 func NewDevicePci() (*DevicePci, error) {
 var (
@@ -2014,10 +2049,9 @@ return nil, err }
 return , nil}
 
 func (x *DevicePci) fromC(xc *C.libxl_device_pci) error {
- x.Func = byte(xc._func)
-x.Dev = byte(xc.dev)
-x.Bus = byte(xc.bus)
-x.Domain = int(xc.domain)
+ if err := x.Bdf.fromC();err != nil {
+return fmt.Errorf("converting field Bdf: %v", err)
+}
 x.Vdevfn = uint32(xc.vdevfn)
 x.VfuncMask = uint32(xc.vfunc_mask)
 x.Msitranslate = bool(xc.msitranslate)
@@ -2033,10 +2067,9 @@ if err != nil{
 C.libxl_device_pci_dispose(xc)}
 }()
 
-xc._func = C.uint8_t(x.Func)
-xc.dev = C.uint8_t(x.Dev)
-xc.bus = C.uint8_t(x.Bus)
-xc.domain = C.int(x.Domain)
+if err := x.Bdf.toC(); err != nil {
+return fmt.Errorf("converting field Bdf: %v", err)
+}
 xc.vdevfn = C.uint32_t(x.Vdevfn)
 xc.vfunc_mask = C.uint32_t(x.VfuncMask)
 xc.msitranslate = C.bool(x.Msitranslate)
@@ -2766,13 +2799,13 @@ if err := x.Nics[i].fromC(); err != nil {
 return fmt.Errorf("converting field Nics: %v", err) }
 }
 }
-x.Pcidevs = nil
-if n := int(xc.num_pcidevs); n > 0 {
-cPcidevs := (*[1<<28]C.libxl_device_pci)(unsafe.Pointer(xc.pcidevs))[:n:n]
-x.Pcidevs = make([]DevicePci, n)
-for i, v := range cPcidevs {
-if err := x.Pcidevs[i].fromC(); err != nil {
-return fmt.Errorf("converting field Pcidevs: %v", err) }
+x.Pcis = nil
+if n := int(xc.num_pcis); n > 0 {
+cPcis := (*[1<<28]C.libxl_device_pci)(unsafe.Pointer(xc.pcis))[:n:n]
+x.Pcis = make([]DevicePci, n)
+for i, v := range cPcis {
+if err := x.Pcis[i].fromC(); err != nil {
+return fmt.Errorf("converting field Pcis: %v", err) }
 }
 }
 x.Rdms = nil
@@ -2922,13 +2955,13 @@ return fmt.Errorf("converting field Nics: %v", err)
 }
 }
 }
-if numPcidevs := len(x.Pcidevs); numPcidevs > 0 {
-xc.pcidevs = 
(*C.libxl_device_pci)(C.malloc(C.ulong(numPcidevs)*C.sizeof_libxl_device_pci))
-xc.num_pcidevs = C.int(numPcidevs)
-cPcidevs := 
(*[1<<28]C.libxl_device_pci)(unsafe.Pointer(xc.pcidevs))[:numPcidevs:numPcidevs]
-for i,v := range x.Pcidevs {
-if err := v.toC([i]); err != nil {
-return fmt.Errorf("converting field Pcidevs: %v", err)
+if numPcis := len(x.Pcis); numPcis > 0 {
+xc.pcis = 
(*C.libxl_device_pci)(C.malloc(C.ulong(numPcis)*C.sizeof_libxl_device_pci))
+xc.num_pcis = C.int(numPcis)
+cPcis := 
(*[1<<28]C.libxl_device_pci)(unsafe.Pointer(xc.pcis))[:numPcis:numPcis]
+for i,v := range x.Pcis {
+if err := v.toC([i]); err != nil {
+return fmt.Errorf("converting field Pcis: %v", err)
 }
 }
 }
diff --git a/tools/golang/xenlight/types.gen.go 
b/tools/golang/xenlight/types.gen.go
index b4c5df0f2c..bc62ae8ce9 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -707,11 +707,15 @@ ColoCheckpointHost string
 ColoCheckpointPort string
 }
 
-type DevicePci struct {
+type PciBdf struct {
 

[PATCH v3 14/23] docs/man: extract documentation of PCI_SPEC_STRING from the xl.cfg manpage...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... and put it into a new xl-pci-configuration(5) manpage, akin to the
xl-network-configration(5) and xl-disk-configuration(5) manpages.

This patch moves the content of the section verbatim. A subsequent patch
will improve the documentation, once it is in its new location.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 docs/man/xl-pci-configuration.5.pod | 78 +
 docs/man/xl.cfg.5.pod.in| 68 +---
 2 files changed, 79 insertions(+), 67 deletions(-)
 create mode 100644 docs/man/xl-pci-configuration.5.pod

diff --git a/docs/man/xl-pci-configuration.5.pod 
b/docs/man/xl-pci-configuration.5.pod
new file mode 100644
index 00..72a27bd95d
--- /dev/null
+++ b/docs/man/xl-pci-configuration.5.pod
@@ -0,0 +1,78 @@
+=encoding utf8
+
+=head1 NAME
+
+xl-pci-configuration - XL PCI Configuration Syntax
+
+=head1 SYNTAX
+
+This document specifies the format for B which is used by
+the L pci configuration option, and related L commands.
+
+Each B has the form of
+B<[:]BB:DD.F[@VSLOT],KEY=VALUE,KEY=VALUE,...> where:
+
+=over 4
+
+=item B<[:]BB:DD.F>
+
+Identifies the PCI device from the host perspective in the domain
+(B), Bus (B), Device (B) and Function (B) syntax. This is
+the same scheme as used in the output of B for the device in
+question.
+
+Note: by default B will omit the domain (B) if it
+is zero and it is optional here also. You may specify the function
+(B) as B<*> to indicate all functions.
+
+=item B<@VSLOT>
+
+Specifies the virtual slot where the guest will see this
+device. This is equivalent to the B which the guest sees. In a
+guest B and B are C<:00>.
+
+=item B
+
+By default pciback only allows PV guests to write "known safe" values
+into PCI configuration space, likewise QEMU (both qemu-xen and
+qemu-xen-traditional) imposes the same constraint on HVM guests.
+However, many devices require writes to other areas of the configuration space
+in order to operate properly.  This option tells the backend (pciback or QEMU)
+to allow all writes to the PCI configuration space of this device by this
+domain.
+
+B it gives the guest much
+more control over the device, which may have security or stability
+implications.  It is recommended to only enable this option for
+trusted VMs under administrator's control.
+
+=item B
+
+Specifies that MSI-INTx translation should be turned on for the PCI
+device. When enabled, MSI-INTx translation will always enable MSI on
+the PCI device regardless of whether the guest uses INTx or MSI. Some
+device drivers, such as NVIDIA's, detect an inconsistency and do not
+function when this option is enabled. Therefore the default is false (0).
+
+=item B
+
+Tells B to automatically attempt to re-assign a device to
+pciback if it is not already assigned.
+
+B If you set this option, B will gladly re-assign a critical
+system device, such as a network or a disk controller being used by
+dom0 without confirmation.  Please use with care.
+
+=item B
+
+B<(HVM only)> Specifies that the VM should be able to program the
+D0-D3hot power management states for the PCI device. The default is false (0).
+
+=item B
+
+B<(HVM/x86 only)> This is the same as the policy setting inside the B
+option but just specific to a given device. The default is "relaxed".
+
+Note: this would override global B option.
+
+=back
diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 0532739c1f..b00644e852 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1101,73 +1101,7 @@ option is valid only when the B option is 
specified.
 =item B
 
 Specifies the host PCI devices to passthrough to this guest.
-Each B has the form of
-B<[:]BB:DD.F[@VSLOT],KEY=VALUE,KEY=VALUE,...> where:
-
-=over 4
-
-=item B<[:]BB:DD.F>
-
-Identifies the PCI device from the host perspective in the domain
-(B), Bus (B), Device (B) and Function (B) syntax. This is
-the same scheme as used in the output of B for the device in
-question.
-
-Note: by default B will omit the domain (B) if it
-is zero and it is optional here also. You may specify the function
-(B) as B<*> to indicate all functions.
-
-=item B<@VSLOT>
-
-Specifies the virtual slot where the guest will see this
-device. This is equivalent to the B which the guest sees. In a
-guest B and B are C<:00>.
-
-=item B
-
-By default pciback only allows PV guests to write "known safe" values
-into PCI configuration space, likewise QEMU (both qemu-xen and
-qemu-xen-traditional) imposes the same constraint on HVM guests.
-However, many devices require writes to other areas of the configuration space
-in order to operate properly.  This option tells the backend (pciback or QEMU)
-to allow all writes to the PCI configuration space of this device by this
-domain.
-
-B it gives the guest much
-more control over the device, which may have security or stability
-implications.  It is recommended to only enable this 

[PATCH v3 23/23] xl / libxl: support 'xl pci-attach/detach' by name

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

This patch adds a 'name' field into the idl for 'libxl_device_pci' and
libxlu_pci_parse_spec_string() is modified to parse the new 'name'
parameter of PCI_SPEC_STRING detailed in the updated documention in
xl-pci-configuration(5).

If the 'name' field is non-NULL then both libxl_device_pci_add() and
libxl_device_pci_remove() will use it to look up the device BDF in
the list of assignable devices.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anthony PERARD 
---
 tools/include/libxl.h|  6 
 tools/libs/light/libxl_pci.c | 67 +---
 tools/libs/light/libxl_types.idl |  1 +
 tools/libs/util/libxlu_pci.c |  7 -
 4 files changed, 75 insertions(+), 6 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 4025d3a3d4..5b55a20155 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -485,6 +485,12 @@
 #define LIBXL_HAVE_PCI_ASSIGNABLE_NAME 1
 
 /*
+ * LIBXL_HAVE_DEVICE_PCI_NAME indicates that the 'name' field of
+ * libxl_device_pci is defined.
+ */
+#define LIBXL_HAVE_DEVICE_PCI_NAME 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index a1c9ae0d5b..986fb11d5c 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -60,6 +60,10 @@ static void libxl_create_pci_backend_device(libxl__gc *gc,
 int num,
 const libxl_device_pci *pci)
 {
+if (pci->name) {
+flexarray_append(back, GCSPRINTF("name-%d", num));
+flexarray_append(back, GCSPRINTF("%s", pci->name));
+}
 flexarray_append(back, GCSPRINTF("key-%d", num));
 flexarray_append(back, GCSPRINTF(PCI_BDF, pci->bdf.domain, pci->bdf.bus, 
pci->bdf.dev, pci->bdf.func));
 flexarray_append(back, GCSPRINTF("dev-%d", num));
@@ -284,6 +288,7 @@ retry_transaction:
 
 retry_transaction2:
 t = xs_transaction_start(ctx->xsh);
+xs_rm(ctx->xsh, t, GCSPRINTF("%s/name-%d", be_path, i));
 xs_rm(ctx->xsh, t, GCSPRINTF("%s/state-%d", be_path, i));
 xs_rm(ctx->xsh, t, GCSPRINTF("%s/key-%d", be_path, i));
 xs_rm(ctx->xsh, t, GCSPRINTF("%s/dev-%d", be_path, i));
@@ -322,6 +327,12 @@ retry_transaction2:
 xs_write(ctx->xsh, t, GCSPRINTF("%s/vdevfn-%d", be_path, j - 1), 
tmp, strlen(tmp));
 xs_rm(ctx->xsh, t, tmppath);
 }
+tmppath = GCSPRINTF("%s/name-%d", be_path, j);
+tmp = libxl__xs_read(gc, t, tmppath);
+if (tmp) {
+xs_write(ctx->xsh, t, GCSPRINTF("%s/name-%d", be_path, j - 1), 
tmp, strlen(tmp));
+xs_rm(ctx->xsh, t, tmppath);
+}
 }
 if (!xs_transaction_end(ctx->xsh, t, 0))
 if (errno == EAGAIN)
@@ -1619,6 +1630,23 @@ void libxl__device_pci_add(libxl__egc *egc, uint32_t 
domid,
 pas->starting = starting;
 pas->callback = device_pci_add_stubdom_done;
 
+if (pci->name) {
+libxl_pci_bdf *pcibdf =
+libxl_device_pci_assignable_name2bdf(CTX, pci->name);
+
+if (!pcibdf) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+LOGD(DETAIL, domid, "'%s' -> %04x:%02x:%02x.%u", pci->name,
+ pcibdf->domain, pcibdf->bus, pcibdf->dev, pcibdf->func);
+
+libxl_pci_bdf_copy(CTX, >bdf, pcibdf);
+libxl_pci_bdf_dispose(pcibdf);
+free(pcibdf);
+}
+
 if (libxl__domain_type(gc, domid) == LIBXL_DOMAIN_TYPE_HVM) {
 rc = xc_test_assign_device(ctx->xch, domid,
pci_encode_bdf(>bdf));
@@ -1767,11 +1795,19 @@ static void device_pci_add_done(libxl__egc *egc,
 libxl_device_pci *pci = >pci;
 
 if (rc) {
-LOGD(ERROR, domid,
- "libxl__device_pci_add  failed for "
- "PCI device %x:%x:%x.%x (rc %d)",
- pci->bdf.domain, pci->bdf.bus, pci->bdf.dev, pci->bdf.func,
- rc);
+if (pci->name) {
+LOGD(ERROR, domid,
+ "libxl__device_pci_add failed for "
+ "PCI device '%s' (rc %d)",
+ pci->name,
+ rc);
+} else {
+LOGD(ERROR, domid,
+ "libxl__device_pci_add failed for "
+ "PCI device %x:%x:%x.%x (rc %d)",
+ pci->bdf.domain, pci->bdf.bus, pci->bdf.dev, pci->bdf.func,
+ rc);
+}
 pci_info_xs_remove(gc, >bdf, "domid");
 }
 libxl_device_pci_dispose(pci);
@@ -2288,6 +2324,23 @@ static void libxl__device_pci_remove_common(libxl__egc 
*egc,
 libxl__ev_time_init(>timeout);
 libxl__ev_time_init(>retry_timer);
 
+if (pci->name) {
+libxl_pci_bdf *pcibdf =
+libxl_device_pci_assignable_name2bdf(CTX, pci->name);
+
+if (!pcibdf) {
+rc = ERROR_FAIL;
+goto out;
+

[PATCH v3 22/23] docs/man: modify xl-pci-configuration(5) to add 'name' field to PCI_SPEC_STRING

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Since assignable devices can be named, a subsequent patch will support use
of a PCI_SPEC_STRING containing a 'name' parameter instead of a 'bdf'. In
this case the name will be used to look up the 'bdf' in the list of assignable
(or assigned) devices.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 docs/man/xl-pci-configuration.5.pod | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl-pci-configuration.5.pod 
b/docs/man/xl-pci-configuration.5.pod
index 4dd73bc498..db3360307c 100644
--- a/docs/man/xl-pci-configuration.5.pod
+++ b/docs/man/xl-pci-configuration.5.pod
@@ -51,7 +51,7 @@ is not specified, or if it is specified with an empty value 
(whether
 positionally or explicitly).
 
 B: In context of B (see L), parameters other than
-B will be ignored.
+B or B will be ignored.
 
 =head1 Positional Parameters
 
@@ -70,7 +70,11 @@ B<*> to indicate all functions of a multi-function device.
 
 =item Default Value
 
-None. This parameter is mandatory as it identifies the device.
+None. This parameter is mandatory in its positional form. As a non-positional
+parameter it is also mandatory unless a B parameter is present, in
+which case B must not be present since the B will be used to find
+the B in the list of assignable devices. See L for more information
+on naming assignable devices.
 
 =back
 
@@ -194,4 +198,21 @@ B: This overrides the global B option.
 
 =back
 
+=item B=I
+
+=over 4
+
+=item Description
+
+This is the name given when the B was made assignable. See L for
+more information on naming assignable devices.
+
+=item Default Value
+
+None. This parameter must not be present if a B parameter is present.
+If a B parameter is not present then B is mandatory as it is
+required to look up the B in the list of assignable devices.
+
+=back
+
 =back
-- 
2.11.0




[PATCH v3 20/23] docs/man: modify xl(1) in preparation for naming of assignable devices

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

A subsequent patch will introduce code to allow a name to be specified to
'xl pci-assignable-add' such that the assignable device may be referred to
by than name in subsequent operations.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 docs/man/xl.1.pod.in | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index c5fbce3b5c..0822a58428 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -1595,19 +1595,23 @@ List virtual network interfaces for a domain.
 
 =over 4
 
-=item B
+=item B [I<-n>]
 
 List all the B of assignable PCI devices. See
-L for more information.
+L for more information. If the -n option is
+specified then any name supplied when the device was made assignable
+will also be displayed.
 
 These are devices in the system which are configured to be
 available for passthrough and are bound to a suitable PCI
 backend driver in domain 0 rather than a real driver.
 
-=item B I
+=item B [I<-n NAME>] I
 
 Make the device at B assignable to guests. See
-L for more information.
+L for more information. If the -n option is
+supplied then the assignable device entry will the named with the
+given B.
 
 This will bind the device to the pciback driver and assign it to the
 "quarantine domain".  If it is already bound to a driver, it will
@@ -1622,10 +1626,11 @@ not to do this on a device critical to domain 0's 
operation, such as
 storage controllers, network interfaces, or GPUs that are currently
 being used.
 
-=item B [I<-r>] I
+=item B [I<-r>] I|I
 
-Make the device at B not assignable to guests. See
-L for more information.
+Make a device non-assignable to guests. The device may be identified
+either by its B or the B supplied when the device was made
+assignable. See L for more information.
 
 This will at least unbind the device from pciback, and
 re-assign it from the "quarantine domain" back to domain 0.  If the -r
-- 
2.11.0




[PATCH v3 16/23] docs/man: fix xl(1) documentation for 'pci' operations

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Currently the documentation completely fails to mention the existence of
PCI_SPEC_STRING. This patch tidies things up, specifically clarifying that
'pci-assignable-add/remove' take  arguments where as 'pci-attach/detach'
take  arguments (which will be enforced in a subsequent
patch).

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 docs/man/xl.1.pod.in | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index f92bacfa72..c5fbce3b5c 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -1597,14 +1597,18 @@ List virtual network interfaces for a domain.
 
 =item B
 
-List all the assignable PCI devices.
+List all the B of assignable PCI devices. See
+L for more information.
+
 These are devices in the system which are configured to be
 available for passthrough and are bound to a suitable PCI
 backend driver in domain 0 rather than a real driver.
 
 =item B I
 
-Make the device at PCI Bus/Device/Function BDF assignable to guests.
+Make the device at B assignable to guests. See
+L for more information.
+
 This will bind the device to the pciback driver and assign it to the
 "quarantine domain".  If it is already bound to a driver, it will
 first be unbound, and the original driver stored so that it can be
@@ -1620,8 +1624,10 @@ being used.
 
 =item B [I<-r>] I
 
-Make the device at PCI Bus/Device/Function BDF not assignable to
-guests.  This will at least unbind the device from pciback, and
+Make the device at B not assignable to guests. See
+L for more information.
+
+This will at least unbind the device from pciback, and
 re-assign it from the "quarantine domain" back to domain 0.  If the -r
 option is specified, it will also attempt to re-bind the device to its
 original driver, making it usable by Domain 0 again.  If the device is
@@ -1637,15 +1643,15 @@ As always, this should only be done if you trust the 
guest, or are
 confident that the particular device you're re-assigning to dom0 will
 cancel all in-flight DMA on FLR.
 
-=item B I I
+=item B I I
 
-Hot-plug a new pass-through pci device to the specified domain.
-B is the PCI Bus/Device/Function of the physical device to pass-through.
+Hot-plug a new pass-through pci device to the specified domain. See
+L for more information.
 
-=item B [I] I I
+=item B [I] I I
 
-Hot-unplug a previously assigned pci device from a domain. B is the PCI
-Bus/Device/Function of the physical device to be removed from the guest domain.
+Hot-unplug a pci device that was previously passed through to a domain. See
+L for more information.
 
 B
 
@@ -1660,7 +1666,7 @@ even without guest domain's collaboration.
 
 =item B I
 
-List pass-through pci devices for a domain.
+List the B of pci devices passed through to a domain.
 
 =back
 
-- 
2.11.0




[PATCH v3 10/23] libxl: remove get_all_assigned_devices() from libxl_pci.c

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Use of this function is a very inefficient way to check whether a device
has already been assigned.

This patch adds code that saves the domain id in xenstore at the point of
assignment, and removes it again when the device id de-assigned (or the
domain is destroyed). It is then straightforward to check whether a device
has been assigned by checking whether a device has a saved domain id.

NOTE: To facilitate the xenstore check it is necessary to move the
  pci_info_xs_read() earlier in libxl_pci.c. To keep related functions
  together, the rest of the pci_info_xs_XXX() functions are moved too.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_pci.c | 149 ---
 1 file changed, 55 insertions(+), 94 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index ec101f255f..d3c7a547c3 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -336,50 +336,6 @@ retry_transaction2:
 return 0;
 }
 
-static int get_all_assigned_devices(libxl__gc *gc, libxl_device_pci **list, 
int *num)
-{
-char **domlist;
-unsigned int nd = 0, i;
-
-*list = NULL;
-*num = 0;
-
-domlist = libxl__xs_directory(gc, XBT_NULL, "/local/domain", );
-for(i = 0; i < nd; i++) {
-char *path, *num_devs;
-
-path = GCSPRINTF("/local/domain/0/backend/%s/%s/0/num_devs",
- libxl__device_kind_to_string(LIBXL__DEVICE_KIND_PCI),
- domlist[i]);
-num_devs = libxl__xs_read(gc, XBT_NULL, path);
-if ( num_devs ) {
-int ndev = atoi(num_devs), j;
-char *devpath, *bdf;
-
-for(j = 0; j < ndev; j++) {
-devpath = GCSPRINTF("/local/domain/0/backend/%s/%s/0/dev-%u",
-
libxl__device_kind_to_string(LIBXL__DEVICE_KIND_PCI),
-domlist[i], j);
-bdf = libxl__xs_read(gc, XBT_NULL, devpath);
-if ( bdf ) {
-unsigned dom, bus, dev, func;
-if ( sscanf(bdf, PCI_BDF, , , , ) != 4 )
-continue;
-
-*list = realloc(*list, sizeof(libxl_device_pci) * ((*num) 
+ 1));
-if (*list == NULL)
-return ERROR_NOMEM;
-pci_struct_fill(*list + *num, dom, bus, dev, func, 0);
-(*num)++;
-}
-}
-}
-}
-libxl__ptr_add(gc, *list);
-
-return 0;
-}
-
 static int is_pci_in_array(libxl_device_pci *assigned, int num_assigned,
int dom, int bus, int dev, int func)
 {
@@ -427,19 +383,58 @@ static int sysfs_write_bdf(libxl__gc *gc, const char * 
sysfs_path,
 return 0;
 }
 
+#define PCI_INFO_PATH "/libxl/pci"
+
+static char *pci_info_xs_path(libxl__gc *gc, libxl_device_pci *pci,
+  const char *node)
+{
+return node ?
+GCSPRINTF(PCI_INFO_PATH"/"PCI_BDF_XSPATH"/%s",
+  pci->domain, pci->bus, pci->dev, pci->func,
+  node) :
+GCSPRINTF(PCI_INFO_PATH"/"PCI_BDF_XSPATH,
+  pci->domain, pci->bus, pci->dev, pci->func);
+}
+
+
+static int pci_info_xs_write(libxl__gc *gc, libxl_device_pci *pci,
+  const char *node, const char *val)
+{
+char *path = pci_info_xs_path(gc, pci, node);
+int rc = libxl__xs_printf(gc, XBT_NULL, path, "%s", val);
+
+if (rc) LOGE(WARN, "Write of %s to node %s failed.", val, path);
+
+return rc;
+}
+
+static char *pci_info_xs_read(libxl__gc *gc, libxl_device_pci *pci,
+  const char *node)
+{
+char *path = pci_info_xs_path(gc, pci, node);
+
+return libxl__xs_read(gc, XBT_NULL, path);
+}
+
+static void pci_info_xs_remove(libxl__gc *gc, libxl_device_pci *pci,
+   const char *node)
+{
+char *path = pci_info_xs_path(gc, pci, node);
+libxl_ctx *ctx = libxl__gc_owner(gc);
+
+/* Remove the xenstore entry */
+xs_rm(ctx->xsh, XBT_NULL, path);
+}
+
 libxl_device_pci *libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num)
 {
 GC_INIT(ctx);
-libxl_device_pci *pcis = NULL, *new, *assigned;
+libxl_device_pci *pcis = NULL, *new;
 struct dirent *de;
 DIR *dir;
-int r, num_assigned;
 
 *num = 0;
 
-r = get_all_assigned_devices(gc, , _assigned);
-if (r) goto out;
-
 dir = opendir(SYSFS_PCIBACK_DRIVER);
 if (NULL == dir) {
 if (errno == ENOENT) {
@@ -455,9 +450,6 @@ libxl_device_pci 
*libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num)
 if (sscanf(de->d_name, PCI_BDF, , , , ) != 4)
 continue;
 
-if (is_pci_in_array(assigned, num_assigned, dom, bus, dev, func))
-continue;
-
 new = realloc(pcis, ((*num) + 1) * sizeof(*new));
 if 

[PATCH v3 11/23] libxl: make sure callers of libxl_device_pci_list() free the list after use

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

A previous patch introduced libxl_device_pci_list_free() which should be used
by callers of libxl_device_pci_list() to properly dispose of the exported
'libxl_device_pci' types and the free the memory holding them. Whilst all
current callers do ensure the memory is freed, only the code in xl's
pcilist() function actually calls libxl_device_pci_dispose(). As it stands
this laxity does not lead to any memory leaks, but the simple addition of
.e.g. a 'string' into the idl definition of 'libxl_device_pci' would lead
to leaks.

This patch makes sure all callers of libxl_device_pci_list() can call
libxl_device_pci_list_free() by keeping copies of 'libxl_device_pci'
structures inline in 'pci_add_state' and 'pci_remove_state' (and also making
sure these are properly disposed at the end of the operations) rather
than keeping pointers to the structures returned by libxl_device_pci_list().

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anthony PERARD 
---
 tools/libs/light/libxl_pci.c | 68 
 tools/xl/xl_pci.c|  3 +-
 2 files changed, 38 insertions(+), 33 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index d3c7a547c3..0f41939d1f 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1025,7 +1025,7 @@ typedef struct pci_add_state {
 libxl__xswait_state xswait;
 libxl__ev_qmp qmp;
 libxl__ev_time timeout;
-libxl_device_pci *pci;
+libxl_device_pci pci;
 libxl_domid pci_domid;
 } pci_add_state;
 
@@ -1097,7 +1097,7 @@ static void pci_add_qemu_trad_watch_state_cb(libxl__egc 
*egc,
 
 /* Convenience aliases */
 libxl_domid domid = pas->domid;
-libxl_device_pci *pci = pas->pci;
+libxl_device_pci *pci = >pci;
 
 rc = check_qemu_running(gc, domid, xswa, rc, state);
 if (rc == ERROR_NOT_READY)
@@ -1118,7 +1118,7 @@ static void pci_add_qmp_device_add(libxl__egc *egc, 
pci_add_state *pas)
 
 /* Convenience aliases */
 libxl_domid domid = pas->domid;
-libxl_device_pci *pci = pas->pci;
+libxl_device_pci *pci = >pci;
 libxl__ev_qmp *const qmp = >qmp;
 
 rc = libxl__ev_time_register_rel(ao, >timeout,
@@ -1199,7 +1199,7 @@ static void pci_add_qmp_query_pci_cb(libxl__egc *egc,
 int dev_slot, dev_func;
 
 /* Convenience aliases */
-libxl_device_pci *pci = pas->pci;
+libxl_device_pci *pci = >pci;
 
 if (rc) goto out;
 
@@ -1300,7 +1300,7 @@ static void pci_add_dm_done(libxl__egc *egc,
 
 /* Convenience aliases */
 bool starting = pas->starting;
-libxl_device_pci *pci = pas->pci;
+libxl_device_pci *pci = >pci;
 bool hvm = libxl__domain_type(gc, domid) == LIBXL_DOMAIN_TYPE_HVM;
 
 libxl__ev_qmp_dispose(gc, >qmp);
@@ -1516,7 +1516,10 @@ void libxl__device_pci_add(libxl__egc *egc, uint32_t 
domid,
 GCNEW(pas);
 pas->aodev = aodev;
 pas->domid = domid;
-pas->pci = pci;
+
+libxl_device_pci_copy(CTX, >pci, pci);
+pci = >pci;
+
 pas->starting = starting;
 pas->callback = device_pci_add_stubdom_done;
 
@@ -1555,12 +1558,6 @@ void libxl__device_pci_add(libxl__egc *egc, uint32_t 
domid,
 
 stubdomid = libxl_get_stubdom_id(ctx, domid);
 if (stubdomid != 0) {
-libxl_device_pci *pci_s;
-
-GCNEW(pci_s);
-libxl_device_pci_init(pci_s);
-libxl_device_pci_copy(CTX, pci_s, pci);
-pas->pci = pci_s;
 pas->callback = device_pci_add_stubdom_wait;
 
 do_pci_add(egc, stubdomid, pas); /* must be last */
@@ -1619,7 +1616,7 @@ static void device_pci_add_stubdom_done(libxl__egc *egc,
 
 /* Convenience aliases */
 libxl_domid domid = pas->domid;
-libxl_device_pci *pci = pas->pci;
+libxl_device_pci *pci = >pci;
 
 if (rc) goto out;
 
@@ -1670,7 +1667,7 @@ static void device_pci_add_done(libxl__egc *egc,
 EGC_GC;
 libxl__ao_device *aodev = pas->aodev;
 libxl_domid domid = pas->domid;
-libxl_device_pci *pci = pas->pci;
+libxl_device_pci *pci = >pci;
 
 if (rc) {
 LOGD(ERROR, domid,
@@ -1680,6 +1677,7 @@ static void device_pci_add_done(libxl__egc *egc,
  rc);
 pci_info_xs_remove(gc, pci, "domid");
 }
+libxl_device_pci_dispose(pci);
 aodev->rc = rc;
 aodev->callback(egc, aodev);
 }
@@ -1770,7 +1768,7 @@ static int qemu_pci_remove_xenstore(libxl__gc *gc, 
uint32_t domid,
 typedef struct pci_remove_state {
 libxl__ao_device *aodev;
 libxl_domid domid;
-libxl_device_pci *pci;
+libxl_device_pci pci;
 bool force;
 bool hvm;
 unsigned int orig_vdev;
@@ -1812,23 +1810,26 @@ static void do_pci_remove(libxl__egc *egc, 
pci_remove_state *prs)
 {
 STATE_AO_GC(prs->aodev->ao);
 libxl_ctx *ctx = libxl__gc_owner(gc);
-libxl_device_pci *assigned;
+libxl_device_pci *pcis;
+bool attached;
 uint32_t domid = prs->domid;
 libxl_domain_type type = libxl__domain_type(gc, domid);
-   

[PATCH v3 13/23] libxl: use COMPARE_PCI() macro is_pci_in_array()...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... rather than an open-coded equivalent.

This patch tidies up the is_pci_in_array() function, making it take a single
'libxl_device_pci' argument rather than separate domain, bus, device and
function arguments. The already-available COMPARE_PCI() macro can then be
used and it is also modified to return 'bool' rather than 'int'.

The patch also modifies libxl_pci_assignable() to use is_pci_in_array() rather
than a separate open-coded equivalent, and also modifies it to return a
'bool' rather than an 'int'.

NOTE: The COMPARE_PCI() macro is also fixed to include the 'domain' in its
  comparison, which should always have been the case.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_internal.h |  7 ---
 tools/libs/light/libxl_pci.c  | 38 +-
 2 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/tools/libs/light/libxl_internal.h 
b/tools/libs/light/libxl_internal.h
index ecee61b541..02f8a3179c 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -4746,9 +4746,10 @@ void libxl__xcinfo2xlinfo(libxl_ctx *ctx,
  * devices have same identifier. */
 #define COMPARE_DEVID(a, b) ((a)->devid == (b)->devid)
 #define COMPARE_DISK(a, b) (!strcmp((a)->vdev, (b)->vdev))
-#define COMPARE_PCI(a, b) ((a)->func == (b)->func &&\
-   (a)->bus == (b)->bus &&  \
-   (a)->dev == (b)->dev)
+#define COMPARE_PCI(a, b) ((a)->domain == (b)->domain && \
+   (a)->bus == (b)->bus &&   \
+   (a)->dev == (b)->dev &&   \
+   (a)->func == (b)->func)
 #define COMPARE_USB(a, b) ((a)->ctrl == (b)->ctrl && \
(a)->port == (b)->port)
 #define COMPARE_USBCTRL(a, b) ((a)->devid == (b)->devid)
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 5a3352c2ec..e0b616fe18 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -336,24 +336,17 @@ retry_transaction2:
 return 0;
 }
 
-static int is_pci_in_array(libxl_device_pci *assigned, int num_assigned,
-   int dom, int bus, int dev, int func)
+static bool is_pci_in_array(libxl_device_pci *pcis, int num,
+libxl_device_pci *pci)
 {
 int i;
 
-for(i = 0; i < num_assigned; i++) {
-if ( assigned[i].domain != dom )
-continue;
-if ( assigned[i].bus != bus )
-continue;
-if ( assigned[i].dev != dev )
-continue;
-if ( assigned[i].func != func )
-continue;
-return 1;
+for (i = 0; i < num; i++) {
+if (COMPARE_PCI(pci, [i]))
+break;
 }
 
-return 0;
+return i < num;
 }
 
 /* Write the standard BDF into the sysfs path given by sysfs_path. */
@@ -1487,21 +1480,17 @@ int libxl_device_pci_add(libxl_ctx *ctx, uint32_t domid,
 return AO_INPROGRESS;
 }
 
-static int libxl_pci_assignable(libxl_ctx *ctx, libxl_device_pci *pci)
+static bool libxl_pci_assignable(libxl_ctx *ctx, libxl_device_pci *pci)
 {
 libxl_device_pci *pcis;
-int num, i;
+int num;
+bool assignable;
 
 pcis = libxl_device_pci_assignable_list(ctx, );
-for (i = 0; i < num; i++) {
-if (pcis[i].domain == pci->domain &&
-pcis[i].bus == pci->bus &&
-pcis[i].dev == pci->dev &&
-pcis[i].func == pci->func)
-break;
-}
+assignable = is_pci_in_array(pcis, num, pci);
 libxl_device_pci_assignable_list_free(pcis, num);
-return i != num;
+
+return assignable;
 }
 
 static void device_pci_add_stubdom_wait(libxl__egc *egc,
@@ -1834,8 +1823,7 @@ static void do_pci_remove(libxl__egc *egc, 
pci_remove_state *prs)
 goto out_fail;
 }
 
-attached = is_pci_in_array(pcis, num, pci->domain,
-   pci->bus, pci->dev, pci->func);
+attached = is_pci_in_array(pcis, num, pci);
 libxl_device_pci_list_free(pcis, num);
 
 rc = ERROR_INVAL;
-- 
2.11.0




[PATCH v3 07/23] libxl: stop using aodev->device_config in libxl__device_pci_add()...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... to hold a pointer to the device.

There is already a 'pci' field in 'pci_add_state' so simply use that from
the start. This also allows the 'pci' (#3) argument to be dropped from
do_pci_add().

NOTE: This patch also changes the type of the 'pci_domid' field in
  'pci_add_state' from 'int' to 'libxl_domid' which is more appropriate
  given what the field is used for.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_pci.c | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 41e4b2b571..77edd27345 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1074,7 +1074,7 @@ typedef struct pci_add_state {
 libxl__ev_qmp qmp;
 libxl__ev_time timeout;
 libxl_device_pci *pci;
-int pci_domid;
+libxl_domid pci_domid;
 } pci_add_state;
 
 static void pci_add_qemu_trad_watch_state_cb(libxl__egc *egc,
@@ -1091,7 +1091,6 @@ static void pci_add_dm_done(libxl__egc *,
 
 static void do_pci_add(libxl__egc *egc,
libxl_domid domid,
-   libxl_device_pci *pci,
pci_add_state *pas)
 {
 STATE_AO_GC(pas->aodev->ao);
@@ -1101,7 +1100,6 @@ static void do_pci_add(libxl__egc *egc,
 /* init pci_add_state */
 libxl__xswait_init(>xswait);
 libxl__ev_qmp_init(>qmp);
-pas->pci = pci;
 pas->pci_domid = domid;
 libxl__ev_time_init(>timeout);
 
@@ -1564,13 +1562,10 @@ void libxl__device_pci_add(libxl__egc *egc, uint32_t 
domid,
 int stubdomid = 0;
 pci_add_state *pas;
 
-/* Store *pci to be used by callbacks */
-aodev->device_config = pci;
-aodev->device_type = __pci_devtype;
-
 GCNEW(pas);
 pas->aodev = aodev;
 pas->domid = domid;
+pas->pci = pci;
 pas->starting = starting;
 pas->callback = device_pci_add_stubdom_done;
 
@@ -1624,9 +1619,10 @@ void libxl__device_pci_add(libxl__egc *egc, uint32_t 
domid,
 GCNEW(pci_s);
 libxl_device_pci_init(pci_s);
 libxl_device_pci_copy(CTX, pci_s, pci);
+pas->pci = pci_s;
 pas->callback = device_pci_add_stubdom_wait;
 
-do_pci_add(egc, stubdomid, pci_s, pas); /* must be last */
+do_pci_add(egc, stubdomid, pas); /* must be last */
 return;
 }
 
@@ -1681,9 +1677,8 @@ static void device_pci_add_stubdom_done(libxl__egc *egc,
 int i;
 
 /* Convenience aliases */
-libxl__ao_device *aodev = pas->aodev;
 libxl_domid domid = pas->domid;
-libxl_device_pci *pci = aodev->device_config;
+libxl_device_pci *pci = pas->pci;
 
 if (rc) goto out;
 
@@ -1718,7 +1713,7 @@ static void device_pci_add_stubdom_done(libxl__egc *egc,
 pci->vdevfn = orig_vdev;
 }
 pas->callback = device_pci_add_done;
-do_pci_add(egc, domid, pci, pas); /* must be last */
+do_pci_add(egc, domid, pas); /* must be last */
 return;
 }
 }
@@ -1734,7 +1729,7 @@ static void device_pci_add_done(libxl__egc *egc,
 EGC_GC;
 libxl__ao_device *aodev = pas->aodev;
 libxl_domid domid = pas->domid;
-libxl_device_pci *pci = aodev->device_config;
+libxl_device_pci *pci = pas->pci;
 
 if (rc) {
 LOGD(ERROR, domid,
-- 
2.11.0




[PATCH v3 09/23] libxl: remove unnecessary check from libxl__device_pci_add()

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

The code currently checks explicitly whether the device is already assigned,
but this is actually unnecessary as assigned devices do not form part of
the list returned by libxl_device_pci_assignable_list() and hence the
libxl_pci_assignable() test would have already failed.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_pci.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index a5d5d2e78b..ec101f255f 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1555,8 +1555,7 @@ void libxl__device_pci_add(libxl__egc *egc, uint32_t 
domid,
 {
 STATE_AO_GC(aodev->ao);
 libxl_ctx *ctx = libxl__gc_owner(gc);
-libxl_device_pci *assigned;
-int num_assigned, rc;
+int rc;
 int stubdomid = 0;
 pci_add_state *pas;
 
@@ -1595,19 +1594,6 @@ void libxl__device_pci_add(libxl__egc *egc, uint32_t 
domid,
 goto out;
 }
 
-rc = get_all_assigned_devices(gc, , _assigned);
-if ( rc ) {
-LOGD(ERROR, domid,
- "cannot determine if device is assigned, refusing to continue");
-goto out;
-}
-if ( is_pci_in_array(assigned, num_assigned, pci->domain,
- pci->bus, pci->dev, pci->func) ) {
-LOGD(ERROR, domid, "PCI device already attached to a domain");
-rc = ERROR_FAIL;
-goto out;
-}
-
 libxl__device_pci_reset(gc, pci->domain, pci->bus, pci->dev, pci->func);
 
 stubdomid = libxl_get_stubdom_id(ctx, domid);
-- 
2.11.0




[PATCH v3 08/23] libxl: generalise 'driver_path' xenstore access functions in libxl_pci.c

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

For the purposes of re-binding a device to its previous driver
libxl__device_pci_assignable_add() writes the driver path into xenstore.
This path is then read back in libxl__device_pci_assignable_remove().

The functions that support this writing to and reading from xenstore are
currently dedicated for this purpose and hence the node name 'driver_path'
is hard-coded. This patch generalizes these utility functions and passes
'driver_path' as an argument. Subsequent patches will invoke them to
access other nodes.

NOTE: Because functions will have a broader use (other than storing a
  driver path in lieu of pciback) the base xenstore path is also
  changed from '/libxl/pciback' to '/libxl/pci'.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_pci.c | 66 +---
 1 file changed, 32 insertions(+), 34 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 77edd27345..a5d5d2e78b 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -737,48 +737,46 @@ static int pciback_dev_unassign(libxl__gc *gc, 
libxl_device_pci *pci)
 return 0;
 }
 
-#define PCIBACK_INFO_PATH "/libxl/pciback"
+#define PCI_INFO_PATH "/libxl/pci"
 
-static void pci_assignable_driver_path_write(libxl__gc *gc,
-libxl_device_pci *pci,
-char *driver_path)
+static char *pci_info_xs_path(libxl__gc *gc, libxl_device_pci *pci,
+  const char *node)
 {
-char *path;
+return node ?
+GCSPRINTF(PCI_INFO_PATH"/"PCI_BDF_XSPATH"/%s",
+  pci->domain, pci->bus, pci->dev, pci->func,
+  node) :
+GCSPRINTF(PCI_INFO_PATH"/"PCI_BDF_XSPATH,
+  pci->domain, pci->bus, pci->dev, pci->func);
+}
+
+
+static void pci_info_xs_write(libxl__gc *gc, libxl_device_pci *pci,
+  const char *node, const char *val)
+{
+char *path = pci_info_xs_path(gc, pci, node);
 
-path = GCSPRINTF(PCIBACK_INFO_PATH"/"PCI_BDF_XSPATH"/driver_path",
- pci->domain,
- pci->bus,
- pci->dev,
- pci->func);
-if ( libxl__xs_printf(gc, XBT_NULL, path, "%s", driver_path) < 0 ) {
-LOGE(WARN, "Write of %s to node %s failed.", driver_path, path);
+if ( libxl__xs_printf(gc, XBT_NULL, path, "%s", val) < 0 ) {
+LOGE(WARN, "Write of %s to node %s failed.", val, path);
 }
 }
 
-static char * pci_assignable_driver_path_read(libxl__gc *gc,
-  libxl_device_pci *pci)
+static char *pci_info_xs_read(libxl__gc *gc, libxl_device_pci *pci,
+  const char *node)
 {
-return libxl__xs_read(gc, XBT_NULL,
-  GCSPRINTF(
-   PCIBACK_INFO_PATH "/" PCI_BDF_XSPATH "/driver_path",
-   pci->domain,
-   pci->bus,
-   pci->dev,
-   pci->func));
+char *path = pci_info_xs_path(gc, pci, node);
+
+return libxl__xs_read(gc, XBT_NULL, path);
 }
 
-static void pci_assignable_driver_path_remove(libxl__gc *gc,
-  libxl_device_pci *pci)
+static void pci_info_xs_remove(libxl__gc *gc, libxl_device_pci *pci,
+   const char *node)
 {
+char *path = pci_info_xs_path(gc, pci, node);
 libxl_ctx *ctx = libxl__gc_owner(gc);
 
 /* Remove the xenstore entry */
-xs_rm(ctx->xsh, XBT_NULL,
-  GCSPRINTF(PCIBACK_INFO_PATH "/" PCI_BDF_XSPATH,
-pci->domain,
-pci->bus,
-pci->dev,
-pci->func) );
+xs_rm(ctx->xsh, XBT_NULL, path);
 }
 
 static int libxl__device_pci_assignable_add(libxl__gc *gc,
@@ -824,9 +822,9 @@ static int libxl__device_pci_assignable_add(libxl__gc *gc,
 /* Store driver_path for rebinding to dom0 */
 if ( rebind ) {
 if ( driver_path ) {
-pci_assignable_driver_path_write(gc, pci, driver_path);
+pci_info_xs_write(gc, pci, "driver_path", driver_path);
 } else if ( (driver_path =
- pci_assignable_driver_path_read(gc, pci)) != NULL ) {
+ pci_info_xs_read(gc, pci, "driver_path")) != NULL ) {
 LOG(INFO, PCI_BDF" not bound to a driver, will be rebound to %s",
 dom, bus, dev, func, driver_path);
 } else {
@@ -834,7 +832,7 @@ static int libxl__device_pci_assignable_add(libxl__gc *gc,
 dom, bus, dev, func);
 }
 } else {
-pci_assignable_driver_path_remove(gc, pci);
+pci_info_xs_remove(gc, pci, "driver_path");
 }
 
 if ( pciback_dev_assign(gc, pci) ) {
@@ -884,7 +882,7 @@ static int 

[PATCH v3 03/23] libxl: Make sure devices added by pci-attach are reflected in the config

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Currently libxl__device_pci_add_xenstore() is broken in that does not
update the domain's configuration for the first device added (which causes
creation of the overall backend area in xenstore). This can be easily observed
by running 'xl list -l' after adding a single device: the device will be
missing.

This patch fixes the problem and adds a DEBUG log line to allow easy
verification that the domain configuration is being modified. Also, the use
of libxl__device_generic_add() is dropped as it leads to a confusing situation
where only partial backend information is written under the xenstore
'/libxl' path. For LIBXL__DEVICE_KIND_PCI devices the only definitive
information in xenstore is under '/local/domain/0/backend' (the '0' being
hard-coded).

NOTE: This patch includes a whitespace in add_pcis_done().

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anthony PERARD 

v2:
 - Avoid having two completely different ways of adding devices into xenstore

v3:
 - Revert some changes form v2 as there is confusion over use of the libxl
   and backend xenstore paths which needs to be fixed
---
 tools/libs/light/libxl_pci.c | 87 +++-
 1 file changed, 45 insertions(+), 42 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 9d44b28f0a..da01c77ba2 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -79,39 +79,55 @@ static void libxl__device_from_pci(libxl__gc *gc, uint32_t 
domid,
 device->kind = LIBXL__DEVICE_KIND_PCI;
 }
 
-static int libxl__create_pci_backend(libxl__gc *gc, uint32_t domid,
- const libxl_device_pci *pci,
- int num)
+static void libxl__create_pci_backend(libxl__gc *gc, xs_transaction_t t,
+  uint32_t domid, const libxl_device_pci 
*pci)
 {
-flexarray_t *front = NULL;
-flexarray_t *back = NULL;
-libxl__device device;
-int i;
+libxl_ctx *ctx = libxl__gc_owner(gc);
+flexarray_t *front, *back;
+char *fe_path, *be_path;
+struct xs_permissions fe_perms[2], be_perms[2];
+
+LOGD(DEBUG, domid, "Creating pci backend");
 
 front = flexarray_make(gc, 16, 1);
 back = flexarray_make(gc, 16, 1);
 
-LOGD(DEBUG, domid, "Creating pci backend");
-
-/* add pci device */
-libxl__device_from_pci(gc, domid, pci, );
+fe_path = libxl__domain_device_frontend_path(gc, domid, 0,
+ LIBXL__DEVICE_KIND_PCI);
+be_path = libxl__domain_device_backend_path(gc, 0, domid, 0,
+LIBXL__DEVICE_KIND_PCI);
 
+flexarray_append_pair(back, "frontend", fe_path);
 flexarray_append_pair(back, "frontend-id", GCSPRINTF("%d", domid));
-flexarray_append_pair(back, "online", "1");
+flexarray_append_pair(back, "online", GCSPRINTF("%d", 1));
 flexarray_append_pair(back, "state", GCSPRINTF("%d", 
XenbusStateInitialising));
 flexarray_append_pair(back, "domain", libxl__domid_to_name(gc, domid));
 
-for (i = 0; i < num; i++, pci++)
-libxl_create_pci_backend_device(gc, back, i, pci);
+be_perms[0].id = 0;
+be_perms[0].perms = XS_PERM_NONE;
+be_perms[1].id = domid;
+be_perms[1].perms = XS_PERM_READ;
+
+xs_rm(ctx->xsh, t, be_path);
+xs_mkdir(ctx->xsh, t, be_path);
+xs_set_permissions(ctx->xsh, t, be_path, be_perms,
+   ARRAY_SIZE(be_perms));
+libxl__xs_writev(gc, t, be_path, libxl__xs_kvs_of_flexarray(gc, back));
 
-flexarray_append_pair(back, "num_devs", GCSPRINTF("%d", num));
+flexarray_append_pair(front, "backend", be_path);
 flexarray_append_pair(front, "backend-id", GCSPRINTF("%d", 0));
 flexarray_append_pair(front, "state", GCSPRINTF("%d", 
XenbusStateInitialising));
 
-return libxl__device_generic_add(gc, XBT_NULL, ,
- libxl__xs_kvs_of_flexarray(gc, back),
- libxl__xs_kvs_of_flexarray(gc, front),
- NULL);
+fe_perms[0].id = domid;
+fe_perms[0].perms = XS_PERM_NONE;
+fe_perms[1].id = 0;
+fe_perms[1].perms = XS_PERM_READ;
+
+xs_rm(ctx->xsh, t, fe_path);
+xs_mkdir(ctx->xsh, t, fe_path);
+xs_set_permissions(ctx->xsh, t, fe_path,
+   fe_perms, ARRAY_SIZE(fe_perms));
+libxl__xs_writev(gc, t, fe_path, libxl__xs_kvs_of_flexarray(gc, front));
 }
 
 static int libxl__device_pci_add_xenstore(libxl__gc *gc,
@@ -135,8 +151,6 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc,
 be_path = libxl__domain_device_backend_path(gc, 0, domid, 0,
 LIBXL__DEVICE_KIND_PCI);
 num_devs = libxl__xs_read(gc, XBT_NULL, GCSPRINTF("%s/num_devs", be_path));
-if (!num_devs)
-return libxl__create_pci_backend(gc, domid, pci, 1);
 
 

[PATCH v3 05/23] libxl: s/detatched/detached in libxl_pci.c

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Simply spelling correction. Purely cosmetic fix.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_pci.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 50c96cbfa6..de617e95eb 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1864,7 +1864,7 @@ static void pci_remove_qmp_query_cb(libxl__egc *egc,
 libxl__ev_qmp *qmp, const libxl__json_object *response, int rc);
 static void pci_remove_timeout(libxl__egc *egc,
 libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
-static void pci_remove_detatched(libxl__egc *egc,
+static void pci_remove_detached(libxl__egc *egc,
 pci_remove_state *prs, int rc);
 static void pci_remove_stubdom_done(libxl__egc *egc,
 libxl__ao_device *aodev);
@@ -1978,7 +1978,7 @@ skip1:
 skip_irq:
 rc = 0;
 out_fail:
-pci_remove_detatched(egc, prs, rc); /* must be last */
+pci_remove_detached(egc, prs, rc); /* must be last */
 }
 
 static void pci_remove_qemu_trad_watch_state_cb(libxl__egc *egc,
@@ -2002,7 +2002,7 @@ static void 
pci_remove_qemu_trad_watch_state_cb(libxl__egc *egc,
 rc = qemu_pci_remove_xenstore(gc, domid, pci, prs->force);
 
 out:
-pci_remove_detatched(egc, prs, rc);
+pci_remove_detached(egc, prs, rc);
 }
 
 static void pci_remove_qmp_device_del(libxl__egc *egc,
@@ -2028,7 +2028,7 @@ static void pci_remove_qmp_device_del(libxl__egc *egc,
 return;
 
 out:
-pci_remove_detatched(egc, prs, rc);
+pci_remove_detached(egc, prs, rc);
 }
 
 static void pci_remove_qmp_device_del_cb(libxl__egc *egc,
@@ -2051,7 +2051,7 @@ static void pci_remove_qmp_device_del_cb(libxl__egc *egc,
 return;
 
 out:
-pci_remove_detatched(egc, prs, rc);
+pci_remove_detached(egc, prs, rc);
 }
 
 static void pci_remove_qmp_retry_timer_cb(libxl__egc *egc, libxl__ev_time *ev,
@@ -2067,7 +2067,7 @@ static void pci_remove_qmp_retry_timer_cb(libxl__egc 
*egc, libxl__ev_time *ev,
 return;
 
 out:
-pci_remove_detatched(egc, prs, rc);
+pci_remove_detached(egc, prs, rc);
 }
 
 static void pci_remove_qmp_query_cb(libxl__egc *egc,
@@ -2127,7 +2127,7 @@ static void pci_remove_qmp_query_cb(libxl__egc *egc,
 }
 
 out:
-pci_remove_detatched(egc, prs, rc); /* must be last */
+pci_remove_detached(egc, prs, rc); /* must be last */
 }
 
 static void pci_remove_timeout(libxl__egc *egc, libxl__ev_time *ev,
@@ -2146,12 +2146,12 @@ static void pci_remove_timeout(libxl__egc *egc, 
libxl__ev_time *ev,
 /* If we timed out, we might still want to keep destroying the device
  * (when force==true), so let the next function decide what to do on
  * error */
-pci_remove_detatched(egc, prs, rc);
+pci_remove_detached(egc, prs, rc);
 }
 
-static void pci_remove_detatched(libxl__egc *egc,
- pci_remove_state *prs,
- int rc)
+static void pci_remove_detached(libxl__egc *egc,
+pci_remove_state *prs,
+int rc)
 {
 STATE_AO_GC(prs->aodev->ao);
 int stubdomid = 0;
-- 
2.11.0




[PATCH v3 06/23] libxl: remove extraneous arguments to do_pci_remove() in libxl_pci.c

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Both 'domid' and 'pci' are available in 'pci_remove_state' so there is no
need to also pass them as separate arguments.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_pci.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index de617e95eb..41e4b2b571 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1871,14 +1871,14 @@ static void pci_remove_stubdom_done(libxl__egc *egc,
 static void pci_remove_done(libxl__egc *egc,
 pci_remove_state *prs, int rc);
 
-static void do_pci_remove(libxl__egc *egc, uint32_t domid,
-  libxl_device_pci *pci, int force,
-  pci_remove_state *prs)
+static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
 {
 STATE_AO_GC(prs->aodev->ao);
 libxl_ctx *ctx = libxl__gc_owner(gc);
 libxl_device_pci *assigned;
+uint32_t domid = prs->domid;
 libxl_domain_type type = libxl__domain_type(gc, domid);
+libxl_device_pci *pci = prs->pci;
 int rc, num;
 uint32_t domainid = domid;
 
@@ -2275,7 +2275,6 @@ static void device_pci_remove_common_next(libxl__egc *egc,
 EGC_GC;
 
 /* Convenience aliases */
-libxl_domid domid = prs->domid;
 libxl_device_pci *const pci = prs->pci;
 libxl__ao_device *const aodev = prs->aodev;
 const unsigned int pfunc_mask = prs->pfunc_mask;
@@ -2293,7 +2292,7 @@ static void device_pci_remove_common_next(libxl__egc *egc,
 } else {
 pci->vdevfn = orig_vdev;
 }
-do_pci_remove(egc, domid, pci, prs->force, prs);
+do_pci_remove(egc, prs);
 return;
 }
 }
-- 
2.11.0




[PATCH v3 02/23] libxl: make libxl__device_list() work correctly for LIBXL__DEVICE_KIND_PCI...

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

... devices.

Currently there is an assumption built into libxl__device_list() that device
backends are fully enumarated under the '/libxl' path in xenstore. This is
not the case for PCI backend devices, which are only properly enumerated
under '/local/domain/0/backend'.

This patch adds a new get_path() method to libxl__device_type to allow a
backend implementation (such as PCI) to specify the xenstore path where
devices are enumerated and modifies libxl__device_list() to use this method
if it is available. Also, if the get_num() method is defined then the
from_xenstore() method expects to be passed the backend path without the device
number concatenated, so this issue is also rectified.

Having made libxl__device_list() work correctly, this patch removes the
open-coded libxl_pci_device_pci_list() in favour of an evaluation of the
LIBXL_DEFINE_DEVICE_LIST() macro. This has the side-effect of also defining
libxl_pci_device_pci_list_free() which will be used in subsequent patches.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anthony PERARD 

v3:
 - New in v3 (replacing "libxl: use LIBXL_DEFINE_DEVICE_LIST for pci devices")
---
 tools/include/libxl.h |  7 +
 tools/libs/light/libxl_device.c   | 66 +--
 tools/libs/light/libxl_internal.h |  2 ++
 tools/libs/light/libxl_pci.c  | 29 +
 4 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index fbe4c81ba5..ee52d3cf7e 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -452,6 +452,12 @@
 #define LIBXL_HAVE_CONFIG_PCIS 1
 
 /*
+ * LIBXL_HAVE_DEVICE_PCI_LIST_FREE indicates that the
+ * libxl_device_pci_list_free() function is defined.
+ */
+#define LIBXL_HAVE_DEVICE_PCI_LIST_FREE 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
@@ -2321,6 +2327,7 @@ int libxl_device_pci_destroy(libxl_ctx *ctx, uint32_t 
domid,
 
 libxl_device_pci *libxl_device_pci_list(libxl_ctx *ctx, uint32_t domid,
 int *num);
+void libxl_device_pci_list_free(libxl_device_pci* list, int num);
 
 /*
  * Turns the current process into a backend device service daemon
diff --git a/tools/libs/light/libxl_device.c b/tools/libs/light/libxl_device.c
index e081faf9a9..ac173a043d 100644
--- a/tools/libs/light/libxl_device.c
+++ b/tools/libs/light/libxl_device.c
@@ -2011,7 +2011,7 @@ void *libxl__device_list(libxl__gc *gc, const 
libxl__device_type *dt,
 void *r = NULL;
 void *list = NULL;
 void *item = NULL;
-char *libxl_path;
+char *path;
 char **dir = NULL;
 unsigned int ndirs = 0;
 unsigned int ndevs = 0;
@@ -2019,42 +2019,46 @@ void *libxl__device_list(libxl__gc *gc, const 
libxl__device_type *dt,
 
 *num = 0;
 
-libxl_path = GCSPRINTF("%s/device/%s",
-   libxl__xs_libxl_path(gc, domid),
-   libxl__device_kind_to_string(dt->type));
-
-dir = libxl__xs_directory(gc, XBT_NULL, libxl_path, );
+if (dt->get_path) {
+rc = dt->get_path(gc, domid, );
+if (rc) goto out;
+} else {
+path = GCSPRINTF("%s/device/%s",
+ libxl__xs_libxl_path(gc, domid),
+ libxl__device_kind_to_string(dt->type));
+}
 
-if (dir && ndirs) {
-if (dt->get_num) {
-if (ndirs != 1) {
-LOGD(ERROR, domid, "multiple entries in %s\n", libxl_path);
-rc = ERROR_FAIL;
-goto out;
-}
-rc = dt->get_num(gc, GCSPRINTF("%s/%s", libxl_path, *dir), );
-if (rc) goto out;
-} else {
+if (dt->get_num) {
+rc = dt->get_num(gc, path, );
+if (rc) goto out;
+} else {
+dir = libxl__xs_directory(gc, XBT_NULL, path, );
+if (dir && ndirs)
 ndevs = ndirs;
-}
-list = libxl__malloc(NOGC, dt->dev_elem_size * ndevs);
-item = list;
+}
 
-while (*num < ndevs) {
-dt->init(item);
+if (!ndevs)
+return NULL;
 
-if (dt->from_xenstore) {
-int nr = dt->get_num ? *num : atoi(*dir);
-char *device_libxl_path = GCSPRINTF("%s/%s", libxl_path, *dir);
-rc = dt->from_xenstore(gc, device_libxl_path, nr, item);
-if (rc) goto out;
-}
+list = libxl__malloc(NOGC, dt->dev_elem_size * ndevs);
+item = list;
 
-item = (uint8_t *)item + dt->dev_elem_size;
-++(*num);
-if (!dt->get_num)
-++dir;
+while (*num < ndevs) {
+dt->init(item);
+
+if (dt->from_xenstore) {
+int nr = dt->get_num ? *num : atoi(*dir);
+char *device_path = dt->get_num ? path :
+GCSPRINTF("%s/%d", path, nr);
+
+rc = dt->from_xenstore(gc, 

[PATCH v3 01/23] xl / libxl: s/pcidev/pci and remove DEFINE_DEVICE_TYPE_STRUCT_X

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

The seemingly arbitrary use of 'pci' and 'pcidev' in the code in libxl_pci.c
is confusing and also compromises use of some macros used for other device
types. Indeed it seems that DEFINE_DEVICE_TYPE_STRUCT_X exists solely because
of this duality.

This patch purges use of 'pcidev' from the libxl code, allowing evaluation of
DEFINE_DEVICE_TYPE_STRUCT_X to be replaced with DEFINE_DEVICE_TYPE_STRUCT,
hence allowing removal of the former.

For consistency the xl and libs/util code is also modified, but in this case
it is purely cosmetic.

NOTE: Some of the more gross formatting errors (such as lack of spaces after
  keywords) that came into context have been fixed in libxl_pci.c.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Anthony PERARD 
---
 tools/include/libxl.h |  17 +-
 tools/libs/light/libxl_create.c   |   6 +-
 tools/libs/light/libxl_dm.c   |  18 +-
 tools/libs/light/libxl_internal.h |  45 ++-
 tools/libs/light/libxl_pci.c  | 582 +++---
 tools/libs/light/libxl_types.idl  |   2 +-
 tools/libs/util/libxlu_pci.c  |  36 +--
 tools/xl/xl_parse.c   |  28 +-
 tools/xl/xl_pci.c |  68 ++---
 tools/xl/xl_sxp.c |  12 +-
 10 files changed, 409 insertions(+), 405 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 1ea5b4f446..fbe4c81ba5 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -445,6 +445,13 @@
 #define LIBXL_HAVE_DISK_SAFE_REMOVE 1
 
 /*
+ * LIBXL_HAVE_CONFIG_PCIS indicates that the 'pcidevs' and 'num_pcidevs'
+ * fields in libxl_domain_config have been renamed to 'pcis' and 'num_pcis'
+ * respectively.
+ */
+#define LIBXL_HAVE_CONFIG_PCIS 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
@@ -2300,15 +2307,15 @@ int libxl_device_pvcallsif_destroy(libxl_ctx *ctx, 
uint32_t domid,
 
 /* PCI Passthrough */
 int libxl_device_pci_add(libxl_ctx *ctx, uint32_t domid,
- libxl_device_pci *pcidev,
+ libxl_device_pci *pci,
  const libxl_asyncop_how *ao_how)
  LIBXL_EXTERNAL_CALLERS_ONLY;
 int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid,
-libxl_device_pci *pcidev,
+libxl_device_pci *pci,
 const libxl_asyncop_how *ao_how)
 LIBXL_EXTERNAL_CALLERS_ONLY;
 int libxl_device_pci_destroy(libxl_ctx *ctx, uint32_t domid,
- libxl_device_pci *pcidev,
+ libxl_device_pci *pci,
  const libxl_asyncop_how *ao_how)
  LIBXL_EXTERNAL_CALLERS_ONLY;
 
@@ -2352,8 +2359,8 @@ int libxl_device_events_handler(libxl_ctx *ctx,
  * added or is not bound, the functions will emit a warning but return
  * SUCCESS.
  */
-int libxl_device_pci_assignable_add(libxl_ctx *ctx, libxl_device_pci *pcidev, 
int rebind);
-int libxl_device_pci_assignable_remove(libxl_ctx *ctx, libxl_device_pci 
*pcidev, int rebind);
+int libxl_device_pci_assignable_add(libxl_ctx *ctx, libxl_device_pci *pci, int 
rebind);
+int libxl_device_pci_assignable_remove(libxl_ctx *ctx, libxl_device_pci *pci, 
int rebind);
 libxl_device_pci *libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num);
 
 /* CPUID handling */
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 321a13e519..1f5052c520 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -1100,7 +1100,7 @@ int libxl__domain_config_setdefault(libxl__gc *gc,
 goto error_out;
 }
 
-bool need_pt = d_config->num_pcidevs || d_config->num_dtdevs;
+bool need_pt = d_config->num_pcis || d_config->num_dtdevs;
 if (c_info->passthrough == LIBXL_PASSTHROUGH_DEFAULT) {
 c_info->passthrough = need_pt
 ? LIBXL_PASSTHROUGH_ENABLED : LIBXL_PASSTHROUGH_DISABLED;
@@ -1141,7 +1141,7 @@ int libxl__domain_config_setdefault(libxl__gc *gc,
  * assignment when PoD is enabled.
  */
 if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV &&
-d_config->num_pcidevs && pod_enabled) {
+d_config->num_pcis && pod_enabled) {
 ret = ERROR_INVAL;
 LOGD(ERROR, domid,
  "PCI device assignment for HVM guest failed due to PoD enabled");
@@ -1817,7 +1817,7 @@ const libxl__device_type *device_type_tbl[] = {
 __vtpm_devtype,
 __usbctrl_devtype,
 __usbdev_devtype,
-__pcidev_devtype,
+__pci_devtype,
 __dtdev_devtype,
 __vdispl_devtype,
 __vsnd_devtype,
diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 3da83259c0..8ebe1b60c9 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -442,7 +442,7 @@ int libxl__domain_device_construct_rdm(libxl__gc *gc,
 
 /* Might 

[PATCH v3 00/23] xl / libxl: named PCI pass-through devices

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Paul Durrant (23):
  xl / libxl: s/pcidev/pci and remove DEFINE_DEVICE_TYPE_STRUCT_X
  libxl: make libxl__device_list() work correctly for
LIBXL__DEVICE_KIND_PCI...
  libxl: Make sure devices added by pci-attach are reflected in the
config
  libxl: add/recover 'rdm_policy' to/from PCI backend in xenstore
  libxl: s/detatched/detached in libxl_pci.c
  libxl: remove extraneous arguments to do_pci_remove() in libxl_pci.c
  libxl: stop using aodev->device_config in libxl__device_pci_add()...
  libxl: generalise 'driver_path' xenstore access functions in
libxl_pci.c
  libxl: remove unnecessary check from libxl__device_pci_add()
  libxl: remove get_all_assigned_devices() from libxl_pci.c
  libxl: make sure callers of libxl_device_pci_list() free the list
after use
  libxl: add libxl_device_pci_assignable_list_free()...
  libxl: use COMPARE_PCI() macro is_pci_in_array()...
  docs/man: extract documentation of PCI_SPEC_STRING from the xl.cfg
manpage...
  docs/man: improve documentation of PCI_SPEC_STRING...
  docs/man: fix xl(1) documentation for 'pci' operations
  libxl: introduce 'libxl_pci_bdf' in the idl...
  libxlu: introduce xlu_pci_parse_spec_string()
  libxl: modify
libxl_device_pci_assignable_add/remove/list/list_free()...
  docs/man: modify xl(1) in preparation for naming of assignable devices
  xl / libxl: support naming of assignable devices
  docs/man: modify xl-pci-configuration(5) to add 'name' field to
PCI_SPEC_STRING
  xl / libxl: support 'xl pci-attach/detach' by name

 docs/man/xl-pci-configuration.5.pod  |  218 +++
 docs/man/xl.1.pod.in |   39 +-
 docs/man/xl.cfg.5.pod.in |   68 +--
 tools/golang/xenlight/helpers.gen.go |   77 ++-
 tools/golang/xenlight/types.gen.go   |8 +-
 tools/include/libxl.h|   67 ++-
 tools/include/libxlutil.h|8 +-
 tools/libs/light/libxl_create.c  |6 +-
 tools/libs/light/libxl_device.c  |   66 ++-
 tools/libs/light/libxl_dm.c  |   18 +-
 tools/libs/light/libxl_internal.h|   55 +-
 tools/libs/light/libxl_pci.c | 1048 ++
 tools/libs/light/libxl_types.idl |   19 +-
 tools/libs/util/libxlu_pci.c |  359 ++--
 tools/ocaml/libs/xl/xenlight_stubs.c |   19 +-
 tools/xl/xl_cmdtable.c   |   16 +-
 tools/xl/xl_parse.c  |   30 +-
 tools/xl/xl_pci.c|  163 +++---
 tools/xl/xl_sxp.c|   12 +-
 19 files changed, 1367 insertions(+), 929 deletions(-)
 create mode 100644 docs/man/xl-pci-configuration.5.pod
---
Cc: Anthony PERARD 
Cc: Christian Lindig 
Cc: David Scott 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Nick Rosbrook 
Cc: Wei Liu 
-- 
2.11.0




[PATCH v3 04/23] libxl: add/recover 'rdm_policy' to/from PCI backend in xenstore

2020-11-23 Thread Paul Durrant
From: Paul Durrant 

Other parameters, such as 'msitranslate' and 'permissive' are dealt with
but 'rdm_policy' appears to be have been completely missed.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libs/light/libxl_pci.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index da01c77ba2..50c96cbfa6 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -61,9 +61,9 @@ static void libxl_create_pci_backend_device(libxl__gc *gc,
 flexarray_append_pair(back, GCSPRINTF("vdevfn-%d", num), 
GCSPRINTF("%x", pci->vdevfn));
 flexarray_append(back, GCSPRINTF("opts-%d", num));
 flexarray_append(back,
-  GCSPRINTF("msitranslate=%d,power_mgmt=%d,permissive=%d",
- pci->msitranslate, pci->power_mgmt,
- pci->permissive));
+  
GCSPRINTF("msitranslate=%d,power_mgmt=%d,permissive=%d,rdm_policy=%s",
+pci->msitranslate, pci->power_mgmt,
+pci->permissive, 
libxl_rdm_reserve_policy_to_string(pci->rdm_policy)));
 flexarray_append_pair(back, GCSPRINTF("state-%d", num), GCSPRINTF("%d", 
XenbusStateInitialising));
 }
 
@@ -2374,6 +2374,9 @@ static int libxl__device_pci_from_xs_be(libxl__gc *gc,
 } else if (!strcmp(p, "permissive")) {
 p = strtok_r(NULL, ",=", );
 pci->permissive = atoi(p);
+} else if (!strcmp(p, "rdm_policy")) {
+p = strtok_r(NULL, ",=", );
+libxl_rdm_reserve_policy_from_string(p, >rdm_policy);
 }
 } while ((p = strtok_r(NULL, ",=", )) != NULL);
 }
-- 
2.11.0




Re: NetBSD dom0 PVH: hardware interrupts stalls

2020-11-23 Thread Manuel Bouyer
On Mon, Nov 23, 2020 at 06:06:10PM +0100, Roger Pau Monné wrote:
> OK, I'm afraid this is likely too verbose and messes with the timings.
> 
> I've been looking (again) into the code, and I found something weird
> that I think could be related to the issue you are seeing, but haven't
> managed to try to boot the NetBSD kernel provided in order to assert
> whether it solves the issue or not (or even whether I'm able to
> repro it). Would you mind giving the patch below a try?

With this, I get the same hang but XEN outputs don't wake up the interrupt
any more. The NetBSD counter shows only one interrupt for ioapic2 pin 2,
while I would have about 8 at the time of the hang.

So, now it looks like interrupts are blocked forever. At
http://www-soc.lip6.fr/~bouyer/xen-log5.txt
you'll find the output of the 'i' key.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--



Re: [PATCH] MAINTINERS: Propose Ian Jackson as new release manager

2020-11-23 Thread Julien Grall

Hi George,

NIT: s/MAINTINERS/MAINTAINERS/

On 23/11/2020 16:04, George Dunlap wrote:

Ian Jackson has agreed to be the release manager for 4.15.  Signify
this by giving him maintainership over CHANGELOG.md.

Signed-off-by: George Dunlap 


Acked-by: Julien Grall 

Cheers,

--
Julien Grall



Re: [PATCH v3 1/3] xen/ns16550: Make ns16550 driver usable on ARM with HAS_PCI enabled.

2020-11-23 Thread Julien Grall

Hi Stefano,

On 20/11/2020 00:14, Stefano Stabellini wrote:

On Thu, 19 Nov 2020, Julien Grall wrote:

On Thu, 19 Nov 2020, 23:38 Stefano Stabellini,  wrote:
   On Thu, 19 Nov 2020, Rahul Singh wrote:
   > > On 19/11/2020 09:53, Jan Beulich wrote:
   > >> On 19.11.2020 10:21, Julien Grall wrote:
   > >>> Hi Jan,
   > >>>
   > >>> On 19/11/2020 09:05, Jan Beulich wrote:
   >  On 18.11.2020 16:50, Julien Grall wrote:
   > > On 16/11/2020 12:25, Rahul Singh wrote:
   > >> NS16550 driver has PCI support that is under HAS_PCI flag. When 
HAS_PCI
   > >> is enabled for ARM, compilation error is observed for ARM 
architecture
   > >> because ARM platforms do not have full PCI support available.
   > >   >
   > >> Introducing new kconfig option CONFIG_HAS_NS16550_PCI to support
   > >> ns16550 PCI for X86.
   > >>
   > >> For X86 platforms it is enabled by default. For ARM platforms 
it is
   > >> disabled by default, once we have proper support for NS16550 
PCI for
   > >> ARM we can enable it.
   > >>
   > >> No functional change.
   > >
   > > NIT: I would say "No functional change intended" to make clear 
this is
   > > an expectation and hopefully will be correct :).
   > >
   > > Regarding the commit message itself, I would suggest the 
following to
   > > address Jan's concern:
   > 
   >  While indeed this is a much better description, I continue to 
think
   >  that the proposed Kconfig option is undesirable to have.
   > >>>
   > >>> I am yet to see an argument into why we should keep the PCI code
   > >>> compiled on Arm when there will be no-use
   > >> Well, see my patch suppressing building of quite a part of it.
   > >
   > > I will let Rahul figuring out whether your patch series is 
sufficient to fix compilation issues (this is what matters right
   now).
   >
   > I just checked the compilation error for ARM after enabling the 
HAS_PCI on ARM. I am observing the same compilation error
   what I observed previously.
   > There are two new errors related to struct uart_config and struct 
part_param as those struct defined globally but used under
   X86 flags.
   >
   > At top level:
   > ns16550.c:179:48: error: ‘uart_config’ defined but not used 
[-Werror=unused-const-variable=]
   >  static const struct ns16550_config __initconst uart_config[] =
   >                                                 ^~~
   > ns16550.c:104:54: error: ‘uart_param’ defined but not used 
[-Werror=unused-const-variable=]
   >  static const struct ns16550_config_param __initconst uart_param[] = {
   >
   >
   > >
   >  Either,
   >  following the patch I've just sent, truly x86-specific things (at
   >  least as far as current state goes - if any of this was to be
   >  re-used by a future port, suitable further abstraction may be
   >  needed) should be guarded by CONFIG_X86 (or abstracted into arch
   >  hooks), or the HAS_PCI_MSI proposal would at least want further
   >  investigating as to its feasibility to address the issues at hand.
   > >>>
   > >>> I would be happy with CONFIG_X86, despite the fact that this is 
only
   > >>> deferring the problem.
   > >>>
   > >>> Regarding HAS_PCI_MSI, I don't really see the point of introducing 
given
   > >>> that we are not going to use NS16550 PCI on Arm in the forseeable
   > >>> future.
   > >> And I continue to fail to see what would guarantee this: As soon
   > >> as you can plug in such a card into an Arm system, people will
   > >> want to be able use it. That's why we had to add support for it
   > >> on x86, after all.
   > >
   > > Well, plug-in PCI cards on Arm has been available for quite a 
while... Yet I haven't heard anyone asking for NS16550 PCI
   support.
   > >
   > > This is probably because SBSA compliant server should always provide 
an SBSA UART (a cut-down version of the PL011). So why
   would bother to lose a PCI slot for yet another UART?
   > >
   > >> >> So why do we need a finer graine Kconfig?
   > >> Because most of the involved code is indeed MSI-related?
   > >
   > > Possibly, yet it would not be necessary if we don't want NS16550 PCI 
support...
   >
   > To fix compilation error on ARM as per the discussion there are below 
options please suggest which one to use to proceed
   further.
   >
   > 1. Use the newly introduced CONFIG_HAS_NS16550_PCI config options. 
This helps also non-x86 architecture in the future not to
   have compilation error
   > what we are observing now when HAS_PCI is enabled.
   >
   > 2. Guard the remaining x86 specific code with 

Re: [PATCH] MAINTINERS: Propose Ian Jackson as new release manager

2020-11-23 Thread Ian Jackson
George Dunlap writes ("[PATCH] MAINTINERS: Propose Ian Jackson as new release 
manager"):
> Ian Jackson has agreed to be the release manager for 4.15.  Signify
> this by giving him maintainership over CHANGELOG.md.

Acked-by: Ian Jackson 

Obviously that signifies my consent but I think it needs more acks.

Wei, Juergen, Paul, I think I am likely to ask you some questions.
Any tips etc would be welcome.

Thanks,
Ian.



Re: [RFC] MAINTAINERS tag for cleanup robot

2020-11-23 Thread Tom Rix


On 11/22/20 10:22 AM, Joe Perches wrote:
> On Sun, 2020-11-22 at 08:33 -0800, Tom Rix wrote:
>> On 11/21/20 9:10 AM, Joe Perches wrote:
>>> On Sat, 2020-11-21 at 08:50 -0800, t...@redhat.com wrote:
 A difficult part of automating commits is composing the subsystem
 preamble in the commit log.  For the ongoing effort of a fixer producing
 one or two fixes a release the use of 'treewide:' does not seem 
 appropriate.

 It would be better if the normal prefix was used.  Unfortunately normal is
 not consistent across the tree.

 So I am looking for comments for adding a new tag to the MAINTAINERS file

D: Commit subsystem prefix

 ex/ for FPGA DFL DRIVERS

D: fpga: dfl:
>>> I'm all for it.  Good luck with the effort.  It's not completely trivial.
>>>
>>> From a decade ago:
>>>
>>> https://lore.kernel.org/lkml/1289919077.28741.50.camel@Joe-Laptop/
>>>
>>> (and that thread started with extra semicolon patches too)
>> Reading the history, how about this.
>>
>> get_maintainer.pl outputs a single prefix, if multiple files have the
>> same prefix it works, if they don't its an error.
>>
>> Another script 'commit_one_file.sh' does the call to get_mainainter.pl
>> to get the prefix and be called by run-clang-tools.py to get the fixer
>> specific message.
> It's not whether the script used is get_maintainer or any other script,
> the question is really if the MAINTAINERS file is the appropriate place
> to store per-subsystem patch specific prefixes.
>
> It is.
>
> Then the question should be how are the forms described and what is the
> inheritance priority.  My preference would be to have a default of
> inherit the parent base and add basename(subsystem dirname).
>
> Commit history seems to have standardized on using colons as the separator
> between the commit prefix and the subject.
>
> A good mechanism to explore how various subsystems have uses prefixes in
> the past might be something like:
>
> $ git log --no-merges --pretty='%s' -  | \
>   perl -n -e 'print substr($_, 0, rindex($_, ":") + 1) . "\n";' | \
>   sort | uniq -c | sort -rn

Thanks, I have shamelessly stolen this line and limited the commits to the 
maintainer.

I will post something once the generation of the prefixes is done.

Tom




Re: NetBSD dom0 PVH: hardware interrupts stalls

2020-11-23 Thread Roger Pau Monné
On Mon, Nov 23, 2020 at 03:31:50PM +0100, Manuel Bouyer wrote:
> On Mon, Nov 23, 2020 at 01:51:12PM +0100, Roger Pau Monné wrote:
> > Hm, yes, it's quite weird. Do you know whether a NetBSD kernel can be
> > multibooted from pxelinux with Xen? I would like to see if I can
> > reproduce this myself.
> 
> Yes, if Xen+linux can boot, Xen+netbsd should boot too.
> In a previous mail I wrote:
> In case it helps, I put by Xen and netbsd kernels at
> http://www-soc.lip6.fr/~bouyer/netbsd-dom0-pvh/
> I boot it from the NetBSD boot loader with:
> menu=Boot Xen PVH:load /netbsd-test console=com0 root=dk0 -vx; multiboot 
> /xen-te
> st.gz dom0_mem=1024M console=com2 com2=57600,8n1 loglvl=all guest_loglvl=all 
> gnt
> tab_max_nr_frames=64 dom0=pvh iommu=debug
> I guess with grub this would be
> kernel /xen-test.gz dom0_mem=1024M console=com2 com2=57600,8n1 loglvl=all 
> guest_
> loglvl=all gnttab_max_nr_frames=64 dom0=pvh iommu=debug
> module /netbsd-test console=com0 root=dk0 -vx
> 
> (yes, com2 for xen and com0 for netbsd, that's not a bug :)
> You can enter the NetBSD debugger with
> +
> you can then enter commands, lile
> sh ev /i
> to see the interrupt counters
> 
> > 
> > I have the following patch also which will print a warning message
> > when GSI 34 is injected from hardware or when Xen performs an EOI
> > (either from a time out or when reacting to a guest one). I would
> > expect at least the interrupt injection one to trigger together with
> > the existing message.
> 
> It's quite verbose. I put the full log at
> http://www-soc.lip6.fr/~bouyer/xen-log4.txt

OK, I'm afraid this is likely too verbose and messes with the timings.

I've been looking (again) into the code, and I found something weird
that I think could be related to the issue you are seeing, but haven't
managed to try to boot the NetBSD kernel provided in order to assert
whether it solves the issue or not (or even whether I'm able to
repro it). Would you mind giving the patch below a try?

Thanks, Roger.
---8<---
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 6b1305a3e5..ebd6c8e933 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -174,7 +174,6 @@ static void pt_irq_time_out(void *data)
  * In the identity mapped case the EOI can also be done now, this way
  * the iteration over the list of domain pirqs is avoided.
  */
-hvm_gsi_deassert(irq_map->dom, dpci_pirq(irq_map)->pirq);
 irq_map->flags |= HVM_IRQ_DPCI_EOI_LATCH;
 pt_irq_guest_eoi(irq_map->dom, irq_map, NULL);
 spin_unlock(_map->dom->event_lock);




Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names

2020-11-23 Thread Oleksandr



On 23.11.20 17:54, Paul Durrant wrote:

Hi Paul


-Original Message-
From: Oleksandr 
Sent: 23 November 2020 15:48
To: Jan Beulich ; Paul Durrant 
Cc: Oleksandr Tyshchenko ; Andrew Cooper 
;
Roger Pau Monné ; Wei Liu ; George Dunlap
; Ian Jackson ; Julien Grall 
; Stefano
Stabellini ; Jun Nakajima ; 
Kevin Tian
; Julien Grall ; 
xen-devel@lists.xenproject.org
Subject: Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved 
function names


On 23.11.20 16:56, Jan Beulich wrote:

Hi Jan, Paul


On 23.11.2020 15:39, Oleksandr wrote:

As it was agreed, below the list of proposed renaming (naming) within
current series.

Thanks for compiling this. A couple of suggestions for consideration:


1. Global (existing):
hvm_map_mem_type_to_ioreq_server -> ioreq_server_map_mem_type
hvm_select_ioreq_server  -> ioreq_server_select
hvm_send_ioreq   -> ioreq_send
hvm_ioreq_init   -> ioreq_init

ioreq_domain_init() (or, imo less desirable domain_ioreq_init())?

On Arm (for example) I see two variants are present:
1. That starts with subsystem:
- tee_domain_init
- iommu_domain_init


2. Where sybsystem in the middle:
- domain_io_init
- domain_vuart_init
- domain_vtimer_init

If there is no rule, but a matter of taste then I would use
ioreq_domain_init(),
so arch_ioreq_init() wants to be arch_ioreq_domain_init().


hvm_destroy_all_ioreq_servers-> ioreq_server_destroy_all
hvm_all_ioreq_servers_add_vcpu   -> ioreq_server_add_vcpu_all
hvm_all_ioreq_servers_remove_vcpu-> ioreq_server_remove_vcpu_all
hvm_broadcast_ioreq  -> ioreq_broadcast
hvm_create_ioreq_server  -> ioreq_server_create
hvm_get_ioreq_server_info-> ioreq_server_get_info
hvm_map_io_range_to_ioreq_server -> ioreq_server_map_io_range
hvm_unmap_io_range_from_ioreq_server -> ioreq_server_unmap_io_range
hvm_set_ioreq_server_state   -> ioreq_server_set_state
hvm_destroy_ioreq_server -> ioreq_server_destroy
hvm_get_ioreq_server_frame   -> ioreq_server_get_frame
hvm_ioreq_needs_completion   -> ioreq_needs_completion
hvm_mmio_first_byte  -> ioreq_mmio_first_byte
hvm_mmio_last_byte   -> ioreq_mmio_last_byte
send_invalidate_req  -> ioreq_signal_mapcache_invalidate

handle_hvm_io_completion -> handle_io_completion

For this one I'm not sure what to suggest, but I'm not overly happy
with the name.

I also failed to find a better name. Probably ioreq_ or vcpu_ioreq_
prefix wants to be added here?



hvm_io_pending   -> io_pending

vcpu_ioreq_pending() or vcpu_any_ioreq_pending()?

I am fine with vcpu_ioreq_pending()


...in which case vcpu_ioreq_handle_completion() seems like a reasonable choice.


ok, will rename here ...





2. Global (new):
arch_io_completion


and here arch_vcpu_ioreq_completion() (without handle in the middle).




arch_ioreq_server_map_pages
arch_ioreq_server_unmap_pages
arch_ioreq_server_enable
arch_ioreq_server_disable
arch_ioreq_server_destroy
arch_ioreq_server_map_mem_type
arch_ioreq_server_destroy_all
arch_ioreq_server_get_type_addr
arch_ioreq_init

Assuming this is the arch hook of the similarly named function
further up, a similar adjustment may then be wanted here.

Yes.



domain_has_ioreq_server


3. Local (existing) in common ioreq.c:
hvm_alloc_ioreq_mfn   -> ioreq_alloc_mfn
hvm_free_ioreq_mfn-> ioreq_free_mfn

These two are server functions, so should imo be ioreq_server_...().

ok, but ...



However, if they're static (as they're now), no distinguishing
prefix is strictly necessary, i.e. alloc_mfn() and free_mfn() may
be fine. The two names may be too short for Paul's taste, though.
Some similar shortening may be possible for some or all of the ones


... In general I would be fine with any option. However, using the
shortening rule for all
we are going to end up with single-word function names (enable, init, etc).
So I would prefer to leave locals as is (but dropping hvm prefixes of
course and
clarify ioreq_server_alloc_mfn/ioreq_server_free_mfn).

Paul, Jan what do you think?

I prefer ioreq_server_alloc_mfn/ioreq_server_free_mfn. The problem with 
shortening is that function names become ambiguous within the source base and 
hence harder to find.


Got it.


Thank you

--
Regards,

Oleksandr Tyshchenko




Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Joe Perches
On Mon, 2020-11-23 at 07:58 -0800, James Bottomley wrote:
> We're also complaining about the inability to recruit maintainers:
> 
> https://www.theregister.com/2020/06/30/hard_to_find_linux_maintainers_says_torvalds/
> 
> And burn out:
> 
> http://antirez.com/news/129

https://www.wired.com/story/open-source-coders-few-tired/

> What I'm actually trying to articulate is a way of measuring value of
> the patch vs cost ... it has nothing really to do with who foots the
> actual bill.

It's unclear how to measure value in consistency.

But one way that costs can be reduced is by automation and _not_
involving maintainers when the patch itself is provably correct.

> One thesis I'm actually starting to formulate is that this continual
> devaluing of maintainers is why we have so much difficulty keeping and
> recruiting them.

The linux kernel has something like 1500 different maintainers listed
in the MAINTAINERS file.  That's not a trivial number.

$ git grep '^M:' MAINTAINERS | sort | uniq -c | wc -l
1543
$ git grep '^M:' MAINTAINERS| cut -f1 -d'<' | sort | uniq -c | wc -l
1446

I think the question you are asking is about trust and how it
effects development.

And back to that wired story, the actual number of what you might
be considering to be maintainers is likely less than 10% of the
listed numbers above.





Re: [Intel-wired-lan] [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread James Bottomley
On Mon, 2020-11-23 at 07:03 -0600, Gustavo A. R. Silva wrote:
> On Sun, Nov 22, 2020 at 11:53:55AM -0800, James Bottomley wrote:
> > On Sun, 2020-11-22 at 11:22 -0800, Joe Perches wrote:
> > > On Sun, 2020-11-22 at 11:12 -0800, James Bottomley wrote:
> > > > On Sun, 2020-11-22 at 10:25 -0800, Joe Perches wrote:
> > > > > On Sun, 2020-11-22 at 10:21 -0800, James Bottomley wrote:
> > > > > > Please tell me our reward for all this effort isn't a
> > > > > > single missing error print.
> > > > > 
> > > > > There were quite literally dozens of logical defects found
> > > > > by the fallthrough additions.  Very few were logging only.
> > > > 
> > > > So can you give us the best examples (or indeed all of them if
> > > > someone is keeping score)?  hopefully this isn't a US election
> > > > situation ...
> > > 
> > > Gustavo?  Are you running for congress now?
> > > 
> > > https://lwn.net/Articles/794944/
> > 
> > That's 21 reported fixes of which about 50% seem to produce no
> > change in code behaviour at all, a quarter seem to have no user
> > visible effect with the remaining quarter producing unexpected
> > errors on obscure configuration parameters, which is why no-one
> > really noticed them before.
> 
> The really important point here is the number of bugs this has
> prevented and will prevent in the future. See an example of this,
> below:
> 
> https://lore.kernel.org/linux-iio/20190813135802.gb27...@kroah.com/

I think this falls into the same category as the other six bugs: it
changes the output/input for parameters but no-one has really noticed,
usually because the command is obscure or the bias effect is minor.

> This work is still relevant, even if the total number of issues/bugs
> we find in the process is zero (which is not the case).

Really, no ... something which produces no improvement has no value at
all ... we really shouldn't be wasting maintainer time with it because
it has a cost to merge.  I'm not sure we understand where the balance
lies in value vs cost to merge but I am confident in the zero value
case.

> "The sucky thing about doing hard work to deploy hardening is that
> the result is totally invisible by definition (things not happening)
> [..]"
> - Dmitry Vyukov

Really, no.  Something that can't be measured at all doesn't exist.

And actually hardening is one of those things you can measure (which I
do have to admit isn't true for everything in the security space) ...
it's number of exploitable bugs found before you did it vs number of
exploitable bugs found after you did it.  Usually hardening eliminates
a class of bug, so the way I've measured hardening before is to go
through the CVE list for the last couple of years for product X, find
all the bugs that are of the class we're looking to eliminate and say
if we had hardened X against this class of bug we'd have eliminated Y%
of the exploits.  It can be quite impressive if Y is a suitably big
number.

James





Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread Rafael J. Wysocki
On Mon, Nov 23, 2020 at 4:58 PM James Bottomley
 wrote:
>
> On Mon, 2020-11-23 at 15:19 +0100, Miguel Ojeda wrote:
> > On Sun, Nov 22, 2020 at 11:36 PM James Bottomley
> >  wrote:

[cut]

> >
> > Maintainers routinely review 1-line trivial patches, not to mention
> > internal API changes, etc.
>
> We're also complaining about the inability to recruit maintainers:
>
> https://www.theregister.com/2020/06/30/hard_to_find_linux_maintainers_says_torvalds/
>
> And burn out:
>
> http://antirez.com/news/129

Right.

> The whole crux of your argument seems to be maintainers' time isn't
> important so we should accept all trivial patches ... I'm pushing back
> on that assumption in two places, firstly the valulessness of the time
> and secondly that all trivial patches are valuable.
>
> > If some company does not want to pay for that, that's fine, but they
> > don't get to be maintainers and claim `Supported`.
>
> What I'm actually trying to articulate is a way of measuring value of
> the patch vs cost ... it has nothing really to do with who foots the
> actual bill.
>
> One thesis I'm actually starting to formulate is that this continual
> devaluing of maintainers is why we have so much difficulty keeping and
> recruiting them.

Absolutely.

This is just one of the factors involved, but a significant one IMV.



Re: [RFC] MAINTAINERS tag for cleanup robot

2020-11-23 Thread Lukas Bulwahn
On Mon, Nov 23, 2020 at 4:52 PM Jani Nikula  wrote:
>
> On Sat, 21 Nov 2020, James Bottomley  
> wrote:
> > On Sat, 2020-11-21 at 08:50 -0800, t...@redhat.com wrote:
> >> A difficult part of automating commits is composing the subsystem
> >> preamble in the commit log.  For the ongoing effort of a fixer
> >> producing
> >> one or two fixes a release the use of 'treewide:' does not seem
> >> appropriate.
> >>
> >> It would be better if the normal prefix was used.  Unfortunately
> >> normal is
> >> not consistent across the tree.
> >>
> >>
> >>  D: Commit subsystem prefix
> >>
> >> ex/ for FPGA DFL DRIVERS
> >>
> >>  D: fpga: dfl:
> >>
> >
> > I've got to bet this is going to cause more issues than it solves.
>
> Agreed.
>

Tom, this a problem only kernel janitors encounter; all other
developers really do not have that issue. The time spent on creating
the patch is much larger than the amount saved if the commit log
header line prefix would be derived automatically. I believe Julia
Lawall, Arnd Bergmann and Nathan Chancellor as long-term
high-frequency janitors do have already scripted approaches to that
issue. Maybe they simply need to share these scripts with you and you
consolidate them and share with everyone?

Also, making clean-up patches cumbersome has a positive side as well;
maintainers are not swamped with fully automated patch submissions.
There have been some bad experiences with some submitters on that in
the past...

> > SCSI uses scsi: : for drivers but not every driver has a
> > MAINTAINERS entry.  We use either scsi: or scsi: core: for mid layer
> > things, but we're not consistent.  Block uses blk-: for all
> > of it's stuff but almost no s have a MAINTAINERS entry.  So
> > the next thing you're going to cause is an explosion of suggested
> > MAINTAINERs entries.
>
> On the one hand, adoption of new MAINTAINERS entries has been really
> slow. Look at B, C, or P, for instance. On the other hand, if this were
> to get adopted, you'll potentially get conflicting prefixes for patches
> touching multiple files. Then what?
>
> I'm guessing a script looking at git log could come up with better
> suggestions for prefixes via popularity contest than manually maintained
> MAINTAINERS entries. It might not always get it right, but then human
> outsiders aren't going to always get it right either.
>
> Now you'll only need Someone(tm) to write the script. ;)
>
> Something quick like this:
>
> git log --since={1year} --pretty=format:%s --  |\
> grep -v "^\(Merge\|Revert\)" |\
> sed 's/:[^:]*$//' |\
> sort | uniq -c | sort -rn | head -5
>
> already gives me results that really aren't worse than some of the
> prefixes invented by drive-by contributors.
>

I agree I do not see the need to introduce something in MAINTAINERS;
from my observations maintaining MAINTAINERS, there is sufficient work
on adoption and maintenance of the existing entries already without
such an yet another additional entry. Some entries are outdated or
wrong and the janitor task of cleaning those up is already enough work
for involved janitors and enough churn for involved maintainers. So a
machine-learned approach as above is probably good enough, but if you
think you need more complex rules try to learn them from the data at
hand... certainly a nice task to do with machine learning on commit
message prefixes.

Lukas



RE: [PATCH] MAINTINERS: Propose Ian Jackson as new release manager

2020-11-23 Thread Paul Durrant
> -Original Message-
> From: George Dunlap 
> Sent: 23 November 2020 16:04
> To: xen-devel@lists.xenproject.org
> Cc: George Dunlap ; Ian Jackson 
> ; Wei Liu
> ; Andrew Cooper ; Jan Beulich 
> ; Roger Pau
> Monne ; Stefano Stabellini ; 
> Julien Grall
> ; Paul Durrant 
> Subject: [PATCH] MAINTINERS: Propose Ian Jackson as new release manager
> 
> Ian Jackson has agreed to be the release manager for 4.15.  Signify
> this by giving him maintainership over CHANGELOG.md.
> 
> Signed-off-by: George Dunlap 

Thank you Ian.

Acked-by: Paul Durrant 

> ---
> CC: Ian Jackson 
> CC: Wei Liu 
> CC: Andrew Cooper 
> CC: Jan Beulich 
> CC: Roger Pau Monne 
> CC: Stefano Stabellini 
> CC: Julien Grall 
> CC: Paul Durrant 
> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index dab38a6a14..a9872df1de 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -250,7 +250,7 @@ F:xen/include/public/arch-arm/
>  F:   xen/include/public/arch-arm.h
> 
>  Change Log
> -M:   Paul Durrant 
> +M:   Ian Jackson 
>  R:   Community Manager 
>  S:   Maintained
>  F:   CHANGELOG.md
> --
> 2.25.1





Re: [PATCH 2/4] x86/ACPI: fix S3 wakeup vector mapping

2020-11-23 Thread Andrew Cooper
On 23/11/2020 16:07, Roger Pau Monné wrote:
> On Mon, Nov 23, 2020 at 04:30:05PM +0100, Jan Beulich wrote:
>> On 23.11.2020 16:24, Roger Pau Monné wrote:
>>> On Mon, Nov 23, 2020 at 01:40:12PM +0100, Jan Beulich wrote:
 --- a/xen/arch/x86/acpi/power.c
 +++ b/xen/arch/x86/acpi/power.c
 @@ -174,17 +174,20 @@ static void acpi_sleep_prepare(u32 state
  if ( state != ACPI_STATE_S3 )
  return;
  
 -wakeup_vector_va = __acpi_map_table(
 -acpi_sinfo.wakeup_vector, sizeof(uint64_t));
 -
  /* TBoot will set resume vector itself (when it is safe to do so). */
  if ( tboot_in_measured_env() )
  return;
  
 +set_fixmap(FIX_ACPI_END, acpi_sinfo.wakeup_vector);
 +wakeup_vector_va = fix_to_virt(FIX_ACPI_END) +
 +   PAGE_OFFSET(acpi_sinfo.wakeup_vector);
 +
  if ( acpi_sinfo.vector_width == 32 )
  *(uint32_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
  else
  *(uint64_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
 +
 +clear_fixmap(FIX_ACPI_END);
>>> Why not use vmap here instead of the fixmap?
>> Considering the S3 path is relatively fragile (as in: we end up
>> breaking it more often than about anything else) I wanted to
>> make as little of a change as possible. Hence I decided to stick
>> to the fixmap use that was (indirectly) used before as well.
> Unless there's a restriction to use the ACPI fixmap entry I would just
> switch to use vmap, as it's used extensively in the code and less
> likely to trigger issues in the future, or else a bunch of other stuff
> would also be broken.
>
> IMO doing the mapping differently here when it's not required will end
> up turning this code more fragile in the long run.

We can't enter S3 at all until dom0 has booted, as one detail has to
come from AML.

Therefore, we're fully up and running by this point, and vmap() will be
fine.

However, why are we re-writing the wakeup vector every time?  Its fixed
by the position of the trampoline, so we'd actually simplify the S3 path
by only setting it up once.

~Andrew

(The fix for fragility is to actually test it, not shy away from making
any change)



Re: [PATCH] MAINTINERS: Propose Ian Jackson as new release manager

2020-11-23 Thread Roger Pau Monné
On Mon, Nov 23, 2020 at 04:04:00PM +, George Dunlap wrote:
> Ian Jackson has agreed to be the release manager for 4.15.  Signify
> this by giving him maintainership over CHANGELOG.md.
> 
> Signed-off-by: George Dunlap 

Acked-by: Roger Pau Monné 

Congratulations!



Re: [PATCH 2/4] x86/ACPI: fix S3 wakeup vector mapping

2020-11-23 Thread Roger Pau Monné
On Mon, Nov 23, 2020 at 04:30:05PM +0100, Jan Beulich wrote:
> On 23.11.2020 16:24, Roger Pau Monné wrote:
> > On Mon, Nov 23, 2020 at 01:40:12PM +0100, Jan Beulich wrote:
> >> --- a/xen/arch/x86/acpi/power.c
> >> +++ b/xen/arch/x86/acpi/power.c
> >> @@ -174,17 +174,20 @@ static void acpi_sleep_prepare(u32 state
> >>  if ( state != ACPI_STATE_S3 )
> >>  return;
> >>  
> >> -wakeup_vector_va = __acpi_map_table(
> >> -acpi_sinfo.wakeup_vector, sizeof(uint64_t));
> >> -
> >>  /* TBoot will set resume vector itself (when it is safe to do so). */
> >>  if ( tboot_in_measured_env() )
> >>  return;
> >>  
> >> +set_fixmap(FIX_ACPI_END, acpi_sinfo.wakeup_vector);
> >> +wakeup_vector_va = fix_to_virt(FIX_ACPI_END) +
> >> +   PAGE_OFFSET(acpi_sinfo.wakeup_vector);
> >> +
> >>  if ( acpi_sinfo.vector_width == 32 )
> >>  *(uint32_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
> >>  else
> >>  *(uint64_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
> >> +
> >> +clear_fixmap(FIX_ACPI_END);
> > 
> > Why not use vmap here instead of the fixmap?
> 
> Considering the S3 path is relatively fragile (as in: we end up
> breaking it more often than about anything else) I wanted to
> make as little of a change as possible. Hence I decided to stick
> to the fixmap use that was (indirectly) used before as well.

Unless there's a restriction to use the ACPI fixmap entry I would just
switch to use vmap, as it's used extensively in the code and less
likely to trigger issues in the future, or else a bunch of other stuff
would also be broken.

IMO doing the mapping differently here when it's not required will end
up turning this code more fragile in the long run.

Thanks, Roger.



[PATCH] MAINTINERS: Propose Ian Jackson as new release manager

2020-11-23 Thread George Dunlap
Ian Jackson has agreed to be the release manager for 4.15.  Signify
this by giving him maintainership over CHANGELOG.md.

Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
CC: Jan Beulich 
CC: Roger Pau Monne 
CC: Stefano Stabellini 
CC: Julien Grall 
CC: Paul Durrant 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index dab38a6a14..a9872df1de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -250,7 +250,7 @@ F:  xen/include/public/arch-arm/
 F: xen/include/public/arch-arm.h
 
 Change Log
-M: Paul Durrant 
+M: Ian Jackson 
 R: Community Manager 
 S: Maintained
 F: CHANGELOG.md
-- 
2.25.1




Re: [PATCH v1 00/23] reduce overhead during live migration

2020-11-23 Thread Olaf Hering
There was no feedback to this series within the past three weeks.

Please review this series.

Thanks,
Olaf

Am Thu, 29 Oct 2020 18:19:40 +0100
schrieb Olaf Hering :

> The current live migration code can easily saturate an 1Gb link.
> There is still room for improvement with faster network connections.
> Even with this series reviewed and applied.
> See description of patch #6.
> 
> Olaf
> 
> Olaf Hering (23):
>   tools: add readv_exact to libxenctrl
>   tools: add xc_is_known_page_type to libxenctrl
>   tools: use xc_is_known_page_type
>   tools: unify type checking for data pfns in migration stream
>   tools: show migration transfer rate in send_dirty_pages
>   tools/guest: prepare to allocate arrays once
>   tools/guest: save: move batch_pfns
>   tools/guest: save: move mfns array
>   tools/guest: save: move types array
>   tools/guest: save: move errors array
>   tools/guest: save: move iov array
>   tools/guest: save: move rec_pfns array
>   tools/guest: save: move guest_data array
>   tools/guest: save: move local_pages array
>   tools/guest: restore: move pfns array
>   tools/guest: restore: move types array
>   tools/guest: restore: move mfns array
>   tools/guest: restore: move map_errs array
>   tools/guest: restore: move mfns array in populate_pfns
>   tools/guest: restore: move pfns array in populate_pfns
>   tools/guest: restore: split record processing
>   tools/guest: restore: split handle_page_data
>   tools/guest: restore: write data directly into guest
> 
>  tools/libs/ctrl/xc_private.c  |  54 ++-
>  tools/libs/ctrl/xc_private.h  |  34 ++
>  tools/libs/guest/xg_sr_common.c   |  33 +-
>  tools/libs/guest/xg_sr_common.h   |  86 +++-
>  tools/libs/guest/xg_sr_restore.c  | 562 +-
>  tools/libs/guest/xg_sr_save.c | 158 
>  tools/libs/guest/xg_sr_save_x86_hvm.c |   5 +-
>  tools/libs/guest/xg_sr_save_x86_pv.c  |  31 +-
>  8 files changed, 666 insertions(+), 297 deletions(-)


pgpSqFc8jempr.pgp
Description: Digitale Signatur von OpenPGP


Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-23 Thread James Bottomley
On Mon, 2020-11-23 at 15:19 +0100, Miguel Ojeda wrote:
> On Sun, Nov 22, 2020 at 11:36 PM James Bottomley
>  wrote:
> > Well, it seems to be three years of someone's time plus the
> > maintainer review time and series disruption of nearly a thousand
> > patches.  Let's be conservative and assume the producer worked
> > about 30% on the series and it takes about 5-10 minutes per patch
> > to review, merge and for others to rework existing series.  So
> > let's say it's cost a person year of a relatively junior engineer
> > producing the patches and say 100h of review and application
> > time.  The latter is likely the big ticket item because it's what
> > we have in least supply in the kernel (even though it's 20x vs the
> > producer time).
> 
> How are you arriving at such numbers? It is a total of ~200 trivial
> lines.

Well, I used git.  It says that as of today in Linus' tree we have 889
patches related to fall throughs and the first series went in in
october 2017 ... ignoring a couple of outliers back to February.

> > It's not about the risk of the changes it's about the cost of
> > implementing them.  Even if you discount the producer time (which
> > someone gets to pay for, and if I were the engineering manager, I'd
> > be unhappy about), the review/merge/rework time is pretty
> > significant in exchange for six minor bug fixes.  Fine, when a new
> > compiler warning comes along it's certainly reasonable to see if we
> > can benefit from it and the fact that the compiler people think
> > it's worthwhile is enough evidence to assume this initially.  But
> > at some point you have to ask whether that assumption is supported
> > by the evidence we've accumulated over the time we've been using
> > it.  And if the evidence doesn't support it perhaps it is time to
> > stop the experiment.
> 
> Maintainers routinely review 1-line trivial patches, not to mention
> internal API changes, etc.

We're also complaining about the inability to recruit maintainers:

https://www.theregister.com/2020/06/30/hard_to_find_linux_maintainers_says_torvalds/

And burn out:

http://antirez.com/news/129

The whole crux of your argument seems to be maintainers' time isn't
important so we should accept all trivial patches ... I'm pushing back
on that assumption in two places, firstly the valulessness of the time
and secondly that all trivial patches are valuable.

> If some company does not want to pay for that, that's fine, but they
> don't get to be maintainers and claim `Supported`.

What I'm actually trying to articulate is a way of measuring value of
the patch vs cost ... it has nothing really to do with who foots the
actual bill.

One thesis I'm actually starting to formulate is that this continual
devaluing of maintainers is why we have so much difficulty keeping and
recruiting them.

James






[xen-unstable test] 156956: tolerable FAIL

2020-11-23 Thread osstest service owner
flight 156956 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156956/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail pass in 156935
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 12 debian-hvm-install fail pass in 
156935
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeat  fail pass in 156935

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156935
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156935
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156935
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156935
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156935
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 156935
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 156935
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156935
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156935
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156935
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156935
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  b659a5cebd611dbe698e63c03485b5fe8cd964ad
baseline version:
 xen  b659a5cebd611dbe698e63c03485b5fe8cd964ad

Last test of basis   156956  2020-11-23 01:51:33 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 

RE: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names

2020-11-23 Thread Paul Durrant
> -Original Message-
> From: Oleksandr 
> Sent: 23 November 2020 15:48
> To: Jan Beulich ; Paul Durrant 
> Cc: Oleksandr Tyshchenko ; Andrew Cooper 
> ;
> Roger Pau Monné ; Wei Liu ; George Dunlap
> ; Ian Jackson ; Julien Grall 
> ; Stefano
> Stabellini ; Jun Nakajima ; 
> Kevin Tian
> ; Julien Grall ; 
> xen-devel@lists.xenproject.org
> Subject: Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved 
> function names
> 
> 
> On 23.11.20 16:56, Jan Beulich wrote:
> 
> Hi Jan, Paul
> 
> > On 23.11.2020 15:39, Oleksandr wrote:
> >> As it was agreed, below the list of proposed renaming (naming) within
> >> current series.
> > Thanks for compiling this. A couple of suggestions for consideration:
> >
> >> 1. Global (existing):
> >> hvm_map_mem_type_to_ioreq_server -> ioreq_server_map_mem_type
> >> hvm_select_ioreq_server  -> ioreq_server_select
> >> hvm_send_ioreq   -> ioreq_send
> >> hvm_ioreq_init   -> ioreq_init
> > ioreq_domain_init() (or, imo less desirable domain_ioreq_init())?
> On Arm (for example) I see two variants are present:
> 1. That starts with subsystem:
> - tee_domain_init
> - iommu_domain_init
> 
> 
> 2. Where sybsystem in the middle:
> - domain_io_init
> - domain_vuart_init
> - domain_vtimer_init
> 
> If there is no rule, but a matter of taste then I would use
> ioreq_domain_init(),
> so arch_ioreq_init() wants to be arch_ioreq_domain_init().
> 
> >
> >> hvm_destroy_all_ioreq_servers-> ioreq_server_destroy_all
> >> hvm_all_ioreq_servers_add_vcpu   -> ioreq_server_add_vcpu_all
> >> hvm_all_ioreq_servers_remove_vcpu-> ioreq_server_remove_vcpu_all
> >> hvm_broadcast_ioreq  -> ioreq_broadcast
> >> hvm_create_ioreq_server  -> ioreq_server_create
> >> hvm_get_ioreq_server_info-> ioreq_server_get_info
> >> hvm_map_io_range_to_ioreq_server -> ioreq_server_map_io_range
> >> hvm_unmap_io_range_from_ioreq_server -> ioreq_server_unmap_io_range
> >> hvm_set_ioreq_server_state   -> ioreq_server_set_state
> >> hvm_destroy_ioreq_server -> ioreq_server_destroy
> >> hvm_get_ioreq_server_frame   -> ioreq_server_get_frame
> >> hvm_ioreq_needs_completion   -> ioreq_needs_completion
> >> hvm_mmio_first_byte  -> ioreq_mmio_first_byte
> >> hvm_mmio_last_byte   -> ioreq_mmio_last_byte
> >> send_invalidate_req  -> ioreq_signal_mapcache_invalidate
> >>
> >> handle_hvm_io_completion -> handle_io_completion
> > For this one I'm not sure what to suggest, but I'm not overly happy
> > with the name.
> 
> I also failed to find a better name. Probably ioreq_ or vcpu_ioreq_
> prefix wants to be added here?
> 
> 
> >
> >> hvm_io_pending   -> io_pending
> > vcpu_ioreq_pending() or vcpu_any_ioreq_pending()?
> 
> I am fine with vcpu_ioreq_pending()
> 

...in which case vcpu_ioreq_handle_completion() seems like a reasonable choice.

> 
> >
> >> 2. Global (new):
> >> arch_io_completion
> >> arch_ioreq_server_map_pages
> >> arch_ioreq_server_unmap_pages
> >> arch_ioreq_server_enable
> >> arch_ioreq_server_disable
> >> arch_ioreq_server_destroy
> >> arch_ioreq_server_map_mem_type
> >> arch_ioreq_server_destroy_all
> >> arch_ioreq_server_get_type_addr
> >> arch_ioreq_init
> > Assuming this is the arch hook of the similarly named function
> > further up, a similar adjustment may then be wanted here.
> 
> Yes.
> 
> 
> >
> >> domain_has_ioreq_server
> >>
> >>
> >> 3. Local (existing) in common ioreq.c:
> >> hvm_alloc_ioreq_mfn   -> ioreq_alloc_mfn
> >> hvm_free_ioreq_mfn-> ioreq_free_mfn
> > These two are server functions, so should imo be ioreq_server_...().
> 
> ok, but ...
> 
> 
> > However, if they're static (as they're now), no distinguishing
> > prefix is strictly necessary, i.e. alloc_mfn() and free_mfn() may
> > be fine. The two names may be too short for Paul's taste, though.
> > Some similar shortening may be possible for some or all of the ones
> 
> 
> ... In general I would be fine with any option. However, using the
> shortening rule for all
> we are going to end up with single-word function names (enable, init, etc).
> So I would prefer to leave locals as is (but dropping hvm prefixes of
> course and
> clarify ioreq_server_alloc_mfn/ioreq_server_free_mfn).
> 
> Paul, Jan what do you think?

I prefer ioreq_server_alloc_mfn/ioreq_server_free_mfn. The problem with 
shortening is that function names become ambiguous within the source base and 
hence harder to find.

  Paul




Re: [RFC] MAINTAINERS tag for cleanup robot

2020-11-23 Thread Jani Nikula
On Sat, 21 Nov 2020, James Bottomley  
wrote:
> On Sat, 2020-11-21 at 08:50 -0800, t...@redhat.com wrote:
>> A difficult part of automating commits is composing the subsystem
>> preamble in the commit log.  For the ongoing effort of a fixer
>> producing
>> one or two fixes a release the use of 'treewide:' does not seem
>> appropriate.
>> 
>> It would be better if the normal prefix was used.  Unfortunately
>> normal is
>> not consistent across the tree.
>> 
>> 
>>  D: Commit subsystem prefix
>> 
>> ex/ for FPGA DFL DRIVERS
>> 
>>  D: fpga: dfl:
>> 
>
> I've got to bet this is going to cause more issues than it solves.

Agreed.

> SCSI uses scsi: : for drivers but not every driver has a
> MAINTAINERS entry.  We use either scsi: or scsi: core: for mid layer
> things, but we're not consistent.  Block uses blk-: for all
> of it's stuff but almost no s have a MAINTAINERS entry.  So
> the next thing you're going to cause is an explosion of suggested
> MAINTAINERs entries.

On the one hand, adoption of new MAINTAINERS entries has been really
slow. Look at B, C, or P, for instance. On the other hand, if this were
to get adopted, you'll potentially get conflicting prefixes for patches
touching multiple files. Then what?

I'm guessing a script looking at git log could come up with better
suggestions for prefixes via popularity contest than manually maintained
MAINTAINERS entries. It might not always get it right, but then human
outsiders aren't going to always get it right either.

Now you'll only need Someone(tm) to write the script. ;)

Something quick like this:

git log --since={1year} --pretty=format:%s --  |\
grep -v "^\(Merge\|Revert\)" |\
sed 's/:[^:]*$//' |\
sort | uniq -c | sort -rn | head -5

already gives me results that really aren't worse than some of the
prefixes invented by drive-by contributors.

> Has anyone actually complained about treewide:?

As Joe said, I'd feel silly applying patches to drivers with that
prefix. If it gets applied by someone else higher up, literally
treewide, then no complaints.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center



Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names

2020-11-23 Thread Oleksandr



On 23.11.20 16:56, Jan Beulich wrote:

Hi Jan, Paul


On 23.11.2020 15:39, Oleksandr wrote:

As it was agreed, below the list of proposed renaming (naming) within
current series.

Thanks for compiling this. A couple of suggestions for consideration:


1. Global (existing):
hvm_map_mem_type_to_ioreq_server -> ioreq_server_map_mem_type
hvm_select_ioreq_server  -> ioreq_server_select
hvm_send_ioreq   -> ioreq_send
hvm_ioreq_init   -> ioreq_init

ioreq_domain_init() (or, imo less desirable domain_ioreq_init())?

On Arm (for example) I see two variants are present:
1. That starts with subsystem:
- tee_domain_init
- iommu_domain_init


2. Where sybsystem in the middle:
- domain_io_init
- domain_vuart_init
- domain_vtimer_init

If there is no rule, but a matter of taste then I would use 
ioreq_domain_init(),

so arch_ioreq_init() wants to be arch_ioreq_domain_init().




hvm_destroy_all_ioreq_servers    -> ioreq_server_destroy_all
hvm_all_ioreq_servers_add_vcpu   -> ioreq_server_add_vcpu_all
hvm_all_ioreq_servers_remove_vcpu    -> ioreq_server_remove_vcpu_all
hvm_broadcast_ioreq  -> ioreq_broadcast
hvm_create_ioreq_server  -> ioreq_server_create
hvm_get_ioreq_server_info    -> ioreq_server_get_info
hvm_map_io_range_to_ioreq_server -> ioreq_server_map_io_range
hvm_unmap_io_range_from_ioreq_server -> ioreq_server_unmap_io_range
hvm_set_ioreq_server_state   -> ioreq_server_set_state
hvm_destroy_ioreq_server -> ioreq_server_destroy
hvm_get_ioreq_server_frame   -> ioreq_server_get_frame
hvm_ioreq_needs_completion   -> ioreq_needs_completion
hvm_mmio_first_byte  -> ioreq_mmio_first_byte
hvm_mmio_last_byte   -> ioreq_mmio_last_byte
send_invalidate_req  -> ioreq_signal_mapcache_invalidate

handle_hvm_io_completion -> handle_io_completion

For this one I'm not sure what to suggest, but I'm not overly happy
with the name.


I also failed to find a better name. Probably ioreq_ or vcpu_ioreq_ 
prefix wants to be added here?






hvm_io_pending   -> io_pending

vcpu_ioreq_pending() or vcpu_any_ioreq_pending()?


I am fine with vcpu_ioreq_pending()





2. Global (new):
arch_io_completion
arch_ioreq_server_map_pages
arch_ioreq_server_unmap_pages
arch_ioreq_server_enable
arch_ioreq_server_disable
arch_ioreq_server_destroy
arch_ioreq_server_map_mem_type
arch_ioreq_server_destroy_all
arch_ioreq_server_get_type_addr
arch_ioreq_init

Assuming this is the arch hook of the similarly named function
further up, a similar adjustment may then be wanted here.


Yes.





domain_has_ioreq_server


3. Local (existing) in common ioreq.c:
hvm_alloc_ioreq_mfn   -> ioreq_alloc_mfn
hvm_free_ioreq_mfn    -> ioreq_free_mfn

These two are server functions, so should imo be ioreq_server_...().


ok, but ...



However, if they're static (as they're now), no distinguishing
prefix is strictly necessary, i.e. alloc_mfn() and free_mfn() may
be fine. The two names may be too short for Paul's taste, though.
Some similar shortening may be possible for some or all of the ones



... In general I would be fine with any option. However, using the 
shortening rule for all

we are going to end up with single-word function names (enable, init, etc).
So I would prefer to leave locals as is (but dropping hvm prefixes of 
course and

clarify ioreq_server_alloc_mfn/ioreq_server_free_mfn).

Paul, Jan what do you think?



below here.

Jan


hvm_update_ioreq_evtchn   -> ioreq_update_evtchn
hvm_ioreq_server_add_vcpu -> ioreq_server_add_vcpu
hvm_ioreq_server_remove_vcpu  -> ioreq_server_remove_vcpu
hvm_ioreq_server_remove_all_vcpus -> ioreq_server_remove_all_vcpus
hvm_ioreq_server_alloc_pages  -> ioreq_server_alloc_pages
hvm_ioreq_server_free_pages   -> ioreq_server_free_pages
hvm_ioreq_server_free_rangesets   -> ioreq_server_free_rangesets
hvm_ioreq_server_alloc_rangesets  -> ioreq_server_alloc_rangesets
hvm_ioreq_server_enable   -> ioreq_server_enable
hvm_ioreq_server_disable  -> ioreq_server_disable
hvm_ioreq_server_init -> ioreq_server_init
hvm_ioreq_server_deinit   -> ioreq_server_deinit
hvm_send_buffered_ioreq   -> ioreq_send_buffered

hvm_wait_for_io   -> wait_for_io

4. Local (existing) in x86 ioreq.c:
Everything related to legacy interface (hvm_alloc_legacy_ioreq_gfn, etc)
are going
to remain as is.




--
Regards,

Oleksandr Tyshchenko




Re: [PATCH 4/4] x86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be determined

2020-11-23 Thread Roger Pau Monné
On Mon, Nov 23, 2020 at 01:41:06PM +0100, Jan Beulich wrote:
> We can be more tolerant as long as the data collected from FACS is only
> needed to enter S3. A prior change already added suitable checking to
> acpi_enter_sleep().
> 
> Signed-off-by: Jan Beulich 

Acked-by: Roger Pau Monné 

Thanks, Roger.



Re: [PATCH 3/4] x86/DMI: fix table mapping when one lives above 1Mb

2020-11-23 Thread Roger Pau Monné
On Mon, Nov 23, 2020 at 01:40:30PM +0100, Jan Beulich wrote:
> Use of __acpi_map_table() is kind of an abuse here, and doesn't work
> anymore for the majority of cases if any of the tables lives outside the
> low first Mb. Keep this (ab)use only prior to reaching SYS_STATE_boot,
> primarily to avoid needing to audit whether any of the calls here can
> happen this early in the first place; quite likely this isn't necessary
> at all - at least dmi_scan_machine() gets called late enough.
> 
> For the "normal" case, call __vmap() directly, despite effectively
> duplicating acpi_os_map_memory(). There's one difference though: We
> shouldn't need to establish UC- mappings, WP or r/o WB mappings ought to
> be fine, as the tables are going to live in either RAM or ROM. Short of
> having PAGE_HYPERVISOR_WP and wanting to map the tables r/o anyway, use
> the latter of the two options. The r/o mapping implies some
> constification of code elsewhere in the file. For code touched anyway
> also switch to void (where possible) or uint8_t.
> 
> Fixes: 1c4aa69ca1e1 ("xen/acpi: Rework acpi_os_map_memory() and 
> acpi_os_unmap_memory()")
> Signed-off-by: Jan Beulich 

Acked-by: Roger Pau Monné 

Thanks, Roger.



Re: [PATCH 2/4] x86/ACPI: fix S3 wakeup vector mapping

2020-11-23 Thread Jan Beulich
On 23.11.2020 16:24, Roger Pau Monné wrote:
> On Mon, Nov 23, 2020 at 01:40:12PM +0100, Jan Beulich wrote:
>> --- a/xen/arch/x86/acpi/power.c
>> +++ b/xen/arch/x86/acpi/power.c
>> @@ -174,17 +174,20 @@ static void acpi_sleep_prepare(u32 state
>>  if ( state != ACPI_STATE_S3 )
>>  return;
>>  
>> -wakeup_vector_va = __acpi_map_table(
>> -acpi_sinfo.wakeup_vector, sizeof(uint64_t));
>> -
>>  /* TBoot will set resume vector itself (when it is safe to do so). */
>>  if ( tboot_in_measured_env() )
>>  return;
>>  
>> +set_fixmap(FIX_ACPI_END, acpi_sinfo.wakeup_vector);
>> +wakeup_vector_va = fix_to_virt(FIX_ACPI_END) +
>> +   PAGE_OFFSET(acpi_sinfo.wakeup_vector);
>> +
>>  if ( acpi_sinfo.vector_width == 32 )
>>  *(uint32_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
>>  else
>>  *(uint64_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
>> +
>> +clear_fixmap(FIX_ACPI_END);
> 
> Why not use vmap here instead of the fixmap?

Considering the S3 path is relatively fragile (as in: we end up
breaking it more often than about anything else) I wanted to
make as little of a change as possible. Hence I decided to stick
to the fixmap use that was (indirectly) used before as well.

Jan



Re: [PATCH 2/4] x86/ACPI: fix S3 wakeup vector mapping

2020-11-23 Thread Roger Pau Monné
On Mon, Nov 23, 2020 at 01:40:12PM +0100, Jan Beulich wrote:
> Use of __acpi_map_table() here was at least close to an abuse already
> before, but it will now consistently return NULL here. Drop the layering
> violation and use set_fixmap() directly. Re-use of the ACPI fixmap area
> is hopefully going to remain "fine" for the time being.
> 
> Add checks to acpi_enter_sleep(): The vector now needs to be contained
> within a single page, but the ACPI spec requires 64-byte alignment of
> FACS anyway. Also bail if no wakeup vector was determined in the first
> place, in part as preparation for a subsequent relaxation change.
> 
> Fixes: 1c4aa69ca1e1 ("xen/acpi: Rework acpi_os_map_memory() and 
> acpi_os_unmap_memory()")
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/arch/x86/acpi/boot.c
> +++ b/xen/arch/x86/acpi/boot.c
> @@ -443,6 +443,11 @@ acpi_fadt_parse_sleep_info(struct acpi_t
>   "FACS is shorter than ACPI spec allow: %#x",
>   facs->length);
>  
> + if (facs_pa % 64)
> + printk(KERN_WARNING PREFIX
> + "FACS is not 64-byte aligned: %#lx",
> + facs_pa);
> +
>   acpi_sinfo.wakeup_vector = facs_pa + 
>   offsetof(struct acpi_table_facs, firmware_waking_vector);
>   acpi_sinfo.vector_width = 32;
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -174,17 +174,20 @@ static void acpi_sleep_prepare(u32 state
>  if ( state != ACPI_STATE_S3 )
>  return;
>  
> -wakeup_vector_va = __acpi_map_table(
> -acpi_sinfo.wakeup_vector, sizeof(uint64_t));
> -
>  /* TBoot will set resume vector itself (when it is safe to do so). */
>  if ( tboot_in_measured_env() )
>  return;
>  
> +set_fixmap(FIX_ACPI_END, acpi_sinfo.wakeup_vector);
> +wakeup_vector_va = fix_to_virt(FIX_ACPI_END) +
> +   PAGE_OFFSET(acpi_sinfo.wakeup_vector);
> +
>  if ( acpi_sinfo.vector_width == 32 )
>  *(uint32_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
>  else
>  *(uint64_t *)wakeup_vector_va = bootsym_phys(wakeup_start);
> +
> +clear_fixmap(FIX_ACPI_END);

Why not use vmap here instead of the fixmap?

Thanks, Roger.



[PATCH v3 8/8] lib: move sort code

2020-11-23 Thread Jan Beulich
Build this code into an archive, partly paralleling bsearch().

Signed-off-by: Jan Beulich 
Acked-by: Julien Grall 
---
 xen/common/Makefile| 1 -
 xen/lib/Makefile   | 1 +
 xen/{common => lib}/sort.c | 0
 3 files changed, 1 insertion(+), 1 deletion(-)
 rename xen/{common => lib}/sort.c (100%)

diff --git a/xen/common/Makefile b/xen/common/Makefile
index e8ce23acea67..7a4e652b575e 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -36,7 +36,6 @@ obj-y += rcupdate.o
 obj-y += rwlock.o
 obj-y += shutdown.o
 obj-y += softirq.o
-obj-y += sort.o
 obj-y += smp.o
 obj-y += spinlock.o
 obj-y += stop_machine.o
diff --git a/xen/lib/Makefile b/xen/lib/Makefile
index f12dab7a737a..42cf7a1164ef 100644
--- a/xen/lib/Makefile
+++ b/xen/lib/Makefile
@@ -6,3 +6,4 @@ lib-y += ctype.o
 lib-y += list-sort.o
 lib-y += parse-size.o
 lib-y += rbtree.o
+lib-y += sort.o
diff --git a/xen/common/sort.c b/xen/lib/sort.c
similarity index 100%
rename from xen/common/sort.c
rename to xen/lib/sort.c




[PATCH v3 7/8] lib: move bsearch code

2020-11-23 Thread Jan Beulich
Convert this code to an inline function (backed by an instance in an
archive in case the compiler decides against inlining), which results
in not having it in x86 final binaries. This saves a little bit of dead
code.

Signed-off-by: Jan Beulich 
---
 xen/common/Makefile|  1 -
 xen/common/bsearch.c   | 51 --
 xen/include/xen/compiler.h |  1 +
 xen/include/xen/lib.h  | 42 ++-
 xen/lib/Makefile   |  1 +
 xen/lib/bsearch.c  | 13 ++
 6 files changed, 56 insertions(+), 53 deletions(-)
 delete mode 100644 xen/common/bsearch.c
 create mode 100644 xen/lib/bsearch.c

diff --git a/xen/common/Makefile b/xen/common/Makefile
index d65c9fe9cb4e..e8ce23acea67 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,6 +1,5 @@
 obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
-obj-y += bsearch.o
 obj-$(CONFIG_HYPFS_CONFIG) += config_data.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
 obj-y += cpu.o
diff --git a/xen/common/bsearch.c b/xen/common/bsearch.c
deleted file mode 100644
index 7090930aab5c..
--- a/xen/common/bsearch.c
+++ /dev/null
@@ -1,51 +0,0 @@
-/*
- * A generic implementation of binary search for the Linux kernel
- *
- * Copyright (C) 2008-2009 Ksplice, Inc.
- * Author: Tim Abbott 
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation; version 2.
- */
-
-#include 
-
-/*
- * bsearch - binary search an array of elements
- * @key: pointer to item being searched for
- * @base: pointer to first element to search
- * @num: number of elements
- * @size: size of each element
- * @cmp: pointer to comparison function
- *
- * This function does a binary search on the given array.  The
- * contents of the array should already be in ascending sorted order
- * under the provided comparison function.
- *
- * Note that the key need not have the same type as the elements in
- * the array, e.g. key could be a string and the comparison function
- * could compare the string with the struct's name field.  However, if
- * the key and elements in the array are of the same type, you can use
- * the same comparison function for both sort() and bsearch().
- */
-void *bsearch(const void *key, const void *base, size_t num, size_t size,
- int (*cmp)(const void *key, const void *elt))
-{
-   size_t start = 0, end = num;
-   int result;
-
-   while (start < end) {
-   size_t mid = start + (end - start) / 2;
-
-   result = cmp(key, base + mid * size);
-   if (result < 0)
-   end = mid;
-   else if (result > 0)
-   start = mid + 1;
-   else
-   return (void *)base + mid * size;
-   }
-
-   return NULL;
-}
diff --git a/xen/include/xen/compiler.h b/xen/include/xen/compiler.h
index c0e0ee9f27be..2b7acdf3b188 100644
--- a/xen/include/xen/compiler.h
+++ b/xen/include/xen/compiler.h
@@ -12,6 +12,7 @@
 
 #define inline__inline__
 #define always_inline __inline__ __attribute__ ((__always_inline__))
+#define gnu_inline__inline__ __attribute__ ((__gnu_inline__))
 #define noinline  __attribute__((__noinline__))
 
 #define noreturn  __attribute__((__noreturn__))
diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h
index a9679c913d5c..48429b69b8df 100644
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -204,8 +204,48 @@ void dump_execstate(struct cpu_user_regs *);
 
 void init_constructors(void);
 
+/*
+ * bsearch - binary search an array of elements
+ * @key: pointer to item being searched for
+ * @base: pointer to first element to search
+ * @num: number of elements
+ * @size: size of each element
+ * @cmp: pointer to comparison function
+ *
+ * This function does a binary search on the given array.  The
+ * contents of the array should already be in ascending sorted order
+ * under the provided comparison function.
+ *
+ * Note that the key need not have the same type as the elements in
+ * the array, e.g. key could be a string and the comparison function
+ * could compare the string with the struct's name field.  However, if
+ * the key and elements in the array are of the same type, you can use
+ * the same comparison function for both sort() and bsearch().
+ */
+#ifndef BSEARCH_IMPLEMENTATION
+extern gnu_inline
+#endif
 void *bsearch(const void *key, const void *base, size_t num, size_t size,
-  int (*cmp)(const void *key, const void *elt));
+  int (*cmp)(const void *key, const void *elt))
+{
+size_t start = 0, end = num;
+int result;
+
+while ( start < end )
+{
+size_t mid = start + (end - start) / 2;
+
+result = cmp(key, base + mid * size);
+if ( result < 0 )
+end = mid;
+else if ( result > 0 )
+start = mid + 1;
+   

[PATCH v3 6/8] lib: move rbtree code

2020-11-23 Thread Jan Beulich
Build this code into an archive, which results in not linking it into
x86 final binaries. This saves about 1.5k of dead code.

While moving the source file, take the opportunity and drop the
pointless EXPORT_SYMBOL() and an instance of trailing whitespace.

Signed-off-by: Jan Beulich 
---
 xen/common/Makefile  | 1 -
 xen/lib/Makefile | 1 +
 xen/{common => lib}/rbtree.c | 9 +
 3 files changed, 2 insertions(+), 9 deletions(-)
 rename xen/{common => lib}/rbtree.c (98%)

diff --git a/xen/common/Makefile b/xen/common/Makefile
index 332e7d667cec..d65c9fe9cb4e 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -33,7 +33,6 @@ obj-y += preempt.o
 obj-y += random.o
 obj-y += rangeset.o
 obj-y += radix-tree.o
-obj-y += rbtree.o
 obj-y += rcupdate.o
 obj-y += rwlock.o
 obj-y += shutdown.o
diff --git a/xen/lib/Makefile b/xen/lib/Makefile
index 72c72fffecf2..b0fe8c72acf5 100644
--- a/xen/lib/Makefile
+++ b/xen/lib/Makefile
@@ -4,3 +4,4 @@ lib-y += ctors.o
 lib-y += ctype.o
 lib-y += list-sort.o
 lib-y += parse-size.o
+lib-y += rbtree.o
diff --git a/xen/common/rbtree.c b/xen/lib/rbtree.c
similarity index 98%
rename from xen/common/rbtree.c
rename to xen/lib/rbtree.c
index 9f5498a89d4e..95e045d52461 100644
--- a/xen/common/rbtree.c
+++ b/xen/lib/rbtree.c
@@ -25,7 +25,7 @@
 #include 
 
 /*
- * red-black trees properties:  http://en.wikipedia.org/wiki/Rbtree 
+ * red-black trees properties:  http://en.wikipedia.org/wiki/Rbtree
  *
  *  1) A node is either red or black
  *  2) The root is black
@@ -223,7 +223,6 @@ void rb_insert_color(struct rb_node *node, struct rb_root 
*root)
}
}
 }
-EXPORT_SYMBOL(rb_insert_color);
 
 static void __rb_erase_color(struct rb_node *parent, struct rb_root *root)
 {
@@ -467,7 +466,6 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
if (rebalance)
__rb_erase_color(rebalance, root);
 }
-EXPORT_SYMBOL(rb_erase);
 
 /*
  * This function returns the first node (in sort order) of the tree.
@@ -483,7 +481,6 @@ struct rb_node *rb_first(const struct rb_root *root)
n = n->rb_left;
return n;
 }
-EXPORT_SYMBOL(rb_first);
 
 struct rb_node *rb_last(const struct rb_root *root)
 {
@@ -496,7 +493,6 @@ struct rb_node *rb_last(const struct rb_root *root)
n = n->rb_right;
return n;
 }
-EXPORT_SYMBOL(rb_last);
 
 struct rb_node *rb_next(const struct rb_node *node)
 {
@@ -528,7 +524,6 @@ struct rb_node *rb_next(const struct rb_node *node)
 
return parent;
 }
-EXPORT_SYMBOL(rb_next);
 
 struct rb_node *rb_prev(const struct rb_node *node)
 {
@@ -557,7 +552,6 @@ struct rb_node *rb_prev(const struct rb_node *node)
 
return parent;
 }
-EXPORT_SYMBOL(rb_prev);
 
 void rb_replace_node(struct rb_node *victim, struct rb_node *new,
 struct rb_root *root)
@@ -574,4 +568,3 @@ void rb_replace_node(struct rb_node *victim, struct rb_node 
*new,
/* Copy the pointers/colour from the victim to the replacement */
*new = *victim;
 }
-EXPORT_SYMBOL(rb_replace_node);




[PATCH v3 5/8] lib: move init_constructors()

2020-11-23 Thread Jan Beulich
... into its own CU, for being unrelated to other things in
common/lib.c.

Signed-off-by: Jan Beulich 
---
 xen/common/lib.c | 14 --
 xen/lib/Makefile |  1 +
 xen/lib/ctors.c  | 25 +
 3 files changed, 26 insertions(+), 14 deletions(-)
 create mode 100644 xen/lib/ctors.c

diff --git a/xen/common/lib.c b/xen/common/lib.c
index 6cfa332142a5..f5ca179a0af4 100644
--- a/xen/common/lib.c
+++ b/xen/common/lib.c
@@ -1,6 +1,5 @@
 #include 
 #include 
-#include 
 #include 
 
 /*
@@ -423,19 +422,6 @@ uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)
 #endif
 }
 
-typedef void (*ctor_func_t)(void);
-extern const ctor_func_t __ctors_start[], __ctors_end[];
-
-void __init init_constructors(void)
-{
-const ctor_func_t *f;
-for ( f = __ctors_start; f < __ctors_end; ++f )
-(*f)();
-
-/* Putting this here seems as good (or bad) as any other place. */
-BUILD_BUG_ON(sizeof(size_t) != sizeof(ssize_t));
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/lib/Makefile b/xen/lib/Makefile
index 99f857540c99..72c72fffecf2 100644
--- a/xen/lib/Makefile
+++ b/xen/lib/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_X86) += x86/
 
+lib-y += ctors.o
 lib-y += ctype.o
 lib-y += list-sort.o
 lib-y += parse-size.o
diff --git a/xen/lib/ctors.c b/xen/lib/ctors.c
new file mode 100644
index ..5bdc591cd50a
--- /dev/null
+++ b/xen/lib/ctors.c
@@ -0,0 +1,25 @@
+#include 
+#include 
+
+typedef void (*ctor_func_t)(void);
+extern const ctor_func_t __ctors_start[], __ctors_end[];
+
+void __init init_constructors(void)
+{
+const ctor_func_t *f;
+for ( f = __ctors_start; f < __ctors_end; ++f )
+(*f)();
+
+/* Putting this here seems as good (or bad) as any other place. */
+BUILD_BUG_ON(sizeof(size_t) != sizeof(ssize_t));
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */




[PATCH v3 4/8] lib: move parse_size_and_unit()

2020-11-23 Thread Jan Beulich
... into its own CU, to build it into an archive.

Signed-off-by: Jan Beulich 
Acked-by: Julien Grall 
---
 xen/common/lib.c | 39 --
 xen/lib/Makefile |  1 +
 xen/lib/parse-size.c | 50 
 3 files changed, 51 insertions(+), 39 deletions(-)
 create mode 100644 xen/lib/parse-size.c

diff --git a/xen/common/lib.c b/xen/common/lib.c
index a224efa8f6e8..6cfa332142a5 100644
--- a/xen/common/lib.c
+++ b/xen/common/lib.c
@@ -423,45 +423,6 @@ uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)
 #endif
 }
 
-unsigned long long parse_size_and_unit(const char *s, const char **ps)
-{
-unsigned long long ret;
-const char *s1;
-
-ret = simple_strtoull(s, , 0);
-
-switch ( *s1 )
-{
-case 'T': case 't':
-ret <<= 10;
-/* fallthrough */
-case 'G': case 'g':
-ret <<= 10;
-/* fallthrough */
-case 'M': case 'm':
-ret <<= 10;
-/* fallthrough */
-case 'K': case 'k':
-ret <<= 10;
-/* fallthrough */
-case 'B': case 'b':
-s1++;
-break;
-case '%':
-if ( ps )
-break;
-/* fallthrough */
-default:
-ret <<= 10; /* default to kB */
-break;
-}
-
-if ( ps != NULL )
-*ps = s1;
-
-return ret;
-}
-
 typedef void (*ctor_func_t)(void);
 extern const ctor_func_t __ctors_start[], __ctors_end[];
 
diff --git a/xen/lib/Makefile b/xen/lib/Makefile
index 764f3624b5f9..99f857540c99 100644
--- a/xen/lib/Makefile
+++ b/xen/lib/Makefile
@@ -2,3 +2,4 @@ obj-$(CONFIG_X86) += x86/
 
 lib-y += ctype.o
 lib-y += list-sort.o
+lib-y += parse-size.o
diff --git a/xen/lib/parse-size.c b/xen/lib/parse-size.c
new file mode 100644
index ..ec980cadfff3
--- /dev/null
+++ b/xen/lib/parse-size.c
@@ -0,0 +1,50 @@
+#include 
+
+unsigned long long parse_size_and_unit(const char *s, const char **ps)
+{
+unsigned long long ret;
+const char *s1;
+
+ret = simple_strtoull(s, , 0);
+
+switch ( *s1 )
+{
+case 'T': case 't':
+ret <<= 10;
+/* fallthrough */
+case 'G': case 'g':
+ret <<= 10;
+/* fallthrough */
+case 'M': case 'm':
+ret <<= 10;
+/* fallthrough */
+case 'K': case 'k':
+ret <<= 10;
+/* fallthrough */
+case 'B': case 'b':
+s1++;
+break;
+case '%':
+if ( ps )
+break;
+/* fallthrough */
+default:
+ret <<= 10; /* default to kB */
+break;
+}
+
+if ( ps != NULL )
+*ps = s1;
+
+return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */




[PATCH v3 3/8] lib: move list sorting code

2020-11-23 Thread Jan Beulich
Build the source file always, as by putting it into an archive it still
won't be linked into final binaries when not needed. This way possible
build breakage will be easier to notice, and it's more consistent with
us unconditionally building other library kind of code (e.g. sort() or
bsearch()).

While moving the source file, take the opportunity and drop the
pointless EXPORT_SYMBOL() and an unnecessary #include.

Signed-off-by: Jan Beulich 
---
 xen/arch/arm/Kconfig| 4 +---
 xen/common/Kconfig  | 3 ---
 xen/common/Makefile | 1 -
 xen/lib/Makefile| 1 +
 xen/{common/list_sort.c => lib/list-sort.c} | 2 --
 5 files changed, 2 insertions(+), 9 deletions(-)
 rename xen/{common/list_sort.c => lib/list-sort.c} (98%)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index f5b1bcda0323..38b6c31ba5dd 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -56,9 +56,7 @@ config HVM
 def_bool y
 
 config NEW_VGIC
-   bool
-   prompt "Use new VGIC implementation"
-   select NEEDS_LIST_SORT
+   bool "Use new VGIC implementation"
---help---
 
This is an alternative implementation of the ARM GIC interrupt
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 3e2cf2508899..0661328a99e7 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -66,9 +66,6 @@ config MEM_ACCESS
 config NEEDS_LIBELF
bool
 
-config NEEDS_LIST_SORT
-   bool
-
 menu "Speculative hardening"
 
 config SPECULATIVE_HARDEN_ARRAY
diff --git a/xen/common/Makefile b/xen/common/Makefile
index d109f279a490..332e7d667cec 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -21,7 +21,6 @@ obj-y += keyhandler.o
 obj-$(CONFIG_KEXEC) += kexec.o
 obj-$(CONFIG_KEXEC) += kimage.o
 obj-y += lib.o
-obj-$(CONFIG_NEEDS_LIST_SORT) += list_sort.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o livepatch_elf.o
 obj-$(CONFIG_MEM_ACCESS) += mem_access.o
 obj-y += memory.o
diff --git a/xen/lib/Makefile b/xen/lib/Makefile
index b8814361d63e..764f3624b5f9 100644
--- a/xen/lib/Makefile
+++ b/xen/lib/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_X86) += x86/
 
 lib-y += ctype.o
+lib-y += list-sort.o
diff --git a/xen/common/list_sort.c b/xen/lib/list-sort.c
similarity index 98%
rename from xen/common/list_sort.c
rename to xen/lib/list-sort.c
index af2b2f6519f1..f8d8bbf28178 100644
--- a/xen/common/list_sort.c
+++ b/xen/lib/list-sort.c
@@ -15,7 +15,6 @@
  * this program; If not, see .
  */
 
-#include 
 #include 
 
 #define MAX_LIST_LENGTH_BITS 20
@@ -154,4 +153,3 @@ void list_sort(void *priv, struct list_head *head,
 
merge_and_restore_back_links(priv, cmp, head, part[max_lev], list);
 }
-EXPORT_SYMBOL(list_sort);




[PATCH v3 2/8] lib: collect library files in an archive

2020-11-23 Thread Jan Beulich
In order to (subsequently) drop odd things like CONFIG_NEEDS_LIST_SORT
just to avoid bloating binaries when only some arch-es and/or
configurations need generic library routines, combine objects under lib/
into an archive, which the linker then can pick the necessary objects
out of.

Note that we can't use thin archives just yet, until we've raised the
minimum required binutils version suitably.

Signed-off-by: Jan Beulich 
---
 xen/Rules.mk  | 29 +
 xen/arch/arm/Makefile |  6 +++---
 xen/arch/x86/Makefile |  8 
 xen/lib/Makefile  |  3 ++-
 4 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/xen/Rules.mk b/xen/Rules.mk
index d5e5eb33de39..aba6ca2a90f5 100644
--- a/xen/Rules.mk
+++ b/xen/Rules.mk
@@ -41,12 +41,16 @@ ALL_OBJS-y   += $(BASEDIR)/xsm/built_in.o
 ALL_OBJS-y   += $(BASEDIR)/arch/$(TARGET_ARCH)/built_in.o
 ALL_OBJS-$(CONFIG_CRYPTO)   += $(BASEDIR)/crypto/built_in.o
 
+ALL_LIBS-y   := $(BASEDIR)/lib/lib.a
+
 # Initialise some variables
+lib-y :=
 targets :=
 CFLAGS-y :=
 AFLAGS-y :=
 
 ALL_OBJS := $(ALL_OBJS-y)
+ALL_LIBS := $(ALL_LIBS-y)
 
 SPECIAL_DATA_SECTIONS := rodata $(foreach a,1 2 4 8 16, \
 $(foreach w,1 2 4, \
@@ -60,7 +64,14 @@ include Makefile
 # ---
 
 quiet_cmd_ld = LD  $@
-cmd_ld = $(LD) $(XEN_LDFLAGS) -r -o $@ $(real-prereqs)
+cmd_ld = $(LD) $(XEN_LDFLAGS) -r -o $@ $(filter-out %.a,$(real-prereqs)) \
+   --start-group $(filter %.a,$(real-prereqs)) --end-group
+
+# Archive
+# ---
+
+quiet_cmd_ar = AR  $@
+cmd_ar = rm -f $@; $(AR) cPrs $@ $(real-prereqs)
 
 # Objcopy
 # ---
@@ -86,6 +97,10 @@ obj-y:= $(patsubst %/, %/built_in.o, $(obj-y))
 # tell kbuild to descend
 subdir-obj-y := $(filter %/built_in.o, $(obj-y))
 
+# Libraries are always collected in one lib file.
+# Filter out objects already built-in
+lib-y := $(filter-out $(obj-y), $(sort $(lib-y)))
+
 $(filter %.init.o,$(obj-y) $(obj-bin-y) $(extra-y)): CFLAGS-y += 
-DINIT_SECTIONS_ONLY
 
 ifeq ($(CONFIG_COVERAGE),y)
@@ -129,7 +144,7 @@ include $(BASEDIR)/arch/$(TARGET_ARCH)/Rules.mk
 c_flags += $(CFLAGS-y)
 a_flags += $(CFLAGS-y) $(AFLAGS-y)
 
-built_in.o: $(obj-y) $(extra-y)
+built_in.o: $(obj-y) $(if $(strip $(lib-y)),lib.a) $(extra-y)
 ifeq ($(strip $(obj-y)),)
$(CC) $(c_flags) -c -x c /dev/null -o $@
 else
@@ -140,8 +155,14 @@ else
 endif
 endif
 
+lib.a: $(lib-y) FORCE
+   $(call if_changed,ar)
+
 targets += built_in.o
-targets += $(filter-out $(subdir-obj-y), $(obj-y)) $(extra-y)
+ifneq ($(strip $(lib-y)),)
+targets += lib.a
+endif
+targets += $(filter-out $(subdir-obj-y), $(obj-y) $(lib-y)) $(extra-y)
 targets += $(MAKECMDGOALS)
 
 built_in_bin.o: $(obj-bin-y) $(extra-y)
@@ -155,7 +176,7 @@ endif
 PHONY += FORCE
 FORCE:
 
-%/built_in.o: FORCE
+%/built_in.o %/lib.a: FORCE
$(MAKE) -f $(BASEDIR)/Rules.mk -C $* built_in.o
 
 %/built_in_bin.o: FORCE
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 296c5e68bbc3..612a83b315c8 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -90,14 +90,14 @@ endif
 
 ifeq ($(CONFIG_LTO),y)
 # Gather all LTO objects together
-prelink_lto.o: $(ALL_OBJS)
-   $(LD_LTO) -r -o $@ $^
+prelink_lto.o: $(ALL_OBJS) $(ALL_LIBS)
+   $(LD_LTO) -r -o $@ $(filter-out %.a,$^) --start-group $(filter %.a,$^) 
--end-group
 
 # Link it with all the binary objects
 prelink.o: $(patsubst %/built_in.o,%/built_in_bin.o,$(ALL_OBJS)) prelink_lto.o
$(call if_changed,ld)
 else
-prelink.o: $(ALL_OBJS) FORCE
+prelink.o: $(ALL_OBJS) $(ALL_LIBS) FORCE
$(call if_changed,ld)
 endif
 
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 9b368632fb43..8f2180485b2b 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -132,8 +132,8 @@ EFI_OBJS-$(XEN_BUILD_EFI) := efi/relocs-dummy.o
 
 ifeq ($(CONFIG_LTO),y)
 # Gather all LTO objects together
-prelink_lto.o: $(ALL_OBJS)
-   $(LD_LTO) -r -o $@ $^
+prelink_lto.o: $(ALL_OBJS) $(ALL_LIBS)
+   $(LD_LTO) -r -o $@ $(filter-out %.a,$^) --start-group $(filter %.a,$^) 
--end-group
 
 # Link it with all the binary objects
 prelink.o: $(patsubst %/built_in.o,%/built_in_bin.o,$(ALL_OBJS)) prelink_lto.o 
$(EFI_OBJS-y) FORCE
@@ -142,10 +142,10 @@ prelink.o: $(patsubst 
%/built_in.o,%/built_in_bin.o,$(ALL_OBJS)) prelink_lto.o $
 prelink-efi.o: $(patsubst %/built_in.o,%/built_in_bin.o,$(ALL_OBJS)) 
prelink_lto.o FORCE
$(call if_changed,ld)
 else
-prelink.o: $(ALL_OBJS) $(EFI_OBJS-y) FORCE
+prelink.o: $(ALL_OBJS) $(ALL_LIBS) $(EFI_OBJS-y) FORCE
$(call if_changed,ld)
 
-prelink-efi.o: $(ALL_OBJS) FORCE
+prelink-efi.o: $(ALL_OBJS) $(ALL_LIBS) FORCE
$(call if_changed,ld)
 endif
 
diff --git 

[PATCH v3 1/8] xen: fix build when $(obj-y) consists of just blanks

2020-11-23 Thread Jan Beulich
This case can occur when combining empty lists

obj-y :=
...
obj-y += $(empty)

or

obj-y := $(empty) $(empty)

where (only) blanks would accumulate. This was only a latent issue until
now, but would become an active issue for Arm once lib/ gets populated
with all respective objects going into the to be introduced lib.a.

Also address a related issue at this occasion: When an empty built_in.o
gets created, .built_in.o.d will have its dependencies recorded. If, on
a subsequent incremental build, an actual constituent of built_in.o
appeared, the $(filter-out ) would leave these recorded dependencies in
place. But of course the linker won't know what to do with C header
files. (The apparent alternative of avoiding to pass $(c_flags) or
$(a_flags) would not be reliable afaict, as among these flags there may
be some affecting information conveyed via the object file to the
linker. The linker, finding inconsistent flags across object files, may
then error out.) Using just $(obj-y) won't work either: It breaks when
the same object file is listed more than once.

Reported-by: Julien Grall 
Signed-off-by: Jan Beulich 
---
 xen/Rules.mk | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/Rules.mk b/xen/Rules.mk
index 333e19bec343..d5e5eb33de39 100644
--- a/xen/Rules.mk
+++ b/xen/Rules.mk
@@ -130,13 +130,13 @@ c_flags += $(CFLAGS-y)
 a_flags += $(CFLAGS-y) $(AFLAGS-y)
 
 built_in.o: $(obj-y) $(extra-y)
-ifeq ($(obj-y),)
+ifeq ($(strip $(obj-y)),)
$(CC) $(c_flags) -c -x c /dev/null -o $@
 else
 ifeq ($(CONFIG_LTO),y)
-   $(LD_LTO) -r -o $@ $(filter-out $(extra-y),$^)
+   $(LD_LTO) -r -o $@ $(filter $(obj-y),$^)
 else
-   $(LD) $(XEN_LDFLAGS) -r -o $@ $(filter-out $(extra-y),$^)
+   $(LD) $(XEN_LDFLAGS) -r -o $@ $(filter $(obj-y),$^)
 endif
 endif
 
@@ -145,10 +145,10 @@ targets += $(filter-out $(subdir-obj-y), $(obj-y)) 
$(extra-y)
 targets += $(MAKECMDGOALS)
 
 built_in_bin.o: $(obj-bin-y) $(extra-y)
-ifeq ($(obj-bin-y),)
+ifeq ($(strip $(obj-bin-y)),)
$(CC) $(a_flags) -c -x assembler /dev/null -o $@
 else
-   $(LD) $(XEN_LDFLAGS) -r -o $@ $(filter-out $(extra-y),$^)
+   $(LD) $(XEN_LDFLAGS) -r -o $@ $(filter $(obj-bin-y),$^)
 endif
 
 # Force execution of pattern rules (for which PHONY cannot be directly used).
-- 
2.22.0





[PATCH v3 0/8] xen: beginnings of moving library-like code into an archive

2020-11-23 Thread Jan Beulich
In a few cases we link in library-like functions when they're not
actually needed. While we could use Kconfig options for each one
of them, I think the better approach for such generic code is to
build it always (thus making sure a build issue can't be introduced
for these in any however exotic configuration) and then put it into
an archive, for the linker to pick up as needed. The series here
presents a first few tiny steps towards such a goal.

Note that we can't use thin archives yet, due to our tool chain
(binutils) baseline being too low.

Further almost immediate steps I'd like to take if the approach
meets no opposition are
- split and move the rest of common/lib.c,
- split and move common/string.c, dropping the need for all the
  __HAVE_ARCH_* (implying possible per-arch archives then need to
  be specified ahead of lib/lib.a on the linker command lines),
- move common/libelf/ and common/libfdt/.

v3 has a new 1st patch and some review feedback addressed. See
individual patches.

1: xen: fix build when $(obj-y) consists of just blanks
2: lib: collect library files in an archive
3: lib: move list sorting code
4: lib: move parse_size_and_unit()
5: lib: move init_constructors()
6: lib: move rbtree code
7: lib: move bsearch code
8: lib: move sort code

Jan



Re: [PATCH v2 05/12] x86: rework arch_local_irq_restore() to not use popf

2020-11-23 Thread Andy Lutomirski





> On Nov 22, 2020, at 9:22 PM, Jürgen Groß  wrote:
> 
> On 22.11.20 22:44, Andy Lutomirski wrote:
>>> On Sat, Nov 21, 2020 at 10:55 PM Jürgen Groß  wrote:
>>> 
>>> On 20.11.20 12:59, Peter Zijlstra wrote:
 On Fri, Nov 20, 2020 at 12:46:23PM +0100, Juergen Gross wrote:
> +static __always_inline void arch_local_irq_restore(unsigned long flags)
> +{
> +if (!arch_irqs_disabled_flags(flags))
> +arch_local_irq_enable();
> +}
 
 If someone were to write horrible code like:
 
   local_irq_disable();
   local_irq_save(flags);
   local_irq_enable();
   local_irq_restore(flags);
 
 we'd be up some creek without a paddle... now I don't _think_ we have
 genius code like that, but I'd feel saver if we can haz an assertion in
 there somewhere...
 
 Maybe something like:
 
 #ifdef CONFIG_DEBUG_ENTRY // for lack of something saner
   WARN_ON_ONCE((arch_local_save_flags() ^ flags) & X86_EFLAGS_IF);
 #endif
 
 At the end?
>>> 
>>> I'd like to, but using WARN_ON_ONCE() in include/asm/irqflags.h sounds
>>> like a perfect receipt for include dependency hell.
>>> 
>>> We could use a plain asm("ud2") instead.
>> How about out-of-lining it:
>> #ifdef CONFIG_DEBUG_ENTRY
>> extern void warn_bogus_irqrestore();
>> #endif
>> static __always_inline void arch_local_irq_restore(unsigned long flags)
>> {
>>if (!arch_irqs_disabled_flags(flags)) {
>>arch_local_irq_enable();
>>} else {
>> #ifdef CONFIG_DEBUG_ENTRY
>>if (unlikely(arch_local_irq_save() & X86_EFLAGS_IF))
>> warn_bogus_irqrestore();
>> #endif
>> }
> 
> This couldn't be a WARN_ON_ONCE() then (or it would be a catch all).

If you put the WARN_ON_ONCE in the out-of-line helper, it should work 
reasonably well.

> Another approach might be to open-code the WARN_ON_ONCE(), like:
> 
> #ifdef CONFIG_DEBUG_ENTRY
> extern void warn_bogus_irqrestore(bool *once);
> #endif
> 
> static __always_inline void arch_local_irq_restore(unsigned long flags)
> {
>if (!arch_irqs_disabled_flags(flags))
>arch_local_irq_enable();
> #ifdef CONFIG_DEBUG_ENTRY
>{
>static bool once;
> 
>if (unlikely(arch_local_irq_save() & X86_EFLAGS_IF))
>warn_bogus_irqrestore();
>}
> #endif
> }
> 

I don’t know precisely what a static variable in an __always_inline function 
will do, but I imagine it will be, at best, erratic, especially when modules 
are involved.

> 
> Juergen
> 



[PATCH v2 2/2] x86/IRQ: reduce casting involved in guest action retrieval

2020-11-23 Thread Jan Beulich
Introduce a helper function covering both the IRQ_GUEST check and the
cast involved in obtaining the (correctly typed) pointer. Where possible
add const and/or reduce variable scope.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1042,6 +1042,11 @@ typedef struct {
 struct domain *guest[IRQ_MAX_GUESTS];
 } irq_guest_action_t;
 
+static irq_guest_action_t *guest_action(const struct irq_desc *desc)
+{
+return desc->status & IRQ_GUEST ? (void *)desc->action : NULL;
+}
+
 /*
  * Stack of interrupts awaiting EOI on each CPU. These must be popped in
  * order, as only the current highest-priority pending irq can be EOIed.
@@ -,11 +1116,9 @@ static void irq_guest_eoi_timer_fn(void
 
 spin_lock_irq(>lock);
 
-if ( !(desc->status & IRQ_GUEST) )
+if ( !(action = guest_action(desc)) )
 goto out;
 
-action = (irq_guest_action_t *)desc->action;
-
 ASSERT(action->ack_type != ACKTYPE_NONE);
 
 /*
@@ -1351,16 +1354,15 @@ static void flush_ready_eoi(void)
 pending_eoi_sp(peoi) = sp+1;
 }
 
-static void __set_eoi_ready(struct irq_desc *desc)
+static void __set_eoi_ready(const struct irq_desc *desc)
 {
-irq_guest_action_t *action = (irq_guest_action_t *)desc->action;
+irq_guest_action_t *action = guest_action(desc);
 struct pending_eoi *peoi = this_cpu(pending_eoi);
 int irq, sp;
 
 irq = desc - irq_desc;
 
-if ( !(desc->status & IRQ_GUEST) ||
- (action->in_flight != 0) ||
+if ( !action || action->in_flight ||
  !cpumask_test_and_clear_cpu(smp_processor_id(),
  action->cpu_eoi_map) )
 return;
@@ -1400,18 +1402,11 @@ void pirq_guest_eoi(struct pirq *pirq)
 
 void desc_guest_eoi(struct irq_desc *desc, struct pirq *pirq)
 {
-irq_guest_action_t *action;
+irq_guest_action_t *action = guest_action(desc);
 cpumask_t   cpu_eoi_map;
 
-if ( !(desc->status & IRQ_GUEST) )
-{
-spin_unlock_irq(>lock);
-return;
-}
-
-action = (irq_guest_action_t *)desc->action;
-
-if ( unlikely(!test_and_clear_bool(pirq->masked)) ||
+if ( unlikely(!action) ||
+ unlikely(!test_and_clear_bool(pirq->masked)) ||
  unlikely(--action->in_flight != 0) )
 {
 spin_unlock_irq(>lock);
@@ -1510,8 +1505,8 @@ static int irq_acktype(const struct irq_
 
 int pirq_shared(struct domain *d, int pirq)
 {
-struct irq_desc *desc;
-irq_guest_action_t *action;
+struct irq_desc*desc;
+const irq_guest_action_t *action;
 unsigned long   flags;
 int shared;
 
@@ -1519,8 +1514,8 @@ int pirq_shared(struct domain *d, int pi
 if ( desc == NULL )
 return 0;
 
-action = (irq_guest_action_t *)desc->action;
-shared = ((desc->status & IRQ_GUEST) && (action->nr_guests > 1));
+action = guest_action(desc);
+shared = (action && (action->nr_guests > 1));
 
 spin_unlock_irqrestore(>lock, flags);
 
@@ -1544,9 +1539,7 @@ int pirq_guest_bind(struct vcpu *v, stru
 goto out;
 }
 
-action = (irq_guest_action_t *)desc->action;
-
-if ( !(desc->status & IRQ_GUEST) )
+if ( !(action = guest_action(desc)) )
 {
 if ( desc->action != NULL )
 {
@@ -1659,21 +1652,18 @@ int pirq_guest_bind(struct vcpu *v, stru
 static irq_guest_action_t *__pirq_guest_unbind(
 struct domain *d, struct pirq *pirq, struct irq_desc *desc)
 {
-irq_guest_action_t *action;
+irq_guest_action_t *action = guest_action(desc);
 cpumask_t   cpu_eoi_map;
 int i;
 
-action = (irq_guest_action_t *)desc->action;
-
 if ( unlikely(action == NULL) )
 {
 dprintk(XENLOG_G_WARNING, "dom%d: pirq %d: desc->action is NULL!\n",
 d->domain_id, pirq->pirq);
+BUG_ON(!(desc->status & IRQ_GUEST));
 return NULL;
 }
 
-BUG_ON(!(desc->status & IRQ_GUEST));
-
 for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
 continue;
 BUG_ON(i == action->nr_guests);
@@ -1793,14 +1783,12 @@ static bool pirq_guest_force_unbind(stru
 desc = pirq_spin_lock_irq_desc(pirq, NULL);
 BUG_ON(desc == NULL);
 
-if ( !(desc->status & IRQ_GUEST) )
-goto out;
-
-action = (irq_guest_action_t *)desc->action;
+action = guest_action(desc);
 if ( unlikely(action == NULL) )
 {
-dprintk(XENLOG_G_WARNING, "dom%d: pirq %d: desc->action is NULL!\n",
-d->domain_id, pirq->pirq);
+if ( desc->status & IRQ_GUEST )
+dprintk(XENLOG_G_WARNING, "%pd: pirq %d: desc->action is NULL!\n",
+d, pirq->pirq);
 goto out;
 }
 
@@ -1827,7 +1815,7 @@ static bool pirq_guest_force_unbind(stru
 
 static void do_IRQ_guest(struct irq_desc *desc, unsigned int vector)
 {
-irq_guest_action_t *action = (irq_guest_action_t *)desc->action;
+irq_guest_action_t *action = 

[PATCH v2 1/2] x86/IRQ: drop three unused variables

2020-11-23 Thread Jan Beulich
I didn't bother figuring which commit(s) should have deleted them while
removing their last uses.

Signed-off-by: Jan Beulich 
---
v2: Yet one more.

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1402,7 +1402,6 @@ void desc_guest_eoi(struct irq_desc *des
 {
 irq_guest_action_t *action;
 cpumask_t   cpu_eoi_map;
-int irq;
 
 if ( !(desc->status & IRQ_GUEST) )
 {
@@ -1411,7 +1410,6 @@ void desc_guest_eoi(struct irq_desc *des
 }
 
 action = (irq_guest_action_t *)desc->action;
-irq = desc - irq_desc;
 
 if ( unlikely(!test_and_clear_bool(pirq->masked)) ||
  unlikely(--action->in_flight != 0) )
@@ -1531,7 +1529,6 @@ int pirq_shared(struct domain *d, int pi
 
 int pirq_guest_bind(struct vcpu *v, struct pirq *pirq, int will_share)
 {
-unsigned intirq;
 struct irq_desc *desc;
 irq_guest_action_t *action, *newaction = NULL;
 int rc = 0;
@@ -1548,7 +1545,6 @@ int pirq_guest_bind(struct vcpu *v, stru
 }
 
 action = (irq_guest_action_t *)desc->action;
-irq = desc - irq_desc;
 
 if ( !(desc->status & IRQ_GUEST) )
 {
@@ -1663,13 +1659,11 @@ int pirq_guest_bind(struct vcpu *v, stru
 static irq_guest_action_t *__pirq_guest_unbind(
 struct domain *d, struct pirq *pirq, struct irq_desc *desc)
 {
-unsigned intirq;
 irq_guest_action_t *action;
 cpumask_t   cpu_eoi_map;
 int i;
 
 action = (irq_guest_action_t *)desc->action;
-irq = desc - irq_desc;
 
 if ( unlikely(action == NULL) )
 {




[PATCH v2 0/2] x86/IRQ: a little bit of tidying

2020-11-23 Thread Jan Beulich
1: drop three unused variables
2: reduce casting involved in guest action retrieval

Jan



Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names

2020-11-23 Thread Jan Beulich
On 23.11.2020 15:39, Oleksandr wrote:
> As it was agreed, below the list of proposed renaming (naming) within 
> current series.

Thanks for compiling this. A couple of suggestions for consideration:

> 1. Global (existing):
> hvm_map_mem_type_to_ioreq_server -> ioreq_server_map_mem_type
> hvm_select_ioreq_server  -> ioreq_server_select
> hvm_send_ioreq   -> ioreq_send
> hvm_ioreq_init   -> ioreq_init

ioreq_domain_init() (or, imo less desirable domain_ioreq_init())?

> hvm_destroy_all_ioreq_servers    -> ioreq_server_destroy_all
> hvm_all_ioreq_servers_add_vcpu   -> ioreq_server_add_vcpu_all
> hvm_all_ioreq_servers_remove_vcpu    -> ioreq_server_remove_vcpu_all
> hvm_broadcast_ioreq  -> ioreq_broadcast
> hvm_create_ioreq_server  -> ioreq_server_create
> hvm_get_ioreq_server_info    -> ioreq_server_get_info
> hvm_map_io_range_to_ioreq_server -> ioreq_server_map_io_range
> hvm_unmap_io_range_from_ioreq_server -> ioreq_server_unmap_io_range
> hvm_set_ioreq_server_state   -> ioreq_server_set_state
> hvm_destroy_ioreq_server -> ioreq_server_destroy
> hvm_get_ioreq_server_frame   -> ioreq_server_get_frame
> hvm_ioreq_needs_completion   -> ioreq_needs_completion
> hvm_mmio_first_byte  -> ioreq_mmio_first_byte
> hvm_mmio_last_byte   -> ioreq_mmio_last_byte
> send_invalidate_req  -> ioreq_signal_mapcache_invalidate
> 
> handle_hvm_io_completion -> handle_io_completion

For this one I'm not sure what to suggest, but I'm not overly happy
with the name.

> hvm_io_pending   -> io_pending

vcpu_ioreq_pending() or vcpu_any_ioreq_pending()?

> 2. Global (new):
> arch_io_completion
> arch_ioreq_server_map_pages
> arch_ioreq_server_unmap_pages
> arch_ioreq_server_enable
> arch_ioreq_server_disable
> arch_ioreq_server_destroy
> arch_ioreq_server_map_mem_type
> arch_ioreq_server_destroy_all
> arch_ioreq_server_get_type_addr
> arch_ioreq_init

Assuming this is the arch hook of the similarly named function
further up, a similar adjustment may then be wanted here.

> domain_has_ioreq_server
> 
> 
> 3. Local (existing) in common ioreq.c:
> hvm_alloc_ioreq_mfn   -> ioreq_alloc_mfn
> hvm_free_ioreq_mfn    -> ioreq_free_mfn

These two are server functions, so should imo be ioreq_server_...().

However, if they're static (as they're now), no distinguishing
prefix is strictly necessary, i.e. alloc_mfn() and free_mfn() may
be fine. The two names may be too short for Paul's taste, though.
Some similar shortening may be possible for some or all of the ones
below here.

Jan

> hvm_update_ioreq_evtchn   -> ioreq_update_evtchn
> hvm_ioreq_server_add_vcpu -> ioreq_server_add_vcpu
> hvm_ioreq_server_remove_vcpu  -> ioreq_server_remove_vcpu
> hvm_ioreq_server_remove_all_vcpus -> ioreq_server_remove_all_vcpus
> hvm_ioreq_server_alloc_pages  -> ioreq_server_alloc_pages
> hvm_ioreq_server_free_pages   -> ioreq_server_free_pages
> hvm_ioreq_server_free_rangesets   -> ioreq_server_free_rangesets
> hvm_ioreq_server_alloc_rangesets  -> ioreq_server_alloc_rangesets
> hvm_ioreq_server_enable   -> ioreq_server_enable
> hvm_ioreq_server_disable  -> ioreq_server_disable
> hvm_ioreq_server_init -> ioreq_server_init
> hvm_ioreq_server_deinit   -> ioreq_server_deinit
> hvm_send_buffered_ioreq   -> ioreq_send_buffered
> 
> hvm_wait_for_io   -> wait_for_io
> 
> 4. Local (existing) in x86 ioreq.c:
> Everything related to legacy interface (hvm_alloc_legacy_ioreq_gfn, etc) 
> are going
> to remain as is.
> 
> 
> 




Re: [PATCH V2 12/23] xen/ioreq: Remove "hvm" prefixes from involved function names

2020-11-23 Thread Oleksandr



Hi Jan.


As it was agreed, below the list of proposed renaming (naming) within 
current series.


If there are no objections I will follow the proposed renaming. If any 
please let me know.



1. Global (existing):
hvm_map_mem_type_to_ioreq_server -> ioreq_server_map_mem_type
hvm_select_ioreq_server  -> ioreq_server_select
hvm_send_ioreq   -> ioreq_send
hvm_ioreq_init   -> ioreq_init
hvm_destroy_all_ioreq_servers    -> ioreq_server_destroy_all
hvm_all_ioreq_servers_add_vcpu   -> ioreq_server_add_vcpu_all
hvm_all_ioreq_servers_remove_vcpu    -> ioreq_server_remove_vcpu_all
hvm_broadcast_ioreq  -> ioreq_broadcast
hvm_create_ioreq_server  -> ioreq_server_create
hvm_get_ioreq_server_info    -> ioreq_server_get_info
hvm_map_io_range_to_ioreq_server -> ioreq_server_map_io_range
hvm_unmap_io_range_from_ioreq_server -> ioreq_server_unmap_io_range
hvm_set_ioreq_server_state   -> ioreq_server_set_state
hvm_destroy_ioreq_server -> ioreq_server_destroy
hvm_get_ioreq_server_frame   -> ioreq_server_get_frame
hvm_ioreq_needs_completion   -> ioreq_needs_completion
hvm_mmio_first_byte  -> ioreq_mmio_first_byte
hvm_mmio_last_byte   -> ioreq_mmio_last_byte
send_invalidate_req  -> ioreq_signal_mapcache_invalidate

handle_hvm_io_completion -> handle_io_completion
hvm_io_pending   -> io_pending


2. Global (new):
arch_io_completion
arch_ioreq_server_map_pages
arch_ioreq_server_unmap_pages
arch_ioreq_server_enable
arch_ioreq_server_disable
arch_ioreq_server_destroy
arch_ioreq_server_map_mem_type
arch_ioreq_server_destroy_all
arch_ioreq_server_get_type_addr
arch_ioreq_init
domain_has_ioreq_server


3. Local (existing) in common ioreq.c:
hvm_alloc_ioreq_mfn   -> ioreq_alloc_mfn
hvm_free_ioreq_mfn    -> ioreq_free_mfn
hvm_update_ioreq_evtchn   -> ioreq_update_evtchn
hvm_ioreq_server_add_vcpu -> ioreq_server_add_vcpu
hvm_ioreq_server_remove_vcpu  -> ioreq_server_remove_vcpu
hvm_ioreq_server_remove_all_vcpus -> ioreq_server_remove_all_vcpus
hvm_ioreq_server_alloc_pages  -> ioreq_server_alloc_pages
hvm_ioreq_server_free_pages   -> ioreq_server_free_pages
hvm_ioreq_server_free_rangesets   -> ioreq_server_free_rangesets
hvm_ioreq_server_alloc_rangesets  -> ioreq_server_alloc_rangesets
hvm_ioreq_server_enable   -> ioreq_server_enable
hvm_ioreq_server_disable  -> ioreq_server_disable
hvm_ioreq_server_init -> ioreq_server_init
hvm_ioreq_server_deinit   -> ioreq_server_deinit
hvm_send_buffered_ioreq   -> ioreq_send_buffered

hvm_wait_for_io   -> wait_for_io

4. Local (existing) in x86 ioreq.c:
Everything related to legacy interface (hvm_alloc_legacy_ioreq_gfn, etc) 
are going

to remain as is.



--
Regards,

Oleksandr Tyshchenko




[PATCH v2 17/17] x86emul: support {LD,ST}TILECFG

2020-11-23 Thread Jan Beulich
While ver 041 of the ISA extensions doc also specifies
xcr0_supports_palette() returning false as one of the #GP(0) reasons for
LDTILECFG, the earlier #UD conditions look to make this fully dead.

Signed-off-by: Jan Beulich 
---
v2: New.
---
SDE: -spr

--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -1335,6 +1335,8 @@ static const struct vex {
 { { 0x45 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsrlv{d,q} */
 { { 0x46 }, 2, T, R, pfx_66, W0, Ln }, /* vpsravd */
 { { 0x47 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsllv{d,q} */
+{ { 0x49, 0x00 }, 2, F, R, pfx_no, W0, L0 }, /* ldtilecfg */
+{ { 0x49, 0x00 }, 2, F, W, pfx_66, W0, L0 }, /* sttilecfg */
 { { 0x49, 0xc0 }, 2, F, N, pfx_no, W0, L0 }, /* tilerelease */
 { { 0x50 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusd */
 { { 0x51 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusds */
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -898,6 +898,16 @@ int main(int argc, char **argv)
 int rc;
 #ifdef __x86_64__
 unsigned int vendor_native;
+static const struct {
+uint8_t palette, start_row;
+uint8_t res[14];
+uint16_t colsb[16];
+uint8_t rows[16];
+} tilecfg = {
+.palette = 1,
+.colsb = { 2, 4, 5, 3 },
+.rows = { 2, 4, 3, 5 },
+};
 #else
 unsigned int bcdres_native, bcdres_emul;
 #endif
@@ -4463,6 +4473,74 @@ int main(int argc, char **argv)
 printf("skipped\n");
 
 #ifdef __x86_64__
+printf("%-40s", "Testing tilerelease;sttilecfg 4(%rcx)...");
+if ( stack_exec && cpu_has_amx_tile )
+{
+decl_insn(tilerelease);
+
+asm volatile ( put_insn(tilerelease,
+/* tilerelease */
+".byte 0xC4, 0xE2, 0x78, 0x49, 0xC0;"
+/* sttilecfg 4(%0) */
+".byte 0xC4, 0xE2, 0x79, 0x49, 0x41, 0x04")
+:: "c" (NULL) );
+
+memset(res, ~0, 72);
+set_insn(tilerelease);
+regs.ecx = (unsigned long)res;
+rc = x86_emulate(, );
+if ( rc == X86EMUL_OKAY )
+rc = x86_emulate(, );
+if ( rc != X86EMUL_OKAY || !check_eip(tilerelease) ||
+ ~res[0] || ~res[17] || memchr_inv(res + 1, 0, 64) )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing ldtilecfg (%rdx)...");
+if ( stack_exec && cpu_has_amx_tile )
+{
+decl_insn(ldtilecfg);
+
+asm volatile ( put_insn(ldtilecfg,
+/* ldtilecfg (%0) */
+".byte 0xC4, 0xE2, 0x78, 0x49, 0x02")
+:: "d" (NULL) );
+
+set_insn(ldtilecfg);
+regs.edx = (unsigned long)
+rc = x86_emulate(, );
+if ( rc != X86EMUL_OKAY || !check_eip(ldtilecfg) )
+goto fail;
+printf("pending\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing sttilecfg -4(%rcx)...");
+if ( stack_exec && cpu_has_amx_tile )
+{
+decl_insn(sttilecfg);
+
+asm volatile ( put_insn(sttilecfg,
+/* sttilecfg 4(%0) */
+".byte 0xC4, 0xE2, 0x79, 0x49, 0x41, 0xfc")
+:: "c" (NULL) );
+
+memset(res, ~0, 72);
+set_insn(sttilecfg);
+regs.ecx = (unsigned long)(res + 2);
+rc = x86_emulate(, );
+if ( rc != X86EMUL_OKAY || !check_eip(sttilecfg) ||
+ ~res[0] || ~res[17] || memcmp(res + 1, , 64) )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 printf("%-40s", "Testing vzeroupper (compat)...");
 if ( cpu_has_avx )
 {
--- a/tools/tests/x86_emulator/x86-emulate.h
+++ b/tools/tests/x86_emulator/x86-emulate.h
@@ -67,6 +67,17 @@
 
 #define is_canonical_address(x) (((int64_t)(x) >> 47) == ((int64_t)(x) >> 63))
 
+static inline void *memchr_inv(const void *s, int c, size_t n)
+{
+const unsigned char *p = s;
+
+while ( n-- )
+if ( (unsigned char)c != *p++ )
+return (void *)(p - 1);
+
+return NULL;
+}
+
 extern uint32_t mxcsr_mask;
 extern struct cpuid_policy cp;
 
@@ -170,6 +181,8 @@ static inline bool xcr0_mask(uint64_t ma
 #define cpu_has_avx512_4fmaps (cp.feat.avx512_4fmaps && xcr0_mask(0xe6))
 #define cpu_has_avx512_vp2intersect (cp.feat.avx512_vp2intersect && 
xcr0_mask(0xe6))
 #define cpu_has_serialize  cp.feat.serialize
+#define cpu_has_amx_tile   (cp.feat.amx_tile && \
+xcr0_mask(X86_XCR0_TILECFG | X86_XCR0_TILEDATA))
 #define cpu_has_avx_vnni   (cp.feat.avx_vnni && xcr0_mask(6))
 #define cpu_has_avx512_bf16 (cp.feat.avx512_bf16 && xcr0_mask(0xe6))
 
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ 

[PATCH v2 16/17] x86emul: support TILERELEASE

2020-11-23 Thread Jan Beulich
This is relatively straightforward, and hence best suited to introduce a
few other general pieces.

Testing of this will be added once a sensible test can be put together,
i.e. when support for other insns is also there.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -1335,6 +1335,7 @@ static const struct vex {
 { { 0x45 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsrlv{d,q} */
 { { 0x46 }, 2, T, R, pfx_66, W0, Ln }, /* vpsravd */
 { { 0x47 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsllv{d,q} */
+{ { 0x49, 0xc0 }, 2, F, N, pfx_no, W0, L0 }, /* tilerelease */
 { { 0x50 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusd */
 { { 0x51 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusds */
 { { 0x52 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssd */
--- a/tools/tests/x86_emulator/x86-emulate.c
+++ b/tools/tests/x86_emulator/x86-emulate.c
@@ -247,6 +247,9 @@ int emul_test_get_fpu(
 break;
 default:
 return X86EMUL_UNHANDLEABLE;
+
+case X86EMUL_FPU_tile:
+return cpu_has_amx_tile ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE;
 }
 return X86EMUL_OKAY;
 }
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -475,6 +475,7 @@ static const struct ext0f38_table {
 [0x43] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq },
 [0x44] = { .simd_size = simd_packed_int, .two_op = 1, .d8s = d8s_vl },
 [0x45 ... 0x47] = { .simd_size = simd_packed_int, .d8s = d8s_vl },
+[0x49] = { .simd_size = simd_other, .two_op = 1 },
 [0x4c] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_vl },
 [0x4d] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq },
 [0x4e] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_vl },
@@ -2014,6 +2015,7 @@ amd_like(const struct x86_emulate_ctxt *
 #define vcpu_has_avx512_4fmaps() (ctxt->cpuid->feat.avx512_4fmaps)
 #define vcpu_has_avx512_vp2intersect() (ctxt->cpuid->feat.avx512_vp2intersect)
 #define vcpu_has_serialize()   (ctxt->cpuid->feat.serialize)
+#define vcpu_has_amx_tile()(ctxt->cpuid->feat.amx_tile)
 #define vcpu_has_avx_vnni()(ctxt->cpuid->feat.avx_vnni)
 #define vcpu_has_avx512_bf16() (ctxt->cpuid->feat.avx512_bf16)
 
@@ -9460,6 +9462,24 @@ x86_emulate(
 generate_exception_if(vex.l, EXC_UD);
 goto simd_0f_avx;
 
+case X86EMUL_OPC_VEX(0x0f38, 0x49):
+generate_exception_if(!mode_64bit() || vex.l || vex.w, EXC_UD);
+if ( ea.type == OP_REG )
+{
+switch ( modrm )
+{
+case 0xc0: /* tilerelease */
+host_and_vcpu_must_have(amx_tile);
+get_fpu(X86EMUL_FPU_tile);
+op_bytes = 1; /* fake */
+goto simd_0f_common;
+
+default:
+goto unrecognized_insn;
+}
+}
+goto unimplemented_insn;
+
 case X86EMUL_OPC_VEX_66(0x0f38, 0x50): /* vpdpbusd 
[xy]mm/mem,[xy]mm,[xy]mm */
 case X86EMUL_OPC_VEX_66(0x0f38, 0x51): /* vpdpbusds 
[xy]mm/mem,[xy]mm,[xy]mm */
 case X86EMUL_OPC_VEX_66(0x0f38, 0x52): /* vpdpwssd 
[xy]mm/mem,[xy]mm,[xy]mm */
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -131,6 +131,7 @@
 #define cpu_has_avx512_vp2intersect 
boot_cpu_has(X86_FEATURE_AVX512_VP2INTERSECT)
 #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)
 #define cpu_has_serialize   boot_cpu_has(X86_FEATURE_SERIALIZE)
+#define cpu_has_amx_tileboot_cpu_has(X86_FEATURE_AMX_TILE)
 
 /* CPUID level 0x0007:1.eax */
 #define cpu_has_avx_vnniboot_cpu_has(X86_FEATURE_AVX_VNNI)




  1   2   >