Re: [Xen-devel] [V9 2/3] Refactor rangeset structure for better performance.

2015-12-31 Thread Yu, Zhang



On 12/21/2015 10:38 PM, Jan Beulich wrote:

On 15.12.15 at 03:05,  wrote:

This patch refactors struct rangeset to base it on the red-black
tree structure, instead of on the current doubly linked list. By
now, ioreq leverages rangeset to keep track of the IO/memory
resources to be emulated. Yet when number of ranges inside one
ioreq server is very high, traversing a doubly linked list could
be time consuming. With this patch, the time complexity for
searching a rangeset can be improved from O(n) to O(log(n)).
Interfaces of rangeset still remain the same, and no new APIs
introduced.


So this indeed addresses one of the two original concerns. But
what about the other (resource use due to thousands of ranges
in use by a single VM)? IOW I'm still unconvinced this is the way
to go.



Thank you, Jan. As you saw in patch 3/3, the other concern was solved
by extending the rangeset size, which may not be convictive for you.
But I believe this patch - refactoring the rangeset to rb_tree, does
not only solve XenGT's performance issue, but may also be helpful in
the future, e.g. if someday the rangeset is not allocated in xen heap
and can have a great number of ranges in it. :)

Yu


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V9 0/3] Refactor ioreq server for better performance.

2015-12-31 Thread Yu, Zhang

Shuai, thank you very much for helping me push these patches!
And sorry for the delay due to my illness.
Now I'm back and will pick this up. :)

B.R.
Yu

On 12/15/2015 10:05 AM, Shuai Ruan wrote:

From: Yu Zhang ng the

XenGT leverages ioreq server to track and forward the accesses to
GPU I/O resources, e.g. the PPGTT(per-process graphic translation
tables). Currently, ioreq server uses rangeset to track the BDF/
PIO/MMIO ranges to be emulated. To select an ioreq server, the
rangeset is searched to see if the I/O range is recorded. However,
traversing the link list inside rangeset could be time consuming
when number of ranges is too high. On HSW platform, number of PPGTTs
for each vGPU could be several hundred. On BDW, this value could
be several thousand.  This patch series refactored rangeset to base
it on red-back tree, so that the searching would be more efficient.

Besides, this patchset also splits the tracking of MMIO and guest
ram ranges into different rangesets. And to accommodate more ranges,
limitation of the number of ranges in an ioreq server, MAX_NR_IO_RANGES
is changed - future patches might be provided to tune this with other
approaches.

Changes in v9:
1> Change order of patch 2 and patch3.
2> Intruduce a const static array before hvm_ioreq_server_alloc_rangesets().
3> Coding style changes.

Changes in v8:
Use a clearer API name to map/unmap the write-protected memory in
ioreq server.

Changes in v7:
1> Coding style changes;
2> Fix a typo in hvm_select_ioreq_server().

Changes in v6:
Break the identical relationship between ioreq type and rangeset
index inside ioreq server.

Changes in v5:
1> Use gpfn, instead of gpa to track guest write-protected pages;
2> Remove redundant conditional statement in routine find_range().

Changes in v4:
Keep the name HVMOP_IO_RANGE_MEMORY for MMIO resources, and add
a new one, HVMOP_IO_RANGE_WP_MEM, for write-protected memory.

Changes in v3:
1> Use a seperate rangeset for guest ram pages in ioreq server;
2> Refactor rangeset, instead of introduce a new data structure.

Changes in v2:
1> Split the original patch into 2;
2> Take Paul Durrant's comments:
   a> Add a name member in the struct rb_rangeset, and use the 'q'
debug key to dump the ranges in ioreq server;
   b> Keep original routine names for hvm ioreq server;
   c> Commit message changes - mention that a future patch to change
the maximum ranges inside ioreq server.


Yu Zhang (3):
   Remove identical relationship between ioreq type and rangeset type.
   Refactor rangeset structure for better performance.
   Differentiate IO/mem resources tracked by ioreq server

  tools/libxc/include/xenctrl.h| 31 +++
  tools/libxc/xc_domain.c  | 61 ++
  xen/arch/x86/hvm/hvm.c   | 43 ++---
  xen/common/rangeset.c| 82 +---
  xen/include/asm-x86/hvm/domain.h |  4 +-
  xen/include/public/hvm/hvm_op.h  |  1 +
  6 files changed, 185 insertions(+), 37 deletions(-)



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V9 3/3] Differentiate IO/mem resources tracked by ioreq server

2015-12-31 Thread Yu, Zhang



On 12/21/2015 10:45 PM, Jan Beulich wrote:

On 15.12.15 at 03:05,  wrote:

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -935,6 +935,9 @@ static void hvm_ioreq_server_free_rangesets(struct 
hvm_ioreq_server *s,
  rangeset_destroy(s->range[i]);
  }

+static const char *io_range_name[ NR_IO_RANGE_TYPES ] =


const


OK. Thanks.




+{"port", "mmio", "pci", "wp-ed memory"};


As brief as possible, but still understandable - e.g. "wp-mem"?



Got it. Thanks.


@@ -2593,6 +2597,16 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct 
domain *d,
  type = (p->type == IOREQ_TYPE_PIO) ?
  HVMOP_IO_RANGE_PORT : HVMOP_IO_RANGE_MEMORY;
  addr = p->addr;
+if ( type == HVMOP_IO_RANGE_MEMORY )
+{
+ ram_page = get_page_from_gfn(d, p->addr >> PAGE_SHIFT,
+  , P2M_UNSHARE);
+ if ( p2mt == p2m_mmio_write_dm )
+ type = HVMOP_IO_RANGE_WP_MEM;
+
+ if ( ram_page )
+ put_page(ram_page);
+}


You evaluate the page's current type here - what if it subsequently
changes? I don't think it is appropriate to leave the hypervisor at
the mercy of the device model here.



Well. I do not quite understand your concern. :)
Here, the get_page_from_gfn() is used to determine if the addr is a MMIO
or a write-protected ram. If this p2m type is changed, it should be
triggered by the guest and device model, e.g. this RAM is not supposed
to be used as the graphic translation table. And it should be fine.
But I also wonder, if there's any other routine more appropriate to get
a p2m type from the gfn?


--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -48,8 +48,8 @@ struct hvm_ioreq_vcpu {
  bool_t   pending;
  };

-#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
-#define MAX_NR_IO_RANGES  256
+#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP_MEM + 1)
+#define MAX_NR_IO_RANGES  8192


I'm sure I've objected before to this universal bumping of the limit:
Even if I were to withdraw my objection to the higher limit on the
new kind of tracked resource, I would continue to object to all
other resources getting their limits bumped too.



Hah. So how about we keep MAX_NR_IO_RANGES as 256, and use a new value,
say MAX_NR_WR_MEM_RANGES, set to 8192 in this patch? :)

Thanks a lot & happy new year!


Yu


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 40/62] arm/acpi: Estimate memory required for acpi/efi tables

2015-12-31 Thread Shannon Zhao


On 2015/11/27 0:39, Stefano Stabellini wrote:
> On Tue, 17 Nov 2015, shannon.z...@linaro.org wrote:
>> From: Shannon Zhao 
>>
>> Estimate the memory required for loading acpi/efi tables in Dom0. Alloc
>> the pages to store the new created EFI and ACPI tables and free these
>> pages when destroying domain.
> 
> Could you please explain what you are page aligning exactly and why?
> 

At least it should be 64bit aligned of the table start address. If not,
guest will throw out an alignment fault.

ACPI: Using GIC for interrupt routing
Unhandled fault: alignment fault (0x9621) at 0xff86c19c
Internal error: : 9621 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.3.0+ #551
Hardware name: (null) (DT)
task: ffc00887 ti: ffc00884c000 task.ti: ffc00884c000
PC is at acpi_get_phys_id+0x264/0x290
LR is at acpi_get_phys_id+0x178/0x290

> 
>>
>> Signed-off-by: Parth Dixit 
>> Signed-off-by: Shannon Zhao 
>> ---
>>  xen/arch/arm/domain.c   |  4 +++
>>  xen/arch/arm/domain_build.c | 80 
>> -
>>  xen/common/efi/boot.c   | 21 
>>  xen/include/asm-arm/setup.h |  2 ++
>>  4 files changed, 106 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 880d0a6..10c58c4 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -638,6 +638,10 @@ void arch_domain_destroy(struct domain *d)
>>  domain_vgic_free(d);
>>  domain_vuart_free(d);
>>  free_xenheap_page(d->shared_info);
>> +#ifdef CONFIG_ACPI
>> +free_xenheap_pages(d->arch.efi_acpi_table,
>> +   get_order_from_bytes(d->arch.efi_acpi_len));
>> +#endif
>>  }
>>  
>>  void arch_domain_shutdown(struct domain *d)
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 0c3441a..b5ed44c 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -12,6 +12,8 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -1354,6 +1356,78 @@ static int prepare_dtb(struct domain *d, struct 
>> kernel_info *kinfo)
>>  return -EINVAL;
>>  }
>>  
>> +#ifdef CONFIG_ACPI
>> +static int estimate_acpi_efi_size(struct domain *d, struct kernel_info 
>> *kinfo)
>> +{
>> +u64 efi_size, acpi_size = 0, addr;
>> +u32 madt_size;
>> +struct acpi_table_rsdp *rsdp_tbl;
>> +struct acpi_table_header *table = NULL;
>> +
>> +efi_size = estimate_efi_size(kinfo->mem.nr_banks);
>> +
>> +acpi_size += PAGE_ALIGN(sizeof(struct acpi_table_fadt));
>> +acpi_size += PAGE_ALIGN(sizeof(struct acpi_table_stao));
>> +
>> +madt_size = sizeof(struct acpi_table_madt)
>> ++ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
>> ++ sizeof(struct acpi_madt_generic_distributor);
>> +if ( d->arch.vgic.version == GIC_V3 )
>> +madt_size += sizeof(struct acpi_madt_generic_redistributor)
>> + * d->arch.vgic.nr_regions;
>> +acpi_size += PAGE_ALIGN(madt_size);
> 
> are the MADT and FADT tables guaranteed to be on separate pages or is it
> just an estimate?
> 
> 
>> +addr = acpi_os_get_root_pointer();
>> +if ( !addr )
>> +{
>> +printk("Unable to get acpi root pointer\n");
>> +return -EINVAL;
>> +}
>> +rsdp_tbl = acpi_os_map_memory(addr, sizeof(struct acpi_table_rsdp));
>> +table = acpi_os_map_memory(rsdp_tbl->xsdt_physical_address,
>> +   sizeof(struct acpi_table_header));
>> +acpi_size += PAGE_ALIGN(table->length + sizeof(u64));
>> +acpi_os_unmap_memory(table, sizeof(struct acpi_table_header));
>> +acpi_os_unmap_memory(rsdp_tbl, sizeof(struct acpi_table_rsdp));
>> +
>> +acpi_size += PAGE_ALIGN(sizeof(struct acpi_table_rsdp));
>> +d->arch.efi_acpi_len = PAGE_ALIGN(efi_size) + PAGE_ALIGN(acpi_size);
>> +
>> +return 0;
>> +}
>> +
>> +static int prepare_acpi(struct domain *d, struct kernel_info *kinfo)
>> +{
>> +int rc = 0;
>> +int order;
>> +
>> +rc = estimate_acpi_efi_size(d, kinfo);
>> +if ( rc != 0 )
>> +return rc;
>> +
>> +order = get_order_from_bytes(d->arch.efi_acpi_len);
>> +d->arch.efi_acpi_table = alloc_xenheap_pages(order, 0);
>> +if ( d->arch.efi_acpi_table == NULL )
>> +{
>> +printk("unable to allocate memory!\n");
>> +return -1;
> 
> ENOMEM
> 
> 
>> +}
>> +memset(d->arch.efi_acpi_table, 0, d->arch.efi_acpi_len);
>> +
>> +/* For ACPI, Dom0 doesn't use kinfo->gnttab_start to get the grant table
>> + * region. So we use it as the ACPI table mapped address. */
>> +d->arch.efi_acpi_gpa = kinfo->gnttab_start;
>> +
>> +return 0;
>> +}
>> +#else
>> +static int prepare_acpi(struct domain *d, struct kernel_info *kinfo)
>> +{
>> +/* Only booting with 

[Xen-devel] [PATCH 1/1] Improved RTDS scheduler

2015-12-31 Thread Tianyang Chen
Budget replenishment is now handled by a dedicated timer which is
triggered at the most imminent release time of all runnable vcpus.

Signed-off-by: Tianyang Chen 
Signed-off-by: Meng Xu 
Signed-off-by: Dagaen Golomb 
---
 -cover-letter.patch|   16 +++
 0001-Improved-RTDS-scheduler.patch |  280 
 bak|   62 
 xen/common/sched_rt.c  |  159 +++-
 4 files changed, 453 insertions(+), 64 deletions(-)
 create mode 100644 -cover-letter.patch
 create mode 100644 0001-Improved-RTDS-scheduler.patch
 create mode 100644 bak

diff --git a/-cover-letter.patch b/-cover-letter.patch
new file mode 100644
index 000..f5aca91
--- /dev/null
+++ b/-cover-letter.patch
@@ -0,0 +1,16 @@
+From 25ca27ea281885eb9873244a11f08e6987efb36e Mon Sep 17 00:00:00 2001
+From: Tianyang Chen 
+Date: Thu, 31 Dec 2015 04:05:21 -0500
+Subject: [PATCH] *** SUBJECT HERE ***
+
+*** BLURB HERE ***
+
+Tianyang Chen (1):
+  Improved RTDS scheduler
+
+ xen/common/sched_rt.c |  159 +
+ 1 file changed, 95 insertions(+), 64 deletions(-)
+
+-- 
+1.7.9.5
+
diff --git a/0001-Improved-RTDS-scheduler.patch 
b/0001-Improved-RTDS-scheduler.patch
new file mode 100644
index 000..02cb1f7
--- /dev/null
+++ b/0001-Improved-RTDS-scheduler.patch
@@ -0,0 +1,280 @@
+From 25ca27ea281885eb9873244a11f08e6987efb36e Mon Sep 17 00:00:00 2001
+From: Tianyang Chen 
+Date: Thu, 31 Dec 2015 01:55:19 -0500
+Subject: [PATCH] Improved RTDS scheduler
+
+Budget replenishment is now handled by a dedicated timer which is
+triggered at the most imminent release time of all runnable vcpus.
+
+Signed-off-by: Tianyang Chen 
+Signed-off-by: Meng Xu 
+Signed-off-by: Dagaen Golomb 
+---
+ xen/common/sched_rt.c |  159 +
+ 1 file changed, 95 insertions(+), 64 deletions(-)
+
+diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
+index 4372486..d522272 100644
+--- a/xen/common/sched_rt.c
 b/xen/common/sched_rt.c
+@@ -16,6 +16,7 @@
+ #include 
+ #include 
+ #include 
++#include 
+ #include 
+ #include 
+ #include 
+@@ -147,6 +148,16 @@ static unsigned int nr_rt_ops;
+  * Global lock is referenced by schedule_data.schedule_lock from all
+  * physical cpus. It can be grabbed via vcpu_schedule_lock_irq()
+  */
++
++/* dedicated timer for replenishment */
++static struct timer repl_timer;
++
++/* controls when to first start the timer*/
++static int timer_started;
++
++/* handler for the replenishment timer */
++static void repl_handler(void *data); 
++
+ struct rt_private {
+ spinlock_t lock;/* the global coarse grand lock */
+ struct list_head sdom;  /* list of availalbe domains, used for dump */
+@@ -426,6 +437,7 @@ __runq_insert(const struct scheduler *ops, struct rt_vcpu 
*svc)
+ static int
+ rt_init(struct scheduler *ops)
+ {
++const int cpu = smp_processor_id(); 
+ struct rt_private *prv = xzalloc(struct rt_private);
+ 
+ printk("Initializing RTDS scheduler\n"
+@@ -454,6 +466,8 @@ rt_init(struct scheduler *ops)
+ 
+ ops->sched_data = prv;
+ 
++init_timer(_timer, repl_handler, ops, 0);
++
+ return 0;
+ 
+  no_mem:
+@@ -473,6 +487,9 @@ rt_deinit(const struct scheduler *ops)
+ xfree(_cpumask_scratch);
+ _cpumask_scratch = NULL;
+ }
++
++kill_timer(_timer);
++
+ xfree(prv);
+ }
+ 
+@@ -635,6 +652,13 @@ rt_vcpu_insert(const struct scheduler *ops, struct vcpu 
*vc)
+ 
+ /* add rt_vcpu svc to scheduler-specific vcpu list of the dom */
+ list_add_tail(>sdom_elem, >sdom->vcpu);
++
++if(!timer_started)
++{   
++/* the first vcpu starts the timer for the first time*/
++timer_started = 1;
++set_timer(_timer,svc->cur_deadline);
++}
+ }
+ 
+ /*
+@@ -792,44 +816,6 @@ __runq_pick(const struct scheduler *ops, const cpumask_t 
*mask)
+ }
+ 
+ /*
+- * Update vcpu's budget and
+- * sort runq by insert the modifed vcpu back to runq
+- * lock is grabbed before calling this function
+- */
+-static void
+-__repl_update(const struct scheduler *ops, s_time_t now)
+-{
+-struct list_head *runq = rt_runq(ops);
+-struct list_head *depletedq = rt_depletedq(ops);
+-struct list_head *iter;
+-struct list_head *tmp;
+-struct rt_vcpu *svc = NULL;
+-
+-list_for_each_safe(iter, tmp, runq)
+-{
+-svc = __q_elem(iter);
+-if ( now < svc->cur_deadline )
+-break;
+-
+-rt_update_deadline(now, svc);
+-/* reinsert the vcpu if its deadline is updated */
+-__q_remove(svc);
+-__runq_insert(ops, svc);
+-}
+-
+-list_for_each_safe(iter, tmp, depletedq)
+-{
+-svc = __q_elem(iter);
+-if ( now >= 

[Xen-devel] [PATCH V2 1/1] Improved RTDS scheduler

2015-12-31 Thread Tianyang Chen
Budget replenishment is now handled by a dedicated timer which is
triggered at the most imminent release time of all runnable vcpus.

Changes since V1:
None

Signed-off-by: Tianyang Chen 
Signed-off-by: Meng Xu 
Signed-off-by: Dagaen Golomb 
---
 xen/common/sched_rt.c |  159 +
 1 file changed, 95 insertions(+), 64 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 4372486..d522272 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -147,6 +148,16 @@ static unsigned int nr_rt_ops;
  * Global lock is referenced by schedule_data.schedule_lock from all
  * physical cpus. It can be grabbed via vcpu_schedule_lock_irq()
  */
+
+/* dedicated timer for replenishment */
+static struct timer repl_timer;
+
+/* controls when to first start the timer*/
+static int timer_started;
+
+/* handler for the replenishment timer */
+static void repl_handler(void *data); 
+
 struct rt_private {
 spinlock_t lock;/* the global coarse grand lock */
 struct list_head sdom;  /* list of availalbe domains, used for dump */
@@ -426,6 +437,7 @@ __runq_insert(const struct scheduler *ops, struct rt_vcpu 
*svc)
 static int
 rt_init(struct scheduler *ops)
 {
+const int cpu = smp_processor_id(); 
 struct rt_private *prv = xzalloc(struct rt_private);
 
 printk("Initializing RTDS scheduler\n"
@@ -454,6 +466,8 @@ rt_init(struct scheduler *ops)
 
 ops->sched_data = prv;
 
+init_timer(_timer, repl_handler, ops, 0);
+
 return 0;
 
  no_mem:
@@ -473,6 +487,9 @@ rt_deinit(const struct scheduler *ops)
 xfree(_cpumask_scratch);
 _cpumask_scratch = NULL;
 }
+
+kill_timer(_timer);
+
 xfree(prv);
 }
 
@@ -635,6 +652,13 @@ rt_vcpu_insert(const struct scheduler *ops, struct vcpu 
*vc)
 
 /* add rt_vcpu svc to scheduler-specific vcpu list of the dom */
 list_add_tail(>sdom_elem, >sdom->vcpu);
+
+if(!timer_started)
+{   
+/* the first vcpu starts the timer for the first time*/
+timer_started = 1;
+set_timer(_timer,svc->cur_deadline);
+}
 }
 
 /*
@@ -792,44 +816,6 @@ __runq_pick(const struct scheduler *ops, const cpumask_t 
*mask)
 }
 
 /*
- * Update vcpu's budget and
- * sort runq by insert the modifed vcpu back to runq
- * lock is grabbed before calling this function
- */
-static void
-__repl_update(const struct scheduler *ops, s_time_t now)
-{
-struct list_head *runq = rt_runq(ops);
-struct list_head *depletedq = rt_depletedq(ops);
-struct list_head *iter;
-struct list_head *tmp;
-struct rt_vcpu *svc = NULL;
-
-list_for_each_safe(iter, tmp, runq)
-{
-svc = __q_elem(iter);
-if ( now < svc->cur_deadline )
-break;
-
-rt_update_deadline(now, svc);
-/* reinsert the vcpu if its deadline is updated */
-__q_remove(svc);
-__runq_insert(ops, svc);
-}
-
-list_for_each_safe(iter, tmp, depletedq)
-{
-svc = __q_elem(iter);
-if ( now >= svc->cur_deadline )
-{
-rt_update_deadline(now, svc);
-__q_remove(svc); /* remove from depleted queue */
-__runq_insert(ops, svc); /* add to runq */
-}
-}
-}
-
-/*
  * schedule function for rt scheduler.
  * The lock is already grabbed in schedule.c, no need to lock here
  */
@@ -848,7 +834,6 @@ rt_schedule(const struct scheduler *ops, s_time_t now, 
bool_t tasklet_work_sched
 /* burn_budget would return for IDLE VCPU */
 burn_budget(ops, scurr, now);
 
-__repl_update(ops, now);
 
 if ( tasklet_work_scheduled )
 {
@@ -889,7 +874,7 @@ rt_schedule(const struct scheduler *ops, s_time_t now, 
bool_t tasklet_work_sched
 }
 }
 
-ret.time = MIN(snext->budget, MAX_SCHEDULE); /* sched quantum */
+ret.time = snext->budget; /* invoke the scheduler next time */
 ret.task = snext->vcpu;
 
 /* TRACE */
@@ -1033,10 +1018,6 @@ rt_vcpu_wake(const struct scheduler *ops, struct vcpu 
*vc)
 {
 struct rt_vcpu * const svc = rt_vcpu(vc);
 s_time_t now = NOW();
-struct rt_private *prv = rt_priv(ops);
-struct rt_vcpu *snext = NULL; /* highest priority on RunQ */
-struct rt_dom *sdom = NULL;
-cpumask_t *online;
 
 BUG_ON( is_idle_vcpu(vc) );
 
@@ -1074,14 +1055,7 @@ rt_vcpu_wake(const struct scheduler *ops, struct vcpu 
*vc)
 /* insert svc to runq/depletedq because svc is not in queue now */
 __runq_insert(ops, svc);
 
-__repl_update(ops, now);
-
-ASSERT(!list_empty(>sdom));
-sdom = list_entry(prv->sdom.next, struct rt_dom, sdom_elem);
-online = cpupool_scheduler_cpumask(sdom->dom->cpupool);
-snext = __runq_pick(ops, online); /* pick snext from ALL valid cpus */
-
-runq_tickle(ops, snext);
+runq_tickle(ops, svc);
 
  

[Xen-devel] [PATCH 0/1] Improved RTDS scheduler

2015-12-31 Thread Tianyang Chen
Current RTDS scheduler is time driven and is called every 1ms. During each 
scheduler call, the repl_update() scans both runq and depeletedq, which might 
not be necessary every 1ms.

Since each vcpu is implemented as a deferable server, budget is preserved 
during its period and refilled in the next. It is not necessary to check every 
1ms as the current design does. The replenishment is needed at the nearest next 
period(nearest current_deadline) of all runnable vcpus.

This improved design tries to reduce scheduler invocation by using an event 
driven approach;rt_schedule() will return a value when the scheduler needs to 
be called next time. In addition, the sched_rt will have one dedicated timer to 
handle replenishment when necessary. In other words, the budget replenishment 
and scheduler decision(rt_schedule) are separated.

Based on previous decision between Dario, Dagaen and Meng, the improved design 
can be implemented/modified as follows:

rt_schedule(): picks the highest runnable vcpu based on cpu affinity and 
ret.time will be passed to schedule().

rt_vcpu_wake(): when a vcpu is awake, it tickles instead of picking one from 
runq.

rt_context_saved(): when context switching is finished, the preempted vcpu will 
be put back into the runq. Picking from runq and tickling are removed.

repl_handler(): a timer handler which is reprogrammed to fire at the nearest 
vcpu deadline to replenish vcpus on depeletedq while keeping the runq sorted. 
When the replenishment is done, each replenished vcpu in the runq should tickle 
a pcpu to see if it needs to preempt any running vcpus.


An extra field to record the last replenishing time will be added.

schedule.c SCHEDULE_SOFTIRQ:
rt_schedule():
[spin_lock]
burn_budget(scurr)
snext = runq_pick()
[spin_unlock]


sched_rt.c TIMER_SOFTIRQ
replenishment_timer_handler()
[spin_lock]

program_timer()
[spin_lock]

The transient behavior should be noted. It happens between a vcpu tickles and a 
pcpu actually picks it. As previous discussions, this is unavoidable.

Previous discussions:
http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg02629.html

Signed-off-by: Tianyang Chen 
Signed-off-by: Meng Xu 
Signed-off-by: Dagaen Golomb 


*** BLURB HERE ***

Tianyang Chen (1):
  Improved RTDS scheduler

 -cover-letter.patch|   16 +++
 0001-Improved-RTDS-scheduler.patch |  280 
 bak|   62 
 xen/common/sched_rt.c  |  159 +++-
 4 files changed, 453 insertions(+), 64 deletions(-)
 create mode 100644 -cover-letter.patch
 create mode 100644 0001-Improved-RTDS-scheduler.patch
 create mode 100644 bak

-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2 1/1] Improved RTDS scheduler

2015-12-31 Thread Meng Xu
On Thu, Dec 31, 2015 at 6:20 PM, Tianyang Chen  wrote:
>
> Budget replenishment is now handled by a dedicated timer which is
> triggered at the most imminent release time of all runnable vcpus.
>
> Changes since V1:
> None
>
> Signed-off-by: Tianyang Chen 
> Signed-off-by: Meng Xu 
> Signed-off-by: Dagaen Golomb 
> ---
>  xen/common/sched_rt.c |  159 
> +
>  1 file changed, 95 insertions(+), 64 deletions(-)
>
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 4372486..d522272 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -147,6 +148,16 @@ static unsigned int nr_rt_ops;
>   * Global lock is referenced by schedule_data.schedule_lock from all
>   * physical cpus. It can be grabbed via vcpu_schedule_lock_irq()
>   */
> +
> +/* dedicated timer for replenishment */
> +static struct timer repl_timer;
> +
> +/* controls when to first start the timer*/


missing a space between * and / at the end of this line.


>
> +static int timer_started;
> +
> +/* handler for the replenishment timer */
> +static void repl_handler(void *data);
> +
>  struct rt_private {
>  spinlock_t lock;/* the global coarse grand lock */
>  struct list_head sdom;  /* list of availalbe domains, used for dump 
> */
> @@ -426,6 +437,7 @@ __runq_insert(const struct scheduler *ops, struct rt_vcpu 
> *svc)
>  static int
>  rt_init(struct scheduler *ops)
>  {
> +const int cpu = smp_processor_id();


Did the "cpu" be used? If not, please remove it here.


>
>  struct rt_private *prv = xzalloc(struct rt_private);
>
>  printk("Initializing RTDS scheduler\n"
> @@ -454,6 +466,8 @@ rt_init(struct scheduler *ops)
>
>  ops->sched_data = prv;
>
> +init_timer(_timer, repl_handler, ops, 0);
> +
>  return 0;
>
>   no_mem:
> @@ -473,6 +487,9 @@ rt_deinit(const struct scheduler *ops)
>  xfree(_cpumask_scratch);
>  _cpumask_scratch = NULL;
>  }
> +
> +kill_timer(_timer);
> +
>  xfree(prv);
>  }
>
> @@ -635,6 +652,13 @@ rt_vcpu_insert(const struct scheduler *ops, struct vcpu 
> *vc)
>
>  /* add rt_vcpu svc to scheduler-specific vcpu list of the dom */
>  list_add_tail(>sdom_elem, >sdom->vcpu);
> +
> +if(!timer_started)


space should be added..


>
> +{
> +/* the first vcpu starts the timer for the first time*/
> +timer_started = 1;
> +set_timer(_timer,svc->cur_deadline);
> +}
>  }
>
>  /*
> @@ -792,44 +816,6 @@ __runq_pick(const struct scheduler *ops, const cpumask_t 
> *mask)
>  }
>
>  /*
> - * Update vcpu's budget and
> - * sort runq by insert the modifed vcpu back to runq
> - * lock is grabbed before calling this function
> - */
> -static void
> -__repl_update(const struct scheduler *ops, s_time_t now)
> -{
> -struct list_head *runq = rt_runq(ops);
> -struct list_head *depletedq = rt_depletedq(ops);
> -struct list_head *iter;
> -struct list_head *tmp;
> -struct rt_vcpu *svc = NULL;
> -
> -list_for_each_safe(iter, tmp, runq)
> -{
> -svc = __q_elem(iter);
> -if ( now < svc->cur_deadline )
> -break;
> -
> -rt_update_deadline(now, svc);
> -/* reinsert the vcpu if its deadline is updated */
> -__q_remove(svc);
> -__runq_insert(ops, svc);
> -}
> -
> -list_for_each_safe(iter, tmp, depletedq)
> -{
> -svc = __q_elem(iter);
> -if ( now >= svc->cur_deadline )
> -{
> -rt_update_deadline(now, svc);
> -__q_remove(svc); /* remove from depleted queue */
> -__runq_insert(ops, svc); /* add to runq */
> -}
> -}
> -}
> -
> -/*
>   * schedule function for rt scheduler.
>   * The lock is already grabbed in schedule.c, no need to lock here
>   */
> @@ -848,7 +834,6 @@ rt_schedule(const struct scheduler *ops, s_time_t now, 
> bool_t tasklet_work_sched
>  /* burn_budget would return for IDLE VCPU */
>  burn_budget(ops, scurr, now);
>
> -__repl_update(ops, now);
>
>  if ( tasklet_work_scheduled )
>  {
> @@ -889,7 +874,7 @@ rt_schedule(const struct scheduler *ops, s_time_t now, 
> bool_t tasklet_work_sched
>  }
>  }
>
> -ret.time = MIN(snext->budget, MAX_SCHEDULE); /* sched quantum */
> +ret.time = snext->budget; /* invoke the scheduler next time */
>  ret.task = snext->vcpu;
>
>  /* TRACE */
> @@ -1033,10 +1018,6 @@ rt_vcpu_wake(const struct scheduler *ops, struct vcpu 
> *vc)
>  {
>  struct rt_vcpu * const svc = rt_vcpu(vc);
>  s_time_t now = NOW();
> -struct rt_private *prv = rt_priv(ops);
> -struct rt_vcpu *snext = NULL; /* highest priority on RunQ */
> -struct rt_dom *sdom = NULL;
> -cpumask_t *online;
>
>  BUG_ON( 

[Xen-devel] [PATCH V2 0/1] Improved RTDS scheduler

2015-12-31 Thread Tianyang Chen
Changes since V1:
Removed redundant cover letter
Removed redundant patch file that was added with the last commit
Please dis-regard V1 because it was in wrong format. Sorry about that.

Current RTDS scheduler is time driven and is called every 1ms. During each 
scheduler call, the repl_update() scans both runq and depeletedq, which might 
not be necessary every 1ms.

Since each vcpu is implemented as a deferable server, budget is preserved 
during its period and refilled in the next. It is not necessary to check every 
1ms as the current design does. The replenishment is needed at the nearest next 
period(nearest current_deadline) of all runnable vcpus.

This improved design tries to reduce scheduler invocation by using an event 
driven approach;rt_schedule() will return a value when the scheduler needs to 
be called next time. In addition, the sched_rt will have one dedicated timer to 
handle replenishment when necessary. In other words, the budget replenishment 
and scheduler decision(rt_schedule) are separated.

Based on previous decision between Dario, Dagaen and Meng, the improved design 
can be implemented/modified as follows:

rt_schedule(): picks the highest runnable vcpu based on cpu affinity and 
ret.time will be passed to schedule().

rt_vcpu_wake(): when a vcpu is awake, it tickles instead of picking one from 
runq.

rt_context_saved(): when context switching is finished, the preempted vcpu will 
be put back into the runq. Picking from runq and tickling are removed.

repl_handler(): a timer handler which is reprogrammed to fire at the nearest 
vcpu deadline to replenish vcpus on runq and depeletedq. When the replenishment 
is done, each replenished vcpu in the runq should tickle a pcpu to see if it 
needs to preempt any running vcpus.


schedule.c SCHEDULE_SOFTIRQ:
rt_schedule():
[spin_lock]
burn_budget(scurr)
snext = runq_pick()
[spin_unlock]


sched_rt.c TIMER_SOFTIRQ
replenishment_timer_handler()
[spin_lock]

program_timer()
[spin_lock]

The transient behavior should be noted. It happens between a vcpu tickles and a 
pcpu actually picks it. As previous discussions, this is unavoidable.

Previous discussions:
http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg02629.html

Signed-off-by: Tianyang Chen 
Signed-off-by: Meng Xu 
Signed-off-by: Dagaen Golomb 


Tianyang Chen (1):
  Improved RTDS scheduler

 xen/common/sched_rt.c |  159 +
 1 file changed, 95 insertions(+), 64 deletions(-)

-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [distros-debian-wheezy test] 38576: all pass

2015-12-31 Thread Platform Team regression test user
flight 38576 distros-debian-wheezy real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38576/

Perfect :-)
All tests in this flight passed
baseline version:
 flight   38557

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-wheezy-netboot-pvgrub pass
 test-amd64-i386-i386-wheezy-netboot-pvgrub   pass
 test-amd64-i386-amd64-wheezy-netboot-pygrub  pass
 test-amd64-amd64-i386-wheezy-netboot-pygrub  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 31/32] sh: support a 2-byte smp_store_mb

2015-12-31 Thread Michael S. Tsirkin
At the moment, xchg on sh only supports 4 and 1 byte values, so using it
from smp_store_mb means attempts to store a 2 byte value using this
macro fail.

And happens to be exactly what virtio drivers want to do.

Check size and fall back to a slower, but safe, WRITE_ONCE+smp_mb.

Signed-off-by: Michael S. Tsirkin 
---
 arch/sh/include/asm/barrier.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
index f887c64..0cc5735 100644
--- a/arch/sh/include/asm/barrier.h
+++ b/arch/sh/include/asm/barrier.h
@@ -32,7 +32,15 @@
 #define ctrl_barrier() __asm__ __volatile__ ("nop;nop;nop;nop;nop;nop;nop;nop")
 #endif
 
-#define __smp_store_mb(var, value) do { (void)xchg(, value); } while (0)
+#define __smp_store_mb(var, value) do { \
+   if (sizeof(var) != 4 && sizeof(var) != 1) { \
+WRITE_ONCE(var, value); \
+   __smp_mb(); \
+   } else { \
+   (void)xchg(, value);  \
+   } \
+} while (0)
+
 #define smp_store_mb(var, value) __smp_store_mb(var, value)
 
 #include 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 32/32] virtio_ring: use virt_store_mb

2015-12-31 Thread Michael S. Tsirkin
We need a full barrier after writing out event index, using
virt_store_mb there seems better than open-coding.  As usual, we need a
wrapper to account for strong barriers.

It's tempting to use this in vhost as well, for that, we'll
need a variant of smp_store_mb that works on __user pointers.

Signed-off-by: Michael S. Tsirkin 
---
 include/linux/virtio_ring.h  | 12 
 drivers/virtio/virtio_ring.c | 15 +--
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f3fa55b..3a74d91 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -45,6 +45,18 @@ static inline void virtio_wmb(bool weak_barriers)
wmb();
 }
 
+static inline void virtio_store_mb(bool weak_barriers,
+  __virtio16 *p, __virtio16 v)
+{
+   if (weak_barriers)
+   virt_store_mb(*p, v);
+   else
+   {
+   WRITE_ONCE(*p, v);
+   mb();
+   }
+}
+
 struct virtio_device;
 struct virtqueue;
 
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ee663c4..e12e385 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -517,10 +517,10 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned 
int *len)
/* If we expect an interrupt for the next entry, tell host
 * by writing event index and flush out the write before
 * the read in the next get_buf call. */
-   if (!(vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
-   vring_used_event(>vring) = cpu_to_virtio16(_vq->vdev, 
vq->last_used_idx);
-   virtio_mb(vq->weak_barriers);
-   }
+   if (!(vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
+   virtio_store_mb(vq->weak_barriers,
+   _used_event(>vring),
+   cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
 
 #ifdef DEBUG
vq->last_add_time_valid = false;
@@ -653,8 +653,11 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
}
/* TODO: tune this threshold */
bufs = (u16)(vq->avail_idx_shadow - vq->last_used_idx) * 3 / 4;
-   vring_used_event(>vring) = cpu_to_virtio16(_vq->vdev, 
vq->last_used_idx + bufs);
-   virtio_mb(vq->weak_barriers);
+
+   virtio_store_mb(vq->weak_barriers,
+   _used_event(>vring),
+   cpu_to_virtio16(_vq->vdev, vq->last_used_idx + bufs));
+
if (unlikely((u16)(virtio16_to_cpu(_vq->vdev, vq->vring.used->idx) - 
vq->last_used_idx) > bufs)) {
END_USE(vq);
return false;
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 24/32] sparc: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for sparc,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/sparc/include/asm/barrier_64.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/sparc/include/asm/barrier_64.h 
b/arch/sparc/include/asm/barrier_64.h
index 26c3f72..c9f6ee6 100644
--- a/arch/sparc/include/asm/barrier_64.h
+++ b/arch/sparc/include/asm/barrier_64.h
@@ -37,14 +37,14 @@ do {__asm__ __volatile__("ba,pt %%xcc, 1f\n\t" \
 #define rmb()  __asm__ __volatile__("":::"memory")
 #define wmb()  __asm__ __volatile__("":::"memory")
 
-#define smp_store_release(p, v)
\
+#define __smp_store_release(p, v)  
\
 do {   \
compiletime_assert_atomic_type(*p); \
barrier();  \
WRITE_ONCE(*p, v);  \
 } while (0)
 
-#define smp_load_acquire(p)\
+#define __smp_load_acquire(p)  \
 ({ \
typeof(*p) ___p1 = READ_ONCE(*p);   \
compiletime_assert_atomic_type(*p); \
@@ -52,8 +52,8 @@ do {  
\
___p1;  \
 })
 
-#define smp_mb__before_atomic()barrier()
-#define smp_mb__after_atomic() barrier()
+#define __smp_mb__before_atomic()  barrier()
+#define __smp_mb__after_atomic()   barrier()
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 21/32] mips: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for mips,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Note: the only exception is smp_mb__before_llsc which is mips-specific.
We define both the __smp_mb__before_llsc variant (for use in
asm/barriers.h) and smp_mb__before_llsc (for use elsewhere on this
architecture).

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/mips/include/asm/barrier.h | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 3eac4b9..d296633 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -85,20 +85,20 @@
 #define wmb()  fast_wmb()
 #define rmb()  fast_rmb()
 
-#if defined(CONFIG_WEAK_ORDERING) && defined(CONFIG_SMP)
+#if defined(CONFIG_WEAK_ORDERING)
 # ifdef CONFIG_CPU_CAVIUM_OCTEON
-#  define smp_mb() __sync()
-#  define smp_rmb()barrier()
-#  define smp_wmb()__syncw()
+#  define __smp_mb()   __sync()
+#  define __smp_rmb()  barrier()
+#  define __smp_wmb()  __syncw()
 # else
-#  define smp_mb() __asm__ __volatile__("sync" : : :"memory")
-#  define smp_rmb()__asm__ __volatile__("sync" : : :"memory")
-#  define smp_wmb()__asm__ __volatile__("sync" : : :"memory")
+#  define __smp_mb()   __asm__ __volatile__("sync" : : :"memory")
+#  define __smp_rmb()  __asm__ __volatile__("sync" : : :"memory")
+#  define __smp_wmb()  __asm__ __volatile__("sync" : : :"memory")
 # endif
 #else
-#define smp_mb()   barrier()
-#define smp_rmb()  barrier()
-#define smp_wmb()  barrier()
+#define __smp_mb() barrier()
+#define __smp_rmb()barrier()
+#define __smp_wmb()barrier()
 #endif
 
 #if defined(CONFIG_WEAK_REORDERING_BEYOND_LLSC) && defined(CONFIG_SMP)
@@ -111,6 +111,7 @@
 
 #ifdef CONFIG_CPU_CAVIUM_OCTEON
 #define smp_mb__before_llsc() smp_wmb()
+#define __smp_mb__before_llsc() __smp_wmb()
 /* Cause previous writes to become visible on all CPUs as soon as possible */
 #define nudge_writes() __asm__ __volatile__(".set push\n\t"\
".set arch=octeon\n\t"  \
@@ -118,11 +119,12 @@
".set pop" : : : "memory")
 #else
 #define smp_mb__before_llsc() smp_llsc_mb()
+#define __smp_mb__before_llsc() smp_llsc_mb()
 #define nudge_writes() mb()
 #endif
 
-#define smp_mb__before_atomic()smp_mb__before_llsc()
-#define smp_mb__after_atomic() smp_llsc_mb()
+#define __smp_mb__before_atomic()  __smp_mb__before_llsc()
+#define __smp_mb__after_atomic()   smp_llsc_mb()
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 22/32] s390: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for s390,
for use by virtualization.

Some smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
unconditionally on this architecture.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/s390/include/asm/barrier.h | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index c358c31..fbd25b2 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -26,18 +26,21 @@
 #define wmb()  barrier()
 #define dma_rmb()  mb()
 #define dma_wmb()  mb()
-#define smp_mb()   mb()
-#define smp_rmb()  rmb()
-#define smp_wmb()  wmb()
-
-#define smp_store_release(p, v)
\
+#define __smp_mb() mb()
+#define __smp_rmb()rmb()
+#define __smp_wmb()wmb()
+#define smp_mb()   __smp_mb()
+#define smp_rmb()  __smp_rmb()
+#define smp_wmb()  __smp_wmb()
+
+#define __smp_store_release(p, v)  \
 do {   \
compiletime_assert_atomic_type(*p); \
barrier();  \
WRITE_ONCE(*p, v);  \
 } while (0)
 
-#define smp_load_acquire(p)\
+#define __smp_load_acquire(p)  \
 ({ \
typeof(*p) ___p1 = READ_ONCE(*p);   \
compiletime_assert_atomic_type(*p); \
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 28/32] asm-generic: implement virt_xxx memory barriers

2015-12-31 Thread Michael S. Tsirkin
Guests running within virtual machines might be affected by SMP effects even if
the guest itself is compiled without SMP support.  This is an artifact of
interfacing with an SMP host while running an UP kernel.  Using mandatory
barriers for this use-case would be possible but is often suboptimal.

In particular, virtio uses a bunch of confusing ifdefs to work around
this, while xen just uses the mandatory barriers.

To better handle this case, low-level virt_mb() etc macros are made available.
These are implemented trivially using the low-level __smp_xxx macros,
the purpose of these wrappers is to annotate those specific cases.

These have the same effect as smp_mb() etc when SMP is enabled, but generate
identical code for SMP and non-SMP systems. For example, virtual machine guests
should use virt_mb() rather than smp_mb() when synchronizing against a
(possibly SMP) host.

Suggested-by: David Miller 
Signed-off-by: Michael S. Tsirkin 
---
 include/asm-generic/barrier.h | 11 +++
 Documentation/memory-barriers.txt | 28 +++-
 2 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 8752964..1cceca14 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -196,5 +196,16 @@ do {   
\
 
 #endif
 
+/* Barriers for virtual machine guests when talking to an SMP host */
+#define virt_mb() __smp_mb()
+#define virt_rmb() __smp_rmb()
+#define virt_wmb() __smp_wmb()
+#define virt_read_barrier_depends() __smp_read_barrier_depends()
+#define virt_store_mb(var, value) __smp_store_mb(var, value)
+#define virt_mb__before_atomic() __smp_mb__before_atomic()
+#define virt_mb__after_atomic()__smp_mb__after_atomic()
+#define virt_store_release(p, v) __smp_store_release(p, v)
+#define virt_load_acquire(p) __smp_load_acquire(p)
+
 #endif /* !__ASSEMBLY__ */
 #endif /* __ASM_GENERIC_BARRIER_H */
diff --git a/Documentation/memory-barriers.txt 
b/Documentation/memory-barriers.txt
index aef9487..8f4a93a 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1655,17 +1655,18 @@ macro is a good place to start looking.
 SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
 systems because it is assumed that a CPU will appear to be self-consistent,
 and will order overlapping accesses correctly with respect to itself.
+However, see the subsection on "Virtual Machine Guests" below.
 
 [!] Note that SMP memory barriers _must_ be used to control the ordering of
 references to shared memory on SMP systems, though the use of locking instead
 is sufficient.
 
 Mandatory barriers should not be used to control SMP effects, since mandatory
-barriers unnecessarily impose overhead on UP systems. They may, however, be
-used to control MMIO effects on accesses through relaxed memory I/O windows.
-These are required even on non-SMP systems as they affect the order in which
-memory operations appear to a device by prohibiting both the compiler and the
-CPU from reordering them.
+barriers impose unnecessary overhead on both SMP and UP systems. They may,
+however, be used to control MMIO effects on accesses through relaxed memory I/O
+windows.  These barriers are required even on non-SMP systems as they affect
+the order in which memory operations appear to a device by prohibiting both the
+compiler and the CPU from reordering them.
 
 
 There are some more advanced barrier functions:
@@ -2948,6 +2949,23 @@ The Alpha defines the Linux kernel's memory barrier 
model.
 
 See the subsection on "Cache Coherency" above.
 
+VIRTUAL MACHINE GUESTS
+---
+
+Guests running within virtual machines might be affected by SMP effects even if
+the guest itself is compiled without SMP support.  This is an artifact of
+interfacing with an SMP host while running an UP kernel.  Using mandatory
+barriers for this use-case would be possible but is often suboptimal.
+
+To handle this case optimally, low-level virt_mb() etc macros are available.
+These have the same effect as smp_mb() etc when SMP is enabled, but generate
+identical code for SMP and non-SMP systems. For example, virtual machine guests
+should use virt_mb() rather than smp_mb() when synchronizing against a
+(possibly SMP) host.
+
+These are equivalent to smp_mb() etc counterparts in all other respects,
+in particular, they do not control MMIO effects: to control
+MMIO effects, use mandatory barriers.
 
 
 EXAMPLE USES
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 29/32] Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb"

2015-12-31 Thread Michael S. Tsirkin
This reverts commit 9e1a27ea42691429e31f158cce6fc61bc79bb2e9.

While that commit optimizes !CONFIG_SMP, it mixes
up DMA and SMP concepts, making the code hard
to figure out.

A better way to optimize this is with the new __smp_XXX
barriers.

As a first step, go back to full rmb/wmb barriers
for !SMP.
We switch to __smp_XXX barriers in the next patch.

Cc: Peter Zijlstra 
Cc: Alexander Duyck 
Signed-off-by: Michael S. Tsirkin 
---
 include/linux/virtio_ring.h | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 8e50888..67e06fe 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -21,20 +21,19 @@
  * actually quite cheap.
  */
 
+#ifdef CONFIG_SMP
 static inline void virtio_mb(bool weak_barriers)
 {
-#ifdef CONFIG_SMP
if (weak_barriers)
smp_mb();
else
-#endif
mb();
 }
 
 static inline void virtio_rmb(bool weak_barriers)
 {
if (weak_barriers)
-   dma_rmb();
+   smp_rmb();
else
rmb();
 }
@@ -42,10 +41,26 @@ static inline void virtio_rmb(bool weak_barriers)
 static inline void virtio_wmb(bool weak_barriers)
 {
if (weak_barriers)
-   dma_wmb();
+   smp_wmb();
else
wmb();
 }
+#else
+static inline void virtio_mb(bool weak_barriers)
+{
+   mb();
+}
+
+static inline void virtio_rmb(bool weak_barriers)
+{
+   rmb();
+}
+
+static inline void virtio_wmb(bool weak_barriers)
+{
+   wmb();
+}
+#endif
 
 struct virtio_device;
 struct virtqueue;
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 30/32] virtio_ring: update weak barriers to use __smp_XXX

2015-12-31 Thread Michael S. Tsirkin
virtio ring uses smp_wmb on SMP and wmb on !SMP,
the reason for the later being that it might be
talking to another kernel on the same SMP machine.

This is exactly what __smp_XXX barriers do,
so switch to these instead of homegrown ifdef hacks.

Cc: Peter Zijlstra 
Cc: Alexander Duyck 
Signed-off-by: Michael S. Tsirkin 
---
 include/linux/virtio_ring.h | 25 -
 1 file changed, 4 insertions(+), 21 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 67e06fe..f3fa55b 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -12,7 +12,7 @@
  * anyone care?
  *
  * For virtio_pci on SMP, we don't need to order with respect to MMIO
- * accesses through relaxed memory I/O windows, so smp_mb() et al are
+ * accesses through relaxed memory I/O windows, so virt_mb() et al are
  * sufficient.
  *
  * For using virtio to talk to real devices (eg. other heterogeneous
@@ -21,11 +21,10 @@
  * actually quite cheap.
  */
 
-#ifdef CONFIG_SMP
 static inline void virtio_mb(bool weak_barriers)
 {
if (weak_barriers)
-   smp_mb();
+   virt_mb();
else
mb();
 }
@@ -33,7 +32,7 @@ static inline void virtio_mb(bool weak_barriers)
 static inline void virtio_rmb(bool weak_barriers)
 {
if (weak_barriers)
-   smp_rmb();
+   virt_rmb();
else
rmb();
 }
@@ -41,26 +40,10 @@ static inline void virtio_rmb(bool weak_barriers)
 static inline void virtio_wmb(bool weak_barriers)
 {
if (weak_barriers)
-   smp_wmb();
+   virt_wmb();
else
wmb();
 }
-#else
-static inline void virtio_mb(bool weak_barriers)
-{
-   mb();
-}
-
-static inline void virtio_rmb(bool weak_barriers)
-{
-   rmb();
-}
-
-static inline void virtio_wmb(bool weak_barriers)
-{
-   wmb();
-}
-#endif
 
 struct virtio_device;
 struct virtqueue;
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 27/32] x86: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for x86,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/x86/include/asm/barrier.h | 31 ---
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index cc4c2a7..a584e1c 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -31,17 +31,10 @@
 #endif
 #define dma_wmb()  barrier()
 
-#ifdef CONFIG_SMP
-#define smp_mb()   mb()
-#define smp_rmb()  dma_rmb()
-#define smp_wmb()  barrier()
-#define smp_store_mb(var, value) do { (void)xchg(, value); } while (0)
-#else /* !SMP */
-#define smp_mb()   barrier()
-#define smp_rmb()  barrier()
-#define smp_wmb()  barrier()
-#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); barrier(); } 
while (0)
-#endif /* SMP */
+#define __smp_mb() mb()
+#define __smp_rmb()dma_rmb()
+#define __smp_wmb()barrier()
+#define __smp_store_mb(var, value) do { (void)xchg(, value); } while (0)
 
 #if defined(CONFIG_X86_PPRO_FENCE)
 
@@ -50,31 +43,31 @@
  * model and we should fall back to full barriers.
  */
 
-#define smp_store_release(p, v)
\
+#define __smp_store_release(p, v)  \
 do {   \
compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
+   __smp_mb(); \
WRITE_ONCE(*p, v);  \
 } while (0)
 
-#define smp_load_acquire(p)\
+#define __smp_load_acquire(p)  \
 ({ \
typeof(*p) ___p1 = READ_ONCE(*p);   \
compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
+   __smp_mb(); \
___p1;  \
 })
 
 #else /* regular x86 TSO memory ordering */
 
-#define smp_store_release(p, v)
\
+#define __smp_store_release(p, v)  \
 do {   \
compiletime_assert_atomic_type(*p); \
barrier();  \
WRITE_ONCE(*p, v);  \
 } while (0)
 
-#define smp_load_acquire(p)\
+#define __smp_load_acquire(p)  \
 ({ \
typeof(*p) ___p1 = READ_ONCE(*p);   \
compiletime_assert_atomic_type(*p); \
@@ -85,8 +78,8 @@ do {  
\
 #endif
 
 /* Atomic operations are already serializing on x86 */
-#define smp_mb__before_atomic()barrier()
-#define smp_mb__after_atomic() barrier()
+#define __smp_mb__before_atomic()  barrier()
+#define __smp_mb__after_atomic()   barrier()
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 26/32] xtensa: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for xtensa,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/xtensa/include/asm/barrier.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/xtensa/include/asm/barrier.h 
b/arch/xtensa/include/asm/barrier.h
index 5b88774..956596e 100644
--- a/arch/xtensa/include/asm/barrier.h
+++ b/arch/xtensa/include/asm/barrier.h
@@ -13,8 +13,8 @@
 #define rmb() barrier()
 #define wmb() mb()
 
-#define smp_mb__before_atomic()barrier()
-#define smp_mb__after_atomic() barrier()
+#define __smp_mb__before_atomic()  barrier()
+#define __smp_mb__after_atomic()   barrier()
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 13/32] x86: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
As on most architectures, on x86 read_barrier_depends and
smp_read_barrier_depends are empty.  Drop the local definitions and pull
the generic ones from asm-generic/barrier.h instead: they are identical.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/x86/include/asm/barrier.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index 0681d25..cc4c2a7 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -43,9 +43,6 @@
 #define smp_store_mb(var, value) do { WRITE_ONCE(var, value); barrier(); } 
while (0)
 #endif /* SMP */
 
-#define read_barrier_depends() do { } while (0)
-#define smp_read_barrier_depends() do { } while (0)
-
 #if defined(CONFIG_X86_PPRO_FENCE)
 
 /*
@@ -91,4 +88,6 @@ do {  
\
 #define smp_mb__before_atomic()barrier()
 #define smp_mb__after_atomic() barrier()
 
+#include 
+
 #endif /* _ASM_X86_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 12/32] x86/um: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On x86/um CONFIG_SMP is never defined.  As a result, several macros
match the asm-generic variant exactly. Drop the local definitions and
pull in asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/x86/um/asm/barrier.h | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/x86/um/asm/barrier.h b/arch/x86/um/asm/barrier.h
index 755481f..174781a 100644
--- a/arch/x86/um/asm/barrier.h
+++ b/arch/x86/um/asm/barrier.h
@@ -36,13 +36,6 @@
 #endif /* CONFIG_X86_PPRO_FENCE */
 #define dma_wmb()  barrier()
 
-#define smp_mb()   barrier()
-#define smp_rmb()  barrier()
-#define smp_wmb()  barrier()
-
-#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); barrier(); } 
while (0)
-
-#define read_barrier_depends() do { } while (0)
-#define smp_read_barrier_depends() do { } while (0)
+#include 
 
 #endif
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 17/32] arm: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for arm,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

This reduces the amount of arch-specific boiler-plate code.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/arm/include/asm/barrier.h | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 31152e8..112cc1a 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -60,15 +60,9 @@ extern void arm_heavy_mb(void);
 #define dma_wmb()  barrier()
 #endif
 
-#ifndef CONFIG_SMP
-#define smp_mb()   barrier()
-#define smp_rmb()  barrier()
-#define smp_wmb()  barrier()
-#else
-#define smp_mb()   dmb(ish)
-#define smp_rmb()  smp_mb()
-#define smp_wmb()  dmb(ishst)
-#endif
+#define __smp_mb() dmb(ish)
+#define __smp_rmb()__smp_mb()
+#define __smp_wmb()dmb(ishst)
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 10/32] metag: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On metag dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends,
smp_read_barrier_depends, smp_store_release and smp_load_acquire  match
the asm-generic variants exactly. Drop the local definitions and pull in
asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/metag/include/asm/barrier.h | 25 ++---
 1 file changed, 2 insertions(+), 23 deletions(-)

diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
index 172b7e5..b5b778b 100644
--- a/arch/metag/include/asm/barrier.h
+++ b/arch/metag/include/asm/barrier.h
@@ -44,9 +44,6 @@ static inline void wr_fence(void)
 #define rmb()  barrier()
 #define wmb()  mb()
 
-#define dma_rmb()  rmb()
-#define dma_wmb()  wmb()
-
 #ifndef CONFIG_SMP
 #define fence()do { } while (0)
 #define smp_mb()barrier()
@@ -81,27 +78,9 @@ static inline void fence(void)
 #endif
 #endif
 
-#define read_barrier_depends() do { } while (0)
-#define smp_read_barrier_depends() do { } while (0)
-
-#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
-
-#define smp_store_release(p, v)
\
-do {   \
-   compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
-   WRITE_ONCE(*p, v);  \
-} while (0)
-
-#define smp_load_acquire(p)\
-({ \
-   typeof(*p) ___p1 = READ_ONCE(*p);   \
-   compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
-   ___p1;  \
-})
-
 #define smp_mb__before_atomic()barrier()
 #define smp_mb__after_atomic() barrier()
 
+#include 
+
 #endif /* _ASM_METAG_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 18/32] blackfin: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for blackfin,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/blackfin/include/asm/barrier.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/blackfin/include/asm/barrier.h 
b/arch/blackfin/include/asm/barrier.h
index dfb66fe..7cca51c 100644
--- a/arch/blackfin/include/asm/barrier.h
+++ b/arch/blackfin/include/asm/barrier.h
@@ -78,8 +78,8 @@
 
 #endif /* !CONFIG_SMP */
 
-#define smp_mb__before_atomic()barrier()
-#define smp_mb__after_atomic() barrier()
+#define __smp_mb__before_atomic()  barrier()
+#define __smp_mb__after_atomic()   barrier()
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 19/32] ia64: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for ia64,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

This reduces the amount of arch-specific boiler-plate code.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Tony Luck 
Acked-by: Arnd Bergmann 
---
 arch/ia64/include/asm/barrier.h | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index 2f93348..588f161 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -42,28 +42,24 @@
 #define dma_rmb()  mb()
 #define dma_wmb()  mb()
 
-#ifdef CONFIG_SMP
-# define smp_mb()  mb()
-#else
-# define smp_mb()  barrier()
-#endif
+# define __smp_mb()mb()
 
-#define smp_mb__before_atomic()barrier()
-#define smp_mb__after_atomic() barrier()
+#define __smp_mb__before_atomic()  barrier()
+#define __smp_mb__after_atomic()   barrier()
 
 /*
  * IA64 GCC turns volatile stores into st.rel and volatile loads into ld.acq no
  * need for asm trickery!
  */
 
-#define smp_store_release(p, v)
\
+#define __smp_store_release(p, v)  
\
 do {   \
compiletime_assert_atomic_type(*p); \
barrier();  \
WRITE_ONCE(*p, v);  \
 } while (0)
 
-#define smp_load_acquire(p)\
+#define __smp_load_acquire(p)  \
 ({ \
typeof(*p) ___p1 = READ_ONCE(*p);   \
compiletime_assert_atomic_type(*p); \
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 20/32] metag: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for metag,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Note: as __smp_XX macros should not depend on CONFIG_SMP, they can not
use the existing fence() macro since that is defined differently between
SMP and !SMP.  For this reason, this patch introduces a wrapper
metag_fence() that doesn't depend on CONFIG_SMP.
fence() is then defined using that, depending on CONFIG_SMP.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/metag/include/asm/barrier.h | 32 +++-
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
index b5b778b..84880c9 100644
--- a/arch/metag/include/asm/barrier.h
+++ b/arch/metag/include/asm/barrier.h
@@ -44,13 +44,6 @@ static inline void wr_fence(void)
 #define rmb()  barrier()
 #define wmb()  mb()
 
-#ifndef CONFIG_SMP
-#define fence()do { } while (0)
-#define smp_mb()barrier()
-#define smp_rmb()   barrier()
-#define smp_wmb()   barrier()
-#else
-
 #ifdef CONFIG_METAG_SMP_WRITE_REORDERING
 /*
  * Write to the atomic memory unlock system event register (command 0). This is
@@ -60,26 +53,31 @@ static inline void wr_fence(void)
  * incoherence). It is therefore ineffective if used after and on the same
  * thread as a write.
  */
-static inline void fence(void)
+static inline void metag_fence(void)
 {
volatile int *flushptr = (volatile int *) LINSYSEVENT_WR_ATOMIC_UNLOCK;
barrier();
*flushptr = 0;
barrier();
 }
-#define smp_mb()fence()
-#define smp_rmb()   fence()
-#define smp_wmb()   barrier()
+#define __smp_mb()metag_fence()
+#define __smp_rmb()   metag_fence()
+#define __smp_wmb()   barrier()
 #else
-#define fence()do { } while (0)
-#define smp_mb()barrier()
-#define smp_rmb()   barrier()
-#define smp_wmb()   barrier()
+#define metag_fence()  do { } while (0)
+#define __smp_mb()barrier()
+#define __smp_rmb()   barrier()
+#define __smp_wmb()   barrier()
 #endif
+
+#ifdef CONFIG_SMP
+#define fence() metag_fence()
+#else
+#define fence()do { } while (0)
 #endif
 
-#define smp_mb__before_atomic()barrier()
-#define smp_mb__after_atomic() barrier()
+#define __smp_mb__before_atomic()  barrier()
+#define __smp_mb__after_atomic()   barrier()
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 23/32] sh: define __smp_xxx, fix smp_store_mb for !SMP

2015-12-31 Thread Michael S. Tsirkin
sh variant of smp_store_mb() calls xchg() on !SMP which is stronger than
implied by both the name and the documentation.

define __smp_store_mb instead: code in asm-generic/barrier.h
will then define smp_store_mb correctly depending on
CONFIG_SMP.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/sh/include/asm/barrier.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
index bf91037..f887c64 100644
--- a/arch/sh/include/asm/barrier.h
+++ b/arch/sh/include/asm/barrier.h
@@ -32,7 +32,8 @@
 #define ctrl_barrier() __asm__ __volatile__ ("nop;nop;nop;nop;nop;nop;nop;nop")
 #endif
 
-#define smp_store_mb(var, value) do { (void)xchg(, value); } while (0)
+#define __smp_store_mb(var, value) do { (void)xchg(, value); } while (0)
+#define smp_store_mb(var, value) __smp_store_mb(var, value)
 
 #include 
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 25/32] tile: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for tile,
for use by virtualization.

Some smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Note: for 32 bit, keep smp_mb__after_atomic around since it's faster
than the generic implementation.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/tile/include/asm/barrier.h | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
index 96a42ae..d552228 100644
--- a/arch/tile/include/asm/barrier.h
+++ b/arch/tile/include/asm/barrier.h
@@ -79,11 +79,12 @@ mb_incoherent(void)
  * But after the word is updated, the routine issues an "mf" before returning,
  * and since it's a function call, we don't even need a compiler barrier.
  */
-#define smp_mb__before_atomic()smp_mb()
-#define smp_mb__after_atomic() do { } while (0)
+#define __smp_mb__before_atomic()  __smp_mb()
+#define __smp_mb__after_atomic()   do { } while (0)
+#define smp_mb__after_atomic() __smp_mb__after_atomic()
 #else /* 64 bit */
-#define smp_mb__before_atomic()smp_mb()
-#define smp_mb__after_atomic() smp_mb()
+#define __smp_mb__before_atomic()  __smp_mb()
+#define __smp_mb__after_atomic()   __smp_mb()
 #endif
 
 #include 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 08/32] arm: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On arm smp_store_mb, read_barrier_depends, smp_read_barrier_depends,
smp_store_release, smp_load_acquire, smp_mb__before_atomic and
smp_mb__after_atomic match the asm-generic variants exactly. Drop the
local definitions and pull in asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/arm/include/asm/barrier.h | 23 +--
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 3ff5642..31152e8 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -70,28 +70,7 @@ extern void arm_heavy_mb(void);
 #define smp_wmb()  dmb(ishst)
 #endif
 
-#define smp_store_release(p, v)
\
-do {   \
-   compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
-   WRITE_ONCE(*p, v);  \
-} while (0)
-
-#define smp_load_acquire(p)\
-({ \
-   typeof(*p) ___p1 = READ_ONCE(*p);   \
-   compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
-   ___p1;  \
-})
-
-#define read_barrier_depends() do { } while(0)
-#define smp_read_barrier_depends() do { } while(0)
-
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); smp_mb(); 
} while (0)
-
-#define smp_mb__before_atomic()smp_mb()
-#define smp_mb__after_atomic() smp_mb()
+#include 
 
 #endif /* !__ASSEMBLY__ */
 #endif /* __ASM_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 11/32] mips: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends,
smp_read_barrier_depends, smp_store_release and smp_load_acquire  match
the asm-generic variants exactly. Drop the local definitions and pull in
asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/mips/include/asm/barrier.h | 25 ++---
 1 file changed, 2 insertions(+), 23 deletions(-)

diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 752e0b8..3eac4b9 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -10,9 +10,6 @@
 
 #include 
 
-#define read_barrier_depends() do { } while(0)
-#define smp_read_barrier_depends() do { } while(0)
-
 #ifdef CONFIG_CPU_HAS_SYNC
 #define __sync()   \
__asm__ __volatile__(   \
@@ -87,8 +84,6 @@
 
 #define wmb()  fast_wmb()
 #define rmb()  fast_rmb()
-#define dma_wmb()  fast_wmb()
-#define dma_rmb()  fast_rmb()
 
 #if defined(CONFIG_WEAK_ORDERING) && defined(CONFIG_SMP)
 # ifdef CONFIG_CPU_CAVIUM_OCTEON
@@ -112,9 +107,6 @@
 #define __WEAK_LLSC_MB "   \n"
 #endif
 
-#define smp_store_mb(var, value) \
-   do { WRITE_ONCE(var, value); smp_mb(); } while (0)
-
 #define smp_llsc_mb()  __asm__ __volatile__(__WEAK_LLSC_MB : : :"memory")
 
 #ifdef CONFIG_CPU_CAVIUM_OCTEON
@@ -129,22 +121,9 @@
 #define nudge_writes() mb()
 #endif
 
-#define smp_store_release(p, v)
\
-do {   \
-   compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
-   WRITE_ONCE(*p, v);  \
-} while (0)
-
-#define smp_load_acquire(p)\
-({ \
-   typeof(*p) ___p1 = READ_ONCE(*p);   \
-   compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
-   ___p1;  \
-})
-
 #define smp_mb__before_atomic()smp_mb__before_llsc()
 #define smp_mb__after_atomic() smp_llsc_mb()
 
+#include 
+
 #endif /* __ASM_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 06/32] s390: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On s390 read_barrier_depends, smp_read_barrier_depends
smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
asm-generic variants exactly. Drop the local definitions and pull in
asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/s390/include/asm/barrier.h | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index 7ffd0b1..c358c31 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -30,14 +30,6 @@
 #define smp_rmb()  rmb()
 #define smp_wmb()  wmb()
 
-#define read_barrier_depends() do { } while (0)
-#define smp_read_barrier_depends() do { } while (0)
-
-#define smp_mb__before_atomic()smp_mb()
-#define smp_mb__after_atomic() smp_mb()
-
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); smp_mb(); 
} while (0)
-
 #define smp_store_release(p, v)
\
 do {   \
compiletime_assert_atomic_type(*p); \
@@ -53,4 +45,6 @@ do {  
\
___p1;  \
 })
 
+#include 
+
 #endif /* __ASM_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 01/32] lcoking/barriers, arch: Use smp barriers in smp_store_release()

2015-12-31 Thread Michael S. Tsirkin
From: Davidlohr Bueso 

With commit b92b8b35a2e ("locking/arch: Rename set_mb() to smp_store_mb()")
it was made clear that the context of this call (and thus set_mb)
is strictly for CPU ordering, as opposed to IO. As such all archs
should use the smp variant of mb(), respecting the semantics and
saving a mandatory barrier on UP.

Signed-off-by: Davidlohr Bueso 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Andrew Morton 
Cc: Benjamin Herrenschmidt 
Cc: Heiko Carstens 
Cc: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: d...@stgolabs.net
Link: 
http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-d...@stgolabs.net
Signed-off-by: Ingo Molnar 
---
 arch/ia64/include/asm/barrier.h| 2 +-
 arch/powerpc/include/asm/barrier.h | 2 +-
 arch/s390/include/asm/barrier.h| 2 +-
 include/asm-generic/barrier.h  | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index df896a1..209c4b8 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -77,7 +77,7 @@ do {  
\
___p1;  \
 })
 
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); mb(); } 
while (0)
+#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
 
 /*
  * The group barrier in front of the rsm & ssm are necessary to ensure
diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 0eca6ef..a7af5fb 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -34,7 +34,7 @@
 #define rmb()  __asm__ __volatile__ ("sync" : : : "memory")
 #define wmb()  __asm__ __volatile__ ("sync" : : : "memory")
 
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); mb(); } 
while (0)
+#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
 
 #ifdef __SUBARCH_HAS_LWSYNC
 #define SMPWMB  LWSYNC
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index d68e11e..7ffd0b1 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -36,7 +36,7 @@
 #define smp_mb__before_atomic()smp_mb()
 #define smp_mb__after_atomic() smp_mb()
 
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); 
mb(); } while (0)
+#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); smp_mb(); 
} while (0)
 
 #define smp_store_release(p, v)
\
 do {   \
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index b42afad..0f45f93 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -93,7 +93,7 @@
 #endif /* CONFIG_SMP */
 
 #ifndef smp_store_mb
-#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); mb(); } while 
(0)
+#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
 #endif
 
 #ifndef smp_mb__before_atomic
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 14/32] asm-generic: add __smp_xxx wrappers

2015-12-31 Thread Michael S. Tsirkin
On !SMP, most architectures define their
barriers as compiler barriers.
On SMP, most need an actual barrier.

Make it possible to remove the code duplication for
!SMP by defining low-level __smp_xxx barriers
which do not depend on the value of SMP, then
use them from asm-generic conditionally.

Besides reducing code duplication, these low level APIs will also be
useful for virtualization, where a barrier is sometimes needed even if
!SMP since we might be talking to another kernel on the same SMP system.

Both virtio and Xen drivers will benefit.

The smp_xxx variants should use __smp_XXX ones or barrier() depending on
SMP, identically for all architectures.

We keep ifndef guards around them for now - once/if all
architectures are converted to use the generic
code, we'll be able to remove these.

Suggested-by: Peter Zijlstra 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 include/asm-generic/barrier.h | 91 ++-
 1 file changed, 82 insertions(+), 9 deletions(-)

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 987b2e0..8752964 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -54,22 +54,38 @@
 #define read_barrier_depends() do { } while (0)
 #endif
 
+#ifndef __smp_mb
+#define __smp_mb() mb()
+#endif
+
+#ifndef __smp_rmb
+#define __smp_rmb()rmb()
+#endif
+
+#ifndef __smp_wmb
+#define __smp_wmb()wmb()
+#endif
+
+#ifndef __smp_read_barrier_depends
+#define __smp_read_barrier_depends()   read_barrier_depends()
+#endif
+
 #ifdef CONFIG_SMP
 
 #ifndef smp_mb
-#define smp_mb()   mb()
+#define smp_mb()   __smp_mb()
 #endif
 
 #ifndef smp_rmb
-#define smp_rmb()  rmb()
+#define smp_rmb()  __smp_rmb()
 #endif
 
 #ifndef smp_wmb
-#define smp_wmb()  wmb()
+#define smp_wmb()  __smp_wmb()
 #endif
 
 #ifndef smp_read_barrier_depends
-#define smp_read_barrier_depends() read_barrier_depends()
+#define smp_read_barrier_depends() __smp_read_barrier_depends()
 #endif
 
 #else  /* !CONFIG_SMP */
@@ -92,23 +108,78 @@
 
 #endif /* CONFIG_SMP */
 
+#ifndef __smp_store_mb
+#define __smp_store_mb(var, value)  do { WRITE_ONCE(var, value); __smp_mb(); } 
while (0)
+#endif
+
+#ifndef __smp_mb__before_atomic
+#define __smp_mb__before_atomic()  __smp_mb()
+#endif
+
+#ifndef __smp_mb__after_atomic
+#define __smp_mb__after_atomic()   __smp_mb()
+#endif
+
+#ifndef __smp_store_release
+#define __smp_store_release(p, v)  \
+do {   \
+   compiletime_assert_atomic_type(*p); \
+   __smp_mb(); \
+   WRITE_ONCE(*p, v);  \
+} while (0)
+#endif
+
+#ifndef __smp_load_acquire
+#define __smp_load_acquire(p)  \
+({ \
+   typeof(*p) ___p1 = READ_ONCE(*p);   \
+   compiletime_assert_atomic_type(*p); \
+   __smp_mb(); \
+   ___p1;  \
+})
+#endif
+
+#ifdef CONFIG_SMP
+
+#ifndef smp_store_mb
+#define smp_store_mb(var, value)  __smp_store_mb(var, value)
+#endif
+
+#ifndef smp_mb__before_atomic
+#define smp_mb__before_atomic()__smp_mb__before_atomic()
+#endif
+
+#ifndef smp_mb__after_atomic
+#define smp_mb__after_atomic() __smp_mb__after_atomic()
+#endif
+
+#ifndef smp_store_release
+#define smp_store_release(p, v) __smp_store_release(p, v)
+#endif
+
+#ifndef smp_load_acquire
+#define smp_load_acquire(p) __smp_load_acquire(p)
+#endif
+
+#else  /* !CONFIG_SMP */
+
 #ifndef smp_store_mb
-#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
+#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); barrier(); } 
while (0)
 #endif
 
 #ifndef smp_mb__before_atomic
-#define smp_mb__before_atomic()smp_mb()
+#define smp_mb__before_atomic()barrier()
 #endif
 
 #ifndef smp_mb__after_atomic
-#define smp_mb__after_atomic() smp_mb()
+#define smp_mb__after_atomic() barrier()
 #endif
 
 #ifndef smp_store_release
 #define smp_store_release(p, v)
\
 do {   \
compiletime_assert_atomic_type(*p); \
-   smp_mb();   \
+   barrier();  \
WRITE_ONCE(*p, v);  \
 } while (0)
 #endif
@@ -118,10 +189,12 @@ do {  
 

[Xen-devel] [PATCH v2 16/32] arm64: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for arm64,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

Note: arm64 does not support !SMP config,
so smp_xxx and __smp_xxx are always equivalent.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/arm64/include/asm/barrier.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 91a43f4..dae5c49 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -35,11 +35,11 @@
 #define dma_rmb()  dmb(oshld)
 #define dma_wmb()  dmb(oshst)
 
-#define smp_mb()   dmb(ish)
-#define smp_rmb()  dmb(ishld)
-#define smp_wmb()  dmb(ishst)
+#define __smp_mb() dmb(ish)
+#define __smp_rmb()dmb(ishld)
+#define __smp_wmb()dmb(ishst)
 
-#define smp_store_release(p, v)
\
+#define __smp_store_release(p, v)  
\
 do {   \
compiletime_assert_atomic_type(*p); \
switch (sizeof(*p)) {   \
@@ -62,7 +62,7 @@ do {  
\
}   \
 } while (0)
 
-#define smp_load_acquire(p)\
+#define __smp_load_acquire(p)  \
 ({ \
union { typeof(*p) __val; char __c[1]; } __u;   \
compiletime_assert_atomic_type(*p); \
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 07/32] sparc: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On sparc 64 bit dma_rmb, dma_wmb, smp_store_mb, smp_mb, smp_rmb,
smp_wmb, read_barrier_depends and smp_read_barrier_depends match the
asm-generic variants exactly. Drop the local definitions and pull in
asm-generic/barrier.h instead.

nop uses __asm__ __volatile but is otherwise identical to
the generic version, drop that as well.

This is in preparation to refactoring this code area.

Note: nop() was in processor.h and not in barrier.h as on other
architectures. Nothing seems to depend on it being there though.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/sparc/include/asm/barrier_32.h |  1 -
 arch/sparc/include/asm/barrier_64.h | 21 ++---
 arch/sparc/include/asm/processor.h  |  3 ---
 3 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/arch/sparc/include/asm/barrier_32.h 
b/arch/sparc/include/asm/barrier_32.h
index ae69eda..8059130 100644
--- a/arch/sparc/include/asm/barrier_32.h
+++ b/arch/sparc/include/asm/barrier_32.h
@@ -1,7 +1,6 @@
 #ifndef __SPARC_BARRIER_H
 #define __SPARC_BARRIER_H
 
-#include  /* for nop() */
 #include 
 
 #endif /* !(__SPARC_BARRIER_H) */
diff --git a/arch/sparc/include/asm/barrier_64.h 
b/arch/sparc/include/asm/barrier_64.h
index 14a9286..26c3f72 100644
--- a/arch/sparc/include/asm/barrier_64.h
+++ b/arch/sparc/include/asm/barrier_64.h
@@ -37,25 +37,6 @@ do { __asm__ __volatile__("ba,pt %%xcc, 1f\n\t" \
 #define rmb()  __asm__ __volatile__("":::"memory")
 #define wmb()  __asm__ __volatile__("":::"memory")
 
-#define dma_rmb()  rmb()
-#define dma_wmb()  wmb()
-
-#define smp_store_mb(__var, __value) \
-   do { WRITE_ONCE(__var, __value); membar_safe("#StoreLoad"); } while(0)
-
-#ifdef CONFIG_SMP
-#define smp_mb()   mb()
-#define smp_rmb()  rmb()
-#define smp_wmb()  wmb()
-#else
-#define smp_mb()   __asm__ __volatile__("":::"memory")
-#define smp_rmb()  __asm__ __volatile__("":::"memory")
-#define smp_wmb()  __asm__ __volatile__("":::"memory")
-#endif
-
-#define read_barrier_depends() do { } while (0)
-#define smp_read_barrier_depends() do { } while (0)
-
 #define smp_store_release(p, v)
\
 do {   \
compiletime_assert_atomic_type(*p); \
@@ -74,4 +55,6 @@ do {  
\
 #define smp_mb__before_atomic()barrier()
 #define smp_mb__after_atomic() barrier()
 
+#include 
+
 #endif /* !(__SPARC64_BARRIER_H) */
diff --git a/arch/sparc/include/asm/processor.h 
b/arch/sparc/include/asm/processor.h
index 2fe99e6..9da9646 100644
--- a/arch/sparc/include/asm/processor.h
+++ b/arch/sparc/include/asm/processor.h
@@ -5,7 +5,4 @@
 #else
 #include 
 #endif
-
-#define nop()  __asm__ __volatile__ ("nop")
-
 #endif
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 09/32] arm64: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On arm64 nop, read_barrier_depends, smp_read_barrier_depends
smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
asm-generic variants exactly. Drop the local definitions and pull in
asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/arm64/include/asm/barrier.h | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 9622eb4..91a43f4 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -91,14 +91,7 @@ do { 
\
__u.__val;  \
 })
 
-#define read_barrier_depends() do { } while(0)
-#define smp_read_barrier_depends() do { } while(0)
-
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); smp_mb(); 
} while (0)
-#define nop()  asm volatile("nop");
-
-#define smp_mb__before_atomic()smp_mb()
-#define smp_mb__after_atomic() smp_mb()
+#include 
 
 #endif /* __ASSEMBLY__ */
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 05/32] powerpc: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On powerpc read_barrier_depends, smp_read_barrier_depends
smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
asm-generic variants exactly. Drop the local definitions and pull in
asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/powerpc/include/asm/barrier.h | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index a7af5fb..980ad0c 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -34,8 +34,6 @@
 #define rmb()  __asm__ __volatile__ ("sync" : : : "memory")
 #define wmb()  __asm__ __volatile__ ("sync" : : : "memory")
 
-#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
-
 #ifdef __SUBARCH_HAS_LWSYNC
 #define SMPWMB  LWSYNC
 #else
@@ -60,9 +58,6 @@
 #define smp_wmb()  barrier()
 #endif /* CONFIG_SMP */
 
-#define read_barrier_depends() do { } while (0)
-#define smp_read_barrier_depends() do { } while (0)
-
 /*
  * This is a barrier which prevents following instructions from being
  * started until the value of the argument x is known.  For example, if
@@ -87,8 +82,8 @@ do {  
\
___p1;  \
 })
 
-#define smp_mb__before_atomic() smp_mb()
-#define smp_mb__after_atomic()  smp_mb()
 #define smp_mb__before_spinlock()   smp_mb()
 
+#include 
+
 #endif /* _ASM_POWERPC_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 15/32] powerpc: define __smp_xxx

2015-12-31 Thread Michael S. Tsirkin
This defines __smp_xxx barriers for powerpc
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h

This reduces the amount of arch-specific boiler-plate code.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 arch/powerpc/include/asm/barrier.h | 24 
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 980ad0c..c0deafc 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -44,19 +44,11 @@
 #define dma_rmb()  __lwsync()
 #define dma_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
:"memory")
 
-#ifdef CONFIG_SMP
-#define smp_lwsync()   __lwsync()
+#define __smp_lwsync() __lwsync()
 
-#define smp_mb()   mb()
-#define smp_rmb()  __lwsync()
-#define smp_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
:"memory")
-#else
-#define smp_lwsync()   barrier()
-
-#define smp_mb()   barrier()
-#define smp_rmb()  barrier()
-#define smp_wmb()  barrier()
-#endif /* CONFIG_SMP */
+#define __smp_mb() mb()
+#define __smp_rmb()__lwsync()
+#define __smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
:"memory")
 
 /*
  * This is a barrier which prevents following instructions from being
@@ -67,18 +59,18 @@
 #define data_barrier(x)\
asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
 
-#define smp_store_release(p, v)
\
+#define __smp_store_release(p, v)  
\
 do {   \
compiletime_assert_atomic_type(*p); \
-   smp_lwsync();   \
+   __smp_lwsync(); \
WRITE_ONCE(*p, v);  \
 } while (0)
 
-#define smp_load_acquire(p)\
+#define __smp_load_acquire(p)  \
 ({ \
typeof(*p) ___p1 = READ_ONCE(*p);   \
compiletime_assert_atomic_type(*p); \
-   smp_lwsync();   \
+   __smp_lwsync(); \
___p1;  \
 })
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 07/32] sparc: reuse asm-generic/barrier.h

2015-12-31 Thread David Miller
From: "Michael S. Tsirkin" 
Date: Thu, 31 Dec 2015 21:06:38 +0200

> On sparc 64 bit dma_rmb, dma_wmb, smp_store_mb, smp_mb, smp_rmb,
> smp_wmb, read_barrier_depends and smp_read_barrier_depends match the
> asm-generic variants exactly. Drop the local definitions and pull in
> asm-generic/barrier.h instead.
> 
> nop uses __asm__ __volatile but is otherwise identical to
> the generic version, drop that as well.
> 
> This is in preparation to refactoring this code area.
> 
> Note: nop() was in processor.h and not in barrier.h as on other
> architectures. Nothing seems to depend on it being there though.
> 
> Signed-off-by: Michael S. Tsirkin 
> Acked-by: Arnd Bergmann 

Acked-by: David S. Miller 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 33/34] xenbus: use virt_xxx barriers

2015-12-31 Thread Michael S. Tsirkin
drivers/xen/xenbus/xenbus_comms.c uses
full memory barriers to communicate with the other side.

For guests compiled with CONFIG_SMP, smp_wmb and smp_mb
would be sufficient, so mb() and wmb() here are only needed if
a non-SMP guest runs on an SMP host.

Switch to virt_xxx barriers which serve this exact purpose.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/xen/xenbus/xenbus_comms.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_comms.c 
b/drivers/xen/xenbus/xenbus_comms.c
index fdb0f33..ecdecce 100644
--- a/drivers/xen/xenbus/xenbus_comms.c
+++ b/drivers/xen/xenbus/xenbus_comms.c
@@ -123,14 +123,14 @@ int xb_write(const void *data, unsigned len)
avail = len;
 
/* Must write data /after/ reading the consumer index. */
-   mb();
+   virt_mb();
 
memcpy(dst, data, avail);
data += avail;
len -= avail;
 
/* Other side must not see new producer until data is there. */
-   wmb();
+   virt_wmb();
intf->req_prod += avail;
 
/* Implies mb(): other side will see the updated producer. */
@@ -180,14 +180,14 @@ int xb_read(void *data, unsigned len)
avail = len;
 
/* Must read data /after/ reading the producer index. */
-   rmb();
+   virt_rmb();
 
memcpy(data, src, avail);
data += avail;
len -= avail;
 
/* Other side must not see free space until we've copied out */
-   mb();
+   virt_mb();
intf->rsp_cons += avail;
 
pr_debug("Finished read of %i bytes (%i to go)\n", avail, len);
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 34/34] xen/io: use virt_xxx barriers

2015-12-31 Thread Michael S. Tsirkin
include/xen/interface/io/ring.h uses
full memory barriers to communicate with the other side.

For guests compiled with CONFIG_SMP, smp_wmb and smp_mb
would be sufficient, so mb() and wmb() here are only needed if
a non-SMP guest runs on an SMP host.

Switch to virt_xxx barriers which serve this exact purpose.

Signed-off-by: Michael S. Tsirkin 
---
 include/xen/interface/io/ring.h | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
index 7dc685b..21f4fbd 100644
--- a/include/xen/interface/io/ring.h
+++ b/include/xen/interface/io/ring.h
@@ -208,12 +208,12 @@ struct __name##_back_ring {   
\
 
 
 #define RING_PUSH_REQUESTS(_r) do {\
-wmb(); /* back sees requests /before/ updated producer index */\
+virt_wmb(); /* back sees requests /before/ updated producer index */   
\
 (_r)->sring->req_prod = (_r)->req_prod_pvt;
\
 } while (0)
 
 #define RING_PUSH_RESPONSES(_r) do {   \
-wmb(); /* front sees responses /before/ updated producer index */  \
+virt_wmb(); /* front sees responses /before/ updated producer index */ 
\
 (_r)->sring->rsp_prod = (_r)->rsp_prod_pvt;
\
 } while (0)
 
@@ -250,9 +250,9 @@ struct __name##_back_ring { 
\
 #define RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(_r, _notify) do {  \
 RING_IDX __old = (_r)->sring->req_prod;\
 RING_IDX __new = (_r)->req_prod_pvt;   \
-wmb(); /* back sees requests /before/ updated producer index */\
+virt_wmb(); /* back sees requests /before/ updated producer index */   
\
 (_r)->sring->req_prod = __new; \
-mb(); /* back sees new requests /before/ we check req_event */ \
+virt_mb(); /* back sees new requests /before/ we check req_event */
\
 (_notify) = ((RING_IDX)(__new - (_r)->sring->req_event) <  \
 (RING_IDX)(__new - __old));\
 } while (0)
@@ -260,9 +260,9 @@ struct __name##_back_ring { 
\
 #define RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(_r, _notify) do { \
 RING_IDX __old = (_r)->sring->rsp_prod;\
 RING_IDX __new = (_r)->rsp_prod_pvt;   \
-wmb(); /* front sees responses /before/ updated producer index */  \
+virt_wmb(); /* front sees responses /before/ updated producer index */ 
\
 (_r)->sring->rsp_prod = __new; \
-mb(); /* front sees new responses /before/ we check rsp_event */   \
+virt_mb(); /* front sees new responses /before/ we check rsp_event */  
\
 (_notify) = ((RING_IDX)(__new - (_r)->sring->rsp_event) <  \
 (RING_IDX)(__new - __old));\
 } while (0)
@@ -271,7 +271,7 @@ struct __name##_back_ring { 
\
 (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);  \
 if (_work_to_do) break;\
 (_r)->sring->req_event = (_r)->req_cons + 1;   \
-mb();  \
+virt_mb(); \
 (_work_to_do) = RING_HAS_UNCONSUMED_REQUESTS(_r);  \
 } while (0)
 
@@ -279,7 +279,7 @@ struct __name##_back_ring { 
\
 (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \
 if (_work_to_do) break;\
 (_r)->sring->rsp_event = (_r)->rsp_cons + 1;   \
-mb();  \
+virt_mb(); \
 (_work_to_do) = RING_HAS_UNCONSUMED_RESPONSES(_r); \
 } while (0)
 
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 24/32] sparc: define __smp_xxx

2015-12-31 Thread David Miller
From: "Michael S. Tsirkin" 
Date: Thu, 31 Dec 2015 21:08:53 +0200

> This defines __smp_xxx barriers for sparc,
> for use by virtualization.
> 
> smp_xxx barriers are removed as they are
> defined correctly by asm-generic/barriers.h
> 
> Signed-off-by: Michael S. Tsirkin 
> Acked-by: Arnd Bergmann 

Acked-by: David S. Miller 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 03/32] ia64: rename nop->iosapic_nop

2015-12-31 Thread Michael S. Tsirkin
asm-generic/barrier.h defines a nop() macro.
To be able to use this header on ia64, we shouldn't
call local functions/variables nop().

There's one instance where this breaks on ia64:
rename the function to iosapic_nop to avoid the conflict.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Tony Luck 
Acked-by: Arnd Bergmann 
---
 arch/ia64/kernel/iosapic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/kernel/iosapic.c b/arch/ia64/kernel/iosapic.c
index d2fae05..90fde5b 100644
--- a/arch/ia64/kernel/iosapic.c
+++ b/arch/ia64/kernel/iosapic.c
@@ -256,7 +256,7 @@ set_rte (unsigned int gsi, unsigned int irq, unsigned int 
dest, int mask)
 }
 
 static void
-nop (struct irq_data *data)
+iosapic_nop (struct irq_data *data)
 {
/* do nothing... */
 }
@@ -415,7 +415,7 @@ iosapic_unmask_level_irq (struct irq_data *data)
 #define iosapic_shutdown_level_irq mask_irq
 #define iosapic_enable_level_irq   unmask_irq
 #define iosapic_disable_level_irq  mask_irq
-#define iosapic_ack_level_irq  nop
+#define iosapic_ack_level_irq  iosapic_nop
 
 static struct irq_chip irq_type_iosapic_level = {
.name = "IO-SAPIC-level",
@@ -453,7 +453,7 @@ iosapic_ack_edge_irq (struct irq_data *data)
 }
 
 #define iosapic_enable_edge_irqunmask_irq
-#define iosapic_disable_edge_irq   nop
+#define iosapic_disable_edge_irq   iosapic_nop
 
 static struct irq_chip irq_type_iosapic_edge = {
.name = "IO-SAPIC-edge",
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 02/32] asm-generic: guard smp_store_release/load_acquire

2015-12-31 Thread Michael S. Tsirkin
Allow architectures to override smp_store_release
and smp_load_acquire by guarding the defines
in asm-generic/barrier.h with ifndef directives.

This is in preparation to reusing asm-generic/barrier.h
on architectures which have their own definition
of these macros.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Arnd Bergmann 
---
 include/asm-generic/barrier.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 0f45f93..987b2e0 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -104,13 +104,16 @@
 #define smp_mb__after_atomic() smp_mb()
 #endif
 
+#ifndef smp_store_release
 #define smp_store_release(p, v)
\
 do {   \
compiletime_assert_atomic_type(*p); \
smp_mb();   \
WRITE_ONCE(*p, v);  \
 } while (0)
+#endif
 
+#ifndef smp_load_acquire
 #define smp_load_acquire(p)\
 ({ \
typeof(*p) ___p1 = READ_ONCE(*p);   \
@@ -118,6 +121,7 @@ do {
\
smp_mb();   \
___p1;  \
 })
+#endif
 
 #endif /* !__ASSEMBLY__ */
 #endif /* __ASM_GENERIC_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 04/32] ia64: reuse asm-generic/barrier.h

2015-12-31 Thread Michael S. Tsirkin
On ia64 smp_rmb, smp_wmb, read_barrier_depends, smp_read_barrier_depends
and smp_store_mb() match the asm-generic variants exactly. Drop the
local definitions and pull in asm-generic/barrier.h instead.

This is in preparation to refactoring this code area.

Signed-off-by: Michael S. Tsirkin 
Acked-by: Tony Luck 
Acked-by: Arnd Bergmann 
---
 arch/ia64/include/asm/barrier.h | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index 209c4b8..2f93348 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -48,12 +48,6 @@
 # define smp_mb()  barrier()
 #endif
 
-#define smp_rmb()  smp_mb()
-#define smp_wmb()  smp_mb()
-
-#define read_barrier_depends() do { } while (0)
-#define smp_read_barrier_depends() do { } while (0)
-
 #define smp_mb__before_atomic()barrier()
 #define smp_mb__after_atomic() barrier()
 
@@ -77,12 +71,12 @@ do {
\
___p1;  \
 })
 
-#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
-
 /*
  * The group barrier in front of the rsm & ssm are necessary to ensure
  * that none of the previous instructions in the same group are
  * affected by the rsm/ssm.
  */
 
+#include 
+
 #endif /* _ASM_IA64_BARRIER_H */
-- 
MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 00/34] arch: barrier cleanup + barriers for virt

2015-12-31 Thread Michael S. Tsirkin
Changes since v1:
- replaced my asm-generic patch with an equivalent patch already in tip
- add wrappers with virt_ prefix for better code annotation,
  as suggested by David Miller
- dropped XXX in patch names as this makes vger choke, Cc all relevant
  mailing lists on all patches (not personal email, as the list becomes
  too long then)

I parked this in vhost tree for now, but the inclusion of patch 1 from tip
creates a merge conflict (even though it's easy to resolve).
Would tip maintainers prefer merging it through tip tree instead
(including the virtio patches)?
Or should I just merge it all through my tree, including the
duplicate patch, and assume conflict will be resolved?
If the second, acks will be appreciated.

Thanks!

This is really trying to cleanup some virt code, as suggested by Peter, who
said
> You could of course go fix that instead of mutilating things into
> sort-of functional state.

This work is needed for virtio, so it's probably easiest to
merge it through my tree - is this fine by everyone?
Arnd, if you agree, could you ack this please?

Note to arch maintainers: please don't cherry-pick patches out of this patchset
as it's been structured in this order to avoid breaking bisect.
Please send acks instead!

Sometimes, virtualization is weird. For example, virtio does this 
(conceptually):

#ifdef CONFIG_SMP
smp_mb();
#else
mb();
#endif

Similarly, Xen calls mb() when it's not doing any MMIO at all.

Of course it's wrong in the sense that it's suboptimal. What we would really
like is to have, on UP, exactly the same barrier as on SMP.  This is because a
UP guest can run on an SMP host.

But Linux doesn't provide this ability: if CONFIG_SMP is not defined is
optimizes most barriers out to a compiler barrier.

Consider for example x86: what we want is xchg (NOT mfence - there's no real IO
going on here - just switching out of the VM - more like a function call
really) but if built without CONFIG_SMP smp_store_mb does not include this.

Virt in general is probably the only use-case, because this really is an
artifact of interfacing with an SMP host while running an UP kernel,
but since we have (at least) two users, it seems to make sense to
put these APIs in a central place.

In fact, smp_ barriers are stubs on !SMP, so they can be defined as follows:

arch/XXX/include/asm/barrier.h:

#define __smp_mb() DOSOMETHING

include/asm-generic/barrier.h:

#ifdef CONFIG_SMP
#define smp_mb() __smp_mb()
#else
#define smp_mb() barrier()
#endif

This has the benefit of cleaning out a bunch of duplicated
ifdefs on a bunch of architectures - this patchset brings
about a net reduction in LOC, even with new barriers and extra documentation :)

Then virt can use __smp_XXX when talking to an SMP host.
To make those users explicit, this patchset adds virt_xxx wrappers
for them.

Touching all archs is a tad tedious, but its fairly straight forward.

The rest of the patchset is structured as follows:


-. Patch 1 fixes a bug in asm-generic.
   It is already in tip, included here for completeness.

-. Patches 2-12 make sure barrier.h on all remaining
   architectures includes asm-generic/barrier.h:
   after the change in Patch 1, code there matches
   asm-generic/barrier.h almost verbatim.
   Minor code tweaks were required in a couple of places.
   Macros duplicated from asm-generic/barrier.h are dropped
   in the process.

After all that preparatory work, we are getting to the actual change.

-. Patches 13 adds generic smp_XXX wrappers in asm-generic
   these select __smp_XXX or barrier() depending on CONFIG_SMP

-. Patches 14-27 change all architectures to
   define __smp_XXX macros; the generic code in asm-generic/barrier.h
   then defines smp_XXX macros

   I compiled the affected arches before and after the changes,
   dumped the .text section (using objdump -O binary) and
   made sure that the object code is exactly identical
   before and after the change.
   I couldn't fully build sh,tile,xtensa but I did this test
   kernel/rcu/tree.o kernel/sched/wait.o and
   kernel/futex.o and tested these instead.

Unfortunately, I don't have a metag cross-build toolset ready.
Hoping for some acks on this architecture.

Finally, the following patches put the __smp_xxx APIs to work for virt:

-. Patch 28 adds virt_ wrappers for __smp_, and documents them.
   After all this work, this requires very few lines of code in
   the generic header.

-. Patches 29,30,33,34 convert virtio xen drivers to use the virt_xxx APIs

   xen patches are untested
   virtio ones have been tested on x86

-. Patches 31-32 teach virtio to use virt_store_mb
   sh architecture was missing a 2-byte smp_store_mb,
   the fix is trivial although my code is not optimal:
   if anyone cares, pls send me a patch to apply on top.
   I didn't build this architecture, but intel's 0-day
   infrastructure builds it.

   tested on x86


Davidlohr Bueso (1):
  

[Xen-devel] What is Ganeti?

2015-12-31 Thread Jason Long
Hello.
Can anyone tell me about "Ganeti" ? Is it a replacement for Xen?

Cheers.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] svm: rephrase local variable use for Coverity.

2015-12-31 Thread Joshua Otto
Coverity CID 1343310

No functional changes.

Signed-off-by: Joshua Otto 
---
On Mon, Dec 28, 2015 at 09:34:28AM +, Andrew Cooper wrote:
> The error message isn't fantastic, but the complaint that Coverity
> has is that we store intr here, then unilaterally store it again
> slightly lower in the function, no matter what value it had (with
> the early return presumably not being taken into account).
>
> The error would probably be resolved if lines 95 and 96 turned into
> "if ( vmcb_get_vintr(gvmcb).fields.irq )"

This patch implements that change - as a general rule, is maintainer
preference to resolve false positives like this by suppressing them in
the tool or through code changes like this one?

 xen/arch/x86/hvm/svm/intr.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/intr.c b/xen/arch/x86/hvm/svm/intr.c
index bd94731..240eb35 100644
--- a/xen/arch/x86/hvm/svm/intr.c
+++ b/xen/arch/x86/hvm/svm/intr.c
@@ -92,8 +92,7 @@ static void svm_enable_intr_window(struct vcpu *v, struct 
hvm_intack intack)
  * return here or l2 guest looses interrupts, otherwise.
  */
 ASSERT(gvmcb != NULL);
-intr = vmcb_get_vintr(gvmcb);
-if ( intr.fields.irq )
+if ( vmcb_get_vintr(gvmcb).fields.irq )
 return;
 }
 }
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel