Re: [Xen-devel] [Xen-Devel] Enabling IRQ Crossbar (Secondary Interrupt Controller) Support

2015-07-21 Thread Brandon Perez

On 07/21/2015 05:41 AM, Julien Grall wrote:



On 21/07/2015 00:17, Brandon Perez wrote:

Hello All,


Hi Brandon,

We use to send one mail by patch rather than sending them as an
attachment of a single email. It's easier for reviewing the patches.
You also need to add you Signed-off-by on each patch and CC all the
relevant maintainers. Please see [1] for all the guidelines to submit a
patch to Xen.

A couple of comments I about this series:
 - Patch #2: You are allowing any guest to do smc which, unless you
trust all the guest, is unsecure. There was some discussion about
different solution to handle SMC back in 2013 [2]. So far I didn't see
any more update on it. It may be worth to send a separate thread about
how to handle SMC.
 - Patch #3, I can't find any documentation or implementation of the
property default-mapping in Linux. Can you provide a link about it?

I will comment more when you will resend the patches inline.

Regards,

[1] http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches
[2] http://lists.xen.org/archives/html/xen-devel/2013-07/msg02779.html



Hi Julien,

I'm not sure that these patches are quite ready yet to be put into 
the Xen repo. For one, these patches don't solve the problem of the 
crossbar as outlined in the thread this came from [1]. Also, I haven't 
had a chance to clean up these patches yet. I've provided these patches 
at the request of Ian, so that you guys could see the changes I have 
made so far, and we could discuss what changes would be needed for the 
crossbar.


I agree about Patch #2, that makes sense, this was a workaround 
I've been using for now. Perhaps a check could be added to see if the 
domain is the privileged domain?


That's correct, the default-mapping property is not standard. 
It's another workaround that I'm working with for now. The interrupts 
property is going to contain the crossbar input number, not the actual 
SPI GIC line, so I needed a way to convey this to Xen.


The patches are inlined below:

Patch #1:

From f2bf190255c8f872d15063d7f8a6382c279e312d Mon Sep 17 00:00:00 2001
From: Brandon Perez bpere...@ti.com
Date: Mon, 20 Jul 2015 17:56:49 -0400
Subject: DRA7: Add specific mappings for devices/regions not in the 
device tree.


The DRA7 chip, which is similar to the OMAP5 chip, also requires 
specific mappings. These are MMIO mappings which are not explicitly 
stated in the device tree, so Xen does not know to map them. This patch 
adds these regions required by the DRA7 to be mapped.


Signed-off-by: Brandon Perez bpere...@ti.com

---
 xen/arch/arm/platforms/omap5.c|   27 +++
 xen/include/asm-arm/platforms/omap5.h |3 +++
 2 files changed, 30 insertions(+)

diff --git a/xen/arch/arm/platforms/omap5.c b/xen/arch/arm/platforms/omap5.c
index e7bf30d..3c6495a 100644
--- a/xen/arch/arm/platforms/omap5.c
+++ b/xen/arch/arm/platforms/omap5.c
@@ -120,6 +120,32 @@ static int omap5_specific_mapping(struct domain *d)
 return 0;
 }

+/* Additional mappings for dom0 (not in the DTS) */
+static int dra7_specific_mapping(struct domain *d)
+{
+/* Map the PRM module */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRM_BASE), 2,
+ paddr_to_pfn(OMAP5_PRM_BASE));
+
+/* Map the PRM_MPU */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRCM_MPU_BASE), 1,
+ paddr_to_pfn(OMAP5_PRCM_MPU_BASE));
+
+/* Map the Wakeup Gen */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_WKUPGEN_BASE), 1,
+ paddr_to_pfn(OMAP5_WKUPGEN_BASE));
+
+/* Map the on-chip SRAM */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_SRAM_PA), 32,
+ paddr_to_pfn(OMAP5_SRAM_PA));
+
+/* Map GPMC address space for NAND flash. */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_GPMC_PA), 65536,
+ paddr_to_pfn(OMAP5_GPMC_PA));
+
+return 0;
+}
+
 static int __init omap5_smp_init(void)
 {
 void __iomem *wugen_base;
@@ -171,6 +197,7 @@ PLATFORM_START(dra7, TI DRA7)
 .init_time = omap5_init_time,
 .cpu_up = cpu_up_send_sgi,
 .smp_init = omap5_smp_init,
+.specific_mapping = dra7_specific_mapping,

 .dom0_gnttab_start = 0x4b00,
 .dom0_gnttab_size = 0x2,
diff --git a/xen/include/asm-arm/platforms/omap5.h 
b/xen/include/asm-arm/platforms/omap5.h

index c559c84..d87e7d2 100644
--- a/xen/include/asm-arm/platforms/omap5.h
+++ b/xen/include/asm-arm/platforms/omap5.h
@@ -20,6 +20,9 @@
 #define OMAP_AUX_CORE_BOOT_0_OFFSET 0x800
 #define OMAP_AUX_CORE_BOOT_1_OFFSET 0x804

+#define OMAP5_GPMC_PA   0x0100
+#define OMAP5_TILER_PA  0x6000
+
 #endif /* __ASM_ARM_PLATFORMS_OMAP5_H */

 /*
--
1.7.9.5

Patch #2:

From e53fdc1ea750dd3143e2d7cd62a5d38eb446afde Mon Sep 17 00:00:00 2001
From: Brandon Perez bpere...@ti.com
Date: Mon, 20 Jul 2015 17:58:24 -0400
Subject: Traps: Enable pass-through SMC call support for guest OS's.


Re: [Xen-devel] [PATCH OSSTEST v2 06/13] toolstack/libvirt: guest migrate, save and restore support

2015-07-21 Thread Wei Liu
On Mon, Jul 13, 2015 at 12:23:52PM +0100, Ian Campbell wrote:
 On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
 
 Perhaps the libvirt part of the check_for_command stuff ought to be
 moved here? Otherwise we are claiming support before the code is
 actually willing to try to do so.

I move this patch before toolstack: distinguish local and remote
migration support (the patch that claims support). It should be fine
now.

Wei.

 
  Signed-off-by: Wei Liu wei.l...@citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  Acked-by: Ian Campbell ian.campb...@citrix.com
  ---
   Osstest/Toolstack/libvirt.pm | 11 ---
   1 file changed, 8 insertions(+), 3 deletions(-)
  
  diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm
  index ddf84df..3dc1856 100644
  --- a/Osstest/Toolstack/libvirt.pm
  +++ b/Osstest/Toolstack/libvirt.pm
  @@ -105,17 +105,22 @@ sub saverestore_check ($) {
   
   sub migrate ($) {
   my ($self,$gho,$dst,$timeout) = @_;
  -die Migration is not yet supported on libvirt.;
  +my $ho = $self-{Host};
  +my $gn = $gho-{Name};
  +target_cmd_root($ho, virsh migrate $gn $dst, $timeout);
   }
   
   sub save () {
   my ($self,$gho,$f,$timeout) = @_;
  -die Save is not yet supported on libvirt.;
  +my $ho = $self-{Host};
  +my $gn = $gho-{Name};
  +target_cmd_root($ho, virsh save $gn $f, $timeout);
   }
   
   sub restore () {
   my ($self,$gho,$f,$timeout) = @_;
  -die Restore is not yet supported on libvirt.;
  +my $ho = $self-{Host};
  +target_cmd_root($ho, virsh restore $f, $timeout);
   }
   
   1;
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 3/3] hotplug/FreeBSD: fix xendriverdomain rc.d script

2015-07-21 Thread Ian Campbell
On Tue, 2015-07-21 at 12:03 +0100, Wei Liu wrote:
 On Mon, Jul 20, 2015 at 04:55:02PM +0200, Roger Pau Monne wrote:
  hotplugpath.sh by default is located in /usr/local/etc/xen/scripts 
  on
  FreeBSD. Instead of hardcoding it's location use the XEN_SCRIPT_DIR 
  variable
  like it's used on the xencommons rc.d script.
  
  Signed-off-by: Roger Pau Monné roger@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Wei Liu wei.l...@citrix.com
 
 Acked-by: Wei Liu wei.l...@citrix.com

Me too, plus applied.

Thanks Roger, sorry for the breakage.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 1/3] libxl: include sys/endian.h for FreeBSD

2015-07-21 Thread Ian Campbell
On Tue, 2015-07-21 at 10:36 +0100, Wei Liu wrote:
 On Mon, Jul 20, 2015 at 04:55:00PM +0200, Roger Pau Monne wrote:
  be64toh and friends are declared in sys/endian.h on FreeBSD, so 
  include it
  as part of libxl_osdeps.h.
  
  Signed-off-by: Roger Pau Monné roger@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Wei Liu wei.l...@citrix.com
 
 Acked-by: Wei Liu wei.l...@citrix.com

Applied.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun

Just to your example,

libxl_domain_config cfg;
cfg.stuff = blah;
cfg.rdm.strategy = HOST;

libxl_domain_create_new(cfg, domid);
libxl_domain_destroy(domid);

Here looks you mean d_config-rdms would be changed, right? Currently
this shouldn't be allowed. But I think we need to further discussion
make this case clear after feature freeze since we didn't have this kind
of assumption in our previous design.

libxl_domain_create_new(cfg, domid);


This response of yours does not lead me to think you have understood
what I am saying, but I agree that this can be dealt with later (if


Indeed, I'm not a fan to Xen tools so I can't picture what this real 
scenario would happen. So if I'm misunderstanding what you mean, just 
please correct me. Or if you still think its hard to explain this to me, 
just tell me what I should do. I think this make your life easy.


Thanks
Tiejun


indeed it needs to be dealt with at all).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 2/2] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool

2015-07-21 Thread Dario Faggioli
And this time, do it right. In fact, a similar change was
attempted in 93be8285a79c6 (cpupools: update domU's node-affinity
on the cpupool_unassign_cpu() path). But that was buggy, and got
reverted with 8395b67ab0b8a86.

However, even though reverting was the right thing to do, it
remains true that:
 - calling the function is better done in the cpupool cpu removal
   code, even if just for simmetry with the cpupool cpu adding path;
 - it is not necessary to call it during cpu teardown (for suspend
   or shutdown) code as we either are going down and will never
   come up (shutdown) or, when coming up, we want everything to be
   as before the tearing down process started, and so we would just
   undo any update made during the process.
 - calling it from the teardown path is not only unnecessary, but
   it can trigger an ASSERT(), in case we get, during the process,
   to remove the last online pcpu of a domain's node affinity:

  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  ... ... ...
  (XEN) Xen call trace:
  (XEN)[82d0801055b9] domain_update_node_affinity+0x113/0x240
  (XEN)[82d08012e676] cpu_disable_scheduler+0x334/0x3f2
  (XEN)[82d08018bb8d] __cpu_disable+0x313/0x36e
  (XEN)[82d080101424] take_cpu_down+0x34/0x3b
  (XEN)[82d080130ad9] stopmachine_action+0x70/0x99
  (XEN)[82d08013274f] do_tasklet_work+0x78/0xab
  (XEN)[82d080132a85] do_tasklet+0x5e/0x8a
  (XEN)[82d08016478c] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 12:
  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) 

Therefore, for all these reasons, move the call from
cpu_disable_schedule() to cpupool_unassign_cpu_helper().

While there, add some sanity checking (in the latter function), and
make sure that scanning the domain list is done with domlist_read_lock
held, at least when the system is 'live'.

I re-tested the scenario described in here:
 http://permalink.gmane.org/gmane.comp.emulators.xen.devel/235310

which is what led to the revert of 93be8285a79c6, and that is
working ok after this commit.

Signed-off-by: dario.faggi...@citrix.com
Acked-by: George Dunlap george.dun...@eu.citrix.com
Acked-by: Juergen Gross jgr...@suse.com
---
 xen/common/cpupool.c  |   18 ++
 xen/common/schedule.c |7 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index b48ae17..69b984c 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -297,12 +297,25 @@ static int cpupool_assign_cpu_locked(struct cpupool *c, 
unsigned int cpu)
 static long cpupool_unassign_cpu_helper(void *info)
 {
 int cpu = cpupool_moving_cpu;
+struct cpupool *c = info;
+struct domain *d;
 long ret;
 
 cpupool_dprintk(cpupool_unassign_cpu(pool=%d,cpu=%d)\n,
 cpupool_cpu_moving-cpupool_id, cpu);
 
 spin_lock(cpupool_lock);
+if ( c != cpupool_cpu_moving )
+{
+ret = -EBUSY;
+goto out;
+}
+
+/*
+ * We need this for scanning the domain list, both in
+ * cpu_disable_scheduler(), and at the bottom of this function.
+ */
+rcu_read_lock(domlist_read_lock);
 ret = cpu_disable_scheduler(cpu);
 cpumask_set_cpu(cpu, cpupool_free_cpus);
 if ( !ret )
@@ -319,6 +332,11 @@ static long cpupool_unassign_cpu_helper(void *info)
 cpupool_cpu_moving = NULL;
 }
 
+for_each_domain_in_cpupool(d, c)
+{
+domain_update_node_affinity(d);
+}
+rcu_read_unlock(domlist_read_lock);
 out:
 spin_unlock(cpupool_lock);
 cpupool_dprintk(cpupool_unassign_cpu ret=%ld\n, ret);
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 89fc10a..1419064 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -650,6 +650,12 @@ int cpu_disable_scheduler(unsigned int cpu)
 if ( c == NULL )
 return ret;
 
+/*
+ * We'd need the domain RCU lock, but:
+ *  - when we are called from cpupool code, it's acquired there already;
+ *  - when we are called for CPU teardown, we're in stop-machine context,
+ *so that's not be a problem.
+ */
 for_each_domain_in_cpupool ( d, c )
 {
 for_each_vcpu ( d, v )
@@ -735,7 +741,6 @@ int cpu_disable_scheduler(unsigned int cpu)
 ret = -EAGAIN;
 }
 }
-domain_update_node_affinity(d);
 }
 
 return ret;


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 1/2] xen: sched: reorganize cpu_disable_scheduler()

2015-07-21 Thread Dario Faggioli
The function is called both when we want to remove a cpu
from a cpupool, and during cpu teardown, for suspend or
shutdown. If, however, the boot cpu (cpu 0, most of the
times) is not present in the default cpupool, during
suspend or shutdown, Xen crashes like this:

  root@Zhaman:~# xl cpupool-cpu-remove Pool-0 0
  root@Zhaman:~# shutdown -h now
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  ...
  (XEN) Xen call trace:
  (XEN)[82d0801238de] _csched_cpu_pick+0x156/0x61f
  (XEN)[82d080123db5] csched_cpu_pick+0xe/0x10
  (XEN)[82d08012de3c] vcpu_migrate+0x18e/0x321
  (XEN)[82d08012e4f8] cpu_disable_scheduler+0x1cf/0x2ac
  (XEN)[82d08018bb8d] __cpu_disable+0x313/0x36e
  (XEN)[82d080101424] take_cpu_down+0x34/0x3b
  (XEN)[82d08013097a] stopmachine_action+0x70/0x99
  (XEN)[82d0801325f0] do_tasklet_work+0x78/0xab
  (XEN)[82d080132926] do_tasklet+0x5e/0x8a
  (XEN)[82d08016478c] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 15:
  (XEN) Assertion 'cpu  nr_cpu_ids' failed at 
...URCES/xen/xen/xen.git/xen/include/xen/cpumask.h:97
  (XEN) 

There also are problems when we try to suspend or shutdown
with a cpupool configured with just one cpu (no matter, in
this case, whether that is the boot cpu or not):

  root@Zhaman:~# xl create /etc/xen/test.cfg
  root@Zhaman:~# xl cpupool-migrate test Pool-1
  root@Zhaman:~# xl cpupool-list -c
  Name   CPU list
  Pool-0 0,1,2,3,4,5,6,7,8,9,10,11,13,14,15
  Pool-1 12
  root@Zhaman:~# shutdown -h now
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  (XEN) CPU:12
  ...
  (XEN) Xen call trace:
  (XEN)[82d08018bb91] __cpu_disable+0x317/0x36e
  (XEN)[82d080101424] take_cpu_down+0x34/0x3b
  (XEN)[82d08013097a] stopmachine_action+0x70/0x99
  (XEN)[82d0801325f0] do_tasklet_work+0x78/0xab
  (XEN)[82d080132926] do_tasklet+0x5e/0x8a
  (XEN)[82d08016478c] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 12:
  (XEN) Xen BUG at smpboot.c:895
  (XEN) 

In both cases, the problem is the scheduler not being able
to:
 - move all the vcpus to the boot cpu (as the boot cpu is
   not in the cpupool), in the former;
 - move the vcpus away from a cpu at all (as that is the
   only one cpu in the cpupool), in the latter.

Solution is to distinguish, inside cpu_disable_scheduler(),
the two cases of cpupool manipulation and teardown. For
cpupool manipulation, it is correct to ask the scheduler to
take an action, as pathological situation (like there not
being any cpu in the pool where to send vcpus) are taken
care of (i.e., forbidden!) already. For suspend and shutdown,
we don't want the scheduler to be involved at all, as the
final goal is pretty simple: send all the vcpus to the
boot cpu ASAP, so we just go for it.

Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
---
Changes from v2:
 * add a missing spin_unlock, most likely eaten by a
   forgotten `stg refresh' (sorry!)
 * fix a typo

Changes from v1:
 * BUG_ON() if, in the suspend/shutdown case, the mask of
   online pCPUs will ever get empty, as suggested during
   review;
 * reorganize and improve comments inside cpu_disable_scheduler()
   as suggested during review;
 * make it more clear that vcpu_move_nosched() (name
   changed, as suggested during review), should only be
   called from quite contextes, such us, during suspend
   or shutdown. Do that via both comments and asserts,
   as requested during review;
 * reorganize cpu_disable_scheduler() and vcpu_move_nosched()
   so that calling to sleep and wakeup functions are only
   called when necessary (i.e., *not* in case we are
   suspending/shutting down, as requested during review.
---
Cc: George Dunlap george.dun...@eu.citrix.com
Cc: Juergen Gross jgr...@suse.com
---
 xen/common/schedule.c |  104 +
 1 file changed, 88 insertions(+), 16 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index df8c1d0..89fc10a 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -454,9 +454,10 @@ void vcpu_unblock(struct vcpu *v)
  * Do the actual movement of a vcpu from old to new CPU. Locks for *both*
  * CPUs needs to have been taken already when calling this!
  */
-static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
-  unsigned int new_cpu)
+static void vcpu_move_locked(struct vcpu *v, unsigned int new_cpu)
 {
+unsigned int old_cpu = v-processor;
+
 /*
  * Transfer urgency status to new CPU before switching CPUs, as
  * once the switch occurs, v-is_urgent is no longer protected by
@@ -478,6 +479,33 @@ static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
 

Re: [Xen-devel] [PATCH v4 3/3] libxl: call libxl_cpupoolinfo_{init, dispose} in numa_place_domain

2015-07-21 Thread Ian Campbell
On Fri, 2015-07-17 at 18:03 +0100, Ian Jackson wrote:
 Wei Liu writes ([PATCH v4 3/3] libxl: call 
 libxl_cpupoolinfo_{init,dispose} in numa_place_domain):
  Call libxl_cpupoolinfo_init at the beginning.  Change two returns 
  to
  goto out so that libxl_cpupoolinfo_dispose is called in failure 
  path.
 
 Acked-by: Ian Jackson ian.jack...@eu.citrix.com

Applied.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: arm: Avoid reading beyond the last module

2015-07-21 Thread Ian Campbell
On Mon, 2015-07-20 at 13:32 +0100, Julien Grall wrote:
 Hi Chris,
 
 On 17/07/15 21:48, Chris (Christopher) Brand wrote:
  nr_mods is set in add_boot_module() to the number of module
  array elements used. This function also ensures that nr_mods
  never exceeds MAX_MODULES (the size of the array). When looping
  through the array, the correct maximum index is nr_mods-1,
  not nr_mods. If the array is full, using the latter will in
  fact access beyond the end of the array.
  This was done correctly in boot_module_find_by_kind() and
  consider_modules() but incorrectly in discard_initial_modules()
  and next_module().
  
  Signed-off-by: Chris Brand chris.br...@broadcom.com
 
 Reviewed-by: Julien Grall julien.gr...@citrix.com

Acked + applied.

Care should be taken when backporting since I think this off-by-one was
the result of us previously not including Xen in nr_mods despite it
being in the array or something like that (i..e the off-by-one used to
be correct).

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] Fix to libxl migration v2 issue blocking OSSTest

2015-07-21 Thread Ian Campbell
On Tue, 2015-07-21 at 15:51 +0100, Ian Campbell wrote:
 On Tue, 2015-07-21 at 13:49 +0100, Wei Liu wrote:
  On Mon, Jul 20, 2015 at 11:11:28AM +0100, Andrew Cooper wrote:
   On 20/07/15 10:56, Wei Liu wrote:
On Fri, Jul 17, 2015 at 05:51:14PM +0100, Andrew Cooper wrote:
 And three improvements to debugging.
 
 Note that there is still a bug in libxl__toolstack_save() 
 which
 valgrind identified, but I do not wish to block this bugfix 
 on 
 that
 
 Andrew Cooper (4):
   tools/libxc: Identify the path of the kernel image which 
 cannot be
 found
   tools/libxl: Log the subject fd in datacopier messages
   tools/libxl: Identify copywhat in stream v2 datacopiers
I think all three patches should wait until next development 
window
opens unless we have nothing else in our queue (which doesn't 
seem to be
the case at the moment).
   
   You mean delay until 4.7? I disagree.  Without these fixes, 
   debugging
   issues is substantially harder than they need to be.
   
   They literally are only adding extra information into existing 
   error
   messages.
   
  
  Well I am expecting two to three big series getting applied soon, 
  any
  changes that gets applied now has the chance of forcing those 
  series 
  to
  be rebased.
 
 Wei and I discussed this IRL, the concern was the outstanding colopre
 patches.
 
 However I did a test apply on top of  
 https://github.com/macrosheep/xen.git#colo/colo-v9 (the latest 
 colopre)
 and there were no rejects due to the remus refactoring.
 
 There were rejects because I already applied 4/4 on Friday, i.e. they
 were the inverse of what I fixed up then.
 
 So given the lack of interaction with colopre Wei gave me permission 
 to
 go ahead, so I have applied patches 1..3.
 
 4 was applied already, of course.

In doing this I managed to revert part of #4, thanks to Andy for
noticing so promptly.

I've pushed the following:

From 1287ac109c44ca9b99eb642316d7af83b4081b52 Mon Sep 17 00:00:00 2001
From: Ian Campbell ian.campb...@citrix.com
Date: Tue, 21 Jul 2015 16:00:19 +0100
Subject: [PATCH] tools: libxl: Refix Initialise the fd of the unused half of
 a datacopier

Applying the series out of order led to d72befc35f31 tools/libxl:
Identify copywhat in stream v2 datacopiers unintentionally reverting
part of 21d9b079e538 tools/libxl: Initialise the fd of the unused
half of a datacopier.

Put this back.

Reported-by: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Ian Campbell ian.campb...@citrix.com
---
 tools/libxl/libxl_stream_read.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 3e1cd2a..32a3551 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -611,6 +611,7 @@ static void write_emulator_blob(libxl__egc *egc,
 dc-writewhat  = qemu save file;
 dc-copywhat   = restore v2 stream;
 dc-writefd= writefd;
+dc-readfd = -1;
 dc-maxsz  = -1;
 dc-callback   = write_emulator_done;
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson
Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM):
 +static void
 +add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
 +  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
 +{
 +d_config-num_rdms++;
 +d_config-rdms = libxl__realloc(NOGC, d_config-rdms,
 +d_config-num_rdms * sizeof(libxl_device_rdm));
 +
 +d_config-rdms[d_config-num_rdms - 1].start = rdm_start;
 +d_config-rdms[d_config-num_rdms - 1].size = rdm_size;
 +d_config-rdms[d_config-num_rdms - 1].policy = rdm_policy;
 +}


But, I wrote:

   Can I suggest a function

  void add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)

   which assumes that d_config-num_rdms is set correctly, and increments
   it ?

   (Please put the increment at the end so that the assignments are to
   -rdms[d_config-num_rdms], or perhaps make a convenience alias.)

Note the last paragraph.

This is now the third time I have posted that text.  It is the fifth
request or clarification I have had to make about this very small
area.  I have to say that I'm finding this rather frustrating.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] Fix to libxl migration v2 issue blocking OSSTest

2015-07-21 Thread Ian Campbell
On Tue, 2015-07-21 at 13:49 +0100, Wei Liu wrote:
 On Mon, Jul 20, 2015 at 11:11:28AM +0100, Andrew Cooper wrote:
  On 20/07/15 10:56, Wei Liu wrote:
   On Fri, Jul 17, 2015 at 05:51:14PM +0100, Andrew Cooper wrote:
And three improvements to debugging.

Note that there is still a bug in libxl__toolstack_save() which
valgrind identified, but I do not wish to block this bugfix on 
that

Andrew Cooper (4):
  tools/libxc: Identify the path of the kernel image which 
cannot be
found
  tools/libxl: Log the subject fd in datacopier messages
  tools/libxl: Identify copywhat in stream v2 datacopiers
   I think all three patches should wait until next development 
   window
   opens unless we have nothing else in our queue (which doesn't 
   seem to be
   the case at the moment).
  
  You mean delay until 4.7? I disagree.  Without these fixes, 
  debugging
  issues is substantially harder than they need to be.
  
  They literally are only adding extra information into existing 
  error
  messages.
  
 
 Well I am expecting two to three big series getting applied soon, any
 changes that gets applied now has the chance of forcing those series 
 to
 be rebased.

Wei and I discussed this IRL, the concern was the outstanding colopre
patches.

However I did a test apply on top of
https://github.com/macrosheep/xen.git#colo/colo-v9 (the latest colopre)
and there were no rejects due to the remus refactoring.

There were rejects because I already applied 4/4 on Friday, i.e. they
were the inverse of what I fixed up then.

So given the lack of interaction with colopre Wei gave me permission to
go ahead, so I have applied patches 1..3.

4 was applied already, of course.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun

On 2015/7/21 23:09, Ian Jackson wrote:

Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts 
with RDM):

+static void
+add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
+  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
+{
+d_config-num_rdms++;
+d_config-rdms = libxl__realloc(NOGC, d_config-rdms,
+d_config-num_rdms * sizeof(libxl_device_rdm));
+
+d_config-rdms[d_config-num_rdms - 1].start = rdm_start;
+d_config-rdms[d_config-num_rdms - 1].size = rdm_size;
+d_config-rdms[d_config-num_rdms - 1].policy = rdm_policy;
+}



But, I wrote:

Can I suggest a function

   void add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
 uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)

which assumes that d_config-num_rdms is set correctly, and increments
it ?

(Please put the increment at the end so that the assignments are to
-rdms[d_config-num_rdms], or perhaps make a convenience alias.)

Note the last paragraph.

This is now the third time I have posted that text.  It is the fifth
request or clarification I have had to make about this very small
area.  I have to say that I'm finding this rather frustrating.


Sorry, I just ignore the line in brackets since I always think this kind 
of thing is often not a big deal, and next time I should pay more 
attention to the (). But indeed, before I post this whole patch online I 
also picked up this chunk of code to ask you to take a look that. This 
manner means I'm not very sure if I'm addressing this properly. But I 
didn't get a further response, so I guess that should work for you and 
then I posted the whole online.


Now back on our problem,

static void
add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
{
d_config-rdms = libxl__realloc(NOGC, d_config-rdms,
(d_config-num_rdms+1) * sizeof(libxl_device_rdm));

d_config-rdms[d_config-num_rdms].start = rdm_start;
d_config-rdms[d_config-num_rdms].size = rdm_size;
d_config-rdms[d_config-num_rdms].policy = rdm_policy;
d_config-num_rdms++;
}

Does this work for you? If I'm still wrong, please correct this function 
directly to cost you less.


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 0/2] xen: sched/cpupool: more fixing of (corner?) cases

2015-07-21 Thread Dario Faggioli
v2, which is here:
 http://lists.xen.org/archives/html/xen-devel/2015-07/msg03540.html

suffered from a missing stg refresh issue (or, at least,
that's my best guess).

Thanks Juergen for noticing and pointing that out, and
sorry for that.

This posting should be fine.

Regards,
Dario
---
Dario Faggioli (2):
  xen: sched: reorganize cpu_disable_scheduler()
  xen: sched/cpupool: properly update affinity when removing a cpu from a 
cpupool

 xen/common/cpupool.c  |   18 
 xen/common/schedule.c |  111 +
 2 files changed, 112 insertions(+), 17 deletions(-)
--
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v2 08/13] ts-libvirt-build: run libvirt test suite

2015-07-21 Thread Ian Campbell
On Tue, 2015-07-21 at 17:22 +0100, Wei Liu wrote:
 On Mon, Jul 13, 2015 at 12:25:35PM +0100, Ian Campbell wrote:
  On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
   We're interested in xlconfigtest.
   
   Signed-off-by: Wei Liu wei.l...@citrix.com
   Cc: Ian Campbell ian.campb...@citrix.com
   Cc: Ian Jackson ian.jack...@eu.citrix.com
   [...]
   + | tee ../libvirt-test-suite-log
  
  Should something be collecting/stashing that log file?
  
 
 Not sure.
 
  I think since it is logged to the output of the command it's 
  probably
  not needed, but I suppose you added the tee for a reason?
  
 
 I followed suite. There is tee log in build command. I can't seem 
 to
 find that log stashed anywhere though.
 
 What should I do about this?

Given the precedent, I think nothing.

Acked-by: Ian Campbell ian.campb...@citrix.com



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/5] xl: Command line: Adjust Fix segfaults from `xl psr-cat-cbm-set`...

2015-07-21 Thread Ian Campbell
On Tue, 2015-07-21 at 15:36 +0100, Wei Liu wrote:
 On Tue, Jul 21, 2015 at 03:27:26PM +0100, Ian Jackson wrote:
  Ian Campbell writes (Re: [PATCH 1/5] xl: Command line: Adjust Fix 
  segfaults from `xl psr-cat-cbm-set`...):
   On Fri, 2015-07-17 at 18:00 +0100, Ian Jackson wrote:
   
   Replying here in lieu of a 0/N:
   
   Is any subset of this aimed at 4.6?
  
  Yes, ideally, all of them.  I think they are bugfixes or minor
  cleanups.
  
  If Wei wants only a subset, I can probably tease them out.
  
 
 I think this is part of the audit of toolstack code.  They look
 obviously correct.  I'm fine with them going in.
 
 Currently we have three series with freeze exception. Only RMRR has
 significant number of changes for xl, however that series doens't
 introduce a new command, so we're probably safe to apply this series
 now.

I applied #1..3, #4 had a good suggestion for an improvement and #5
relies on that change so I left them in this pass.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.6 3/3] xen: remove non-POSIX error codes

2015-07-21 Thread Roger Pau Monne
Xen was using some non-POSIX error codes that are removed in this patch. For
future reference, the list of POSIX error codes has been obtained from:

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html

The error codes already present and defined as optional (XSR), have been
left in place.

Signed-off-by: Roger Pau Monné roger@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Jan Beulich jbeul...@suse.com
Cc: Tim Deegan t...@xen.org
---
 xen/include/public/errno.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/xen/include/public/errno.h b/xen/include/public/errno.h
index 9a7e411..3498727 100644
--- a/xen/include/public/errno.h
+++ b/xen/include/public/errno.h
@@ -54,8 +54,6 @@ XEN_ERRNO(EDEADLK,35) /* Resource deadlock would 
occur */
 XEN_ERRNO(ENAMETOOLONG,36) /* File name too long */
 XEN_ERRNO(ENOLCK,  37) /* No record locks available */
 XEN_ERRNO(ENOSYS,  38) /* Function not implemented */
-XEN_ERRNO(EBADRQC, 56) /* Invalid request code */
-XEN_ERRNO(EBADSLT, 57) /* Invalid slot */
 XEN_ERRNO(ENODATA, 61) /* No data available */
 XEN_ERRNO(ETIME,   62) /* Timer expired */
 XEN_ERRNO(EBADMSG, 74) /* Not a data message */
@@ -64,15 +62,12 @@ XEN_ERRNO(EILSEQ,   84) /* Illegal byte sequence */
 #ifdef __XEN__ /* Internal only, should never be exposed to the guest. */
 XEN_ERRNO(ERESTART,85) /* Interrupted system call should be restarted 
*/
 #endif
-XEN_ERRNO(EUSERS,  87) /* Too many users */
 XEN_ERRNO(EOPNOTSUPP,  95) /* Operation not supported on transport 
endpoint */
 XEN_ERRNO(EADDRINUSE,  98) /* Address already in use */
 XEN_ERRNO(EADDRNOTAVAIL, 99)   /* Cannot assign requested address */
 XEN_ERRNO(ENOBUFS, 105)/* No buffer space available */
 XEN_ERRNO(EISCONN, 106)/* Transport endpoint is already connected */
 XEN_ERRNO(ENOTCONN,107)/* Transport endpoint is not connected */
-XEN_ERRNO(ESHUTDOWN,   108)/* Cannot send after transport endpoint 
shutdown */
-XEN_ERRNO(ETOOMANYREFS,109)/* Too many references: cannot splice */
 XEN_ERRNO(ETIMEDOUT,   110)/* Connection timed out */
 
 #undef XEN_ERRNO
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson
Ian Jackson writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM):
 Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
 conflicts with RDM):
  Sorry, I just ignore the line in brackets since I always think this kind 
  of thing is often not a big deal, and next time I should pay more 
  attention to the (). But indeed, before I post this whole patch online I 
  also picked up this chunk of code to ask you to take a look that. This 
  manner means I'm not very sure if I'm addressing this properly. But I 
  didn't get a further response, so I guess that should work for you and 
  then I posted the whole online.
 
 You are talking about 55ae2bb1.9030...@intel.com I guess.  I replied
 to that with several comments about your prose and about the
 computation of the new set of rdms.
 
 It's true that I didn't comment on the frat that you had half-done one
 fact
 of the things I had requested.  It is of course a waste of my time to
 be constantly re-reviewing half-done changes.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Xen-Devel] Enabling IRQ Crossbar (Secondary Interrupt Controller) Support

2015-07-21 Thread Ian Campbell
On Tue, 2015-07-21 at 10:07 -0400, Brandon Perez wrote:
 
  I'm not sure that these patches are quite ready yet to be put 
 into 
 the Xen repo.

That's ok, but even for an RFC (Request For Comments) please post them
one patch per email in the manner of git send-email. You can use -
-subject-prefix='PATCH RFC' to tag them as such.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 1/3] xen/x86/libxl: replace non-POSIX error codes used by PSR code

2015-07-21 Thread Jan Beulich
 On 21.07.15 at 17:56, roger@citrix.com wrote:
 PSR was using EBADSLT and EUSERS which are not POSIX error codes, replace
 them with EINVAL and EOVERFLOW respectively.

Considering that we use EINVAL for almost everything (well beyond
parameter checking I'm afraid), I don't think using this value for
something intended to yield a specific user mode error message is
really a good choice. Looking at the two respective hypervisor side
changes - how about e.g. using EDOM, EBADF, or ENXIO instead?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v2 09/13] ts-debian-hvm-install: stub out libvirt + ovmf / rombios

2015-07-21 Thread Wei Liu
On Mon, Jul 13, 2015 at 12:27:30PM +0100, Ian Campbell wrote:
 On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
  Libvirt's configuration converter doesn't know how to deal with BIOS
  selection. The end result is it always use the default one (seabios).
  Stub out ovmf and rombios to avoid false positive results.
 
 It's worth mentioning here whether or not we expect to currently see
 such configurations in osstest today.
 

I don't expect to see those configurations any time soon. I will explain
this in commit message.

 If we do expect to see them then it would be good to filter them in
 make-flight to avoid wasting lots of test time.
 

The filtering is done in later patch.

This change is more like another level of safety in case we change
something in make-flight by mistake.

Wei.

  
  This restriction will be removed once libvirt's converter knows how to
  deal with BIOS selection.
  
  Signed-off-by: Wei Liu wei.l...@citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  ---
   ts-debian-hvm-install | 7 +++
   1 file changed, 7 insertions(+)
  
  diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
  index f05b1a7..bd16506 100755
  --- a/ts-debian-hvm-install
  +++ b/ts-debian-hvm-install
  @@ -28,6 +28,13 @@ if (@ARGV  $ARGV[0] =~ m/^--stage(\d+)$/) { $stage=$1; 
  shift @ARGV; }
   
   defined($r{bios}) or die Need to define which bios to use;
   
  +# Libvirt doesn't know anything about bios. It will always use the
  +# default one (seabios). Stub out rombios and ovmf to avoid false
  +# positive results.
  +if ($r{bios} =~ m/ovmf|rombios/  $r{toolstack} eq 'libvirt') {
  +die libvirt + $r{bios} is not supported yet.;
  +}
  +
   our ($whhost,$gn) = @ARGV;
   $whhost ||= 'host';
   $gn ||= 'debianhvm';
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v2 08/13] ts-libvirt-build: run libvirt test suite

2015-07-21 Thread Wei Liu
On Mon, Jul 13, 2015 at 12:25:35PM +0100, Ian Campbell wrote:
 On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
  We're interested in xlconfigtest.
  
  Signed-off-by: Wei Liu wei.l...@citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  [...]
  + | tee ../libvirt-test-suite-log
 
 Should something be collecting/stashing that log file?
 

Not sure.

 I think since it is logged to the output of the command it's probably
 not needed, but I suppose you added the tee for a reason?
 

I followed suite. There is tee log in build command. I can't seem to
find that log stashed anywhere though.

What should I do about this?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun

On 2015/7/21 23:57, Ian Jackson wrote:

Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts 
with RDM):

Sorry, I just ignore the line in brackets since I always think this kind
of thing is often not a big deal, and next time I should pay more
attention to the (). But indeed, before I post this whole patch online I
also picked up this chunk of code to ask you to take a look that. This
manner means I'm not very sure if I'm addressing this properly. But I
didn't get a further response, so I guess that should work for you and
then I posted the whole online.


You are talking about 55ae2bb1.9030...@intel.com I guess.  I replied
to that with several comments about your prose and about the
computation of the new set of rdms.

It's true that I didn't comment on the frat that you had half-done one
of the things I had requested.  It is of course a waste of my time to
be constantly re-reviewing half-done changes.


Next time I should see each line of all comments carefully. Maybe its 
good way to use IRC to take your quick advice in advance, and I hope 
this make you feel better. Anyway, sorry to bring this kind of 
inconvenience.





Now back on our problem,

static void
add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
{
  d_config-rdms = libxl__realloc(NOGC, d_config-rdms,
  (d_config-num_rdms+1) * sizeof(libxl_device_rdm));

  d_config-rdms[d_config-num_rdms].start = rdm_start;
  d_config-rdms[d_config-num_rdms].size = rdm_size;
  d_config-rdms[d_config-num_rdms].policy = rdm_policy;
  d_config-num_rdms++;
}

Does this work for you? If I'm still wrong, please correct this function
directly to cost you less.


Yes, that is what I meant.



Good to know.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.5-testing test] 59792: regressions - FAIL

2015-07-21 Thread osstest service owner
flight 59792 xen-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59792/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail REGR. vs. 59527
 test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail REGR. 
vs. 59527

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 59527
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail blocked in 
59527
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59508
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 59527
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 59527

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  666b80f239c566283cb1b3435180d99a329d0156
baseline version:
 xen  36a7c54a698db7d087873b312087cfa64de33175

Last test of basis59527  2015-07-14 01:40:53 Z7 days
Testing same since59792  2015-07-21 09:35:47 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper andrew.coop...@citrix.com
  Dario Faggioli dario.faggi...@citrix.com
  Elena Ufimtseva elena.ufimts...@oracle.com
  Ian Campbell ian.campb...@citrix.com
  Jan Beulich jbeul...@suse.com
  Juergen Gross jgr...@suse.com
  Liang Li liang.z...@intel.com
  Yang Zhang yang.z.zh...@intel.com

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-rumpuserxen-amd64   fail
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-i386-rumpuserxen-i386 pass
 test-amd64-amd64-xl-pvh-intelfail
 

Re: [Xen-devel] [PATCH v6 10/15] x86/altp2m: add remaining support routines.

2015-07-21 Thread Sahita, Ravi
From: George Dunlap [mailto:george.dun...@citrix.com]
Sent: Tuesday, July 21, 2015 6:19 AM

On 07/21/2015 12:58 AM, Ed White wrote:
 Add the remaining routines required to support enabling the alternate
 p2m functionality.

 Signed-off-by: Ed White edmund.h.wh...@intel.com

 Reviewed-by: Andrew Cooper andrew.coop...@citrix.com
 ---
  xen/arch/x86/hvm/hvm.c   |  58 +-
  xen/arch/x86/mm/hap/Makefile |   1 +
  xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++
  xen/arch/x86/mm/p2m-ept.c|   3 +
  xen/arch/x86/mm/p2m.c| 385
+++
  xen/include/asm-x86/hvm/altp2m.h |   4 +
  xen/include/asm-x86/p2m.h|  33 
  7 files changed, 576 insertions(+), 6 deletions(-)  create mode
 100644 xen/arch/x86/mm/hap/altp2m_hap.c

 diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
 f0ab4d4..38cf0c6 100644
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2856,10 +2856,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
unsigned long gla,
  mfn_t mfn;
  struct vcpu *curr = current;
  struct domain *currd = curr-domain;
 -struct p2m_domain *p2m;
 +struct p2m_domain *p2m, *hostp2m;
  int rc, fall_through = 0, paged = 0;
  int sharing_enomem = 0;
  vm_event_request_t *req_ptr = NULL;
 +bool_t ap2m_active = 0;

  /* On Nested Virtualization, walk the guest page table.
   * If this succeeds, all is fine.
 @@ -2919,11 +2920,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
unsigned long gla,
  goto out;
  }

 -p2m = p2m_get_hostp2m(currd);
 -mfn = get_gfn_type_access(p2m, gfn, p2mt, p2ma,
 +ap2m_active = altp2m_active(currd);
 +
 +/* Take a lock on the host p2m speculatively, to avoid potential
 + * locking order problems later and to handle unshare etc.
 + */
 +hostp2m = p2m_get_hostp2m(currd);
 +mfn = get_gfn_type_access(hostp2m, gfn, p2mt, p2ma,
P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE 
 : 0),
NULL);

 +if ( ap2m_active )
 +{
 +if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, p2m) == 1 
 )
 +{
 +/* entry was lazily copied from host -- retry */
 +__put_gfn(hostp2m, gfn);
 +rc = 1;
 +goto out;
 +}
 +
 +mfn = get_gfn_type_access(p2m, gfn, p2mt, p2ma, 0, NULL);
 +}
 +else
 +p2m = hostp2m;
 +
  /* Check access permissions first, then handle faults */
  if ( mfn_x(mfn) != INVALID_MFN )
  {
 @@ -2963,6 +2984,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
 unsigned long gla,

  if ( violation )
  {
 +/* Should #VE be emulated for this fault? */
 +if ( p2m_is_altp2m(p2m)  !cpu_has_vmx_virt_exceptions )
 +{
 +bool_t sve;
 +
 +p2m-get_entry(p2m, gfn, p2mt, p2ma, 0, NULL,
 + sve);
 +
 +if ( !sve  altp2m_vcpu_emulate_ve(curr) )
 +{
 +rc = 1;
 +goto out_put_gfn;
 +}
 +}
 +
  if ( p2m_mem_access_check(gpa, gla, npfec, req_ptr) )
  {
  fall_through = 1;
 @@ -2982,7 +3017,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
unsigned long gla,
   (npfec.write_access 
(p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
  {
 -put_gfn(currd, gfn);
 +__put_gfn(p2m, gfn);
 +if ( ap2m_active )
 +__put_gfn(hostp2m, gfn);

  rc = 0;
  if ( unlikely(is_pvh_domain(currd)) ) @@ -3011,6 +3048,7 @@
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
  /* Spurious fault? PoD and log-dirty also take this path. */
  if ( p2m_is_ram(p2mt) )
  {
 +rc = 1;
  /*
   * Page log dirty is always done with order 0. If this mfn resides 
 in
   * a large page, we do not change other pages type within
 that large @@ -3019,9 +3057,15 @@ int
hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
  if ( npfec.write_access )
  {
  paging_mark_dirty(currd, mfn_x(mfn));
 +/* If p2m is really an altp2m, unlock here to avoid lock 
 ordering
 + * violation when the change below is propagated from host p2m 
 */
 +if ( ap2m_active )
 +__put_gfn(p2m, gfn);
  p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
 p2m_ram_rw);
 +__put_gfn(ap2m_active ? hostp2m : p2m, gfn);
 +
 +goto out;
  }
 -rc = 1;
  goto out_put_gfn;
  }

 @@ -3031,7 +3075,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
unsigned long gla,
  rc = fall_through;

  out_put_gfn:
 -put_gfn(currd, gfn);
 +__put_gfn(p2m, gfn);
 +if ( ap2m_active )
 +__put_gfn(hostp2m, gfn);
  out:
  /* All of these are delayed until we 

Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb

2015-07-21 Thread Jan Beulich
 Daniel Kiper daniel.ki...@oracle.com 07/21/15 8:23 PM 
On Tue, Jul 21, 2015 at 03:37:48AM -0600, Jan Beulich wrote:
  On 20.07.15 at 16:28, daniel.ki...@oracle.com wrote:

 ... because of ??? Nowadays - with X86_FEATURE_ERMS - rep stosb
 is expected to be faster than rep stosl.

OK, I did not know about that. However, as I know this feature
was introduced in 2012 with Ivy Bridge. So, I suppose that there
are still a lot of machines in the wild which does not support it.
Anyway, because this code is not performance critical I am not going
to insist on one or another solution. However, Andrew suggested that
thing, so, please agree with him in which direction we should go.
I will do what you agree.

ISTR having seen a similar patch from him(?), maybe in another area
of code, before (or was it v1 of this one?), which I responded to with the
same as above.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: arm: gic-v3: avoid \n in the middle of log messages

2015-07-21 Thread Ian Campbell
On Fri, 2015-07-17 at 15:10 +0100, Julien Grall wrote:
 Hi Ian,
 
 On 17/07/15 14:21, Ian Campbell wrote:
  These result in log messages such as:
  
  (XEN) d0v0: vGICD: RAZ on reserved register offset 0x0c
  (XEN) d0v0: vGICR: write r2 offset 0x000180
  (XEN)  not foundG3traps.c:2417:d0v0 HSR=0x93820046 
  pc=0xffc000322bfc gva=0xff890180 gpa=0x008d110180
  
  Fix this by rewording without a \n in the middle. Also add one at 
  the end.
  
  Signed-off-by: Ian Campbell ian.campb...@citrix.com
 
 Reviewed-by: Julien Grall julien.gr...@citrix.com

Applied thanks.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson
Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM):
 [Ian Jackson:]
  That is, an address would be reserved if it was reserved in any of the
  rdm regions implied by the config.
 
 Are you saying this point?
 
 The union of two sets A and B is the set of elements which are in A, in 
 B, or in both A and B.

Yes.

  The explicitly specified regions might overlap with the computed ones,
  without being identical. Computing the union would not be entirely
  trivial.
 
 Just to your example,
 
libxl_domain_config cfg;
cfg.stuff = blah;
cfg.rdm.strategy = HOST;
 
libxl_domain_create_new(cfg, domid);
libxl_domain_destroy(domid);
 
 Here looks you mean d_config-rdms would be changed, right? Currently 
 this shouldn't be allowed. But I think we need to further discussion 
 make this case clear after feature freeze since we didn't have this kind 
 of assumption in our previous design.
 
libxl_domain_create_new(cfg, domid);

This response of yours does not lead me to think you have understood
what I am saying, but I agree that this can be dealt with later (if
indeed it needs to be dealt with at all).

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson
Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM):
 Indeed, I'm not a fan to Xen tools so I can't picture what this real 
 scenario would happen. So if I'm misunderstanding what you mean, just 
 please correct me. Or if you still think its hard to explain this to me, 
 just tell me what I should do. I think this make your life easy.

Please ignore this line of discussion.

Instead, please simply make it so that if there are any rdms specified
in the domain config, they are used instead of the automatically
gathered information (from strategy and devices).

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.6 1/3] xen/x86/libxl: replace non-POSIX error codes used by PSR code

2015-07-21 Thread Roger Pau Monne
PSR was using EBADSLT and EUSERS which are not POSIX error codes, replace
them with EINVAL and EOVERFLOW respectively.

Signed-off-by: Roger Pau Monné roger@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 tools/libxl/libxl_psr.c | 6 +++---
 xen/arch/x86/psr.c  | 8 
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 2a0..f64406a 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -31,7 +31,7 @@ static void libxl__psr_log_err_msg(libxl__gc *gc, int err)
 case ESRCH:
 msg = invalid domain ID;
 break;
-case EBADSLT:
+case EINVAL:
 msg = socket is not supported;
 break;
 case EFAULT:
@@ -59,7 +59,7 @@ static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
 case ENOENT:
 msg = CMT is not attached to this domain;
 break;
-case EUSERS:
+case EOVERFLOW:
 msg = no free RMID available;
 break;
 default:
@@ -81,7 +81,7 @@ static void libxl__psr_cat_log_err_msg(libxl__gc *gc, int err)
 case ENOENT:
 msg = CAT is not enabled on the socket;
 break;
-case EUSERS:
+case EOVERFLOW:
 msg = no free COS available;
 break;
 case EEXIST:
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 861683f..0185c45 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -176,7 +176,7 @@ int psr_alloc_rmid(struct domain *d)
 if ( rmid  psr_cmt-rmid_max )
 {
 d-arch.psr_rmid = 0;
-return -EUSERS;
+return -EOVERFLOW;
 }
 
 d-arch.psr_rmid = rmid;
@@ -251,7 +251,7 @@ static struct psr_cat_socket_info 
*get_cat_socket_info(unsigned int socket)
 return ERR_PTR(-ENODEV);
 
 if ( socket = nr_sockets )
-return ERR_PTR(-EBADSLT);
+return ERR_PTR(-EINVAL);
 
 if ( !test_bit(socket, cat_socket_enable) )
 return ERR_PTR(-ENOENT);
@@ -332,7 +332,7 @@ static int write_l3_cbm(unsigned int socket, unsigned int 
cos, uint64_t cbm)
 unsigned int cpu = get_socket_cpu(socket);
 
 if ( cpu = nr_cpu_ids )
-return -EBADSLT;
+return -EINVAL;
 on_selected_cpus(cpumask_of(cpu), do_write_l3_cbm, info, 1);
 }
 
@@ -381,7 +381,7 @@ int psr_set_l3_cbm(struct domain *d, unsigned int socket, 
uint64_t cbm)
 if ( !found )
 {
 spin_unlock(info-cbm_lock);
-return -EUSERS;
+return -EOVERFLOW;
 }
 
 cos = found - map;
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v2 03/13] osstest migrate support check catch - variables

2015-07-21 Thread Ian Jackson
Ian Campbell writes (Re: [PATCH OSSTEST v2 03/13] osstest migrate support 
check catch - variables):
 On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
  @@ -300,7 +300,9 @@ proc run-job/test-pair {} {
   proc test-guest-migr {g} {
  -if {[catch { run-ts . = ts-migrate-support-check + host $g }]} return
  +set to_reap [spawn-ts . = ts-migrate-support-check + host $g]
 
 Most other uses of spawn-ts use [eval spawn-ts stuff]. I think those
 are just trying to expand a $args into multiple arguments to spawn-ts,
 and hence that isn't needed here (because $g is a singleton argument
 already). But TBH I don't know...

Yes, the effect of the
set reap [eval spawn-ts $args]
is to expand the list in $args as arguments to spawn-ts.  $g is a
singleton as you say, not a list.

spawn-ts has the same argument convention as run-ts.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v2 03/13] osstest migrate support check catch - variables

2015-07-21 Thread Ian Jackson
Wei Liu writes (Re: [PATCH OSSTEST v2 03/13] osstest migrate support check 
catch - variables):
 Do I need to change anything in this patch? I guess not? It's not very
 clear to me.

Ian C was asking whether the patch (which I wrote) was right, in a
particular respect.  I answered that it is correct.  So no, you don't
need to do anything to this patch.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.

2015-07-21 Thread Sahita, Ravi
From: George Dunlap [mailto:george.dun...@eu.citrix.com]
Sent: Tuesday, July 14, 2015 8:57 AM

On 07/14/2015 01:14 AM, Ed White wrote:
 Add the basic data structures needed to support alternate p2m's and
 the functions to initialise them and tear them down.

 Although Intel hardware can handle 512 EPTP's per hardware thread
 concurrently, only 10 per domain are supported in this patch for
 performance reasons.

 The iterator in hap_enable() does need to handle 512, so that is now
 uint16_t.

 This change also splits the p2m lock into one lock type for altp2m's
 and another type for all other p2m's. The purpose of this is to place
 the altp2m list lock between the types, so the list lock can be
 acquired whilst holding the host p2m lock.

 Signed-off-by: Ed White edmund.h.wh...@intel.com

 Reviewed-by: Andrew Cooper andrew.coop...@citrix.com

With the number of major changes you made here, you definitely should
have dropped this reviewed-by.

 ---
  xen/arch/x86/hvm/Makefile|   1 +
  xen/arch/x86/hvm/altp2m.c|  77 +
  xen/arch/x86/hvm/hvm.c   |  21 
  xen/arch/x86/mm/hap/hap.c|  38 ++-
  xen/arch/x86/mm/mm-locks.h   |  46 +-
  xen/arch/x86/mm/p2m.c| 102
+++
  xen/include/asm-x86/domain.h |  10 
  xen/include/asm-x86/hvm/altp2m.h |  38 +++
  xen/include/asm-x86/hvm/hvm.h|  14 ++
  xen/include/asm-x86/hvm/vcpu.h   |   9 
  xen/include/asm-x86/p2m.h|  32 +++-
  11 files changed, 384 insertions(+), 4 deletions(-)  create mode
 100644 xen/arch/x86/hvm/altp2m.c  create mode 100644
 xen/include/asm-x86/hvm/altp2m.h

 diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
 index 69af47f..eb1a37b 100644
 --- a/xen/arch/x86/hvm/Makefile
 +++ b/xen/arch/x86/hvm/Makefile
 @@ -1,6 +1,7 @@
  subdir-y += svm
  subdir-y += vmx

 +obj-y += altp2m.o
  obj-y += asid.o
  obj-y += emulate.o
  obj-y += event.o
 diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c new
 file mode 100644 index 000..a10f347
 --- /dev/null
 +++ b/xen/arch/x86/hvm/altp2m.c
 @@ -0,0 +1,77 @@
 +/*
 + * Alternate p2m HVM
 + * Copyright (c) 2014, Intel Corporation.
 + *
 + * This program is free software; you can redistribute it and/or
 +modify it
 + * under the terms and conditions of the GNU General Public License,
 + * version 2, as published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope it will be useful, but
 +WITHOUT
 + * ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY
 +or
 + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
 +License for
 + * more details.
 + *
 + * You should have received a copy of the GNU General Public License
 +along with
 + * this program; if not, write to the Free Software Foundation, Inc.,
 +59 Temple
 + * Place - Suite 330, Boston, MA 02111-1307 USA.
 + */
 +
 +#include asm/hvm/support.h
 +#include asm/hvm/hvm.h
 +#include asm/p2m.h
 +#include asm/hvm/altp2m.h
 +
 +void
 +altp2m_vcpu_reset(struct vcpu *v)

OK, so it looks like at the end of this patch series:
* altp2m_vcpu_reset() isn't called outside this file
* altp2m_vcpu_initialise() is only called from hvm.c when the guest enables
the altp2m functionality
* altp2m_vcpu_destroy() is called when the guest disables altp2m
funcitonality, or when the vcpu is destroyed

Looking at the vcpu_destroy case, it's hard to tell exactly how much on that
path is actually useful; but it looks like the only thing that's critical is 
decreasing
the active_vcpu count of the p2m that's being used.

Also, it looks like these functions don't do anything specifically with the HVM
side of things.

So on the whole, it seems like these would better go along with the other
altp2m functions inside p2m.c.

Thoughts?

George, apologies on this one - I completely missed this email from you -

We could move these functions into p2m.c, except destroy, the VMCS updates are 
critical.
We will try to get this into what will be our final rev (at least for 4.6 
candidate). 
Again sorry for the snafu on this email.

Ravi


 -George



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 02/23] x86/boot: copy only text section from *.lnk file to *.bin file

2015-07-21 Thread Daniel Kiper
On Tue, Jul 21, 2015 at 03:35:07AM -0600, Jan Beulich wrote:
  On 20.07.15 at 16:28, daniel.ki...@oracle.com wrote:

 Without any explanation (description) I'm inclined to say this makes
 things more fragile instead of improving the situation. As it looks
 like we indeed pointlessly copy .eh_frame, but I think this would
 better be avoided by suppressing its generation (i.e. add
 -fno-asynchronous-unwind-tables just like Rules.mk has).

Make sense, however, there is still place for two small optimizations.

First of all ld generates .got.plt section and objcopy copy it to binary file.
It is not needed because we do not link our stuff here with shared libraries.
So, we can use -R objcopy option to remove it (if you do not like -j .text).
This way we could save 15 bytes (at least on my machines).

We could also save another 3 bytes (per one xen/arch/x86/boot C input file)
in final Xen binary in worst case :-))). We just need generate output assembly
files as string of .byte instead of .long.

Where should I stop?

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v11][PATCH 02/16] xen/vtd: create RMRR mapping

2015-07-21 Thread Tiejun Chen
RMRR reserved regions must be setup in the pfn space with an identity
mapping to reported mfn. However existing code has problem to setup
correct mapping when VT-d shares EPT page table, so lead to problem
when assigning devices (e.g GPU) with RMRR reported. So instead, this
patch aims to setup identity mapping in p2m layer, regardless of
whether EPT is shared or not. And we still keep creating VT-d table.

And we also need to introduce a pair of helper to create/clear this
sort of identity mapping as follows:

set_identity_p2m_entry():

If the gfn space is unoccupied, we just set the mapping. If space
is already occupied by desired identity mapping, do nothing.
Otherwise, failure is returned.

clear_identity_p2m_entry():

We just define macro to wrapper guest_physmap_remove_page() with
a returning value as necessary.

CC: Tim Deegan t...@xen.org
CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
Reviewed-by: Kevin Tian kevin.t...@intel.com
Reviewed-by: Tim Deegan t...@xen.org
Acked-by: George Dunlap george.dun...@eu.citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
v6 ~ v11:

* Nothing is changed.

v5:

* Fold our original patch #2 and #3 as this new

* Introduce a new, clear_identity_p2m_entry, which can wrapper
  guest_physmap_remove_page(). And we use this to clean our
  identity mapping. 

v4:

* Change that orginal condition,

  if ( p2mt == p2m_invalid || p2mt == p2m_mmio_dm )
  
  to make sure we catch those invalid mfn mapping as we expected.

* To have

  if ( !paging_mode_translate(p2m-domain) )
return 0;

  at the start, instead of indenting the whole body of the function
  in an inner scope. 

* extend guest_physmap_remove_page() to return a value as a proper
  unmapping helper

* Instead of intel_iommu_unmap_page(), we should use
  guest_physmap_remove_page() to unmap rmrr mapping correctly. 

* Drop iommu_map_page() since actually ept_set_entry() can do this
  internally.

 xen/arch/x86/mm/p2m.c   | 40 +++--
 xen/drivers/passthrough/vtd/iommu.c |  5 ++---
 xen/include/asm-x86/p2m.h   | 13 +---
 3 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6fe6387..1e763dc 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -584,14 +584,16 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long 
gfn, unsigned long mfn,
  p2m-default_access);
 }
 
-void
+int
 guest_physmap_remove_page(struct domain *d, unsigned long gfn,
   unsigned long mfn, unsigned int page_order)
 {
 struct p2m_domain *p2m = p2m_get_hostp2m(d);
+int rc;
 gfn_lock(p2m, gfn, page_order);
-p2m_remove_page(p2m, gfn, mfn, page_order);
+rc = p2m_remove_page(p2m, gfn, mfn, page_order);
 gfn_unlock(p2m, gfn, page_order);
+return rc;
 }
 
 int
@@ -898,6 +900,40 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long 
gfn, mfn_t mfn,
 return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct, access);
 }
 
+int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
+   p2m_access_t p2ma)
+{
+p2m_type_t p2mt;
+p2m_access_t a;
+mfn_t mfn;
+struct p2m_domain *p2m = p2m_get_hostp2m(d);
+int ret;
+
+if ( !paging_mode_translate(p2m-domain) )
+return 0;
+
+gfn_lock(p2m, gfn, 0);
+
+mfn = p2m-get_entry(p2m, gfn, p2mt, a, 0, NULL);
+
+if ( p2mt == p2m_invalid || p2mt == p2m_mmio_dm )
+ret = p2m_set_entry(p2m, gfn, _mfn(gfn), PAGE_ORDER_4K,
+p2m_mmio_direct, p2ma);
+else if ( mfn_x(mfn) == gfn  p2mt == p2m_mmio_direct  a == p2ma )
+ret = 0;
+else
+{
+ret = -EBUSY;
+printk(XENLOG_G_WARNING
+   Cannot setup identity map d%d:%lx,
+gfn already mapped to %lx.\n,
+   d-domain_id, gfn, mfn_x(mfn));
+}
+
+gfn_unlock(p2m, gfn, 0);
+return ret;
+}
+
 /* Returns: 0 for success, -errno for failure */
 int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 9849d0e..5aa482f 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1839,7 +1839,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
 
 while ( base_pfn  end_pfn )
 {
-if ( intel_iommu_unmap_page(d, base_pfn) )
+if ( clear_identity_p2m_entry(d, base_pfn, 0) )
 ret = -ENXIO;
 base_pfn++;
 }
@@ -1855,8 +1855,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
 
 while ( base_pfn  end_pfn )
 {
-int err = intel_iommu_map_page(d, base_pfn, base_pfn,
-   

[Xen-devel] [v11][PATCH 10/16] tools: introduce some new parameters to set rdm policy

2015-07-21 Thread Tiejun Chen
This patch introduces user configurable parameters to specify RDM
resource and according policies,

Global RDM parameter:
rdm = strategy=host,policy=strict/relaxed
Per-device RDM parameter:
pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Global RDM parameter, strategy, allows user to specify reserved regions
explicitly, Currently, using 'host' to include all reserved regions reported
on this platform which is good to handle hotplug scenario. In the future
this parameter may be further extended to allow specifying random regions,
e.g. even those belonging to another platform as a preparation for live
migration with passthrough devices. By default this isn't set so we don't
check all rdms. Instead, we just check rdm specific to a given device if
you're assigning this kind of device. Note this option is not recommended
unless you can make sure any conflict does exist.

'strict/relaxed' policy decides how to handle conflict when reserving RDM
regions in pfn space. If conflict exists, 'strict' means an immediate error
so VM can't keep running, while 'relaxed' allows moving forward with a
warning message thrown out.

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
Acked-by: Wei Liu wei.l...@citrix.com
Acked-by: Ian Jackson ian.jack...@eu.citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
v9 ~ v11:

* Nothing is changed.

v8:

* One minimal code style change

v7:

* Need to rename some parameters:
  In the xl rdm config parsing, `reserve=' should be `policy='.
  In the xl pci config parsing, `rdm_reserve=' should be `rdm_policy='.
  The type `libxl_rdm_reserve_flag' should be `libxl_rdm_policy'.
  The field name `reserve' in `libxl_rdm_reserve' should be `policy'.

v6:

* Some rename to make our policy reasonable
  type - strategy
  none - ignore
* Don't expose ignore in xl level and just keep that as a default.
  And then sync docs and the patch head description

v5:

* Just make sure the per-device plicy always override the global policy,
  and so cleanup some associated comments and the patch head description.
* A little change to follow one bit, XEN_DOMCTL_DEV_RDM_RELAXED.
* Improve all descriptions in doc.
* Make all rdm variables specific to .hvm

v4:

* No need to define init_val for libxl_rdm_reserve_type since its just zero
* Grab those changes to xl/libxlu to as a final patch

 docs/man/xl.cfg.pod.5| 81 
 docs/misc/vtd.txt| 24 +
 tools/libxl/libxl_create.c   |  7 
 tools/libxl/libxl_internal.h |  2 ++
 tools/libxl/libxl_pci.c  |  9 +
 tools/libxl/libxl_types.idl  | 18 ++
 6 files changed, 141 insertions(+)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 382f30b..e6e0f70 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -633,6 +633,79 @@ assigned slave device.
 
 =back
 
+=item Brdm=RDM_RESERVATION_STRING
+
+(HVM/x86 only) Specifies information about Reserved Device Memory (RDM),
+which is necessary to enable robust device passthrough. One example of RDM
+is reported through ACPI Reserved Memory Region Reporting (RMRR) structure
+on x86 platform.
+
+BRDM_RESERVE_STRING has the form C[KEY=VALUE,KEY=VALUE,... where:
+
+=over 4
+
+=item BKEY=VALUE
+
+Possible BKEYs are:
+
+=over 4
+
+=item Bstrategy=STRING
+
+Currently there is only one valid type:
+
+host means all reserved device memory on this platform should be checked to
+reserve regions in this VM's guest address space. This global rdm parameter
+allows user to specify reserved regions explicitly, and using host includes
+all reserved regions reported on this platform, which is useful when doing
+hotplug.
+
+By default this isn't set so we don't check all rdms. Instead, we just check
+rdm specific to a given device if you're assigning this kind of device. Note
+this option is not recommended unless you can make sure any conflict does 
exist.
+
+For example, you're trying to set memory = 2800 to allocate memory to one
+given VM but the platform owns two RDM regions like,
+
+Device A [sbdf_A]: RMRR region_A: base_addr ac6d3000 end_address ac6e6fff
+Device B [sbdf_B]: RMRR region_B: base_addr ad80 end_address afff
+
+In this conflict case,
+
+#1. If Bstrategy is set to host, for example,
+
+rdm = strategy=host,policy=strict or rdm = strategy=host,policy=relaxed
+
+It means all conflicts will be handled according to the policy
+introduced by Bpolicy as described below.
+
+#2. If Bstrategy is not set at all, but
+
+pci = [ 'sbdf_A, rdm_policy=x' ]
+
+It means only one conflict of region_A will be handled according to the policy
+introduced by Brdm_policy=STRING as described inside pci options.
+
+=item Bpolicy=STRING
+

[Xen-devel] [v11][PATCH 04/16] xen: enable XENMEM_memory_map in hvm

2015-07-21 Thread Tiejun Chen
This patch enables XENMEM_memory_map in hvm. So hvmloader can
use it to setup the e820 mappings.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
Reviewed-by: Tim Deegan t...@xen.org
Reviewed-by: Kevin Tian kevin.t...@intel.com
Acked-by: Jan Beulich jbeul...@suse.com
Acked-by: George Dunlap george.dun...@eu.citrix.com
---
v5 ~ v11:

* Nothing is changed.

v4:

* Just refine the patch head description as Jan commented.

 xen/arch/x86/hvm/hvm.c | 2 --
 xen/arch/x86/mm.c  | 6 --
 2 files changed, 8 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c07e3ef..d860579 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4855,7 +4855,6 @@ static long hvm_memory_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 switch ( cmd  MEMOP_CMD_MASK )
 {
-case XENMEM_memory_map:
 case XENMEM_machine_memory_map:
 case XENMEM_machphys_mapping:
 return -ENOSYS;
@@ -4931,7 +4930,6 @@ static long hvm_memory_op_compat32(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 switch ( cmd  MEMOP_CMD_MASK )
 {
-case XENMEM_memory_map:
 case XENMEM_machine_memory_map:
 case XENMEM_machphys_mapping:
 return -ENOSYS;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 342414f..8c887d8 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4717,12 +4717,6 @@ long arch_memory_op(unsigned long cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 return rc;
 }
 
-if ( is_hvm_domain(d) )
-{
-rcu_unlock_domain(d);
-return -EPERM;
-}
-
 e820 = xmalloc_array(e820entry_t, fmap.map.nr_entries);
 if ( e820 == NULL )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v11][PATCH 00/16] Fix RMRR

2015-07-21 Thread Tiejun Chen
v11:

* Rebase on staging

* Patch #6: hvmloader/pci: Try to avoid placing BARs in RMRRs
  To find the lowest RMRR the _end_ of which is higher than base;
  Refine some code implementations; 

* Patch #7: hvmloader/e820: construct guest e820 table
  To check/sync memory_map.map[] before copy them into e820 since
  ultimately this can make sure hvm_info, memory_map.map[] and e820
  are on the same page;
  Refine some code implementations;

* Patch #11: tools/libxl: detect and avoid conflicts with RDM
  Use GCNEW_ARRAY to replace libxl__malloc();
  #define pfn_to_paddrk is missing safety () around x, and
  move this into libxl_internal.h;
  Rename set_rdm_entries() to add_rdm_entry() and put the
  increment at the end so that the assignments are
  to -rdms[d_config-num_rdms];
  Simply make it so that if there are any rdms specified
  in the domain config, they are used instead of the
  automatically gathered information (from strategy and
  devices). So just return if d_config-rmds is valid;
  Shorten some code comments.

v10:

* Patch #6: hvmloader/pci: Try to avoid placing BARs in RMRRs
  This is from George' draft patch which implements an acceptable
  solution in current cycle. Here I just implemented check_overlap_all() and
  some cleanups.

* Patch #7: hvmloader/e820: construct guest e820 table
  Instead of correcting e820, I'd like to correct memory_map.map[]
  and then copy them into e820 directly. I think this can make sure
  hvm_info, memory_map.map[] and e820 are on the same page.

v9:

* Patch #3: xen/passthrough: extend hypercall to support rdm reservation policy
  Correct one check condition of XEN_DOMCTL_DEV_RDM_RELAXED

* Patch #5: hvmloader: get guest memory map into memory_map[]
  Correct the patch head description:
  [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END]
- [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END);
  Merge two if{} as one if{};

* Patch #6: hvmloader/pci: disable all pci devices conflicting with rdm
  A little improvement to code implementation but again, its still argued
  about this solution. Myself prefer to take a look at v7 if possible.

* Patch #7: hvmloader/e820: construct guest e820 table
  Refine that chunk of codes to check/modify highmem

* Patch #15: xen/vtd: prevent from assign the device with shared rmrr
  Correct one indentation issue

v8:

* Patch #3: xen/passthrough: extend hypercall to support rdm reservation policy
  Force to pass 0(strict) when add or move a device in hardware domain,
  and improve some associated code comments.

* Patch #5: hvmloader: get guest memory map into memory_map[]
  Actually we should check this range started from
  RESERVED_MEMORY_DYNAMIC_START, not RESERVED_MEMORY_DYNAMIC_START - 1.
  So correct this and sync the patch head description.

* Patch #6: hvmloader/pci: disable all pci devices conflicting
  We have a big change to this patch:

  Based on current discussion its hard to reshape the original mmio
  allocation mechanism but we haven't a good and simple way to in short term.
  So instead, we don't bring more complicated to intervene that process but
  still check any conflicts to disable all associated devices.

  I know this is still argumented but I'd like to discuss this based on this
  revision and thanks for your time.

* Patch #7: hvmloader/e820: construct guest e820 table
  define low_mem_end as uint32_t;
  Correct those two wrong loops, memory_map.nr_map - nr
  when we're trying to revise low/high memory e820 entries;
  Improve code comments and the patch head description;
  Add one check if highmem is just populated by hvmloader itself

* Patch #11: tools/libxl: detect and avoid conflicts with RDM
  Introduce pfn_to_paddr(x) - ((uint64_t)x  XC_PAGE_SHIFT)
  and set_rdm_entries() to factor out current codes.

* Patch #13: libxl: construct e820 map with RDM information for HVM guest
  make that core construction function as arch-specific to make sure
  we don't break ARM at this point.

* Patch #15:  xen/vtd: prevent from assign the device with shared rmrr
  Merge two if{} as one if{};
  Add to print RMRR range info when stop assign a group device

* Some minimal code style changes

v7:

* Need to rename some parameters:
  In the xl rdm config parsing, `reserve=' should be `policy='.
  In the xl pci config parsing, `rdm_reserve=' should be `rdm_policy='.
  The type `libxl_rdm_reserve_flag' should be `libxl_rdm_policy'.
  The field name `reserve' in `libxl_rdm_reserve' should be `policy'.

* Just sync with the fallout of renaming parameters above.

Note I also mask patch #10 Acked by Wei Liu, Ian Jackson and Ian
Campbell. ( If I'm wrong just let me know at this point. ) And
as we discussed I'd further improve something as next step after
this round of review.

v6:

* Inside patch #01, add a comments to the nr_entries field inside
  xen_reserved_device_memory_map. Note this is from Jan.

* Inside patch #10,  we need rename something to make our policy reasonable
  type - strategy
  

[Xen-devel] [v11][PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr

2015-07-21 Thread Tiejun Chen
Currently we're intending to cover this kind of devices
with shared RMRR simply since the case of shared RMRR is
a rare case according to our previous experiences. But
late we can group these devices which shared rmrr, and
then allow all devices within a group to be assigned to
same domain.

CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
Acked-by: Kevin Tian kevin.t...@intel.com
---
v10 ~ v11:

* Noting is changed.

v9:

* Correct one indentation issue

v8:

* Merge two if{} as one if{}

* Add to print RMRR range info when stop assign a group device

v5 ~ v7:

* Nothing is changed.

v4:

* Refine one code comment.

 xen/drivers/passthrough/vtd/iommu.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 8a8d763..ce5c295 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2293,13 +2293,37 @@ static int intel_iommu_assign_device(
 if ( list_empty(acpi_drhd_units) )
 return -ENODEV;
 
+seg = pdev-seg;
+bus = pdev-bus;
+/*
+ * In rare cases one given rmrr is shared by multiple devices but
+ * obviously this would put the security of a system at risk. So
+ * we should prevent from this sort of device assignment.
+ *
+ * TODO: in the future we can introduce group device assignment
+ * interface to make sure devices sharing RMRR are assigned to the
+ * same domain together.
+ */
+for_each_rmrr_device( rmrr, bdf, i )
+{
+if ( rmrr-segment == seg 
+ PCI_BUS(bdf) == bus 
+ PCI_DEVFN2(bdf) == devfn 
+ rmrr-scope.devices_cnt  1 )
+{
+printk(XENLOG_G_ERR VTDPREFIX
+cannot assign %04x:%02x:%02x.%u
+with shared RMRR at %PRIx64 for Dom%d.\n,
+   seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+   rmrr-base_address, d-domain_id);
+return -EPERM;
+}
+}
+
 ret = reassign_device_ownership(hardware_domain, d, devfn, pdev);
 if ( ret )
 return ret;
 
-seg = pdev-seg;
-bus = pdev-bus;
-
 /* Setup rmrr identity mapping */
 for_each_rmrr_device( rmrr, bdf, i )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v11][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs

2015-07-21 Thread Tiejun Chen
Try to avoid placing PCI BARs over RMRRs:

- If mmio_hole_size is not specified, and the existing MMIO range has
  RMRRs in it, and there is space to expand the hole in lowmem without
  moving more memory, then make the MMIO hole as large as possible.

- When placing RMRRs, find the next RMRR higher than the current base
  in the lowmem mmio hole.  If it overlaps, skip ahead of it and find
  the next one.

This certainly won't work in all cases, but it should work in a
significant number of cases.  Additionally, users should be able to
work around problems by setting mmio_hole_size larger in the guest
config.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
Reviewed-by: Jan Beulich jbeul...@suse.com
Signed-off-by: George Dunlap george.dun...@eu.citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
v11:

* To find the lowest RMRR the _end_ of which is higher than base.

* Refine some code implementations 

v10:

* This is from George' draft patch which implements an acceptable solution in
  current cycle. Here I just implemented check_overlap_all() and some cleanups.

v9:

* A little improvement to code implementation but again, its still argued about
  this solution.

v8:

* Based on current discussion its hard to reshape the original mmio
  allocation mechanism but we haven't a good and simple way to in short term.
  So instead, we don't bring more complicated to intervene that process but
  still check any conflicts to disable all associated devices.

v6 ~ v7:

* Nothing is changed.

v5:

* Rename that field, is_64bar, inside struct bars with flag, and
  then extend to also indicate if this bar is already allocated.

v4:

* We have to re-design this as follows:

  #1. Goal

  MMIO region should exclude all reserved device memory

  #2. Requirements

  #2.1 Still need to make sure MMIO region is fit all pci devices as before

  #2.2 Accommodate the not aligned reserved memory regions

  If I'm missing something let me know.

  #3. How to

  #3.1 Address #2.1

  We need to either of populating more RAM, or of expanding more highmem. But
  we should know just 64bit-bar can work with highmem, and as you mentioned we
  also should avoid expanding highmem as possible. So my implementation is to 
  allocate 32bit-bar and 64bit-bar orderly.

  1. The first allocation round just to 32bit-bar

  If we can finish allocating all 32bit-bar, we just go to allocate 64bit-bar
  with all remaining resources including low pci memory.

  If not, we need to calculate how much RAM should be populated to allocate the 
  remaining 32bit-bars, then populate sufficient RAM as exp_mem_resource to go
  to the second allocation round 2.

  2. The second allocation round to the remaining 32bit-bar

  We should can finish allocating all 32bit-bar in theory, then go to the third
  allocation round 3.

  3. The third allocation round to 64bit-bar

  We'll try to first allocate from the remaining low memory resource. If that
  isn't enough, we try to expand highmem to allocate for 64bit-bar. This process
  should be same as the original.

  #3.2 Address #2.2

  I'm trying to accommodate the not aligned reserved memory regions:

  We should skip all reserved device memory, but we also need to check if other
  smaller bars can be allocated if a mmio hole exists between resource-base and
  reserved device memory. If a hole exists between base and reserved device
  memory, lets go out simply to try allocate for next bar since all bars are in
  descending order of size. If not, we need to move resource-base to 
reserved_end
  just to reallocate this bar.

 tools/firmware/hvmloader/pci.c | 65 ++
 1 file changed, 65 insertions(+)

diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 5ff87a7..74fc080 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -38,6 +38,46 @@ uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
 enum virtual_vga virtual_vga = VGA_none;
 unsigned long igd_opregion_pgbase = 0;
 
+/* Check if the specified range conflicts with any reserved device memory. */
+static bool check_overlap_all(uint64_t start, uint64_t size)
+{
+unsigned int i;
+
+for ( i = 0; i  memory_map.nr_map; i++ )
+{
+if ( memory_map.map[i].type == E820_RESERVED 
+ check_overlap(start, size,
+   memory_map.map[i].addr,
+   memory_map.map[i].size) )
+return true;
+}
+
+return false;
+}
+
+/* Find the lowest RMRR higher than base. */
+static int find_next_rmrr(uint32_t base)
+{
+unsigned int i;
+int next_rmrr = -1;
+uint64_t end, min_end = (1ull  32);
+
+for ( i = 0; i  memory_map.nr_map ; i++ )
+{
+   

[Xen-devel] [v11][PATCH 05/16] hvmloader: get guest memory map into memory_map[]

2015-07-21 Thread Tiejun Chen
Now we get this map layout by call XENMEM_memory_map then
save them into one global variable memory_map[]. It should
include lowmem range, rdm range and highmem range. Note
rdm range and highmem range may not exist in some cases.

And here we need to check if any reserved memory conflicts with
[RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END).
This range is used to allocate memory in hvmloder level, and
we would lead hvmloader failed in case of conflict since its
another rare possibility in real world.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
Reviewed-by: Kevin Tian kevin.t...@intel.com
Reviewed-by: George Dunlap george.dun...@eu.citrix.com
Acked-by: Jan Beulich jbeul...@suse.com
---
v10 ~ v11:

* Nothing is changed.

v9:

* Correct [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END]
- [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END) in
  the patch head description;
  Merge two if{} as one if{};

v8:

* Actually we should check this range started from
  RESERVED_MEMORY_DYNAMIC_START, not RESERVED_MEMORY_DYNAMIC_START - 1.
  So correct this and sync the patch head description.

v5 ~ v7:

* Nothing is changed.

v4:

* Move some codes related to e820 to that specific file, e820.c.

* Consolidate printf()+BUG() and BUG_ON()

* Avoid another fixed width type for the parameter of get_mem_mapping_layout()

 tools/firmware/hvmloader/e820.c  | 32 
 tools/firmware/hvmloader/e820.h  |  7 +++
 tools/firmware/hvmloader/hvmloader.c |  2 ++
 tools/firmware/hvmloader/util.c  | 26 ++
 tools/firmware/hvmloader/util.h  | 12 
 5 files changed, 79 insertions(+)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 2e05e93..7a414ab 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -23,6 +23,38 @@
 #include config.h
 #include util.h
 
+struct e820map memory_map;
+
+void memory_map_setup(void)
+{
+unsigned int nr_entries = E820MAX, i;
+int rc;
+uint64_t alloc_addr = RESERVED_MEMORY_DYNAMIC_START;
+uint64_t alloc_size = RESERVED_MEMORY_DYNAMIC_END - alloc_addr;
+
+rc = get_mem_mapping_layout(memory_map.map, nr_entries);
+
+if ( rc || !nr_entries )
+{
+printf(Get guest memory maps[%d] failed. (%d)\n, nr_entries, rc);
+BUG();
+}
+
+memory_map.nr_map = nr_entries;
+
+for ( i = 0; i  nr_entries; i++ )
+{
+if ( memory_map.map[i].type == E820_RESERVED 
+ check_overlap(alloc_addr, alloc_size,
+   memory_map.map[i].addr, memory_map.map[i].size) )
+{
+printf(Fail to setup memory map due to conflict);
+printf( on dynamic reserved memory range.\n);
+BUG();
+}
+}
+}
+
 void dump_e820_table(struct e820entry *e820, unsigned int nr)
 {
 uint64_t last_end = 0, start, end;
diff --git a/tools/firmware/hvmloader/e820.h b/tools/firmware/hvmloader/e820.h
index b2ead7f..8b5a9e0 100644
--- a/tools/firmware/hvmloader/e820.h
+++ b/tools/firmware/hvmloader/e820.h
@@ -15,6 +15,13 @@ struct e820entry {
 uint32_t type;
 } __attribute__((packed));
 
+#define E820MAX128
+
+struct e820map {
+unsigned int nr_map;
+struct e820entry map[E820MAX];
+};
+
 #endif /* __HVMLOADER_E820_H__ */
 
 /*
diff --git a/tools/firmware/hvmloader/hvmloader.c 
b/tools/firmware/hvmloader/hvmloader.c
index 25b7f08..84c588c 100644
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -262,6 +262,8 @@ int main(void)
 
 init_hypercalls();
 
+memory_map_setup();
+
 xenbus_setup();
 
 bios = detect_bios();
diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index 80d822f..122e3fa 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -27,6 +27,17 @@
 #include xen/memory.h
 #include xen/sched.h
 
+/*
+ * Check whether there exists overlap in the specified memory range.
+ * Returns true if exists, else returns false.
+ */
+bool check_overlap(uint64_t start, uint64_t size,
+   uint64_t reserved_start, uint64_t reserved_size)
+{
+return (start + size  reserved_start) 
+(start  reserved_start + reserved_size);
+}
+
 void wrmsr(uint32_t idx, uint64_t v)
 {
 asm volatile (
@@ -368,6 +379,21 @@ uuid_to_string(char *dest, uint8_t *uuid)
 *p = '\0';
 }
 
+int get_mem_mapping_layout(struct e820entry entries[], uint32_t *max_entries)
+{
+int rc;
+struct xen_memory_map memmap = {
+.nr_entries = *max_entries
+};
+
+set_xen_guest_handle(memmap.buffer, entries);
+
+rc = 

[Xen-devel] [v11][PATCH 14/16] xen/vtd: enable USB device assignment

2015-07-21 Thread Tiejun Chen
USB RMRR may conflict with guest BIOS region. In such case, identity
mapping setup is simply skipped in previous implementation. Now we
can handle this scenario cleanly with new policy mechanism so previous
hack code can be removed now.

CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
Acked-by: Kevin Tian kevin.t...@intel.com
---
v5 ~ v11:

* Nothing is changed.

v4:

* Refine the patch head description

 xen/drivers/passthrough/vtd/dmar.h  |  1 -
 xen/drivers/passthrough/vtd/iommu.c | 11 ++-
 xen/drivers/passthrough/vtd/utils.c |  7 ---
 3 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/dmar.h 
b/xen/drivers/passthrough/vtd/dmar.h
index af1feef..af205f5 100644
--- a/xen/drivers/passthrough/vtd/dmar.h
+++ b/xen/drivers/passthrough/vtd/dmar.h
@@ -129,7 +129,6 @@ do {\
 
 int vtd_hw_check(void);
 void disable_pmr(struct iommu *iommu);
-int is_usb_device(u16 seg, u8 bus, u8 devfn);
 int is_igd_drhd(struct acpi_drhd_unit *drhd);
 
 #endif /* _DMAR_H_ */
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index a2f3a66..8a8d763 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2242,11 +2242,9 @@ static int reassign_device_ownership(
 /*
  * If the device belongs to the hardware domain, and it has RMRR, don't
  * remove it from the hardware domain, because BIOS may use RMRR at
- * booting time. Also account for the special casing of USB below (in
- * intel_iommu_assign_device()).
+ * booting time.
  */
-if ( !is_hardware_domain(source) 
- !is_usb_device(pdev-seg, pdev-bus, pdev-devfn) )
+if ( !is_hardware_domain(source) )
 {
 const struct acpi_rmrr_unit *rmrr;
 u16 bdf;
@@ -2299,13 +2297,8 @@ static int intel_iommu_assign_device(
 if ( ret )
 return ret;
 
-/* FIXME: Because USB RMRR conflicts with guest bios region,
- * ignore USB RMRR temporarily.
- */
 seg = pdev-seg;
 bus = pdev-bus;
-if ( is_usb_device(seg, bus, pdev-devfn) )
-return 0;
 
 /* Setup rmrr identity mapping */
 for_each_rmrr_device( rmrr, bdf, i )
diff --git a/xen/drivers/passthrough/vtd/utils.c 
b/xen/drivers/passthrough/vtd/utils.c
index bd14c02..b8a077f 100644
--- a/xen/drivers/passthrough/vtd/utils.c
+++ b/xen/drivers/passthrough/vtd/utils.c
@@ -29,13 +29,6 @@
 #include extern.h
 #include asm/io_apic.h
 
-int is_usb_device(u16 seg, u8 bus, u8 devfn)
-{
-u16 class = pci_conf_read16(seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-PCI_CLASS_DEVICE);
-return (class == 0xc03);
-}
-
 /* Disable vt-d protected memory registers. */
 void disable_pmr(struct iommu *iommu)
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v11][PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-21 Thread Tiejun Chen
This patch extends the existing hypercall to support rdm reservation policy.
We return error or just throw out a warning message depending on whether
the policy is strict or relaxed when reserving RDM regions in pfn space.
Note in some special cases, e.g. add a device to hwdomain, and remove a
device from user domain, 'relaxed' is fine enough since this is always safe
to hwdomain.

CC: Tim Deegan t...@xen.org
CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
CC: Aravind Gopalakrishnan aravind.gopalakrish...@amd.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Stefano Stabellini stefano.stabell...@citrix.com
CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
Reviewed-by: George Dunlap george.dun...@eu.citrix.com
Acked-by: Jan Beulich jbeul...@suse.com
---
v10 ~ v11:

* Nothing is changed.

v9:

* Correct one check condition of XEN_DOMCTL_DEV_RDM_RELAXED

v8:

* Force to pass 0(strict) when add or move a device in hardware domain,
  and improve some associated code comments.

v6 ~ v7:

* Nothing is changed.

v5:

* Just leave one bit XEN_DOMCTL_DEV_RDM_RELAXED as our flag, so
  0 means strict and 1 means relaxed.

* So make DT device ignore the flag field

* Improve the code comments

v4:

* Add code comments to describer why we fix to set a policy flag in some
  cases like adding a device to hwdomain, and removing a device from user 
domain.

* Avoid using fixed width types for the parameter of set_identity_p2m_entry()

* Fix one judging condition
  domctl-u.assign_device.flag == XEN_DOMCTL_DEV_NO_RDM
  - domctl-u.assign_device.flag != XEN_DOMCTL_DEV_NO_RDM

* Add to range check the flag passed to make future extensions possible
  (and to avoid ambiguity on what out of range values would mean).

 xen/arch/x86/mm/p2m.c   |  7 --
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  3 ++-
 xen/drivers/passthrough/arm/smmu.c  |  2 +-
 xen/drivers/passthrough/device_tree.c   |  3 ++-
 xen/drivers/passthrough/pci.c   | 15 
 xen/drivers/passthrough/vtd/iommu.c | 37 ++---
 xen/include/asm-x86/p2m.h   |  2 +-
 xen/include/public/domctl.h |  3 +++
 xen/include/xen/iommu.h |  2 +-
 9 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 1e763dc..89616b7 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -901,7 +901,7 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, 
mfn_t mfn,
 }
 
 int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
-   p2m_access_t p2ma)
+   p2m_access_t p2ma, unsigned int flag)
 {
 p2m_type_t p2mt;
 p2m_access_t a;
@@ -923,7 +923,10 @@ int set_identity_p2m_entry(struct domain *d, unsigned long 
gfn,
 ret = 0;
 else
 {
-ret = -EBUSY;
+if ( flag  XEN_DOMCTL_DEV_RDM_RELAXED )
+ret = 0;
+else
+ret = -EBUSY;
 printk(XENLOG_G_WARNING
Cannot setup identity map d%d:%lx,
 gfn already mapped to %lx.\n,
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index e83bb35..920b35a 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -394,7 +394,8 @@ static int reassign_device(struct domain *source, struct 
domain *target,
 }
 
 static int amd_iommu_assign_device(struct domain *d, u8 devfn,
-   struct pci_dev *pdev)
+   struct pci_dev *pdev,
+   u32 flag)
 {
 struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev-seg);
 int bdf = PCI_BDF2(pdev-bus, devfn);
diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 6cc4394..9a667e9 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2605,7 +2605,7 @@ static void arm_smmu_destroy_iommu_domain(struct 
iommu_domain *domain)
 }
 
 static int arm_smmu_assign_dev(struct domain *d, u8 devfn,
-  struct device *dev)
+  struct device *dev, u32 flag)
 {
struct iommu_domain *domain;
struct arm_smmu_xen_domain *xen_domain;
diff --git a/xen/drivers/passthrough/device_tree.c 
b/xen/drivers/passthrough/device_tree.c
index 5d3842a..7ff79f8 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -52,7 +52,8 @@ int iommu_assign_dt_device(struct domain *d, struct 
dt_device_node *dev)
 goto fail;
 }
 
-rc = hd-platform_ops-assign_device(d, 0, dt_to_dev(dev));
+/* The flag field doesn't 

[Xen-devel] [v11][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Tiejun Chen
While building a VM, HVM domain builder provides struct hvm_info_table{}
to help hvmloader. Currently it includes two fields to construct guest
e820 table by hvmloader, low_mem_pgend and high_mem_pgend. So we should
check them to fix any conflict with RDM.

RMRR can reside in address space beyond 4G theoretically, but we never
see this in real world. So in order to avoid breaking highmem layout
we don't solve highmem conflict. Note this means highmem rmrr could still
be supported if no conflict.

But in the case of lowmem, RMRR probably scatter the whole RAM space.
Especially multiple RMRR entries would worsen this to lead a complicated
memory layout. And then its hard to extend hvm_info_table{} to work
hvmloader out. So here we're trying to figure out a simple solution to
avoid breaking existing layout. So when a conflict occurs,

#1. Above a predefined boundary (2G)
- move lowmem_end below reserved region to solve conflict;

#2. Below a predefined boundary (2G)
- Check strict/relaxed policy.
strict policy leads to fail libxl. Note when both policies
are specified on a given region, 'strict' is always preferred.
relaxed policy issue a warning message and also mask this entry 
INVALID
to indicate we shouldn't expose this entry to hvmloader.

Note later we need to provide a parameter to set that predefined boundary
dynamically.

CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
Acked-by: Wei Liu wei.l...@citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
Reviewed-by: Kevin Tian kevin.t...@intel.com
---
v11:

* Use GCNEW_ARRAY to replace libxl__malloc()

* #define pfn_to_paddrk is missing safety () around x, and
  move this into libxl_internal.h

* Rename set_rdm_entries() to add_rdm_entry() and put the
  increment at the end so that the assignments are
  to -rdms[d_config-num_rdms].

* Simply make it so that if there are any rdms specified
  in the domain config, they are used instead of the
  automatically gathered information (from strategy and
  devices). So just return if d_config-rmds is valid.

* Shorten some code comments.

v9 ~ v10:

* Nothing is changed.

v8:

* Introduce pfn_to_paddr(x) - ((uint64_t)x  XC_PAGE_SHIFT)
  and set_rdm_entries() to factor out current codes.

v7:

* Just sync with the fallout of renaming parameters from patch #10.

v6:

* fix some code stypes
* Refine libxl__xc_device_get_rdm()

v5:

* A little change to make sure the per-device policy always override the global
  policy and correct its associated code comments.
* Fix one typo in the patch head description
* Rename xc_device_get_rdm() with libxl__xc_device_get_rdm(), and then replace
  malloc() with libxl__malloc(), and finally cleanup this fallout.
* libxl__xc_device_get_rdm() should return proper libxl error code, ERROR_FAIL.
  Then instead, the allocated RDM entries would be returned with an out 
parameter.

v4:

* Consistent to use term RDM.
* Unconditionally set *nr_entries to 0
* Grab to all sutffs to provide a parameter to set our predefined boundary
  dynamically to as a separated patch later

 tools/libxl/libxl_create.c   |   2 +-
 tools/libxl/libxl_dm.c   | 274 +++
 tools/libxl/libxl_dom.c  |  17 ++-
 tools/libxl/libxl_internal.h |  14 ++-
 tools/libxl/libxl_types.idl  |   7 ++
 5 files changed, 311 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 7c884c4..5b57062 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -407,7 +407,7 @@ int libxl__domain_build(libxl__gc *gc,
 
 switch (info-type) {
 case LIBXL_DOMAIN_TYPE_HVM:
-ret = libxl__build_hvm(gc, domid, info, state);
+ret = libxl__build_hvm(gc, domid, d_config, state);
 if (ret)
 goto out;
 
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 634b8d2..29476fc 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -92,6 +92,280 @@ const char *libxl__domain_device_model(libxl__gc *gc,
 return dm;
 }
 
+static int
+libxl__xc_device_get_rdm(libxl__gc *gc,
+ uint32_t flag,
+ uint16_t seg,
+ uint8_t bus,
+ uint8_t devfn,
+ unsigned int *nr_entries,
+ struct xen_reserved_device_memory **xrdm)
+{
+int rc = 0, r;
+
+/*
+ * We really can't presume how many entries we can get in advance.
+ */
+*nr_entries = 0;
+r = xc_reserved_device_memory_map(CTX-xch, flag, seg, bus, devfn,
+  NULL, nr_entries);
+assert(r = 0);
+/* 0 means we have no any rdm entry. */
+if (!r) goto out;
+
+if (errno != ENOBUFS) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+

[Xen-devel] [v11][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-21 Thread Tiejun Chen
Now use the hypervisor-supplied memory map to build our final e820 table:
* Add regions for BIOS ranges and other special mappings not in the
  hypervisor map
* Add in the hypervisor supplied regions
* Adjust the lowmem and highmem regions if we've had to relocate
  memory (adding a highmem region if necessary)
* Sort all the ranges so that they appear in memory order.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
Reviewed-by: George Dunlap george.dun...@eu.citrix.com
Reviewed-by: Jan Beulich jbeul...@suse.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
v11:

* To check/sync memory_map.map[] before copy them into e820 since
  ultimately this can make sure hvm_info, memory_map.map[] and e820
  are on the same page. 

* Refine some code implementations

v10:

* Instead of correcting e820, I'd like to correct memory_map.map[]
  and then copy them into e820 directly. I think this can make sure
  hvm_info, memory_map.map[] and e820 are on the same page.

v9:

* Refine that chunk of codes to check/modify highmem

v8:

* define low_mem_end as uint32_t

* Correct those two wrong loops, memory_map.nr_map - nr
  when we're trying to revise low/high memory e820 entries.

* Improve code comments and the patch head description

* Add one check if highmem is just populated by hvmloader itself

v5 ~ v7:

* Nothing is changed.

v4:

* Rename local variable, low_mem_pgend, to low_mem_end.

* Improve some code comments

* Adjust highmem after lowmem is changed.
 
 tools/firmware/hvmloader/e820.c | 109 +++-
 1 file changed, 96 insertions(+), 13 deletions(-)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 7a414ab..a6cacdf 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820,
  unsigned int lowmem_reserved_base,
  unsigned int bios_image_base)
 {
-unsigned int nr = 0;
+unsigned int nr = 0, i, j;
+uint32_t low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT;
+uint32_t add_high_mem = 0;
+uint64_t high_mem_end = (uint64_t)hvm_info-high_mem_pgend  PAGE_SHIFT;
+uint64_t map_start, map_size, map_end;
 
 if ( !lowmem_reserved_base )
 lowmem_reserved_base = 0xA;
@@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820,
 e820[nr].type = E820_RESERVED;
 nr++;
 
-/* Low RAM goes here. Reserve space for special pages. */
-BUG_ON((hvm_info-low_mem_pgend  PAGE_SHIFT)  (2u  20));
-e820[nr].addr = 0x10;
-e820[nr].size = (hvm_info-low_mem_pgend  PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
-nr++;
-
 /*
  * Explicitly reserve space for special pages.
  * This space starts at RESERVED_MEMBASE an extends to cover various
@@ -191,16 +188,102 @@ int build_e820_table(struct e820entry *e820,
 nr++;
 }
 
+/* Low RAM goes here. Reserve space for special pages. */
+BUG_ON(low_mem_end  (2u  20));
 
-if ( hvm_info-high_mem_pgend )
+/*
+ * Construct E820 table according to recorded memory map.
+ *
+ * The memory map created by toolstack may include,
+ *
+ * #1. Low memory region
+ *
+ * Low RAM starts at least from 1M to make sure all standard regions
+ * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+ * have enough space.
+ *
+ * #2. Reserved regions if they exist
+ *
+ * #3. High memory region if it exists
+ *
+ * Note we just have one low memory entry and one high mmeory entry if
+ * exists.
+ *
+ * But we may have relocated RAM to allocate sufficient MMIO previously
+ * so low_mem_pgend would be changed over there. And here memory_map[]
+ * records the original low/high memory, so if low_mem_end is less than
+ * the original we need to revise low/high memory range firstly.
+ */
+for ( i = 0; i  memory_map.nr_map; i++ )
 {
-e820[nr].addr = ((uint64_t)1  32);
-e820[nr].size =
-((uint64_t)hvm_info-high_mem_pgend  PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+/* If we need to adjust lowmem. */
+if ( memory_map.map[i].type == E820_RAM 
+ low_mem_end  map_start  low_mem_end  map_end )
+{
+add_high_mem = map_end - low_mem_end;
+memory_map.map[i].size = low_mem_end - map_start;
+break;
+}
+}
+
+/* If we need to adjust highmem. */
+if ( add_high_mem )
+{
+/* Modify the existing highmem 

[Xen-devel] [v11][PATCH 09/16] tools: extend xc_assign_device() to support rdm reservation policy

2015-07-21 Thread Tiejun Chen
This patch passes rdm reservation policy to xc_assign_device() so the policy
is checked when assigning devices to a VM.

Note this also bring some fallout to python usage of xc_assign_device().

CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
CC: David Scott dave.sc...@eu.citrix.com
Acked-by: Wei Liu wei.l...@citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
v6 ~ v11:

* Nothing is changed.

v5:

* Fix the flag field as 0 to DT device

v4:

* In the patch head description, I add to explain why we need to sync
  the xc.c file

 tools/libxc/include/xenctrl.h   |  3 ++-
 tools/libxc/xc_domain.c |  9 -
 tools/libxl/libxl_pci.c |  3 ++-
 tools/ocaml/libs/xc/xenctrl_stubs.c | 16 
 tools/python/xen/lowlevel/xc/xc.c   | 30 --
 5 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 2991333..5c535c4 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2067,7 +2067,8 @@ int xc_hvm_destroy_ioreq_server(xc_interface *xch,
 /* HVM guest pass-through */
 int xc_assign_device(xc_interface *xch,
  uint32_t domid,
- uint32_t machine_sbdf);
+ uint32_t machine_sbdf,
+ uint32_t flag);
 
 int xc_get_device_group(xc_interface *xch,
  uint32_t domid,
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 298b3b5..69e6d8f 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1697,7 +1697,8 @@ int xc_domain_setdebugging(xc_interface *xch,
 int xc_assign_device(
 xc_interface *xch,
 uint32_t domid,
-uint32_t machine_sbdf)
+uint32_t machine_sbdf,
+uint32_t flag)
 {
 DECLARE_DOMCTL;
 
@@ -1705,6 +1706,7 @@ int xc_assign_device(
 domctl.domain = domid;
 domctl.u.assign_device.dev = XEN_DOMCTL_DEV_PCI;
 domctl.u.assign_device.u.pci.machine_sbdf = machine_sbdf;
+domctl.u.assign_device.flag = flag;
 
 return do_domctl(xch, domctl);
 }
@@ -1792,6 +1794,11 @@ int xc_assign_dt_device(
 
 domctl.u.assign_device.dev = XEN_DOMCTL_DEV_DT;
 domctl.u.assign_device.u.dt.size = size;
+/*
+ * DT doesn't own any RDM so actually DT has nothing to do
+ * for any flag and here just fix that as 0.
+ */
+domctl.u.assign_device.flag = 0;
 set_xen_guest_handle(domctl.u.assign_device.u.dt.path, path);
 
 rc = do_domctl(xch, domctl);
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index e0743f8..632c15e 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -894,6 +894,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, 
libxl_device_pci *pcidev, i
 FILE *f;
 unsigned long long start, end, flags, size;
 int irq, i, rc, hvm = 0;
+uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
 
 if (type == LIBXL_DOMAIN_TYPE_INVALID)
 return ERROR_FAIL;
@@ -987,7 +988,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, 
libxl_device_pci *pcidev, i
 
 out:
 if (!libxl_is_stubdom(ctx, domid, NULL)) {
-rc = xc_assign_device(ctx-xch, domid, pcidev_encode_bdf(pcidev));
+rc = xc_assign_device(ctx-xch, domid, pcidev_encode_bdf(pcidev), 
flag);
 if (rc  0  (hvm || errno != ENOSYS)) {
 LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, xc_assign_device failed);
 return ERROR_FAIL;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 64f1137..b7de615 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -1172,12 +1172,17 @@ CAMLprim value stub_xc_domain_test_assign_device(value 
xch, value domid, value d
CAMLreturn(Val_bool(ret == 0));
 }
 
-CAMLprim value stub_xc_domain_assign_device(value xch, value domid, value desc)
+static int domain_assign_device_rdm_flag_table[] = {
+XEN_DOMCTL_DEV_RDM_RELAXED,
+};
+
+CAMLprim value stub_xc_domain_assign_device(value xch, value domid, value desc,
+value rflag)
 {
-   CAMLparam3(xch, domid, desc);
+   CAMLparam4(xch, domid, desc, rflag);
int ret;
int domain, bus, dev, func;
-   uint32_t sbdf;
+   uint32_t sbdf, flag;
 
domain = Int_val(Field(desc, 0));
bus = Int_val(Field(desc, 1));
@@ -1185,7 +1190,10 @@ CAMLprim value stub_xc_domain_assign_device(value xch, 
value domid, value desc)
func = Int_val(Field(desc, 3));
sbdf = encode_sbdf(domain, bus, dev, func);
 
-   ret = xc_assign_device(_H(xch), _D(domid), sbdf);
+   ret = Int_val(Field(rflag, 0));
+   flag = domain_assign_device_rdm_flag_table[ret];
+
+   ret = xc_assign_device(_H(xch), _D(domid), sbdf, flag);
 
if (ret  0)

[Xen-devel] [v11][PATCH 01/16] xen: introduce XENMEM_reserved_device_memory_map

2015-07-21 Thread Tiejun Chen
From: Jan Beulich jbeul...@suse.com

This is a prerequisite for punching holes into HVM and PVH guests' P2M
to allow passing through devices that are associated with (on VT-d)
RMRRs.

CC: Jan Beulich jbeul...@suse.com
CC: Yang Zhang yang.z.zh...@intel.com
CC: Kevin Tian kevin.t...@intel.com
Signed-off-by: Jan Beulich jbeul...@suse.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
Acked-by: Kevin Tian kevin.t...@intel.com
---
v7 ~ v11:

* Nothing is changed.

v6:

* Add a comments to the nr_entries field inside xen_reserved_device_memory_map

v5 ~ v4:

* Nothing is changed.

 xen/common/compat/memory.c   | 66 
 xen/common/memory.c  | 64 ++
 xen/drivers/passthrough/iommu.c  | 10 ++
 xen/drivers/passthrough/vtd/dmar.c   | 32 +
 xen/drivers/passthrough/vtd/extern.h |  1 +
 xen/drivers/passthrough/vtd/iommu.c  |  1 +
 xen/include/public/memory.h  | 37 +++-
 xen/include/xen/iommu.h  | 10 ++
 xen/include/xen/pci.h|  2 ++
 xen/include/xlat.lst |  3 +-
 10 files changed, 224 insertions(+), 2 deletions(-)

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index b258138..b608496 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -17,6 +17,45 @@ CHECK_TYPE(domid);
 CHECK_mem_access_op;
 CHECK_vmemrange;
 
+#ifdef HAS_PASSTHROUGH
+struct get_reserved_device_memory {
+struct compat_reserved_device_memory_map map;
+unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+  u32 id, void *ctxt)
+{
+struct get_reserved_device_memory *grdm = ctxt;
+u32 sbdf;
+struct compat_reserved_device_memory rdm = {
+.start_pfn = start, .nr_pages = nr
+};
+
+sbdf = PCI_SBDF2(grdm-map.seg, grdm-map.bus, grdm-map.devfn);
+if ( (grdm-map.flag  PCI_DEV_RDM_ALL) || (sbdf == id) )
+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+if ( rdm.start_pfn != start || rdm.nr_pages != nr )
+return -ERANGE;
+
+if ( __copy_to_compat_offset(grdm-map.buffer,
+ grdm-used_entries,
+ rdm,
+ 1) )
+{
+return -EFAULT;
+}
+}
+++grdm-used_entries;
+return 1;
+}
+
+return 0;
+}
+#endif
+
 int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 {
 int split, op = cmd  MEMOP_CMD_MASK;
@@ -303,6 +342,33 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 break;
 }
 
+#ifdef HAS_PASSTHROUGH
+case XENMEM_reserved_device_memory_map:
+{
+struct get_reserved_device_memory grdm;
+
+if ( copy_from_guest(grdm.map, compat, 1) ||
+ !compat_handle_okay(grdm.map.buffer, grdm.map.nr_entries) )
+return -EFAULT;
+
+grdm.used_entries = 0;
+rc = iommu_get_reserved_device_memory(get_reserved_device_memory,
+  grdm);
+
+if ( !rc  grdm.map.nr_entries  grdm.used_entries )
+rc = -ENOBUFS;
+
+grdm.map.nr_entries = grdm.used_entries;
+if ( grdm.map.nr_entries )
+{
+if ( __copy_to_guest(compat, grdm.map, 1) )
+rc = -EFAULT;
+}
+
+return rc;
+}
+#endif
+
 default:
 return compat_arch_memory_op(cmd, compat);
 }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index e5d49d8..2fa45d0 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -748,6 +748,43 @@ static int construct_memop_from_reservation(
 return 0;
 }
 
+#ifdef HAS_PASSTHROUGH
+struct get_reserved_device_memory {
+struct xen_reserved_device_memory_map map;
+unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+  u32 id, void *ctxt)
+{
+struct get_reserved_device_memory *grdm = ctxt;
+u32 sbdf;
+
+sbdf = PCI_SBDF2(grdm-map.seg, grdm-map.bus, grdm-map.devfn);
+if ( (grdm-map.flag  PCI_DEV_RDM_ALL) || (sbdf == id) )
+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+struct xen_reserved_device_memory rdm = {
+.start_pfn = start, .nr_pages = nr
+};
+
+if ( __copy_to_guest_offset(grdm-map.buffer,
+grdm-used_entries,
+rdm,
+1) )
+{
+return -EFAULT;
+}
+}
+++grdm-used_entries;
+return 

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andrew Cooper
On 22/07/2015 01:28, Andy Lutomirski wrote:
 On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:
 On 22/07/2015 01:07, Andy Lutomirski wrote:
 On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:
 On 21/07/2015 22:53, Boris Ostrovsky wrote:
 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
   #endif
 /*
 + * ldt_structs can be allocated, used, and freed, but they are never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};

 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?
 PV guests are not permitted to have writeable mappings to the frames
 making up the GDT and LDT, so it cannot make unaudited changes to
 loadable descriptors.  In particular, for a 32bit PV guest, it is only
 the segment limit which protects Xen from the ring1 guest kernel.

 A lot of this code hasn't been touched in years, and it certainly
 predates me.  The alignment requirement appears to come from the virtual
 region Xen uses to map the guests GDT and LDT.  Strict alignment is
 required for the GDT so Xen's descriptors starting at 0xe0xx are
 correct, but the LDT alignment seems to be a side effect of similar
 codepaths.

 For an LDT smaller than 8192 entries, I can't see any specific reason
 for enforcing alignment, other than that's the way it has always been.

 However, the guest would still have to relinquish write access to all
 frames which make up the LDT, which looks to be a bit of an issue given
 the snippet above.
 Does the LDT itself need to be aligned or just the address passed to
 paravirt_alloc_ldt?
 The address which Xen receives needs to be aligned.

 It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
 it is passed is page aligned, and passes it straight through.
 xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
 it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
 pointer to the ldt_struct.

So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
audits to be page aligned.

 This will allow ldt_struct itself to be page aligned, and for the size
 field to sit across the base/limit field of what would logically be
 selector 0x0008  There would be some issues accessing size.  To load
 frames as an LDT, a guest must drop all refs to the page so that its
 type may be changed from writeable to segdesc.  After that, an
 update_descriptor hypercall can be used to change size, and I believe
 the guest may subsequently recreate read-only mappings to the frames in
 question (although frankly it is getting late so you will want to double
 check all of this).

 Anyhow, this looks like an issue which should be fixed up with slightly
 more PVOps, rather than enforcing a Xen view of the world on native Linux.

 I could presumably make the allocation the other way around so the
 size is at the end.  I could even use two separate allocations if
 needed.
 I suspect two separate allocations would be the better solution, as it
 means that the size field doesn't need to be subject to funny page
 permissions.
 True.  OTOH we never write to the size field after allocating the thing.

Right, but even reading it is going to cause problems if one of the
paravirt ops can't re-establish ro mappings.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/3] xen-blkfront: rm BUG_ON(info-feature_persistent) in blkif_free

2015-07-21 Thread Bob Liu


On 07/22/2015 12:43 PM, Bob Liu wrote:
 
 On 07/21/2015 05:25 PM, Roger Pau Monné wrote:
 El 21/07/15 a les 5.30, Bob Liu ha escrit:
 This BUG_ON() in blkif_free() is incorrect, because indirect page can be 
 added
 to list info-indirect_pages in blkif_completion() no matter 
 feature_persistent
 is true or false.

 Signed-off-by: Bob Liu bob@oracle.com

 Acked-by: Roger Pau Monné roger@citrix.com

 This was probably an oversight from when blkif_completion was changed to
 check for gnttab_query_foreign_access. It should be backported to stable
 trees.

 
 Sorry, this patch is buggy and I haven't figure out why.
 
 general protection fault:  [#1] SMP 
 Modules linked in:
 CPU: 0 PID: 39 Comm: xenwatch Tainted: GW   
 4.1.0-rc3-3-g718cf80-dirty #67
 Hardware name: Xen HVM domU, BIOS 4.5.0-rc 11/23/2014
 task: 880283f4eca0 ti: 880283fb4000 task.ti: 880283fb4000
 RIP: 0010:[813d577b]  [813d577b] blkif_free+0x162/0x5a9
 RSP: 0018:880283fb7c48  EFLAGS: 00010087
 RAX: dead00200200 RBX: 88014140 RCX: 
 RDX: dead00100100 RSI: dead00100100 RDI: 88028f418bb8
 RBP: 880283fb7ca8 R08: dead00200200 R09: 0001
 R10: 0001 R11:  R12: 8801414481c8
 R13: dead001000e0 R14: 8801414481b8 R15: ea00
 FS:  () GS:88028f40() knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 01582e08 CR3: 00013345b000 CR4: 001406f0
 Stack:
  880023aa8420 0286 880283fb7cb7 880023aa8420
  8800363fe240 81862c50 880283fb7ce8 880023aa8440
  8187 880023aa8400 88014140 88014148
 Call Trace:
  [813d8e76] blkfront_remove+0x4c/0xff
  [813772fa] xenbus_dev_remove+0x76/0xb0
  [813bd611] __device_release_driver+0x84/0xf8
  [813bd6a3] device_release_driver+0x1e/0x2b
  [813bd1ef] bus_remove_device+0x12c/0x141
  [813ba51d] device_del+0x161/0x1e5
  [81375ef3] ? xenbus_thread+0x239/0x239
  [813ba5e4] device_unregister+0x43/0x4f
  [81377853] xenbus_dev_changed+0x82/0x17f
  [81377566] ? xenbus_otherend_changed+0xf0/0xff
  [81378d8d] frontend_changed+0x43/0x48
  [81375fec] xenwatch_thread+0xf9/0x127
  [81078fc4] ? add_wait_queue+0x44/0x44
  [8106195b] kthread+0xcd/0xd5
  [8106] ? alloc_pid+0xe8/0x492
  [8106188e] ? kthread_freezable_should_stop+0x48/0x48
  [81533ee2] ret_from_fork+0x42/0x70
  [8106188e] ? kthread_freezable_should_stop+0x48/0x48
 Code: 04 00 4c 8b 28 48 8d 78 e0 49 83 ed 20 eb 3d 48 8b 47 28 48 8b 57 20 48 
 be 00 01 10 00 00 00 ad de 49 b8 00 02 20 00 00 00 ad de 48 89 42 08 48 89 
 10 48 89 77 20 4c 89 47 28 31 f6 e8 26 7d cf 
 RIP  [813d577b] blkif_free+0x162/0x5a9
  RSP 880283fb7c48
 ---[ end trace 5321d7f1ef8414d0 ]---
 

The right fix should be:

--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1124,8 +1124,10 @@ static void blkif_completion(struct blk_shadow *s, 
struct blkfront_info *info,
 * Add the used indirect page back to the list 
of
 * available pages for indirect grefs.
 */
-   indirect_page = 
pfn_to_page(s-indirect_grants[i]-pfn);
-   list_add(indirect_page-lru, 
info-indirect_pages);
+   if (!info-feature_persistent) {
+   indirect_page = 
pfn_to_page(s-indirect_grants[i]-pfn);
+   list_add(indirect_page-lru, 
info-indirect_pages);
+   }
s-indirect_grants[i]-gref = GRANT_INVALID_REF;
list_add_tail(s-indirect_grants[i]-node, 
info-grants);
}

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski
On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
andrew.coop...@citrix.com wrote:
 On 22/07/2015 01:07, Andy Lutomirski wrote:
 On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:
 On 21/07/2015 22:53, Boris Ostrovsky wrote:
 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
   #endif
 /*
 + * ldt_structs can be allocated, used, and freed, but they are never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};


 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?
 PV guests are not permitted to have writeable mappings to the frames
 making up the GDT and LDT, so it cannot make unaudited changes to
 loadable descriptors.  In particular, for a 32bit PV guest, it is only
 the segment limit which protects Xen from the ring1 guest kernel.

 A lot of this code hasn't been touched in years, and it certainly
 predates me.  The alignment requirement appears to come from the virtual
 region Xen uses to map the guests GDT and LDT.  Strict alignment is
 required for the GDT so Xen's descriptors starting at 0xe0xx are
 correct, but the LDT alignment seems to be a side effect of similar
 codepaths.

 For an LDT smaller than 8192 entries, I can't see any specific reason
 for enforcing alignment, other than that's the way it has always been.

 However, the guest would still have to relinquish write access to all
 frames which make up the LDT, which looks to be a bit of an issue given
 the snippet above.
 Does the LDT itself need to be aligned or just the address passed to
 paravirt_alloc_ldt?

 The address which Xen receives needs to be aligned.

 It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
 it is passed is page aligned, and passes it straight through.

xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
pointer to the ldt_struct.



 I think I have a solution, but I doubt it is going to be very popular.

 * Make a new paravirt hook for allocation of ldt_struct, so the paravirt
 backend can choose an alignment if needed
 * Make absolutely certain that __pad has the value 0 (so size and __pad
 combined don't look like a present descriptor)
 * Never hand selector 0x0008 to unsuspecting users.
 Yuck.

 I actually meant 0x0004, but yes.  Very much yuck.


 This will allow ldt_struct itself to be page aligned, and for the size
 field to sit across the base/limit field of what would logically be
 selector 0x0008  There would be some issues accessing size.  To load
 frames as an LDT, a guest must drop all refs to the page so that its
 type may be changed from writeable to segdesc.  After that, an
 update_descriptor hypercall can be used to change size, and I believe
 the guest may subsequently recreate read-only mappings to the frames in
 question (although frankly it is getting late so you will want to double
 check all of this).

 Anyhow, this looks like an issue which should be fixed up with slightly
 more PVOps, rather than enforcing a Xen view of the world on native Linux.

 I could presumably make the allocation the other way around so the
 size is at the end.  I could even use two separate allocations if
 needed.

 I suspect two separate allocations would be the better solution, as it
 means that the size field doesn't need to be subject to funny page
 permissions.

True.  OTOH we never write to the size field after allocating the thing.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-upstream-4.5-testing test] 59793: tolerable FAIL - PUSHED

2015-07-21 Thread osstest service owner
flight 59793 qemu-upstream-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59793/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail REGR. vs. 58384
 test-amd64-i386-libvirt  11 guest-start  fail   like 58413
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58413

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-rtds 11 guest-start  fail   never pass

version targeted for testing:
 qemuueb75549f69ca0f3eab26ed39d4ad0fcb6613f64a
baseline version:
 qemuud9552b0af21c27535cd3c8549bb31d26bbecd506

Last test of basis58413  2015-06-11 17:02:58 Z   40 days
Testing same since59774  2015-07-20 12:44:09 Z1 days2 attempts


People who touched revisions under test:
  Gerd Hoffmann kra...@redhat.com
  Marc-André Lureau marcandre.lur...@gmail.com
  Stefano Stabellini stefano.stabell...@eu.citrix.com

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pvh-intelfail
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  fail
 test-amd64-amd64-xl-multivcpupass
 test-armhf-armhf-xl-multivcpupass
 test-amd64-amd64-pairpass
 test-amd64-i386-pair pass
 test-amd64-amd64-xl-rtds pass
 test-armhf-armhf-xl-rtds fail
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass
 test-amd64-amd64-xl-qemuu-winxpsp3   pass
 test-amd64-i386-xl-qemuu-winxpsp3pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master

[Xen-devel] [xen-4.4-testing test] 59794: tolerable FAIL - PUSHED

2015-07-21 Thread osstest service owner
flight 59794 xen-4.4-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59794/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10   fail like 59510
 test-amd64-i386-libvirt  11 guest-start  fail   like 59538
 test-amd64-amd64-libvirt 11 guest-start  fail   like 59538
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59538
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeatfail  like 59538

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 test-armhf-armhf-libvirt 11 guest-start  fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass

version targeted for testing:
 xen  d273ce76c9d79415000d316ce7a3cad03ddb2865
baseline version:
 xen  33eba764618669b9c394c7d9cd2e335b426862ab

Last test of basis59538  2015-07-14 08:07:23 Z7 days
Testing same since59794  2015-07-21 09:55:06 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper andrew.coop...@citrix.com
  Dario Faggioli dario.faggi...@citrix.com
  Elena Ufimtseva elena.ufimts...@oracle.com
  Ian Campbell ian.campb...@citrix.com
  Jan Beulich jbeul...@suse.com
  Juergen Gross jgr...@suse.com
  Liang Li liang.z...@intel.com
  Yang Zhang yang.z.zh...@intel.com

jobs:
 build-amd64-xend pass
 build-i386-xend  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 

[Xen-devel] [PATCH] x86/psr: remove invalid cpu_to_socket call

2015-07-21 Thread Chao Peng
cpu_to_socket() can't give correct socket value in CPU_PREPARE notifier
as at that time phys_proc_id has not yet been initialized (the value is
its default 0 in this case) which is incorrect for sockets other than
socket 0.

cos_to_cbm now is pre-allocated in CPU_PREPARE notifier and then consumed
in CPU_STARTING notifier.

Signed-off-by: Chao Peng chao.p.p...@linux.intel.com
---
 xen/arch/x86/psr.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 861683f..ed59803 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -50,6 +50,8 @@ static unsigned int __read_mostly opt_cos_max = 255;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
+static struct psr_cat_cbm *temp_cos_to_cbm;
+
 static unsigned int get_socket_cpu(unsigned int socket)
 {
 if ( likely(socket  nr_sockets) )
@@ -451,22 +453,15 @@ void psr_domain_free(struct domain *d)
 
 static int cat_cpu_prepare(unsigned int cpu)
 {
-struct psr_cat_socket_info *info;
-unsigned int socket;
-
 if ( !cat_socket_info )
 return 0;
 
-socket = cpu_to_socket(cpu);
-if ( socket = nr_sockets )
-return -ENOSPC;
-
-info = cat_socket_info + socket;
-if ( info-cos_to_cbm )
-return 0;
+if ( temp_cos_to_cbm == NULL 
+ (temp_cos_to_cbm = xzalloc_array(struct psr_cat_cbm,
+  opt_cos_max + 1UL)) == NULL )
+return -ENOMEM;
 
-info-cos_to_cbm = xzalloc_array(struct psr_cat_cbm, opt_cos_max + 1UL);
-return info-cos_to_cbm ? 0 : -ENOMEM;
+return 0;
 }
 
 static void cat_cpu_init(void)
@@ -492,6 +487,8 @@ static void cat_cpu_init(void)
 info-cbm_len = (eax  0x1f) + 1;
 info-cos_max = min(opt_cos_max, edx  0x);
 
+info-cos_to_cbm = temp_cos_to_cbm;
+temp_cos_to_cbm = NULL;
 /* cos=0 is reserved as default cbm(all ones). */
 info-cos_to_cbm[0].cbm = (1ull  info-cbm_len) - 1;
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 02/23] x86/boot: copy only text section from *.lnk file to *.bin file

2015-07-21 Thread Jan Beulich
 Daniel Kiper daniel.ki...@oracle.com 07/21/15 7:24 PM 
On Tue, Jul 21, 2015 at 03:35:07AM -0600, Jan Beulich wrote:
  On 20.07.15 at 16:28, daniel.ki...@oracle.com wrote:

 Without any explanation (description) I'm inclined to say this makes
 things more fragile instead of improving the situation. As it looks
 like we indeed pointlessly copy .eh_frame, but I think this would
 better be avoided by suppressing its generation (i.e. add
 -fno-asynchronous-unwind-tables just like Rules.mk has).

Make sense, however, there is still place for two small optimizations.

First of all ld generates .got.plt section and objcopy copy it to binary file.
It is not needed because we do not link our stuff here with shared libraries.
So, we can use -R objcopy option to remove it (if you do not like -j .text).
This way we could save 15 bytes (at least on my machines).

.got.plt shouldn't be generated in the first place (and I don't recall having
seen one here - I'll double check once in the office), i.e. perhaps another
missing compiler option or a linker quirk?

We could also save another 3 bytes (per one xen/arch/x86/boot C input file)
in final Xen binary in worst case :-))). We just need generate output assembly
files as string of .byte instead of .long.

If you feel it worth your time, go ahead.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski
On Tue, Jul 21, 2015 at 5:49 PM, Andrew Cooper
andrew.coop...@citrix.com wrote:
 On 22/07/2015 01:28, Andy Lutomirski wrote:
 On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:
 On 22/07/2015 01:07, Andy Lutomirski wrote:
 On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:
 On 21/07/2015 22:53, Boris Ostrovsky wrote:
 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
   #endif
 /*
 + * ldt_structs can be allocated, used, and freed, but they are never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};

 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?
 PV guests are not permitted to have writeable mappings to the frames
 making up the GDT and LDT, so it cannot make unaudited changes to
 loadable descriptors.  In particular, for a 32bit PV guest, it is only
 the segment limit which protects Xen from the ring1 guest kernel.

 A lot of this code hasn't been touched in years, and it certainly
 predates me.  The alignment requirement appears to come from the virtual
 region Xen uses to map the guests GDT and LDT.  Strict alignment is
 required for the GDT so Xen's descriptors starting at 0xe0xx are
 correct, but the LDT alignment seems to be a side effect of similar
 codepaths.

 For an LDT smaller than 8192 entries, I can't see any specific reason
 for enforcing alignment, other than that's the way it has always been.

 However, the guest would still have to relinquish write access to all
 frames which make up the LDT, which looks to be a bit of an issue given
 the snippet above.
 Does the LDT itself need to be aligned or just the address passed to
 paravirt_alloc_ldt?
 The address which Xen receives needs to be aligned.

 It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
 it is passed is page aligned, and passes it straight through.
 xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
 it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
 pointer to the ldt_struct.

 So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
 audits to be page aligned.

 This will allow ldt_struct itself to be page aligned, and for the size
 field to sit across the base/limit field of what would logically be
 selector 0x0008  There would be some issues accessing size.  To load
 frames as an LDT, a guest must drop all refs to the page so that its
 type may be changed from writeable to segdesc.  After that, an
 update_descriptor hypercall can be used to change size, and I believe
 the guest may subsequently recreate read-only mappings to the frames in
 question (although frankly it is getting late so you will want to double
 check all of this).

 Anyhow, this looks like an issue which should be fixed up with slightly
 more PVOps, rather than enforcing a Xen view of the world on native Linux.

 I could presumably make the allocation the other way around so the
 size is at the end.  I could even use two separate allocations if
 needed.
 I suspect two separate allocations would be the better solution, as it
 means that the size field doesn't need to be subject to funny page
 permissions.
 True.  OTOH we never write to the size field after allocating the thing.

 Right, but even reading it is going to cause problems if one of the
 paravirt ops can't re-establish ro mappings.

Does paravirt_alloc_ldt completely deny access or does it just set it RO?

--Andy


 ~Andrew



-- 
Andy Lutomirski
AMA Capital Management, LLC

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Boris Ostrovsky



On 07/21/2015 08:49 PM, Andrew Cooper wrote:

On 22/07/2015 01:28, Andy Lutomirski wrote:

On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
andrew.coop...@citrix.com wrote:

On 22/07/2015 01:07, Andy Lutomirski wrote:

On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
andrew.coop...@citrix.com wrote:

On 21/07/2015 22:53, Boris Ostrovsky wrote:

On 07/21/2015 03:59 PM, Andy Lutomirski wrote:

--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
*mm) {}
   #endif
 /*
+ * ldt_structs can be allocated, used, and freed, but they are never
+ * modified while live.
+ */
+struct ldt_struct {
+int size;
+int __pad;/* keep the descriptors naturally aligned. */
+struct desc_struct entries[];
+};

This breaks Xen which expects LDT to be page-aligned. Not sure why.

Jan, Andrew?

PV guests are not permitted to have writeable mappings to the frames
making up the GDT and LDT, so it cannot make unaudited changes to
loadable descriptors.  In particular, for a 32bit PV guest, it is only
the segment limit which protects Xen from the ring1 guest kernel.

A lot of this code hasn't been touched in years, and it certainly
predates me.  The alignment requirement appears to come from the virtual
region Xen uses to map the guests GDT and LDT.  Strict alignment is
required for the GDT so Xen's descriptors starting at 0xe0xx are
correct, but the LDT alignment seems to be a side effect of similar
codepaths.

For an LDT smaller than 8192 entries, I can't see any specific reason
for enforcing alignment, other than that's the way it has always been.

However, the guest would still have to relinquish write access to all
frames which make up the LDT, which looks to be a bit of an issue given
the snippet above.

Does the LDT itself need to be aligned or just the address passed to
paravirt_alloc_ldt?

The address which Xen receives needs to be aligned.

It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
it is passed is page aligned, and passes it straight through.

xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
pointer to the ldt_struct.

So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
audits to be page aligned.


This will allow ldt_struct itself to be page aligned, and for the size
field to sit across the base/limit field of what would logically be
selector 0x0008  There would be some issues accessing size.  To load
frames as an LDT, a guest must drop all refs to the page so that its
type may be changed from writeable to segdesc.  After that, an
update_descriptor hypercall can be used to change size, and I believe
the guest may subsequently recreate read-only mappings to the frames in
question (although frankly it is getting late so you will want to double
check all of this).

Anyhow, this looks like an issue which should be fixed up with slightly
more PVOps, rather than enforcing a Xen view of the world on native Linux.


I could presumably make the allocation the other way around so the
size is at the end.  I could even use two separate allocations if
needed.


Why not wrap mm_context_t's ldt and size into a struct (just like 
ldt_struct but without __pad) and have a single allocation of ldt?


I.e.

struct ldt_struct {
int size;
struct desc_struct *entries;
}

--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -9,8 +9,7 @@
   * we put the segment information here.
   */
  typedef struct {
-void *ldt;
-int size;
+struct ldt_struct ldt;
#ifdef CONFIG_X86_64
  /* True if mm supports a task running in 32 bit compatibility 
mode. */



-boris


I suspect two separate allocations would be the better solution, as it
means that the size field doesn't need to be subject to funny page
permissions.

True.  OTOH we never write to the size field after allocating the thing.

Right, but even reading it is going to cause problems if one of the
paravirt ops can't re-establish ro mappings.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/3] xen-blkfront: rm BUG_ON(info-feature_persistent) in blkif_free

2015-07-21 Thread Bob Liu

On 07/21/2015 05:25 PM, Roger Pau Monné wrote:
 El 21/07/15 a les 5.30, Bob Liu ha escrit:
 This BUG_ON() in blkif_free() is incorrect, because indirect page can be 
 added
 to list info-indirect_pages in blkif_completion() no matter 
 feature_persistent
 is true or false.

 Signed-off-by: Bob Liu bob@oracle.com
 
 Acked-by: Roger Pau Monné roger@citrix.com
 
 This was probably an oversight from when blkif_completion was changed to
 check for gnttab_query_foreign_access. It should be backported to stable
 trees.
 

Sorry, this patch is buggy and I haven't figure out why.

general protection fault:  [#1] SMP 
Modules linked in:
CPU: 0 PID: 39 Comm: xenwatch Tainted: GW   
4.1.0-rc3-3-g718cf80-dirty #67
Hardware name: Xen HVM domU, BIOS 4.5.0-rc 11/23/2014
task: 880283f4eca0 ti: 880283fb4000 task.ti: 880283fb4000
RIP: 0010:[813d577b]  [813d577b] blkif_free+0x162/0x5a9
RSP: 0018:880283fb7c48  EFLAGS: 00010087
RAX: dead00200200 RBX: 88014140 RCX: 
RDX: dead00100100 RSI: dead00100100 RDI: 88028f418bb8
RBP: 880283fb7ca8 R08: dead00200200 R09: 0001
R10: 0001 R11:  R12: 8801414481c8
R13: dead001000e0 R14: 8801414481b8 R15: ea00
FS:  () GS:88028f40() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 01582e08 CR3: 00013345b000 CR4: 001406f0
Stack:
 880023aa8420 0286 880283fb7cb7 880023aa8420
 8800363fe240 81862c50 880283fb7ce8 880023aa8440
 8187 880023aa8400 88014140 88014148
Call Trace:
 [813d8e76] blkfront_remove+0x4c/0xff
 [813772fa] xenbus_dev_remove+0x76/0xb0
 [813bd611] __device_release_driver+0x84/0xf8
 [813bd6a3] device_release_driver+0x1e/0x2b
 [813bd1ef] bus_remove_device+0x12c/0x141
 [813ba51d] device_del+0x161/0x1e5
 [81375ef3] ? xenbus_thread+0x239/0x239
 [813ba5e4] device_unregister+0x43/0x4f
 [81377853] xenbus_dev_changed+0x82/0x17f
 [81377566] ? xenbus_otherend_changed+0xf0/0xff
 [81378d8d] frontend_changed+0x43/0x48
 [81375fec] xenwatch_thread+0xf9/0x127
 [81078fc4] ? add_wait_queue+0x44/0x44
 [8106195b] kthread+0xcd/0xd5
 [8106] ? alloc_pid+0xe8/0x492
 [8106188e] ? kthread_freezable_should_stop+0x48/0x48
 [81533ee2] ret_from_fork+0x42/0x70
 [8106188e] ? kthread_freezable_should_stop+0x48/0x48
Code: 04 00 4c 8b 28 48 8d 78 e0 49 83 ed 20 eb 3d 48 8b 47 28 48 8b 57 20 48 
be 00 01 10 00 00 00 ad de 49 b8 00 02 20 00 00 00 ad de 48 89 42 08 48 89 10 
48 89 77 20 4c 89 47 28 31 f6 e8 26 7d cf 
RIP  [813d577b] blkif_free+0x162/0x5a9
 RSP 880283fb7c48
---[ end trace 5321d7f1ef8414d0 ]---

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-21 Thread Manish Jaggi



On Tuesday 14 July 2015 11:31 PM, Stefano Stabellini wrote:

On Tue, 14 Jul 2015, Julien Grall wrote:

Hi Stefano,

On 14/07/2015 18:46, Stefano Stabellini wrote:

Linux provides a function (pci_for_each_dma_alias) which will return a
requester ID for a given PCI device. It appears that the BDF (the 's' of
sBDF
is only internal to Linux and not part of the hardware) is equal to the
requester ID on your platform but we can't assume it for anyone else.

The PCI Express Base Specification states that the requester ID is The
combination of a Requester's Bus Number, Device Number, and Function
Number that uniquely identifies the Requester.

I think it is safe to assume BDF = requester ID on all platforms.

With the catch that in case of ARI devices
(http://pcisig.com/sites/default/files/specification_documents/ECN-alt-rid-interpretation-070604.pdf),
BDF is actually BF because the device number is always 0 and the
function number is 8 bits.

And some other problem such as broken PCI device...
Both Xen x86 (domain_context_mapping in drivers/passthrough/vtd/iommu.c) and
Linux (pci_dma_for_each_alias) use a code more complex than requesterID = BDF.

So I don't think we can use requesterID = BDF in physdev op unless we are
*stricly* sure this is valid.

The spec is quite clear about it, but I guess there might be hardware quirks.
Can we keep this open and for now till there is agreement make 
requesterid = bdf.

If you are ok, I will update and send Draft 3.




Although, based on the x86 code, Xen should be able to translate the BDF into
the requester ID...

Yes, that is a good point.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski
On Tue, Jul 21, 2015 at 7:04 PM, Boris Ostrovsky
boris.ostrov...@oracle.com wrote:


 On 07/21/2015 08:49 PM, Andrew Cooper wrote:

 On 22/07/2015 01:28, Andy Lutomirski wrote:

 On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:

 On 22/07/2015 01:07, Andy Lutomirski wrote:

 On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:

 On 21/07/2015 22:53, Boris Ostrovsky wrote:

 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:

 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
#endif
  /*
 + * ldt_structs can be allocated, used, and freed, but they are
 never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};

 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?

 PV guests are not permitted to have writeable mappings to the frames
 making up the GDT and LDT, so it cannot make unaudited changes to
 loadable descriptors.  In particular, for a 32bit PV guest, it is only
 the segment limit which protects Xen from the ring1 guest kernel.

 A lot of this code hasn't been touched in years, and it certainly
 predates me.  The alignment requirement appears to come from the
 virtual
 region Xen uses to map the guests GDT and LDT.  Strict alignment is
 required for the GDT so Xen's descriptors starting at 0xe0xx are
 correct, but the LDT alignment seems to be a side effect of similar
 codepaths.

 For an LDT smaller than 8192 entries, I can't see any specific reason
 for enforcing alignment, other than that's the way it has always
 been.

 However, the guest would still have to relinquish write access to all
 frames which make up the LDT, which looks to be a bit of an issue
 given
 the snippet above.

 Does the LDT itself need to be aligned or just the address passed to
 paravirt_alloc_ldt?

 The address which Xen receives needs to be aligned.

 It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
 it is passed is page aligned, and passes it straight through.

 xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
 it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
 pointer to the ldt_struct.

 So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
 audits to be page aligned.

 This will allow ldt_struct itself to be page aligned, and for the size
 field to sit across the base/limit field of what would logically be
 selector 0x0008  There would be some issues accessing size.  To load
 frames as an LDT, a guest must drop all refs to the page so that its
 type may be changed from writeable to segdesc.  After that, an
 update_descriptor hypercall can be used to change size, and I believe
 the guest may subsequently recreate read-only mappings to the frames
 in
 question (although frankly it is getting late so you will want to
 double
 check all of this).

 Anyhow, this looks like an issue which should be fixed up with
 slightly
 more PVOps, rather than enforcing a Xen view of the world on native
 Linux.

 I could presumably make the allocation the other way around so the
 size is at the end.  I could even use two separate allocations if
 needed.


 Why not wrap mm_context_t's ldt and size into a struct (just like ldt_struct
 but without __pad) and have a single allocation of ldt?

 I.e.

 struct ldt_struct {
 int size;
 struct desc_struct *entries;
 }

 --- a/arch/x86/include/asm/mmu.h
 +++ b/arch/x86/include/asm/mmu.h
 @@ -9,8 +9,7 @@
* we put the segment information here.
*/
   typedef struct {
 -void *ldt;
 -int size;
 +struct ldt_struct ldt;
 #ifdef CONFIG_X86_64
   /* True if mm supports a task running in 32 bit compatibility mode. */

I want the atomic read of both of them.  The current code make
interesting assumptions about ordering that may or may not be correct
but are certainly not obviously correct.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 59795: tolerable FAIL

2015-07-21 Thread osstest service owner
flight 59795 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59795/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail in 59772 pass in 
59795
 test-amd64-i386-xl-qemut-win7-amd64 9 windows-install fail in 59772 pass in 
59795
 test-armhf-armhf-xl-arndale   6 xen-bootfail pass in 59772
 test-amd64-i386-xl-qemut-debianhvm-amd64 11 guest-saverestore fail pass in 
59772
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail pass in 
59772

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail in 59772 like 59699
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 59772
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 59772
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59772
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-install fail like 59772

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-arndale  12 migrate-support-check fail in 59772 never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  21d9b079e53805b68047d60d28cde224d09bbb40
baseline version:
 xen  21d9b079e53805b68047d60d28cde224d09bbb40

Last test of basis59795  2015-07-21 09:56:30 Z0 days
Testing same since0  1970-01-01 00:00:00 Z 16638 days0 attempts

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  

Re: [Xen-devel] [PATCH v6 05/15] x86/altp2m: basic data structures and support routines.

2015-07-21 Thread Sahita, Ravi
From: George Dunlap [mailto:george.dun...@citrix.com]
Sent: Tuesday, July 21, 2015 5:47 AM

On 07/21/2015 11:13 AM, Jan Beulich wrote:
 On 21.07.15 at 01:58, edmund.h.wh...@intel.com wrote:
 Add the basic data structures needed to support alternate p2m's and
 the functions to initialise them and tear them down.

 Although Intel hardware can handle 512 EPTP's per hardware thread
 concurrently, only 10 per domain are supported in this patch for
 performance reasons.

 The iterator in hap_enable() does need to handle 512, so that is now
 uint16_t.

 Sigh - this one is still here (and the respective code unchanged).
 I'm not going to NAK the patch just because of this, but it really
 looks like you aren't trying to address comments even when they're
 trivial (and quick) to carry out and testing of the change comes as a
 side effect of you needing to test all the other changes as well.

 --- a/xen/arch/x86/hvm/Makefile
 +++ b/xen/arch/x86/hvm/Makefile
 @@ -1,6 +1,7 @@
  subdir-y += svm
  subdir-y += vmx

 +obj-y += altp2m.o

 Wasn't the outcome of the earlier discussion to put this in x86/mm, or
 possibly not even introduce a new file?

That was my recommendation [1] in response to v5, but there was no
response.  The mail seems to have been seen, however, since Andrew's
Reviewed-by was dropped.  (Perhaps they didn't notice the additional
comment further down.)

[1] marc.info/?i=55a53159.4010...@eu.citrix.com


We got these and hopefully all the straightforward ones in the next rev.

 Overall the situation didn't really change from v5 - the code from a
 pure functionality pov looks okay, but I don't see myself giving in on
 all the minor issues the patch introduces. If some were left
 adjusting of which really takes time to or risks breaking the code,
 I'd (reluctantly) give my ack, but not this way, I'm afraid.

This is a bit puzzling, and somewhat frustrating too -- to have carefully 
combed
through past versions, seen the comments and discussion, and then carefully
combed through this one to find that nearly none of them have been
addressed, even minor ones.

Sorry we did the ABI and some minor ones first in v6, the next rev should have 
these and patch 10 sorted out.
Thanks to you both for the reviews.

Ravi


 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 59789: regressions - FAIL

2015-07-21 Thread osstest service owner
flight 59789 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59789/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-pvops  5 kernel-build  fail REGR. vs. 59254
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux9d634c410b07be7bf637ea03362d3ff132088fe3
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z   12 days
Failing since 59348  2015-07-10 

Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb

2015-07-21 Thread Daniel Kiper
On Tue, Jul 21, 2015 at 03:37:48AM -0600, Jan Beulich wrote:
  On 20.07.15 at 16:28, daniel.ki...@oracle.com wrote:

 ... because of ??? Nowadays - with X86_FEATURE_ERMS - rep stosb
 is expected to be faster than rep stosl.

OK, I did not know about that. However, as I know this feature
was introduced in 2012 with Ivy Bridge. So, I suppose that there
are still a lot of machines in the wild which does not support it.
Anyway, because this code is not performance critical I am not going
to insist on one or another solution. However, Andrew suggested that
thing, so, please agree with him in which direction we should go.
I will do what you agree.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-21 Thread Boris Ostrovsky

On 07/20/2015 10:43 AM, Boris Ostrovsky wrote:

On 07/20/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-17 at 14:17 -0400, Boris Ostrovsky wrote:

On 07/17/2015 03:27 AM, Dario Faggioli wrote:

In the meanwhile, what should we do? Document this? How? don't use
vNUMA with PV guest in SMT enabled systems seems a bit harsh... Is
there a workaround we can put in place/suggest?

I haven't been able to reproduce this on my Intel box because I think I
have different core enumeration.


Yes, most likely, that's highly topology dependant. :-(


Can you try adding
cpuid=['0x1:ebx=0001']
to your config file?


Done (sorry for the delay, the testbox was busy doing other stuff).

Still no joy (.101 is the IP address of the guest, domain id 3):

root@Zhaman:~# ssh root@192.168.1.101 yes  /dev/null 21 
root@Zhaman:~# ssh root@192.168.1.101 yes  /dev/null 21 
root@Zhaman:~# ssh root@192.168.1.101 yes  /dev/null 21 
root@Zhaman:~# ssh root@192.168.1.101 yes  /dev/null 21 
root@Zhaman:~# xl vcpu-list 3
NameID  VCPU   CPU State Time(s) 
Affinity (Hard / Soft)

test 3 04   r-- 23.6  all / 0-7
test 3 19   r-- 19.8  all / 0-7
test 3 28   -b- 0.4  all / 8-15
test 3 34   -b- 0.2  all / 8-15

*HOWEVER* it seems to have an effect. In fact, now, topology as it is
shown in /sys/... is different:

root@test:~# cat 
/sys/devices/system/cpu/cpu0/topology/thread_siblings_list

0
(it was 0-1)

This, OTOH, is still the same:
root@test:~# cat 
/sys/devices/system/cpu/cpu0/topology/core_siblings_list

0-3

Also, I now see this:

[0.150560] [ cut here ]
[0.150560] WARNING: CPU: 2 PID: 0 at 
../arch/x86/kernel/smpboot.c:317 topology_sane.isra.2+0x74/0x88()
[0.150560] sched: CPU #2's llc-sibling CPU #0 is not on the same 
node! [node: 1 != 0]. Ignoring dependency.

[0.150560] Modules linked in:
[0.150560] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0+ #1
[0.150560]  0009 88001ee2fdd0 81657c7b 
810bbd2c
[0.150560]  88001ee2fe20 88001ee2fe10 81081510 
88001ee2fea0
[0.150560]  8103aa02 88003ea0a001  
88001f20a040

[0.150560] Call Trace:
[0.150560]  [81657c7b] dump_stack+0x4f/0x7b
[0.150560]  [810bbd2c] ? up+0x39/0x3e
[0.150560]  [81081510] warn_slowpath_common+0xa1/0xbb
[0.150560]  [8103aa02] ? topology_sane.isra.2+0x74/0x88
[0.150560]  [81081570] warn_slowpath_fmt+0x46/0x48
[0.150560]  [8101eeb1] ? __cpuid.constprop.0+0x15/0x19
[0.150560]  [8103aa02] topology_sane.isra.2+0x74/0x88
[0.150560]  [8103acd0] set_cpu_sibling_map+0x27a/0x444
[0.150560]  [81056ac3] ? numa_add_cpu+0x98/0x9f
[0.150560]  [8100b8f2] cpu_bringup+0x63/0xa8
[0.150560]  [8100b945] cpu_bringup_and_idle+0xe/0x1a
[0.150560] ---[ end trace 63d204896cce9f68 ]---

Notice that it now says 'llc-sibling', while, before, it was saying
'smt-sibling'.


Exactly. You are now passing the first topology test which was to see 
that threads are on the same node. And since each processor has only 
one thread (as evidenced by thread_siblings_list) we are good.


The second test checks that cores (i.e. things that share last level 
cache) are on the same node. And they are not.






On AMD, BTW, we fail a different test so some other bits probably need
to be tweaked. You may fail it too (the LLC sanity check).


Yep, that's the one I guess. Should I try something more/else?



I'll need to see how LLC IDs are calculated, probably also from some 
CPUID bits.



No, can't do this: LLC is calculated from CPUID leaf 4 (on Intel) which 
use indexes in ECX register and xl syntax doesn't allow you to override 
CPUIDs for such leaves.


-boris

The question though will be --- what do we do with how cache sizes 
(and TLB sizes for that matter) are presented to the guests. Do we 
scale them down per thread?


-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] altp2m: patch 07/15 and 11/15

2015-07-21 Thread Sahita, Ravi
Hi maintainers,

While we are sorting through the last 2 to 3 patches that have the main open 
comments (patch 5,6,10/15), could you please ack patch 07/15 and 11/15 if you 
are ok with it - all previous comments on those have been addressed (this will 
allow us to focus on remaining opens). Sorry for missing some emails 
(unintentional) reg patch 5 and 10 ;  some changes were skipped in rev 6 so 
that we could focus on the Category 1 (ABI ones). 

Thanks,
Ravi


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Boris Ostrovsky

On 07/21/2015 03:59 PM, Andy Lutomirski wrote:

modify_ldt has questionable locking and does not synchronize
threads.  Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.

This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.

Cc: sta...@vger.kernel.org
Signed-off-by: Andy Lutomirski l...@kernel.org
---
  arch/x86/include/asm/desc.h|  15 ---
  arch/x86/include/asm/mmu.h |   3 +-
  arch/x86/include/asm/mmu_context.h |  48 ++-
  arch/x86/kernel/cpu/common.c   |   4 +-
  arch/x86/kernel/cpu/perf_event.c   |  12 +-
  arch/x86/kernel/ldt.c  | 247 +++--
  arch/x86/kernel/process_64.c   |   4 +-
  arch/x86/kernel/step.c |   6 +-
  arch/x86/power/cpu.c   |   3 +-
  9 files changed, 192 insertions(+), 150 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index a0bf89fd2647..4e10d73cf018 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -280,21 +280,6 @@ static inline void clear_LDT(void)
set_ldt(NULL, 0);
  }
  
-/*

- * load one particular LDT into the current CPU
- */
-static inline void load_LDT_nolock(mm_context_t *pc)
-{
-   set_ldt(pc-ldt, pc-size);
-}
-
-static inline void load_LDT(mm_context_t *pc)
-{
-   preempt_disable();
-   load_LDT_nolock(pc);
-   preempt_enable();
-}
-
  static inline unsigned long get_desc_base(const struct desc_struct *desc)
  {
return (unsigned)(desc-base0 | ((desc-base1)  16) | ((desc-base2) 
 24));
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 09b9620a73b4..364d27481a52 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -9,8 +9,7 @@
   * we put the segment information here.
   */
  typedef struct {
-   void *ldt;
-   int size;
+   struct ldt_struct *ldt;
  
  #ifdef CONFIG_X86_64

/* True if mm supports a task running in 32 bit compatibility mode. */
diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 5e8daee7c5c9..1ff121fbf366 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct *mm) {}
  #endif
  
  /*

+ * ldt_structs can be allocated, used, and freed, but they are never
+ * modified while live.
+ */
+struct ldt_struct {
+   int size;
+   int __pad;  /* keep the descriptors naturally aligned. */
+   struct desc_struct entries[];
+};




This breaks Xen which expects LDT to be page-aligned. Not sure why.

Jan, Andrew?

-boris



+
+static inline void load_mm_ldt(struct mm_struct *mm)
+{
+   struct ldt_struct *ldt;
+   DEBUG_LOCKS_WARN_ON(!irqs_disabled());
+
+   /* lockless_dereference synchronizes with smp_store_release */
+   ldt = lockless_dereference(mm-context.ldt);
+
+   /*
+* Any change to mm-context.ldt is followed by an IPI to all
+* CPUs with the mm active.  The LDT will not be freed until
+* after the IPI is handled by all such CPUs.  This means that,
+* if the ldt_struct changes before we return, the values we see
+* will be safe, and the new values will be loaded before we run
+* any user code.
+*
+* NB: don't try to convert this to use RCU without extreme care.
+* We would still need IRQs off, because we don't want to change
+* the local LDT after an IPI loaded a newer value than the one
+* that we can see.
+*/
+
+   if (unlikely(ldt))
+   set_ldt(ldt-entries, ldt-size);
+   else
+   clear_LDT();
+}
+
+/*
   * Used for LDT copy/destruction.
   */
  int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
@@ -78,12 +116,12 @@ static inline void switch_mm(struct mm_struct *prev, 
struct mm_struct *next,
 * was called and then modify_ldt changed
 * prev-context.ldt but suppressed an IPI to this CPU.
 * In this case, prev-context.ldt != NULL, because we
-* never free an LDT while the mm still exists.  That
-* means that next-context.ldt != prev-context.ldt,
-* because mms never share an LDT.
+* never set context.ldt to NULL while the mm still
+* exists.  That means that next-context.ldt !=
+* prev-context.ldt, because mms never share an LDT.
 */
if (unlikely(prev-context.ldt != next-context.ldt))
-   load_LDT_nolock(next-context);
+   load_mm_ldt(next);
}
  #ifdef CONFIG_SMP
  else {
@@ -106,7 +144,7 @@ static inline void switch_mm(struct mm_struct *prev, struct 
mm_struct *next,

[Xen-devel] [linux-3.18 test] 59785: regressions - FAIL

2015-07-21 Thread osstest service owner
flight 59785 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59785/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 58581

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail in 
59766 pass in 59785
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 18 guest-start/debianhvm.repeat 
fail pass in 59766

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 58581
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 11 guest-saverestore fail 
baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail baseline untested
 test-armhf-armhf-xl-rtds 11 guest-start fail baseline untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
in 59766 baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 14 guest-localmigrate.2 
fail in 59766 baseline untested
 test-armhf-armhf-xl-rtds 14 guest-start.2  fail in 59766 baseline untested
 test-amd64-i386-libvirt-xsm  11 guest-start   fail in 59766 like 58558
 test-amd64-amd64-libvirt 11 guest-start   fail in 59766 like 58558
 test-amd64-amd64-libvirt-xsm 11 guest-start   fail in 59766 like 58558
 test-amd64-i386-libvirt  11 guest-start   fail in 59766 like 58581
 test-armhf-armhf-xl-credit2   6 xen-boot fail   like 58581
 test-armhf-armhf-xl   6 xen-boot fail   like 58581
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  like 58581
 test-armhf-armhf-xl-xsm   6 xen-boot fail   like 58581
 test-armhf-armhf-libvirt-xsm  6 xen-boot fail   like 58581
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58581
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58581
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58581

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds 12 migrate-support-check fail in 59766 never pass
 test-amd64-i386-freebsd10-i386  9 freebsd-install  fail never pass
 test-amd64-i386-freebsd10-amd64  9 freebsd-install fail never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass

version targeted for testing:
 linux866cebe251f4fb2b435f4ecfe6d3bb4025938533
baseline version:
 linuxd048c068d00da7d4cfa5ea7651933b99026958cf

Last test of basis58581  2015-06-15 09:42:22 Z   36 days
Failing since 58976  2015-06-29 19:43:23 Z   22 days   31 attempts
Testing same since59412  2015-07-11 00:18:42 Z   10 days   19 attempts


308 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andrew Cooper
On 22/07/2015 01:07, Andy Lutomirski wrote:
 On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
 andrew.coop...@citrix.com wrote:
 On 21/07/2015 22:53, Boris Ostrovsky wrote:
 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
   #endif
 /*
 + * ldt_structs can be allocated, used, and freed, but they are never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};


 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?
 PV guests are not permitted to have writeable mappings to the frames
 making up the GDT and LDT, so it cannot make unaudited changes to
 loadable descriptors.  In particular, for a 32bit PV guest, it is only
 the segment limit which protects Xen from the ring1 guest kernel.

 A lot of this code hasn't been touched in years, and it certainly
 predates me.  The alignment requirement appears to come from the virtual
 region Xen uses to map the guests GDT and LDT.  Strict alignment is
 required for the GDT so Xen's descriptors starting at 0xe0xx are
 correct, but the LDT alignment seems to be a side effect of similar
 codepaths.

 For an LDT smaller than 8192 entries, I can't see any specific reason
 for enforcing alignment, other than that's the way it has always been.

 However, the guest would still have to relinquish write access to all
 frames which make up the LDT, which looks to be a bit of an issue given
 the snippet above.
 Does the LDT itself need to be aligned or just the address passed to
 paravirt_alloc_ldt?

The address which Xen receives needs to be aligned.

It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
it is passed is page aligned, and passes it straight through.


 I think I have a solution, but I doubt it is going to be very popular.

 * Make a new paravirt hook for allocation of ldt_struct, so the paravirt
 backend can choose an alignment if needed
 * Make absolutely certain that __pad has the value 0 (so size and __pad
 combined don't look like a present descriptor)
 * Never hand selector 0x0008 to unsuspecting users.
 Yuck.

I actually meant 0x0004, but yes.  Very much yuck.


 This will allow ldt_struct itself to be page aligned, and for the size
 field to sit across the base/limit field of what would logically be
 selector 0x0008  There would be some issues accessing size.  To load
 frames as an LDT, a guest must drop all refs to the page so that its
 type may be changed from writeable to segdesc.  After that, an
 update_descriptor hypercall can be used to change size, and I believe
 the guest may subsequently recreate read-only mappings to the frames in
 question (although frankly it is getting late so you will want to double
 check all of this).

 Anyhow, this looks like an issue which should be fixed up with slightly
 more PVOps, rather than enforcing a Xen view of the world on native Linux.

 I could presumably make the allocation the other way around so the
 size is at the end.  I could even use two separate allocations if
 needed.

I suspect two separate allocations would be the better solution, as it
means that the size field doesn't need to be subject to funny page
permissions.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andrew Cooper
On 21/07/2015 22:53, Boris Ostrovsky wrote:
 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
   #endif
 /*
 + * ldt_structs can be allocated, used, and freed, but they are never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};



 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?

PV guests are not permitted to have writeable mappings to the frames
making up the GDT and LDT, so it cannot make unaudited changes to
loadable descriptors.  In particular, for a 32bit PV guest, it is only
the segment limit which protects Xen from the ring1 guest kernel.

A lot of this code hasn't been touched in years, and it certainly
predates me.  The alignment requirement appears to come from the virtual
region Xen uses to map the guests GDT and LDT.  Strict alignment is
required for the GDT so Xen's descriptors starting at 0xe0xx are
correct, but the LDT alignment seems to be a side effect of similar
codepaths.

For an LDT smaller than 8192 entries, I can't see any specific reason
for enforcing alignment, other than that's the way it has always been.

However, the guest would still have to relinquish write access to all
frames which make up the LDT, which looks to be a bit of an issue given
the snippet above.

I think I have a solution, but I doubt it is going to be very popular.

* Make a new paravirt hook for allocation of ldt_struct, so the paravirt
backend can choose an alignment if needed
* Make absolutely certain that __pad has the value 0 (so size and __pad
combined don't look like a present descriptor)
* Never hand selector 0x0008 to unsuspecting users.

This will allow ldt_struct itself to be page aligned, and for the size
field to sit across the base/limit field of what would logically be
selector 0x0008  There would be some issues accessing size.  To load
frames as an LDT, a guest must drop all refs to the page so that its
type may be changed from writeable to segdesc.  After that, an
update_descriptor hypercall can be used to change size, and I believe
the guest may subsequently recreate read-only mappings to the frames in
question (although frankly it is getting late so you will want to double
check all of this).

Anyhow, this looks like an issue which should be fixed up with slightly
more PVOps, rather than enforcing a Xen view of the world on native Linux.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski
On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
andrew.coop...@citrix.com wrote:
 On 21/07/2015 22:53, Boris Ostrovsky wrote:
 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
   #endif
 /*
 + * ldt_structs can be allocated, used, and freed, but they are never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};



 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?

 PV guests are not permitted to have writeable mappings to the frames
 making up the GDT and LDT, so it cannot make unaudited changes to
 loadable descriptors.  In particular, for a 32bit PV guest, it is only
 the segment limit which protects Xen from the ring1 guest kernel.

 A lot of this code hasn't been touched in years, and it certainly
 predates me.  The alignment requirement appears to come from the virtual
 region Xen uses to map the guests GDT and LDT.  Strict alignment is
 required for the GDT so Xen's descriptors starting at 0xe0xx are
 correct, but the LDT alignment seems to be a side effect of similar
 codepaths.

 For an LDT smaller than 8192 entries, I can't see any specific reason
 for enforcing alignment, other than that's the way it has always been.

 However, the guest would still have to relinquish write access to all
 frames which make up the LDT, which looks to be a bit of an issue given
 the snippet above.

Does the LDT itself need to be aligned or just the address passed to
paravirt_alloc_ldt?


 I think I have a solution, but I doubt it is going to be very popular.

 * Make a new paravirt hook for allocation of ldt_struct, so the paravirt
 backend can choose an alignment if needed
 * Make absolutely certain that __pad has the value 0 (so size and __pad
 combined don't look like a present descriptor)
 * Never hand selector 0x0008 to unsuspecting users.

Yuck.


 This will allow ldt_struct itself to be page aligned, and for the size
 field to sit across the base/limit field of what would logically be
 selector 0x0008  There would be some issues accessing size.  To load
 frames as an LDT, a guest must drop all refs to the page so that its
 type may be changed from writeable to segdesc.  After that, an
 update_descriptor hypercall can be used to change size, and I believe
 the guest may subsequently recreate read-only mappings to the frames in
 question (although frankly it is getting late so you will want to double
 check all of this).

 Anyhow, this looks like an issue which should be fixed up with slightly
 more PVOps, rather than enforcing a Xen view of the world on native Linux.


I could presumably make the allocation the other way around so the
size is at the end.  I could even use two separate allocations if
needed.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] unmodified-drivers: tolerate IRQF_DISABLED being undefined

2015-07-21 Thread Olaf Hering
On Mon, Jun 01, Jan Beulich wrote:

  On 01.06.15 at 15:56, ian.campb...@citrix.com wrote:
  On Mon, 2015-06-01 at 14:50 +0100, Jan Beulich wrote:
  It's being removed in Linux 4.1.
  
  Signed-off-by: Jan Beulich jbeul...@suse.com
  
  Not sure who should, but:
 
 I guess no-one really needs to for that old code. But thanks!

Any chance to backport this to all 4.x branches?

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] VT-d: add iommu=igfx_off option to workaround graphics issues

2015-07-21 Thread Jan Beulich
 On 21.07.15 at 09:23, kevin.t...@intel.com wrote:
  From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Tuesday, July 21, 2015 3:17 PM
  
  On 21.07.15 at 09:05, kevin.t...@intel.com wrote:
   From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: Tuesday, July 21, 2015 2:57 PM
   On 21.07.15 at 02:57, kevin.t...@intel.com wrote:
From: Andrew Cooper [mailto:am...@hermes.cam.ac.uk] On Behalf Of 
   Andrew
   Cooper
   Sent: Monday, July 20, 2015 4:21 PM
   This is the part which I don't quite understand. WC is essentially an UC
   attribute with write buffer to accelerate the write efficiency. There
   should be no correctness problem to use either WC or UC if i915 driver
   wants WC.
 
  Should is too weak a term here: Using WC on the wrong piece of
  memory or without the necessary fencing can imo very well cause
  correctness problems (which would be hidden by WC - UC
  conversion behind the driver's back).
 
 
  My point is about when i915 wants WC, then either UC (I suppose is
  the case before that Linux commit) and WC (by that commit) has
  no correctness problem. UC is more strict than WC. It's just performance
  difference. It's not about using WC in wrong place when it's not desired.
 
 In this you assume there are no misguided attempts to request
 WC in the driver, which would have gone unnoticed as long as WC
 didn't become the effective attribute. And misguided here is
 meant to include cases where hardware errata may need taking
 care of.
 
 I don't understand this point. If it's misguided attempts it'd be
 same on bare metal. 

Not if bare metal has a workaround in place depending on (or even
implemented in) IOMMU code. Also iirc the reported said that a
similar problem existed (exists?) on native too.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [Xen-Devel] Enabling IRQ Crossbar (Secondary Interrupt Controller) Support

2015-07-21 Thread Brandon Perez

Hello All,

You can find the relevant thread that outlines the issue at [1].

In short, the issue is as follows. On the TI DRA72 chip, there is a 
piece of hardware called the IRQ crossbar. Due to the large number of 
peripheral devices, not all devices can fit onto the SPI lines on the 
GIC. The crossbar maps peripheral devices to interrupt lines on the GIC.


The Linux kernel handles this by performing an on-demand mapping. 
Whenever a driver requires an interrupt, it requests it from the kernel. 
The Kernel keeps an internal map of the SPI lines, and checks for the 
next free line. It then sets up the appropriate crossbar register to map 
the given device to the SPI line, and marks that line as no longer free.


In the device tree, the interrupts property no longer contains 
the IRQ line corresponding to the device, but rather the crossbar input 
line corresponding to the device. This causes issues in Xen since it is 
expecting an IRQ line.


Ideally (in my opinion), the solution would be to mark these device 
tree entries (possibly with a different interrupt-parent property), so 
that Xen only maps the memory-mapped registers, and passes the interrupt 
on to Dom0. Dom0 would then handle the on-demand mapping of the devices. 
That way, Xen wouldn't need to repeat some fairly hardware-specific code 
that's already in the Kernel.


Below I've attached the patches of the changes I have made to Xen 
and the Linux kernel so far. For reference, here is my setup:

- Hardware: TI DRA72 Chip, Arm Cortex A15
- Xen:
- Version: 4.6-unstable
- Compiled from source, with some local changes
- Branch: master
- Commit: ecdae1cfaa7f6123decaa1b9d7205c3ff726b941
- Repo URL: git://xenbits.xen.org/xen.git
- Linux Kernel:
- Version: 3.14
- Compiled From source, with some local changes
- Branch: android-3.14-6AL.1.0
- Commit: 7b2f1133857414b96927c06f08ed6c440f5472e7
- Repo URL: git://git.omapzoom.org/kernel/omap.git

I have additional patches for U-Boot which I can provide if needed, 
but they aren't directly relevant to this issue. I also have patches for 
a static crossbar mapping with Xen, which is the temporary work-around 
I'm currently working with.


[1] http://www.gossamer-threads.com/lists/xen/users/389509

Brandon Perez

From f2bf190255c8f872d15063d7f8a6382c279e312d Mon Sep 17 00:00:00 2001
From: Brandon Perez bpere...@ti.com
Date: Mon, 20 Jul 2015 17:56:49 -0400
Subject: [PATCH 1/3] This patch adds in IO mappings for several devices which
 are not explicitly laid out in the device tree, and
 mappings for DRA72x specific regions that are not
 devices.

In the ARM virtualization setup, the virtual memory uses a 2-stage
translation. The guest OS VM performs a virtual address translation to
an intermediate physical address, which is still not a true physical
memory address. The Xen hypervisor VM then translates this IPA to an
actual physical address.

Thus, in order for a guest OS (even Dom0) to access a physical address,
Xen must explicitly setup a mapping in its VM for the guest OS to access
this location. Several devices were missing from the device tree, so Xen
was unware of the need to map their MMIO registers. Thus, the platform
has specific_mapping() function, which patches up these holes. The OMAP5
specific mapping function is missing a few items that DRA72x chips need,
so these were added in.
---
 xen/arch/arm/platforms/omap5.c|   27 +++
 xen/include/asm-arm/platforms/omap5.h |3 +++
 2 files changed, 30 insertions(+)

diff --git a/xen/arch/arm/platforms/omap5.c b/xen/arch/arm/platforms/omap5.c
index e7bf30d..3c6495a 100644
--- a/xen/arch/arm/platforms/omap5.c
+++ b/xen/arch/arm/platforms/omap5.c
@@ -120,6 +120,32 @@ static int omap5_specific_mapping(struct domain *d)
 return 0;
 }
 
+/* Additional mappings for dom0 (not in the DTS) */
+static int dra7_specific_mapping(struct domain *d)
+{
+/* Map the PRM module */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRM_BASE), 2,
+ paddr_to_pfn(OMAP5_PRM_BASE));
+
+/* Map the PRM_MPU */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRCM_MPU_BASE), 1,
+ paddr_to_pfn(OMAP5_PRCM_MPU_BASE));
+
+/* Map the Wakeup Gen */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_WKUPGEN_BASE), 1,
+ paddr_to_pfn(OMAP5_WKUPGEN_BASE));
+
+/* Map the on-chip SRAM */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_SRAM_PA), 32,
+ paddr_to_pfn(OMAP5_SRAM_PA));
+
+/* Map GPMC address space for NAND flash. */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_GPMC_PA), 65536,
+ paddr_to_pfn(OMAP5_GPMC_PA));
+
+return 0;
+}
+
 static int __init omap5_smp_init(void)
 {
 void __iomem *wugen_base;
@@ -171,6 +197,7 @@ PLATFORM_START(dra7, TI DRA7)
 .init_time = omap5_init_time,
 .cpu_up = cpu_up_send_sgi,
 .smp_init = 

Re: [Xen-devel] [v10][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs

2015-07-21 Thread Jan Beulich
 On 21.07.15 at 02:53, tiejun.c...@intel.com wrote:
  Okay, I regenerate this patch online. And I just hope its good to be
 acked here:

 Provided it also works,
 Reviewed-by: Jan Beulich jbeul...@suse.com
 
 Why is this marked as Acked-by this time? The same question is raised to 
 another hvmloader patch as well.
 
 This really makes me confused since you're the key maintainer associated 
 to this, and I remember you also gave me Acked-by to the first hvmloader 
 patch. I know this solution is always argued, so does this mean you 
 still don't think this is good to go in the tree in your perspective, so 
 you want to leave this Acked-by to other maintainers, right?

Keeping in mind that others (and other projects) may consider this
differently, at least on the hypervisor side we've been mostly
treating (or trying to treat) Reviewed-by as a superset of Acked-by.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] blktap2: update connection handling to fix build with gcc5

2015-07-21 Thread Olaf Hering
On Mon, Jul 20, Wei Liu wrote:

 On Mon, Jul 20, 2015 at 11:16:34AM +0200, Olaf Hering wrote:
  On Mon, Jul 20, Jan Beulich wrote:
  
On 19.07.15 at 11:33, o...@aepfle.de wrote:
[  198s] block-log.c:549:23: error: array subscript is above array 
bounds [-Werror=array-bounds]
[  198s]  if (s-connections[i].id == id)
[  198s]^
   
   So what makes the compiler right with that complaint? I.e. how does
   it know i  0 here? After all - afaict - s-connected can only be 0 or
  
  It has to assume that -connected can get any value because the input
  comes from outside the unit.
  
  To reduce the patch size  i  MAX_CONNECTIONS could be added.
  
 
 A smaller patch would be preferable at this stage.

I disagree with that.

What is the longterm goal of that binary?
Will it ever be able to handle more than one connection? If so the loops
have to handle holes in -connections[], which increases patch size.

If it will ever handle only a single conection, please review and spot
possible errors in my patch.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] blktap2: update connection handling to fix build with gcc5

2015-07-21 Thread Jan Beulich
 On 21.07.15 at 08:32, o...@aepfle.de wrote:
 On Mon, Jul 20, Wei Liu wrote:
 
 On Mon, Jul 20, 2015 at 11:16:34AM +0200, Olaf Hering wrote:
  On Mon, Jul 20, Jan Beulich wrote:
  
On 19.07.15 at 11:33, o...@aepfle.de wrote:
[  198s] block-log.c:549:23: error: array subscript is above array 
bounds 
 [-Werror=array-bounds]
[  198s]  if (s-connections[i].id == id)
[  198s]^
   
   So what makes the compiler right with that complaint? I.e. how does
   it know i  0 here? After all - afaict - s-connected can only be 0 or
  
  It has to assume that -connected can get any value because the input
  comes from outside the unit.
  
  To reduce the patch size  i  MAX_CONNECTIONS could be added.
  
 
 A smaller patch would be preferable at this stage.
 
 I disagree with that.

Please consider that we're in code freeze right now.

 What is the longterm goal of that binary?

It being mostly unmaintained, it's probably a candidate for removal
not too far in the future. Which would be another reason again
extensive changes.

But anyway, the primary question remains - isn't what you're seeing
a compiler bug? If yes, that's imo _yet another_ reason for doing a
minimal workaround (if any at all).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 05/20] block/xen-blkfront: Split blkif_queue_request in 2

2015-07-21 Thread Roger Pau Monné
El 09/07/15 a les 22.42, Julien Grall ha escrit:
 Currently, blkif_queue_request has 2 distinct execution path:
 - Send a discard request
 - Send a read/write request
 
 The function is also allocating grants to use for generating the
 request. Although, this is only used for read/write request.
 
 Rather than having a function with 2 distinct execution path, separate
 the function in 2. This will also remove one level of tabulation.
 
 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Roger Pau Monné roger@citrix.com
 Cc: Boris Ostrovsky boris.ostrov...@oracle.com
 Cc: David Vrabel david.vra...@citrix.com

Patch looks fine, although with so much indentation changes it's kind of
hard to review.

Acked-by: Roger Pau Monné roger@citrix.com

Just one minor change below.

[...]

 @@ -595,6 +603,24 @@ static int blkif_queue_request(struct request *req)
   return 0;
  }
  
 +/*
 + * Generate a Xen blkfront IO request from a blk layer request.  Reads
 + * and writes are handled as expected.
 + *
 + * @req: a request struct
 + */
 +static int blkif_queue_request(struct request *req)
 +{
 + struct blkfront_info *info = req-rq_disk-private_data;
 +
 + if (unlikely(info-connected != BLKIF_STATE_CONNECTED))
 + return 1;
 +
 + if (unlikely(req-cmd_flags  (REQ_DISCARD | REQ_SECURE)))
 + return blkif_queue_discard_req(req);
 + else
 + return blkif_queue_rw_req(req);

There's no need for the else clause.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 01/15] common/domain: Helpers to pause a domain while in context

2015-07-21 Thread George Dunlap
On Tue, Jul 21, 2015 at 12:57 AM, Ed White edmund.h.wh...@intel.com wrote:
 From: Andrew Cooper andrew.coop...@citrix.com

 For use on codepaths which would need to use domain_pause() but might be in
 the target domain's context.  In the case that the target domain is in
 context, all other vcpus are paused.

 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com

FWIW:

Reviewed-by: George Dunlap george.dun...@eu.citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 05/15] x86/altp2m: basic data structures and support routines.

2015-07-21 Thread Jan Beulich
 On 21.07.15 at 01:58, edmund.h.wh...@intel.com wrote:
 Add the basic data structures needed to support alternate p2m's and
 the functions to initialise them and tear them down.
 
 Although Intel hardware can handle 512 EPTP's per hardware thread
 concurrently, only 10 per domain are supported in this patch for
 performance reasons.
 
 The iterator in hap_enable() does need to handle 512, so that is now
 uint16_t.

Sigh - this one is still here (and the respective code unchanged).
I'm not going to NAK the patch just because of this, but it really
looks like you aren't trying to address comments even when
they're trivial (and quick) to carry out and testing of the change
comes as a side effect of you needing to test all the other changes
as well.

 --- a/xen/arch/x86/hvm/Makefile
 +++ b/xen/arch/x86/hvm/Makefile
 @@ -1,6 +1,7 @@
  subdir-y += svm
  subdir-y += vmx
  
 +obj-y += altp2m.o

Wasn't the outcome of the earlier discussion to put this in x86/mm,
or possibly not even introduce a new file?

 +void
 +altp2m_vcpu_initialise(struct vcpu *v)
 +{
 +if ( v != current )
 +vcpu_pause(v);
 +
 +altp2m_vcpu_reset(v);
 +vcpu_altp2m(v).p2midx = 0;
 +atomic_inc(p2m_get_altp2m(v)-active_vcpus);
 +
 +altp2m_vcpu_update_eptp(v);

Ahem - I'm not going to repeat the earlier comment yet another time,
the more that I see that the same issue didn't get addressed further
down either. Did you do _any_ of the requested changes at all? The
overview mail's list of changes seems to indicate you didn't. That also
reminds me to ask you yet another time to put the change info into
the individual patch mails; whether you _also_ put them in the
overview mail is up to you.

 @@ -196,7 +234,18 @@ int p2m_init(struct domain *d)
   * (p2m_init runs too early for HVM_PARAM_* options) */
  rc = p2m_init_nestedp2m(d);
  if ( rc )
 +{
  p2m_teardown_hostp2m(d);
 +return rc;
 +}
 +
 +rc = p2m_init_altp2m(d);
 +if ( rc )
 +{
 +p2m_teardown_hostp2m(d);
 +p2m_teardown_nestedp2m(d);
 +p2m_teardown_altp2m(d);

I can see (and remember having asked for) the first two, but
p2m_init_altp2m() would better clean up after itself in case of
failure.

Overall the situation didn't really change from v5 - the code from
a pure functionality pov looks okay, but I don't see myself giving
in on all the minor issues the patch introduces. If some were
left adjusting of which really takes time to or risks breaking the
code, I'd (reluctantly) give my ack, but not this way, I'm afraid.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 06/20] block/xen-blkfront: Store a page rather a pfn in the grant structure

2015-07-21 Thread Roger Pau Monné
El 09/07/15 a les 22.42, Julien Grall ha escrit:
 All the usage of the field pfn are done using the same idiom:
 
 pfn_to_page(grant-pfn)
 
 This will  return always the same page. Store directly the page in the
 grant to clean up the code.
 
 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Roger Pau Monné roger@citrix.com
 Cc: Boris Ostrovsky boris.ostrov...@oracle.com
 Cc: David Vrabel david.vra...@citrix.com

Acked-by: Roger Pau Monné roger@citrix.com

With one style fix.

[...]

  static struct grant *get_grant(grant_ref_t *gref_head,
 -   unsigned long pfn,
 +struct page *page,

Indentation.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.4 test] 59771: regressions - FAIL

2015-07-21 Thread osstest service owner
flight 59771 linux-3.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59771/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win7-amd64  6 xen-boot  fail REGR. vs. 30511

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-sedf-pin  6 xen-boot   fail in 58831 pass in 58798
 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail in 58831 pass in 
59771
 test-amd64-amd64-pair10 xen-boot/dst_host   fail pass in 58798
 test-amd64-i386-pair 10 xen-boot/dst_host   fail pass in 58831
 test-amd64-i386-pair  9 xen-boot/src_host   fail pass in 58831
 test-amd64-amd64-pair 9 xen-boot/src_host   fail pass in 59550
 test-amd64-amd64-xl   6 xen-bootfail pass in 59576

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-i386-xl-xsm6 xen-bootfail baseline untested
 test-amd64-i386-libvirt-xsm   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-multivcpu  6 xen-boot   fail baseline untested
 test-amd64-amd64-libvirt-xsm  6 xen-bootfail baseline untested
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-credit2   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail baseline 
untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 11 guest-saverestore fail 
baseline untested
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-rtds  6 xen-bootfail baseline untested
 test-amd64-amd64-xl-xsm   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-sedf  6 xen-boot  fail in 58831 like 30406
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail in 59576 baseline untested
 test-amd64-i386-libvirt  11 guest-start   fail in 59576 like 30511
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail in 59576 like 30511
 test-amd64-amd64-libvirt 11 guest-start   fail in 59576 like 30511
 test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10  fail like 30394
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 30511
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-bootfail like 53709-bisect
 test-amd64-i386-xl6 xen-bootfail like 53725-bisect
 test-amd64-i386-freebsd10-amd64  6 xen-boot fail like 58780-bisect
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot   fail like 58786-bisect
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-bootfail like 58788-bisect
 test-amd64-i386-rumpuserxen-i386  6 xen-bootfail like 58799-bisect
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  6 xen-bootfail like 58801-bisect
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  6 xen-boot   fail like 58803-bisect
 test-amd64-amd64-xl-qemut-winxpsp3  6 xen-boot  fail like 58804-bisect
 test-amd64-i386-freebsd10-i386  6 xen-boot  fail like 58805-bisect
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot fail like 58806-bisect
 test-amd64-amd64-xl-qemuu-winxpsp3  6 xen-boot  fail like 58807-bisect
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-boot   fail like 58808-bisect
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  6 xen-bootfail like 58809-bisect
 test-amd64-amd64-rumpuserxen-amd64  6 xen-boot  fail like 58810-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64  6 xen-bootfail like 58811-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64  6 xen-boot   fail like 58813-bisect
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-bootfail like 58814-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64  6 xen-bootfail like 58815-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm 12 migrate-support-check fail in 58831 never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 linuxcf1b3dad6c5699b977273276bada8597636ef3e2
baseline version:
 linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70

Last test of basis30511  2014-09-29 16:37:46 Z  294 days
Failing since 32004  2014-12-02 04:10:03 Z  231 days  187 attempts
Testing same since 

Re: [Xen-devel] [PATCH v2 07/20] block/xen-blkfront: split get_grant in 2

2015-07-21 Thread Roger Pau Monné
El 09/07/15 a les 22.42, Julien Grall ha escrit:
 Prepare the code to support 64KB page granularity. The first
 implementation will use a full Linux page per indirect and persistent
 grant. When non-persistent grant is used, each page of a bio request
 may be split in multiple grant.
 
 Furthermore, the field page of the grant structure is only used to copy
 data from persistent grant or indirect grant. Avoid to set it for other
 use case as it will have no meaning given the page will be split in
 multiple grant.
 
 Provide 2 functions, to setup indirect grant, the other for bio page.
 
 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Roger Pau Monné roger@citrix.com
 Cc: Boris Ostrovsky boris.ostrov...@oracle.com
 Cc: David Vrabel david.vra...@citrix.com
 ---
 Changes in v2:
 - Patch added
 ---
  drivers/block/xen-blkfront.c | 85 
 ++--
  1 file changed, 59 insertions(+), 26 deletions(-)
 
 diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
 index 7b81d23..95fd067 100644
 --- a/drivers/block/xen-blkfront.c
 +++ b/drivers/block/xen-blkfront.c
 @@ -242,34 +242,77 @@ out_of_memory:
   return -ENOMEM;
  }
  
 -static struct grant *get_grant(grant_ref_t *gref_head,
 -struct page *page,
 -   struct blkfront_info *info)
 +static struct grant *get_free_grant(struct blkfront_info *info)
  {
   struct grant *gnt_list_entry;
 - unsigned long buffer_mfn;
  
   BUG_ON(list_empty(info-grants));
   gnt_list_entry = list_first_entry(info-grants, struct grant,
 -   node);
 +   node);

Stray change?

   list_del(gnt_list_entry-node);
  
 - if (gnt_list_entry-gref != GRANT_INVALID_REF) {
 + if (gnt_list_entry-gref != GRANT_INVALID_REF)
   info-persistent_gnts_c--;
 +
 + return gnt_list_entry;
 +}
 +
 +static void grant_foreign_access(const struct grant *gnt_list_entry,
 +  const struct blkfront_info *info)

Given that this is just a wrapper I would make it an inline function, or
even consider removing it and just call
gnttab_page_grant_foreign_access_ref directly.

 +{
 + gnttab_page_grant_foreign_access_ref(gnt_list_entry-gref,
 +  info-xbdev-otherend_id,
 +  gnt_list_entry-page,
 +  0);
 +}
 +
 +static struct grant *get_grant(grant_ref_t *gref_head,
 +unsigned long mfn,
 +struct blkfront_info *info)

Indentation.

 +{
 + struct grant *gnt_list_entry = get_free_grant(info);
 +
 + if (gnt_list_entry-gref != GRANT_INVALID_REF)
   return gnt_list_entry;
 +
 + /* Assign a gref to this page */
 + gnt_list_entry-gref = gnttab_claim_grant_reference(gref_head);
 + BUG_ON(gnt_list_entry-gref == -ENOSPC);
 + if (info-feature_persistent)
 + grant_foreign_access(gnt_list_entry, info);
 + else {
 + /* Grant access to the MFN passed by the caller */
 + gnttab_grant_foreign_access_ref(gnt_list_entry-gref,
 + info-xbdev-otherend_id,
 + mfn, 0);
   }
  
 + return gnt_list_entry;
 +}
 +
 +static struct grant *get_indirect_grant(grant_ref_t *gref_head,
 + struct blkfront_info *info)

Indentation.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson
I wrote:
 If the domain configuration has rdms and num_rdms already set, then
 the strategy should presumably be ignored.  (Passing the same domain
 configuration struct to libxl_domain_create_new, after destroying the
 domain, ought to work, even after the first call has modified it.)

But even your latest patch adds the host rdm's (from the strategy
algorithm) to the array, unconditionally.

I think you need to think more clearly about what happens when the
caller *both* supplies some rdms, and sets strategy=host.

Certainly if this happens and the set of rdms supplied is the same as
that which would be generated by the strategy, the set should not be
duplicated.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] xen-blkback: rm BUG_ON() in purge_persistent_gnt()

2015-07-21 Thread Bob Liu

On 07/21/2015 05:13 PM, Roger Pau Monné wrote:
 El 21/07/15 a les 5.30, Bob Liu ha escrit:
 This BUG_ON() will be triggered when previous purge work haven't finished.
 It's reasonable under pretty extreme load and should not panic the system.

 Signed-off-by: Bob Liu bob@oracle.com
 ---
  drivers/block/xen-blkback/blkback.c |4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

 diff --git a/drivers/block/xen-blkback/blkback.c 
 b/drivers/block/xen-blkback/blkback.c
 index ced9677..b90ac8e 100644
 --- a/drivers/block/xen-blkback/blkback.c
 +++ b/drivers/block/xen-blkback/blkback.c
 @@ -394,7 +394,9 @@ static void purge_persistent_gnt(struct xen_blkif *blkif)
  
  pr_debug(Going to purge %u persistent grants\n, num_clean);
  
 -BUG_ON(!list_empty(blkif-persistent_purge_list));
 +if (!list_empty(blkif-persistent_purge_list))
 +return;
 +
 
 I see the problem with this, there's a check for work_pending before
 this BUG_ON, but it doesn't account if the work is currently running. I

Exactly.

 would rather prefer to replace the work_pending check with work_busy
 instead, which will also take into account if the work is still running.
 The comment on work_busy however makes me nervous:
 
 * Test whether @work is currently pending or running.  There is no
 * synchronization around this function and the test result is
 * unreliable and only useful as advisory hints or for debugging.
 
 AFAICT I think it should be safe because we don't have concurrent
 purge_persistent_gnt calls, but I'm no expert on Linux workqueues. It

Me neither, that's why I just replace this BUG_ON() with a simple return.

 also makes me wonder why we have such a half-baked function in the Linux
 kernel.
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun

I hope the following can address all comments below:


diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 2f8e590..a4bd2a1 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -407,7 +407,7 @@ int libxl__domain_build(libxl__gc *gc,

 switch (info-type) {
 case LIBXL_DOMAIN_TYPE_HVM:
-ret = libxl__build_hvm(gc, domid, info, state);
+ret = libxl__build_hvm(gc, domid, d_config, state);
 if (ret)
 goto out;

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 634b8d2..ba852fe 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -92,6 +92,276 @@ const char *libxl__domain_device_model(libxl__gc *gc,
 return dm;
 }

+static int
+libxl__xc_device_get_rdm(libxl__gc *gc,
+ uint32_t flag,
+ uint16_t seg,
+ uint8_t bus,
+ uint8_t devfn,
+ unsigned int *nr_entries,
+ struct xen_reserved_device_memory **xrdm)
+{
+int rc = 0, r;
+
+/*
+ * We really can't presume how many entries we can get in advance.
+ */
+*nr_entries = 0;
+r = xc_reserved_device_memory_map(CTX-xch, flag, seg, bus, devfn,
+  NULL, nr_entries);
+assert(r = 0);
+/* 0 means we have no any rdm entry. */
+if (!r) goto out;
+
+if (errno != ENOBUFS) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+GCNEW_ARRAY(*xrdm, *nr_entries);
+r = xc_reserved_device_memory_map(CTX-xch, flag, seg, bus, devfn,
+  *xrdm, nr_entries);
+if (r)
+rc = ERROR_FAIL;
+
+ out:
+if (rc) {
+*nr_entries = 0;
+*xrdm = NULL;
+LOG(ERROR, Could not get reserved device memory maps.\n);
+}
+return rc;
+}
+
+/*
+ * Check whether there exists rdm hole in the specified memory range.
+ * Returns true if exists, else returns false.
+ */
+static bool overlaps_rdm(uint64_t start, uint64_t memsize,
+ uint64_t rdm_start, uint64_t rdm_size)
+{
+return (start + memsize  rdm_start)  (start  rdm_start + rdm_size);
+}
+
+static void
+add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
+  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
+{
+assert(d_config-num_rdms);
+
+d_config-rdms = libxl__realloc(NOGC, d_config-rdms,
+d_config-num_rdms * sizeof(libxl_device_rdm));
+
+d_config-rdms[d_config-num_rdms - 1].start = rdm_start;
+d_config-rdms[d_config-num_rdms - 1].size = rdm_size;
+d_config-rdms[d_config-num_rdms - 1].policy = rdm_policy;
+}
+
+/*
+ * Check reported RDM regions and handle potential gfn conflicts according
+ * to user preferred policy.
+ *
+ * RDM can reside in address space beyond 4G theoretically, but we never
+ * see this in real world. So in order to avoid breaking highmem layout
+ * we don't solve highmem conflict. Note this means highmem rmrr could
+ * still be supported if no conflict.
+ *
+ * But in the case of lowmem, RDM probably scatter the whole RAM space.
+ * Especially multiple RDM entries would worsen this to lead a complicated
+ * memory layout. And then its hard to extend hvm_info_table{} to work
+ * hvmloader out. So here we're trying to figure out a simple solution to
+ * avoid breaking existing layout. So when a conflict occurs,
+ *
+ * #1. Above a predefined boundary (default 2G)
+ * - Move lowmem_end below reserved region to solve conflict;
+ *
+ * #2. Below a predefined boundary (default 2G)
+ * - Check strict/relaxed policy.
+ * strict policy leads to fail libxl.
+ * relaxed policy issue a warning message and also mask this entry
+ * INVALID to indicate we shouldn't expose this entry to hvmloader.
+ * Note when both policies are specified on a given region, the per-device
+ * policy should override the global policy.
+ */
+int libxl__domain_device_construct_rdm(libxl__gc *gc,
+   libxl_domain_config *d_config,
+   uint64_t rdm_mem_boundary,
+   struct xc_hvm_build_args *args)
+{
+int i, j, conflict, rc;
+struct xen_reserved_device_memory *xrdm = NULL;
+uint32_t strategy = d_config-b_info.u.hvm.rdm.strategy;
+uint16_t seg;
+uint8_t bus, devfn;
+uint64_t rdm_start, rdm_size;
+uint64_t highmem_end = args-highmem_end ? args-highmem_end : 
(1ull32);

+
+/* Might not expose rdm. */
+if (strategy == LIBXL_RDM_RESERVE_STRATEGY_IGNORE 
+!d_config-num_pcidevs)
+return 0;
+
+/* Query all RDM entries in this platform */
+if (strategy == LIBXL_RDM_RESERVE_STRATEGY_HOST) {
+unsigned int nr_entries;
+
+/* Collect all rdm info if exist. */
+rc = libxl__xc_device_get_rdm(gc, PCI_DEV_RDM_ALL,
+  0, 0, 0, nr_entries, xrdm);
+  

Re: [Xen-devel] [PATCH v5 10/15] x86/altp2m: add remaining support routines.

2015-07-21 Thread Jan Beulich
 On 21.07.15 at 07:46, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Sunday, July 19, 2015 11:53 PM
 On 18.07.15 at 00:32, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 16, 2015 2:34 AM

 On 16.07.15 at 11:16, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Tuesday, July 14, 2015 7:32 AM
 On 14.07.15 at 02:14, edmund.h.wh...@intel.com wrote:
 @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
unsigned long gla,
  if ( npfec.write_access )
  {
  paging_mark_dirty(currd, mfn_x(mfn));
 +/* If p2m is really an altp2m, unlock here to avoid
 + lock
 ordering
 + * violation when the change below is propagated from
 + host p2m
 */
 +if ( ap2m_active )
 +__put_gfn(p2m, gfn);
  p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
 p2m_ram_rw);

And this won't result in any races?

 No

To be honest I expected a little more than just no here. Now I have
to ask - why?


 Yes, I should have described it more than that :-)  so this part of
 the code is handling the log dirty transition of the page, and this
 page permission transition happens always on the hostp2m. Given the
 way the locking order is setup (hostp2m-altp2m-list-lock-altp2m and
 there was a separate writeup and discussion with George on that), at
 this point in this sequence there is a p2m lock (whether it's a
 hostp2m or altp2m lock depends on the mode of the
 domain) - the reason we have to drop the lock here first is due to
 what happens next; the permission changes in hostp2m will be serially
 propagated to altp2ms and not dropping the lock here would cause a
 locking order violation. Hope that clarifies.

Sadly it doesn't at all: You re-explain why you need to drop the lock, while 
 you
fail to say anything on why this won't cause a race.

 
 It only drops the lock on the altp2m, which is no longer required in this 
 function anyway. The important aspect is that there is still a lock held on 
 the host p2m, and that is dropped after the log-dirty updates, as it would be 
 in the non-altp2m case (maybe that was the part I should have explained 
 clearly in the para above). Does that clarify or do you see a particular race 
 condition here? (We don't ).

Sounds okay then.

 +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
 +long rc = -EINVAL;

Why long (for both variable and function return type)? (More of
these in functions below.)

 Because the error variable in the code that calls these (in hvm.c)
 is a long, and you had given feedback earlier to propagate the
 returns from these functions through that calling code.

I don't see the connection. The function only returns zero or -E...
values, so why would its return type be long?


 do_hvm_op declares a rc that is of type long and hence this returns
 a long

What type your caller(s) return is of no interest at all here: What would you 
 do
if you had multiple callers with differing return types?
A function's return type should be chosen based on the range of values it may
return, and the result possibly widened to not yield inefficient code (like 
 in
some of the uint16_t cases elsewhere in the series would be necessary).

 
 What do you suggest the return type be?

For the case here - int (quite obviously I would say).

For the uint16_t ones - unsigned int.

 +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
 + mfn_t mfn, unsigned int page_order,
 + p2m_type_t p2mt, p2m_access_t
 +p2ma) {
 +struct p2m_domain *p2m;
 +p2m_access_t a;
 +p2m_type_t t;
 +mfn_t m;
 +uint16_t i;
 +bool_t reset_p2m;
 +unsigned int reset_count = 0;
 +uint16_t last_reset_idx = ~0;
 +
 +if ( !altp2m_active(d) )
 +return;
 +
 +altp2m_list_lock(d);
 +
 +for ( i = 0; i  MAX_ALTP2M; i++ )
 +{
 +if ( d-arch.altp2m_eptp[i] == INVALID_MFN )
 +continue;
 +
 +p2m = d-arch.altp2m_p2m[i];
 +m = get_gfn_type_access(p2m, gfn_x(gfn), t, a, 0,
 + NULL);
 +
 +reset_p2m = 0;
 +
 +/* Check for a dropped page that may impact this altp2m */
 +if ( mfn_x(mfn) == INVALID_MFN 
 + gfn_x(gfn) = p2m-min_remapped_gfn 
 + gfn_x(gfn) = p2m-max_remapped_gfn )
 +reset_p2m = 1;

Considering that this looks like an optimization, what's the
downside of possibly having min=0 and max=end-of-address-space?
I.e.
can there a long latency operation result that's this way a guest can
effect?


 ... A p2m is a gfn-mfn map, amongst other things. There is a
 reverse
 mfn-gfn map, but that is only valid for the host p2m. Unless the
 remap altp2m hypercall is used, the gfn-mfn map in every altp2m
 mirrors the gfn-mfn map in the host p2m (or a subset thereof, due
 to lazy-copy), so handling removal of an mfn from 

Re: [Xen-devel] [PATCH] unmodified-drivers: tolerate IRQF_DISABLED being undefined

2015-07-21 Thread Olaf Hering
On Tue, Jul 21, Jan Beulich wrote:

 Taking it to mean all maintained, that's possible, but since I can't
 immediately see the point I'd like to ask for some justification. Us -
 SUSE - apparently being the only consumer of that code, and us
 not using Xen 4.4 together with Linux 4.1, and our 4.5 trees
 already having the backport, I'm having some difficulty
 understanding the reason for the request.

I run compile tests of all xen-4.x branches against all SUSE releases to
spot new errors. But since all the required changes for gcc wont be
backported anyway I will just patch also this issues out of my copy.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 3/3] hotplug/FreeBSD: fix xendriverdomain rc.d script

2015-07-21 Thread Wei Liu
On Mon, Jul 20, 2015 at 04:55:02PM +0200, Roger Pau Monne wrote:
 hotplugpath.sh by default is located in /usr/local/etc/xen/scripts on
 FreeBSD. Instead of hardcoding it's location use the XEN_SCRIPT_DIR variable
 like it's used on the xencommons rc.d script.
 
 Signed-off-by: Roger Pau Monné roger@citrix.com
 Cc: Ian Jackson ian.jack...@eu.citrix.com
 Cc: Ian Campbell ian.campb...@citrix.com
 Cc: Wei Liu wei.l...@citrix.com

Acked-by: Wei Liu wei.l...@citrix.com

 ---
  tools/hotplug/FreeBSD/rc.d/xendriverdomain.in | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in 
 b/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in
 index 25e3edd..4063c06 100644
 --- a/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in
 +++ b/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in
 @@ -7,7 +7,7 @@
  
  . /etc/rc.subr
  
 -. /etc/xen/scripts/hotplugpath.sh
 +. @XEN_SCRIPT_DIR@/hotplugpath.sh
  
  LD_LIBRARY_PATH=${libdir}
  export LD_LIBRARY_PATH
 -- 
 1.9.5 (Apple Git-50.3)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson
Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM):
 I would add this check at the beginning of this function:
 
 assert(!d_config-num_rdms  !d_config-rdms).

As Ian Campbell and I have explained, that is not OK.

The caller may choose to pass both some rdms and strategy=host.

The most obvious case is something like the following code

  libxl_domain_config cfg;
  cfg.stuff = blah;
  cfg.rdm.strategy = HOST;

  libxl_domain_create_new(cfg, domid);
  libxl_domain_destroy(domid);
  libxl_domain_create_new(cfg, domid);

which must work (and the second domain should have identical
configuration to the first).

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 05/20] block/xen-blkfront: Split blkif_queue_request in 2

2015-07-21 Thread Julien Grall
Hi Roger,

On 21/07/15 10:54, Roger Pau Monné wrote:
 El 09/07/15 a les 22.42, Julien Grall ha escrit:
 Currently, blkif_queue_request has 2 distinct execution path:
 - Send a discard request
 - Send a read/write request

 The function is also allocating grants to use for generating the
 request. Although, this is only used for read/write request.

 Rather than having a function with 2 distinct execution path, separate
 the function in 2. This will also remove one level of tabulation.

 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Roger Pau Monné roger@citrix.com
 Cc: Boris Ostrovsky boris.ostrov...@oracle.com
 Cc: David Vrabel david.vra...@citrix.com
 
 Patch looks fine, although with so much indentation changes it's kind of
 hard to review.

I wasn't sure how to make this patch more easy to review and it seems
like diff is getting confused.

It's mostly removing one indentation layer (the if (req-cmd_flags ...))
and move the discard code in a separate function.

 Acked-by: Roger Pau Monné roger@citrix.com

Thank you.

 Just one minor change below.
 
 [...]
 
 @@ -595,6 +603,24 @@ static int blkif_queue_request(struct request *req)
  return 0;
  }
  
 +/*
 + * Generate a Xen blkfront IO request from a blk layer request.  Reads
 + * and writes are handled as expected.
 + *
 + * @req: a request struct
 + */
 +static int blkif_queue_request(struct request *req)
 +{
 +struct blkfront_info *info = req-rq_disk-private_data;
 +
 +if (unlikely(info-connected != BLKIF_STATE_CONNECTED))
 +return 1;
 +
 +if (unlikely(req-cmd_flags  (REQ_DISCARD | REQ_SECURE)))
 +return blkif_queue_discard_req(req);
 +else
 +return blkif_queue_rw_req(req);
 
 There's no need for the else clause.

I find it more readable and obvious to understand than:

if ( ... )
  return
return;

when there is only one line in the else. IIRC, the resulting assembly
will be the same.

Anyway, I can drop the else if you really want.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun

The domain destroy would not change cfg at all.


Okay.




If not, I should double check this duplication so what about a return in
the case of duplicating one entry?


What I am looking for is a *decision* about what the right behaviour
is, backed up by a *rationale*.

The most obvious answer to me would be that if an rdms array is
specified, the strategy should be ignored.  That is how the automatic
numa placement API works.


I'm not familiar with this.

Do you mean this?

if (d_config-num_rdms)
strategy = LIBXL_RDM_RESERVE_STRATEGY_IGNORE;

But except this global strategy, we also have per-device setting so 
maybe something like this?


if (d_config-num_rdms)
return 0;



But another answer would be to take the union of the specified
regions.  That would be more complicated, because the union would have
to be computed.


  if (d_config-rdms[i].start == rdm_start)
  return;


That doesn't, of course, compute the union.



Sorry I don't understand what the take the union of the specified 
regions is here.


Thanks
Tiejun


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 15/20] block/xen-blkfront: Make it running on 64KB page granularity

2015-07-21 Thread Roger Pau Monné
El 09/07/15 a les 22.42, Julien Grall ha escrit:
 From: Julien Grall julien.gr...@linaro.org
 
 The PV block protocol is using 4KB page granularity. The goal of this
 patch is to allow a Linux using 64KB page granularity using block
 device on a non-modified Xen.
 
 The block API is using segment which should at least be the size of a
 ^ segments
 Linux page. Therefore, the driver will have to break the page in chunk
chunks ^
 of 4K before giving the page to the backend.
 
 Breaking a 64KB segment in 4KB chunk will result to have some chunk with
 no data.

I would rewrite this as:

Breaking a 64KB page into 4KB chunks can result in chunks with no data.

 As the PV protocol always require to have data in the chunk, we
 have to count the number of Xen page which will be in use and avoid to
  ^ pages remove the to ^
 sent empty chunk.
  ^ sending empty chunks
 
 Note that, a pre-defined number of grant is reserved before preparing
   ^ no coma ^ grants are
 the request. This pre-defined number is based on the number and the
 maximum size of the segments. If each segment contain a very small
^ contains
 amount of data, the driver may reserve too much grant (16 grant is
 ^ many grants? ^ grants are
 reserved per segment with 64KB page granularity).
 
 Futhermore, in the case of persistent grant we allocate one Linux page
  ^ Furthermore  ^ case of using persistent grants
 per grant although only the 4KB of the page will be effectively use.
 ^ initial^ used.
 This could be improved by share the page with multiple grants.
^ sharing the page between
 
 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Roger Pau Monné roger@citrix.com
 Cc: Boris Ostrovsky boris.ostrov...@oracle.com
 Cc: David Vrabel david.vra...@citrix.com

This looks much better now, thanks.

Acked-by: Roger Pau Monné roger@citrix.com

 ---
 
 Improvement such as support 64KB grant is not taken into consideration in
 this patch because we have the requirement to run a Linux using 64KB page
 on a non-modified Xen.
 
 Changes in v2:
 - Use gnttab_foreach_grant to split a Linux page into grant
 ---
  drivers/block/xen-blkfront.c | 304 
 ---
  1 file changed, 198 insertions(+), 106 deletions(-)
 
 diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
 index 95fd067..644ba76 100644
 --- a/drivers/block/xen-blkfront.c
 +++ b/drivers/block/xen-blkfront.c
 @@ -77,6 +77,7 @@ struct blk_shadow {
   struct grant **grants_used;
   struct grant **indirect_grants;
   struct scatterlist *sg;
 + unsigned int num_sg;
  };
  
  struct split_bio {
 @@ -106,8 +107,8 @@ static unsigned int xen_blkif_max_ring_order;
  module_param_named(max_ring_page_order, xen_blkif_max_ring_order, int, 
 S_IRUGO);
  MODULE_PARM_DESC(max_ring_page_order, Maximum order of pages to be used for 
 the shared ring);
  
 -#define BLK_RING_SIZE(info) __CONST_RING_SIZE(blkif, PAGE_SIZE * 
 (info)-nr_ring_pages)
 -#define BLK_MAX_RING_SIZE __CONST_RING_SIZE(blkif, PAGE_SIZE * 
 XENBUS_MAX_RING_PAGES)
 +#define BLK_RING_SIZE(info) __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * 
 (info)-nr_ring_pages)
 +#define BLK_MAX_RING_SIZE __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * 
 XENBUS_MAX_RING_PAGES)
  /*
   * ring-ref%i i=(-1UL) would take 11 characters + 'ring-ref' is 8, so 19
   * characters are enough. Define to 20 to keep consist with backend.
 @@ -146,6 +147,7 @@ struct blkfront_info
   unsigned int discard_granularity;
   unsigned int discard_alignment;
   unsigned int feature_persistent:1;
 + /* Number of 4K segment handled */
^ segments
   unsigned int max_indirect_segments;
   int is_ready;
  };
 @@ -173,10 +175,19 @@ static DEFINE_SPINLOCK(minor_lock);
  
  #define DEV_NAME xvd   /* name in /dev */
  
 -#define SEGS_PER_INDIRECT_FRAME \
 - (PAGE_SIZE/sizeof(struct blkif_request_segment))
 -#define INDIRECT_GREFS(_segs) \
 - ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
 +/*
 + * Xen use 4K pages. The guest may use different page size (4K or 64K)
 + * Number of Xen pages per segment
 + */
 +#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
 +
 +#define SEGS_PER_INDIRECT_FRAME  \
 + (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / 
 XEN_PAGES_PER_SEGMENT)
 +#define XEN_PAGES_PER_INDIRECT_FRAME \
 + (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
 +
 +#define INDIRECT_GREFS(_pages) \
 + ((_pages + XEN_PAGES_PER_INDIRECT_FRAME - 
 1)/XEN_PAGES_PER_INDIRECT_FRAME)
  
  static int blkfront_setup_indirect(struct blkfront_info 

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun

On 2015/7/21 19:11, Ian Jackson wrote:

Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts 
with RDM):

I would add this check at the beginning of this function:

assert(!d_config-num_rdms  !d_config-rdms).


As Ian Campbell and I have explained, that is not OK.

The caller may choose to pass both some rdms and strategy=host.

The most obvious case is something like the following code

   libxl_domain_config cfg;
   cfg.stuff = blah;
   cfg.rdm.strategy = HOST;

   libxl_domain_create_new(cfg, domid);
   libxl_domain_destroy(domid);


I'm not sure about this procedure, so do you mean this doesn't free 
d_config-num_rdms/d_config-rdms?


If not, I should double check this duplication so what about a return in 
the case of duplicating one entry?


static void
add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
{
if (d_config-rdms)
{
unsigned int i;

for (i = 0; i  d_config-num_rdms; i++)
{
if (d_config-rdms[i].start == rdm_start)
return;
}
}

d_config-num_rdms++;
d_config-rdms = libxl__realloc(NOGC, d_config-rdms,
d_config-num_rdms * sizeof(libxl_device_rdm));

d_config-rdms[d_config-num_rdms - 1].start = rdm_start;
d_config-rdms[d_config-num_rdms - 1].size = rdm_size;
d_config-rdms[d_config-num_rdms - 1].policy = rdm_policy;
}

Thanks
Tiejun


   libxl_domain_create_new(cfg, domid);

which must work (and the second domain should have identical
configuration to the first).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson
Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM):
 On 2015/7/21 19:11, Ian Jackson wrote:
  The most obvious case is something like the following code
 
 libxl_domain_config cfg;
 cfg.stuff = blah;
 cfg.rdm.strategy = HOST;
 
 libxl_domain_create_new(cfg, domid);
 libxl_domain_destroy(domid);
 
 I'm not sure about this procedure, so do you mean this doesn't free 
 d_config-num_rdms/d_config-rdms?

The domain destroy would not change cfg at all.

 If not, I should double check this duplication so what about a return in 
 the case of duplicating one entry?

What I am looking for is a *decision* about what the right behaviour
is, backed up by a *rationale*.

The most obvious answer to me would be that if an rdms array is
specified, the strategy should be ignored.  That is how the automatic
numa placement API works.

But another answer would be to take the union of the specified
regions.  That would be more complicated, because the union would have
to be computed.

  if (d_config-rdms[i].start == rdm_start)
  return;

That doesn't, of course, compute the union.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 2/3] libxl/psr: use Xen error codes when checking hypercall return values

2015-07-21 Thread Ian Jackson
Roger Pau Monne writes ([PATCH for-4.6 2/3] libxl/psr: use Xen error codes 
when checking hypercall return values):
 We cannot use the systems errno values when checking return values from Xen,
 because some OSes don't have the same set of errno definitions. Instead
 use the definitions present in Xen public errno.h header.
 
 Signed-off-by: Roger Pau Monné roger@citrix.com
 Cc: Ian Jackson ian.jack...@eu.citrix.com
 Cc: Ian Campbell ian.campb...@citrix.com
 Cc: Wei Liu wei.l...@citrix.com

Acked-by: Ian Jackson ian.jack...@eu.citrix.com

 I have not checked if there are other places in libxl that need similar
 treatment, I just came around this because FreeBSD doesn't have EBADSLT
 defined.

Right.  There is probably more of this, but that's not a blocker for
this patch.

Maybe it would be worth mentioning this issue in CODING_STYLE.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun

On 2015/7/21 19:11, Ian Jackson wrote:

Chen, Tiejun writes (Re: [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts 
with RDM):

I would add this check at the beginning of this function:

assert(!d_config-num_rdms  !d_config-rdms).


As Ian Campbell and I have explained, that is not OK.

The caller may choose to pass both some rdms and strategy=host.


If you're talking to create the domain twice I think this should be 
reasonable,


if (d_config-num_rdms) return 0;

Because RDMs are always associated to the platform so we should keep the 
same array. I mean same configuration shouldn't change this point.


Thanks
Tiejun



The most obvious case is something like the following code

   libxl_domain_config cfg;
   cfg.stuff = blah;
   cfg.rdm.strategy = HOST;

   libxl_domain_create_new(cfg, domid);
   libxl_domain_destroy(domid);
   libxl_domain_create_new(cfg, domid);

which must work (and the second domain should have identical
configuration to the first).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] blktap2: update connection handling to fix build with gcc5

2015-07-21 Thread Olaf Hering
blktap2 fails to build with gcc5 because it fails to recognize that
there can be just one active connection (enforced in ctl_accept).

Add a hack to use MAX_CONNECTIONS instead of ARRAY_SIZE:
gives a smaller patch, and the code does not handle holes in connections[] 
anyway.

[  198s] block-log.c: In function 'ctl_close_sock':
[  198s] block-log.c:363:23: error: array subscript is above array bounds 
[-Werror=array-bounds]
[  198s]  if (s-connections[i].fd == fd) {
[  198s]^
[  198s] block-log.c: In function 'ctl_request':
[  198s] block-log.c:549:23: error: array subscript is above array bounds 
[-Werror=array-bounds]
[  198s]  if (s-connections[i].id == id)
[  198s]^
[  198s] cc1: all warnings being treated as errors
[  198s] 
/home/abuild/rpmbuild/BUILD/xen-4.6.31382/non-dbg/tools/blktap2/drivers/../../../tools/Rules.mk:107:
 recipe for target 'block-log.o' failed
[  198s] make[5]: *** [block-log.o] Error 1

Signed-off-by: Olaf Hering o...@aepfle.de
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
---
 tools/blktap2/drivers/block-log.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/blktap2/drivers/block-log.c 
b/tools/blktap2/drivers/block-log.c
index 5330cdc..1934227 100644
--- a/tools/blktap2/drivers/block-log.c
+++ b/tools/blktap2/drivers/block-log.c
@@ -359,7 +359,7 @@ static int ctl_close_sock(struct tdlog_state* s, int fd)
 {
   int i;
 
-  for (i = 0; i  s-connected; i++) {
+  for (i = 0; i  s-connected  i  MAX_CONNECTIONS; i++) {
 if (s-connections[i].fd == fd) {
   tapdisk_server_unregister_event(s-connections[i].id);
   close(s-connections[i].fd);
@@ -545,7 +545,7 @@ static inline int ctl_find_connection(struct tdlog_state 
*s, event_id_t id)
 {
   int i;
 
-  for (i = 0; i  s-connected; i++)
+  for (i = 0; i  s-connected  i  MAX_CONNECTIONS; i++)
 if (s-connections[i].id == id)
   return s-connections[i].fd;
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/6] add xsaves/xrstors support

2015-07-21 Thread Ruan, Shuai
Hi Andrew:
Thanks for your review.
In V2 , I have test migrate VM between two different servers, one which is 
xsaves capable and one which is not. There is 2 bugs related with my code in 
migrating between two servers( in v1 I just test migrate using localhost) and I 
fix it by add change the code.
I forgot to mention it in v2 0/6.

Thanks

shuai

-Original Message-
From: Andrew Cooper [mailto:andrew.coop...@citrix.com] 
Sent: Saturday, July 18, 2015 4:15 AM
To: Ruan, Shuai; xen-devel@lists.xen.org
Cc: ian.jack...@eu.citrix.com; ian.campb...@citrix.com; 
stefano.stabell...@eu.citrix.com; wei.l...@citrix.com; jbeul...@suse.com; 
Nakajima, Jun; k...@xen.org; Dong, Eddie; Tian, Kevin
Subject: Re: [PATCH v2 0/6] add xsaves/xrstors support

On 17/07/15 08:26, Shuai Ruan wrote:
 Changes in v2:
 * Address comments from Andrew/chao/Jan, mainly:
 * Add details information of xsaves/xrstors feature.
 * Remove XGETBV1/XSAVEC/XSAVEOPT out of 'else' in patch 3.
 * Change macro name XGETBV to XGETBV1 in patch 4.

 This patchset enable xsaves/xrstors feature which will be available on 
 Intel Skylake and later platform. Like xsave/xrstor, xsaves/xrstors 
 feature will save and load processor state from a region of memory 
 called XSAVE area. While unlike xsave/xrstor, xsaves/xrstors:

 a) use the compacted format only for the extended region 
of the XSAVE area which saves memory for you;
 b) can operate on supervisor state components so the feature
is prerequisite to support new supervisor state components;
 c) execute only if CPL=0. 

 Detail hardware spec can be found in chapter 13 (section 13.11 13.12) 
 of the Intel SDM [1].

 patch1: add xsaves/xrstors support for pv guest
 patch2: add xsaves/xrstors support for xen
 patch3-5: add xsaves/xrstors support for hvm guest
 patch6: swtich on detection of xsaves/xrstors/xgetbv in xen

 [1] Intel SDM 
 (http://www.intel.com/content/dam/www/public/us/en/documents/manuals/6
 4-ia-32-architectures-software-developer-manual-325462.pdf)

Thankyou for this - it is far more useful 0/$N now.

However, looking at the series, you clearly have not tried migrating a VM 
between two different servers, one which is xsavec capable and one which is not.

The reason why Xen does not currently use compressed xsave areas is that it has 
an ABI to maintain which predates compression.  In some cases there is no 
question; the uncompressed format must be used.  In other cases such as 
migration itself, compressed format could be used, but may not assume that the 
far side of the migration has hardware capable of processing the compressed 
format.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-upstream-4.5-testing test] 59774: regressions - trouble: broken/fail/pass

2015-07-21 Thread osstest service owner
flight 59774 qemu-upstream-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59774/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-arndale   3 host-install(3) broken REGR. vs. 58413
 test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail REGR. 
vs. 58413

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail REGR. vs. 58384
 test-amd64-i386-libvirt  11 guest-start  fail   like 58413
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58413
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 15 guest-localmigrate/x10 fail like 
58413

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-rtds 11 guest-start  fail   never pass

version targeted for testing:
 qemuueb75549f69ca0f3eab26ed39d4ad0fcb6613f64a
baseline version:
 qemuud9552b0af21c27535cd3c8549bb31d26bbecd506

Last test of basis58413  2015-06-11 17:02:58 Z   39 days
Testing same since59774  2015-07-20 12:44:09 Z0 days1 attempts


People who touched revisions under test:
  Gerd Hoffmann kra...@redhat.com
  Marc-André Lureau marcandre.lur...@gmail.com
  Stefano Stabellini stefano.stabell...@eu.citrix.com

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  broken
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pvh-intelfail
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  fail
 test-amd64-amd64-xl-multivcpupass
 test-armhf-armhf-xl-multivcpupass
 test-amd64-amd64-pairpass
 test-amd64-i386-pair pass
 test-amd64-amd64-xl-rtds pass
 test-armhf-armhf-xl-rtds fail
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 fail
 test-amd64-amd64-xl-qemuu-winxpsp3   pass
 test-amd64-i386-xl-qemuu-winxpsp3pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at

Re: [Xen-devel] [PATCH v2 02/23] x86/boot: copy only text section from *.lnk file to *.bin file

2015-07-21 Thread Jan Beulich
 On 20.07.15 at 16:28, daniel.ki...@oracle.com wrote:

Without any explanation (description) I'm inclined to say this makes
things more fragile instead of improving the situation. As it looks
like we indeed pointlessly copy .eh_frame, but I think this would
better be avoided by suppressing its generation (i.e. add
-fno-asynchronous-unwind-tables just like Rules.mk has).

Jan

 Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
 ---
  xen/arch/x86/boot/build32.mk |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/xen/arch/x86/boot/build32.mk b/xen/arch/x86/boot/build32.mk
 index c208249..c83effe 100644
 --- a/xen/arch/x86/boot/build32.mk
 +++ b/xen/arch/x86/boot/build32.mk
 @@ -13,7 +13,7 @@ CFLAGS := $(filter-out -flto,$(CFLAGS))
   sed 's/ /,0x/g' | sed 's/,0x$$//' | sed 's/^[0-9]*,/ .long /') $@
  
  %.bin: %.lnk
 - $(OBJCOPY) -O binary $ $@
 + $(OBJCOPY) -O binary -j .text $ $@
  
  %.lnk: %.o
   $(LD) $(LDFLAGS_DIRECT) -N -Ttext 0 -o $@ $
 -- 
 1.7.10.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >