[Xen-devel] [PATCH] oxenstored: fix del_watches and del_transactions

2015-06-17 Thread Wei Liu
The statement to reset nb_watches should be in del_watches, not
del_transactions.

Signed-off-by: Wei Liu wei.l...@citrix.com
Cc: David Scott dave.sc...@citrix.com
---
It was only until Ian applied previous patch that I discovered this
copy-n-paste  error. Sorry about this.
---
 tools/ocaml/xenstored/connection.ml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/ocaml/xenstored/connection.ml 
b/tools/ocaml/xenstored/connection.ml
index 9de4978..935939a 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -188,10 +188,10 @@ let del_watch con path token =
 
 let del_watches con =
   Hashtbl.clear con.watches
+  con.nb_watches - 0
 
 let del_transactions con =
   Hashtbl.clear con.transactions;
-  con.nb_watches - 0
 
 let list_watches con =
let ll = Hashtbl.fold 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] race condition in xen-gntdev (was: Re: gntdev/gntalloc and fork? - crash in gntdev)

2015-06-17 Thread Marek Marczykowski-Górecki
On Thu, May 28, 2015 at 01:45:08AM +0200, Marek Marczykowski-Górecki wrote:
 On Thu, Apr 30, 2015 at 04:47:44PM +0200, Marek Marczykowski-Górecki wrote:
  Hi,
  
  What is the proper way to handle shared pages (either side - using
  gntdev or gntalloc) regarding fork and possible exec later? The child
  process do not need to access those pages in any way, but will map
  different one(s), using newly opened FD to the gntdev/gntalloc device.
  Should it unmap them and close FD to the device manually just after the
  fork? Or the process using gntdev or gntalloc should prevent using fork
  at all?
  
  I'm asking because I get kernel oops[1] in context of such process. This
  process uses both gntdev and gntalloc. The PID reported there is a
  child, which maps additional pages (using newly opened FD to
  /dev/xen/gnt*), but I'm not sure if the crash happens before, after or
  at this second mapping (actually vchan connection), or maybe even at
  cleanup of this second mapping. The parent process keeps its mappings
  for the whole lifetime of its child. I don't have a 100% reliable way
  to reproduce this problem, but it happens quite often when I run such
  operations in a loop.
 
 Any ideas?

I've done some further debugging, below is what I get.

  The kernel is vanilla 3.19.3, running on Xen 4.4.2.
  
  The kernel message:
  [74376.073464] general protection fault:  [#1] SMP 
  [74376.073475] Modules linked in: fuse xt_conntrack ipt_MASQUERADE 
  nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
  nf_nat_ipv4 nf_nat nf_conntrack ip6table_filter ip6_tables intel_rapl 
  iosf_mbi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul 
  crc32c_intel pcspkr xen_netfront ghash_clmulni_intel nfsd auth_rpcgss 
  nfs_acl lockd grace xenfs xen_privcmd dummy_hcd udc_core xen_gntdev 
  xen_gntalloc xen_blkback sunrpc u2mfn(O) xen_evtchn xen_blkfront
  [74376.073522] CPU: 1 PID: 9377 Comm: qrexec-agent Tainted: G   O   
  3.19.3-4.pvops.qubes.x86_64 #1
  [74376.073528] task: 880002442e40 ti: 8832c000 task.ti: 
  8832c000
  [74376.073532] RIP: e030:[a00952c5]  [a00952c5] 
  unmap_if_in_range+0x15/0xd0 [xen_gntdev]

static void unmap_if_in_range(struct grant_map *map,
  unsigned long start, unsigned long end)
{   
unsigned long mstart, mend;
int err;

if (!map-vma)
return;

The above crash is at first access to map...

  [74376.073543] RSP: e02b:8832fc08  EFLAGS: 00010292
  [74376.073546] RAX:  RBX: dead00100100 RCX: 
  7fd8616ea000
  [74376.073550] RDX: 7fd8616ea000 RSI: 7fd8616e9000 RDI: 
  dead00100100

... which is 0xdead00100100 (LIST_POISON1).


  [74376.073554] RBP: 8832fc48 R08:  R09: 
  
  [74376.073557] R10: ea21bb00 R11:  R12: 
  7fd8616e9000
  [74376.073561] R13: 7fd8616ea000 R14: 880012702e40 R15: 
  880012702e70
  [74376.073569] FS:  7fd8616ca700() GS:880013c8() 
  knlGS:
  [74376.073574] CS:  e033 DS:  ES:  CR0: 80050033
  [74376.073577] CR2: 7fd8616e9458 CR3: e7af5000 CR4: 
  00042660
  [74376.073582] Stack:
  [74376.073584]  8800188356c0 00d0 8832fc68 
  c64ef797
  [74376.073590]  0220 dead00100100 7fd8616e9000 
  7fd8616ea000
  [74376.073596]  8832fc88 a00953c6 8832fcc8 
  880012702e70
  [74376.073603] Call Trace:
  [74376.073610]  [a00953c6] mn_invl_range_start+0x46/0x90 
  [xen_gntdev]
  [74376.073620]  [811e88fb]
  __mmu_notifier_invalidate_range_start+0x5b/0x90
  [74376.073627]  [811c2a59] do_wp_page+0x769/0x820
  [74376.074031]  [811c4f5c] handle_mm_fault+0x7fc/0x10c0
  [74376.074031]  [813864cd] ? radix_tree_lookup+0xd/0x10
  [74376.074031]  [81061e1c] __do_page_fault+0x1dc/0x5a0
  [74376.074031]  [817560a6] ? mutex_lock+0x16/0x37
  [74376.074031]  [a0008928] ? evtchn_ioctl+0x118/0x3c0
  [xen_evtchn]
  [74376.074031]  [812209d8] ? do_vfs_ioctl+0x2f8/0x4f0
  [74376.074031]  [811cafdf] ? do_munmap+0x29f/0x3b0
  [74376.074031]  [81062211] do_page_fault+0x31/0x70
  [74376.074031]  [81759e28] page_fault+0x28/0x30


static void mn_invl_range_start(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start, unsigned long end)
{
struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
struct grant_map *map;

spin_lock(priv-lock);
list_for_each_entry(map, priv-maps, next) {
unmap_if_in_range(map, start, end);
}

mn_invl_range_start+0x46 is the first call to unmap_if_in_range, so
something is wrong with 

Re: [Xen-devel] [PATCH] oxenstored: fix del_watches and del_transactions

2015-06-17 Thread Dave Scott

 On 17 Jun 2015, at 20:39, Wei Liu wei.l...@citrix.com wrote:
 
 The statement to reset nb_watches should be in del_watches, not
 del_transactions.
 
 Signed-off-by: Wei Liu wei.l...@citrix.com
 Cc: David Scott dave.sc...@citrix.com
 ---
 It was only until Ian applied previous patch that I discovered this
 copy-n-paste  error. Sorry about this.

Oh yes I see what you mean.

Acked-by: David Scott dave.sc...@citrix.com

Cheers,
Dave

 ---
 tools/ocaml/xenstored/connection.ml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/tools/ocaml/xenstored/connection.ml 
 b/tools/ocaml/xenstored/connection.ml
 index 9de4978..935939a 100644
 --- a/tools/ocaml/xenstored/connection.ml
 +++ b/tools/ocaml/xenstored/connection.ml
 @@ -188,10 +188,10 @@ let del_watch con path token =
 
 let del_watches con =
   Hashtbl.clear con.watches
 +  con.nb_watches - 0
 
 let del_transactions con =
   Hashtbl.clear con.transactions;
 -  con.nb_watches - 0
 
 let list_watches con =
   let ll = Hashtbl.fold 
 -- 
 1.9.1
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 58663: tolerable FAIL - PUSHED

2015-06-17 Thread osstest service user
flight 58663 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58663/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-credit2   6 xen-boot   fail in 58618 pass in 58663
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-install  fail pass in 58618

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 11 guest-start   fail REGR. vs. 58392
 test-armhf-armhf-libvirt 11 guest-start  fail blocked in 58392
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail in 58618 like 58392
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 58392
 test-amd64-i386-libvirt  11 guest-start  fail   like 58392
 test-amd64-amd64-libvirt-xsm 11 guest-start  fail   like 58392
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58392
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58392

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  fcbfaf9d260adbdb9352d6300b9f63c4ed443d49
baseline version:
 xen  12e817e281034f5881f46e0e4f1d127820101a78


People who touched revisions under test:
  Andrew Cooper andrew.coop...@citrix.com
  Dario Faggioli dario.faggi...@citrix.com
  David Vrabel david.vra...@citrix.com
  Don Dugger donald.d.dug...@intel.com
  Don Slutz dsl...@verizon.com
  George Dunlap george.dun...@eu.citrix.com
  Ian Campbell ian.campb...@citrix.com
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com
  Juergen Gross jgr...@suse.com
  Kevin Tian kevin.t...@intel.com
  Olaf Hering o...@aepfle.de
  Roger Pau Monné roger@citrix.com
  Ross Lagerwall ross.lagerw...@citrix.com
  Wei Liu wei.l...@citrix.com


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm fail
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  fail
 

[Xen-devel] [linux-3.4 test] 58676: regressions - FAIL

2015-06-17 Thread osstest service user
flight 58676 linux-3.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58676/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl   9 debian-install fail REGR. vs. 52209-bisect
 test-amd64-amd64-pair   15 debian-install/dst_host fail REGR. vs. 52715-bisect
 test-amd64-i386-pair15 debian-install/dst_host fail REGR. vs. 56366-bisect

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt   9 debian-installfail blocked in 56366-bisect
 test-amd64-i386-rumpuserxen-i386  6 xen-boot  fail blocked in 56366-bisect
 test-amd64-amd64-libvirt  9 debian-installfail blocked in 56366-bisect
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-freebsd10-amd64  6 xen-boot   fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot fail blocked in 56366-bisect
 test-amd64-amd64-rumpuserxen-amd64  6 xen-bootfail blocked in 56366-bisect
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-boot  fail blocked in 56366-bisect
 test-amd64-amd64-xl-multivcpu  6 xen-boot fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemut-winxpsp3  6 xen-bootfail blocked in 56366-bisect
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot   fail blocked in 56366-bisect
 test-amd64-i386-xl9 debian-installfail blocked in 56366-bisect
 test-amd64-amd64-xl-qemuu-winxpsp3  6 xen-bootfail blocked in 56366-bisect
 test-amd64-i386-libvirt-xsm   6 xen-boot  fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-boot fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-amd64-xl-sedf  9 debian-installfail blocked in 56366-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-freebsd10-i386  6 xen-bootfail blocked in 56366-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-amd64-xl-sedf-pin  9 debian-installfail blocked in 56366-bisect
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-boot  fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemut-win7-amd64  6 xen-boot  fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail blocked in 
56366-bisect
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-bootfail like 53709-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-xsm   9 debian-install   fail   never pass
 test-amd64-amd64-libvirt-xsm  9 debian-install   fail   never pass
 test-amd64-amd64-xl-pvh-amd   9 debian-install   fail   never pass
 test-amd64-i386-xl-xsm9 debian-install   fail   never pass
 test-amd64-amd64-xl-credit2   9 debian-install   fail   never pass
 test-amd64-amd64-xl-pvh-intel  9 debian-install   fail  never pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 20 leak-check/check fail 
never pass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 20 leak-check/check fail 
never pass

version targeted for testing:
 linux56b48fcda5076d4070ab00df32ff5ff834e0be86
baseline version:
 linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70


370 people touched revisions under test,
not listing them all


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  fail
 test-amd64-i386-xl 

[Xen-devel] [qemu-mainline test] 58672: regressions - trouble: blocked/broken/fail/pass

2015-06-17 Thread osstest service user
flight 58672 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58672/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   4 host-build-prep   fail REGR. vs. 58551
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-installfail REGR. vs. 58551

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt  11 guest-start  fail   like 58551
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 58551
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58551

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 11 guest-start  fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 qemuua09f4a9d19c500ea5cbcdc0bd7f0d540cf54f9f5
baseline version:
 qemuu0a2df857a7038c75379cc575de5d4be4c0ac629e


People who touched revisions under test:
  Alberto Garcia be...@igalia.com
  Alex Bennée alex.ben...@linaro.org
  Alexander Graf ag...@suse.de
  Andrew Jones drjo...@redhat.com
  Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
  Aurelien Jarno aurel...@aurel32.net
  Aurelio C. Remonda aurelioremo...@gmail.com
  Benoit Canet benoit.ca...@nodalink.com
  Benoît Canet benoit.ca...@nodalink.com
  Christian Borntraeger borntrae...@de.ibm.com
  Cole Robinson crobi...@redhat.com
  Dave Airlie airl...@redhat.com
  Eric Auger eric.au...@linaro.org
  Fam Zheng f...@redhat.com
  Gerd Hoffmann kra...@redhat.com
  Johan Karlsson johan.karls...@enea.com
  John Snow js...@redhat.com
  Kevin Wolf kw...@redhat.com
  Kővágó, Zoltán dirty.ice...@gmail.com
  Laurent Vivier laur...@vivier.eu
  Mao Chuan Li maoch...@linux.vnet.ibm.com
  Max Reitz mre...@redhat.com
  Michael S. Tsirkin m...@redhat.com
  Pavel Fedin p.fe...@samsung.com
  Peter Crosthwaite peter.crosthwa...@xilinx.com
  Peter Maydell peter.mayd...@linaro.org
  Riku Voipio riku.voi...@linaro.org
  Sergey Fedorov serge.f...@gmail.com
  Shannon Zhao shannon.z...@linaro.org
  Shannon Zhao zhaoshengl...@huawei.com
  Shlomo Pongratz shlomo.pongr...@huawei.com
  Stefan Hajnoczi stefa...@redhat.com
  Thierry Bultel thierry.bul...@basystemes.fr
  Xu Wang gesa...@linux.vnet.ibm.com
  Yongbok Kim yongbok@imgtec.com


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  broken
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm fail
 test-armhf-armhf-libvirt-xsm blocked
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 

[Xen-devel] [linux-next test] 58666: regressions - FAIL

2015-06-17 Thread osstest service user
flight 58666 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58666/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2  15 guest-start/debian.repeat fail REGR. vs. 58620

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-freebsd10-amd64  9 freebsd-install fail like 58620
 test-amd64-i386-freebsd10-i386  9 freebsd-install  fail like 58620
 test-amd64-i386-libvirt  11 guest-start  fail   like 58620
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58620
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 58620
 test-armhf-armhf-libvirt-xsm 11 guest-start  fail   like 58620
 test-armhf-armhf-libvirt 11 guest-start  fail   like 58620
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58620
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 58620
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58620

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail  never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass

version targeted for testing:
 linuxca3cfaa9e7db530b14e4a2d98a9310130152fdb8
baseline version:
 linux0f57d86787d8b1076ea8f9cba2a46d534a27

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  fail
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64  

Re: [Xen-devel] [PATCH 1/2 linux-next] Revert ufs: fix deadlocks introduced by sb mutex merge

2015-06-17 Thread Al Viro
On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote:
  Joy...  Folks, is anybody actively maintaining fs/ufs these days?
 
 Looking into the changelog there wasn't anyone seriously looking into UFS
 for at least 5-6 years... Fabian did some cleanups recently but they were
 mostly cosmetic. So I don't think it's really maintained. OTOH clearly
 there are some users since occasionally someone comes with a bug report.
 Welcome to the world where we have ~50 on disk / network filesystems :-|

FWIW, I've a partial rewrite of that area (completely untested, etc.)
in vfs.git#ufs; it's still in progress, but hopefully it'll be done
by tomorrow.  Basically, it switches the sucker to locking scheme
similar to ext2 (modulo seqlock instead of rwlock for protection of
block pointers, having to be more careful due to 64bit assignments not
being atomic and being unable to fall back to -truncate_mutex in
case of race on the read side) and starts cleaning the hell out of
the truncate (and eventually create side of get_block as well).

As far as I can see, the root of the problems with that sucker is the
misplaced handling of tail-packing logics.  That had ballooned into
a lot of convoluted codepaths ;-/  I'm not blaming Evgeniy - it *is*
painful to massage into saner shape and the damn thing kept missing
cleanups that were done to similar filesystems due to that...

One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with
UFS vs. UFS2 differences at such a low level.  Sure, the way we did endianness
handling in there provoked exactly that, but it might have been better to
go the other way round; see e.g. fs/minix/itree*.c for opposite approach.

All this stuff is currently completely untested; I'll be using generic
parts of xfstests for beating it up, but any assistance would be welcome.
Note: all testing I'll be realistically able to do will be for FreeBSD
UFS variants - I don't have Solaris left on any boxen, to start with...

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 58681: tolerable FAIL - PUSHED

2015-06-17 Thread osstest service user
flight 58681 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58681/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58640
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58640

version targeted for testing:
 ovmf cc63add853d4b5c16b74b4d4ceb59c015a2adcc4
baseline version:
 ovmf ffbb1b25402c38d25cbbfb059139e76900bdc843


People who touched revisions under test:
  Dandan Bi dandan...@intel.com
  Qiu Shumin shumin@intel.com
  Star Zeng star.z...@intel.com


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass
 test-amd64-amd64-xl-qemuu-winxpsp3   pass
 test-amd64-i386-xl-qemuu-winxpsp3pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=cc63add853d4b5c16b74b4d4ceb59c015a2adcc4
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{Repos} or die $!;
'
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
cc63add853d4b5c16b74b4d4ceb59c015a2adcc4
+ branch=ovmf
+ revision=cc63add853d4b5c16b74b4d4ceb59c015a2adcc4
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{Repos} or die $!;
'
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . cri-common
++ . cri-getconfig
++ umask 002
+ select_xenbranch
+ case $branch in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ : tested/2.6.39.x
+ . ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{OsstestUpstream} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/staging/qemu-xen-unstable.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;

Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller

2015-06-17 Thread Chun Yan Liu


 On 6/17/2015 at 07:06 PM, in message 
 1434539195.13744.321.ca...@citrix.com,
Ian Campbell ian.campb...@citrix.com wrote: 
 On Thu, 2015-05-21 at 18:07 +0100, George Dunlap wrote: 
  We have several outstanding patch series which add devices that have 
  two levels: a controller and individual devices attached to that 
  controller. 
   
  In the interest of consistency, this patch introduces a section that 
  sketches out a template for interfaces for such devices. 
  
 Chun Yan and Jeurgen: 
  
 I was hoping we could come to some sort of agreement on this such that 
 it can be used as the basis for both the pvusb and pvscsi interfaces. As 
 such your feedback here is important... 
  
   
  Signed-off-by: George Dunlap george.dun...@eu.citrix.com 
  --- 
  CC: Ian Campbell ian.campb...@citrix.com 
  CC: Ian Jackson ian.jack...@citrix.com 
  CC: Wei Liu wei.l...@citrix.com 
  CC: Juergen Gross jgr...@suse.com 
  CC: Chun Yan Liu cy...@suse.com 
  CC: Olaf Hering o...@aepfle.de 
   
  So, this is definitely RFC -- I tried to spec things out in a way that 
  made sense, but I often just chose something that I thought would be a 
  sensible starting point for discussion. 
   
  This spec looks a lot more like the PVUSB spec than the PVSCSI spec, 
  in part because I think the PVUSB spec has already had a lot more 
  thought that's gone into it. 
   
  A couple of random points to discuss: 
   
  * Calling things controllers, using typectrl for the device name, 
and using ctrl as the field name for the devid of the controller 
in the individual devices. 
   
  * I've said that having an index (port, lun, whatever) is optional. 
Do we want to make that requred?  Do we want it to have a consistent 
name?  In the case of emulated USB, we can't really specify to qemu 
what port the device gets attached to, so I'm tempted to say it's 
not required; but even there we could always give it a port number 
just for name's sake. 
   
  * Naming sub-devices.  We need to have a way to uniquely name both 
controllers and subdevices.  Here I've said that we will have both 
typectrl and type devid namespaces, mainly because in the 
previous point I opted not to require an index.  Another option 
would be not to have another devid namespace, but to use 
ctrl,index as the unique identifier.  (This would mean requiring 
the index/port/lun specification above.) 
   
  * libxl_device_type_list listing devices across all controllers.  I 
think this is the most practical thing to do, but one could imagine 
wanting to query by controller ID instead. 
   
  Feedback welcome. 
  --- 
   tools/libxl/libxl.h | 46 ++ 
   1 file changed, 46 insertions(+) 
   
  diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h 
  index 2ed7194..d757845 100644 
  --- a/tools/libxl/libxl.h 
  +++ b/tools/libxl/libxl.h 
  @@ -1234,6 +1234,52 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int  
 nr_vtpms); 
* 
*   This function does not interact with the guest and therefore 
*   cannot block on the guest. 
  + * 
  + * Controllers 
  + * --- 
  + * 
  + * Most devices are treated individually.  Some devices however, like 
  + * USB or SCSI, inherently have the need to have busses or 
  + * controllers to which individual devices can be attached. 
  + * 
  + * In that case, for each type, there will be two sets of the 
  + * functions, types, and devid namespaces outlined above: one based on 
  + * 'type', and one based on 'typectrl'. 
  + * 
  + * In those cases, libxl_device_typectrl_function will act more or 
  + * less like top-level non-bus devices: they will either create or 
  + * accept a libxl_devid which will be unique within the 
  + * typectrl libxl_devid namespace. 
  + * 
  + * Second-level devices which will be attached to a controller will 
  + * include in their libxl_device_type a field called ctrl, which 
  + * will be the libxl_devid of the corresponding controller.  It may also 
  + * include an index onto that bus, that represents (for example) a USB 
  + * port or a SCSI LUN. 
  + * 
  + * These second-level devices will also have their own devid which 
  + * will be unique within the type devid namespace, and will be used 
  + * for queries or removals.

All other description is agreed except here:
For pvusb, currently we uses ctrl, port instead of devid. It seems to be
more straightforward. To add a USB device info to xenstore, it only writes
the USB busid to controller/port. To remove a USB device, just remove
USB busid from xenstore controller/port.

- Chunyan
 
  + * 
  + * In the case where there are multiple different ways to implement a 
  + * given device -- for instance, one which is fully PV and one which 
  + * uses an emulator -- the controller will contain a field which 
  + * specifies what type of implementation is used.  The implementations 
  + * of individual devices will be known by 

Re: [Xen-devel] [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream

2015-06-17 Thread Wen Congyang
On 06/15/2015 09:44 PM, Andrew Cooper wrote:
 when signalled to do so by libxl__remus_domain_checkpoint_callback()
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/libxl_dom.c  |   16 +++---
  tools/libxl/libxl_internal.h |7 +++
  tools/libxl/libxl_stream_write.c |  111 
 --
  3 files changed, 121 insertions(+), 13 deletions(-)
 
 diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
 index 06bfaab..3597a91 100644
 --- a/tools/libxl/libxl_dom.c
 +++ b/tools/libxl/libxl_dom.c
 @@ -1867,8 +1867,8 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
  
  /*- remus asynchronous checkpoint callback -*/
  
 -static void remus_checkpoint_dm_saved(libxl__egc *egc,
 -  libxl__domain_suspend_state *dss, int 
 rc);
 +static void remus_checkpoint_stream_written(
 +libxl__egc *egc, libxl__domain_suspend_state *dss, int rc);
  static void remus_devices_commit_cb(libxl__egc *egc,
  libxl__remus_devices_state *rds,
  int rc);
 @@ -1882,16 +1882,11 @@ static void 
 libxl__remus_domain_checkpoint_callback(void *data)
  libxl__egc *egc = dss-shs.egc;
  STATE_AO_GC(dss-ao);
  
 -/* This would go into tailbuf. */
 -if (dss-hvm) {
 -libxl__domain_save_device_model(egc, dss, remus_checkpoint_dm_saved);
 -} else {
 -remus_checkpoint_dm_saved(egc, dss, 0);
 -}
 +libxl__stream_write_start_checkpoint(egc, dss-sws);
  }
  
 -static void remus_checkpoint_dm_saved(libxl__egc *egc,
 -  libxl__domain_suspend_state *dss, int 
 rc)
 +static void remus_checkpoint_stream_written(
 +libxl__egc *egc, libxl__domain_suspend_state *dss, int rc)
  {
  /* Convenience aliases */
  libxl__remus_devices_state *const rds = dss-rds;
 @@ -2036,6 +2031,7 @@ void libxl__domain_suspend(libxl__egc *egc, 
 libxl__domain_suspend_state *dss)
  callbacks-suspend = libxl__remus_domain_suspend_callback;
  callbacks-postcopy = libxl__remus_domain_resume_callback;
  callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;
 +dss-sws.checkpoint_callback = remus_checkpoint_stream_written;
  } else
  callbacks-suspend = libxl__domain_suspend_callback;
  
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index 82cd792..bf1c377 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -2879,17 +2879,24 @@ struct libxl__stream_write_state {
  void (*completion_callback)(libxl__egc *egc,
  libxl__domain_suspend_state *dss,
  int rc);
 +void (*checkpoint_callback)(libxl__egc *egc,
 +libxl__domain_suspend_state *dss,
 +int rc);
  /* Private */
  int rc;
  int joined_rc;
  size_t padding;
  bool running;
 +bool in_checkpoint;
  libxl__datacopier_state dc;
  };
  
  _hidden void libxl__stream_write_start(libxl__egc *egc,
 libxl__stream_write_state *stream);
  
 +_hidden void libxl__stream_write_start_checkpoint(
 +libxl__egc *egc, libxl__stream_write_state *stream);
 +
  _hidden void libxl__stream_write_abort(libxl__egc *egc,
 libxl__stream_write_state *stream,
 int rc);
 diff --git a/tools/libxl/libxl_stream_write.c 
 b/tools/libxl/libxl_stream_write.c
 index d28a8a5..40f2cb7 100644
 --- a/tools/libxl/libxl_stream_write.c
 +++ b/tools/libxl/libxl_stream_write.c
 @@ -23,6 +23,9 @@
   *  - libxl__stream_write_start()
   * - Start writing a stream from the start.
   *
 + *  - libxl__stream_write_start()
 + * - Write the records which form a checkpoint into a stream.
 + *
   * In normal operation, there are two tasks running at once; this stream
   * processing, and the the libxl-save-helper.  check_stream_finished() is 
 used
   * to join all the tasks in both success and error cases.
 @@ -39,6 +42,12 @@
   *  - Toolstack record
   *  - if (hvm), Qemu record
   *  - End record
 + *
 + * For checkpointed stream, there is a second loop which is triggered by a
 + * save-helper checkpoint callback.  It writes:
 + *  - Toolstack record
 + *  - if (hvm), Qemu record
 + *  - Checkpoint end record
   */
  
  static const uint8_t zero_padding[1U  REC_ALIGN_ORDER] = { 0 };
 @@ -81,6 +90,16 @@ static void end_record_done(libxl__egc *egc,
  libxl__datacopier_state *dc,
  int onwrite, int errnoval);
  
 +/* Event callbacks unique to checkpointed streams. */
 +static void checkpoint_done(libxl__egc *egc,
 +  

Re: [Xen-devel] [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state

2015-06-17 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:

Shortly more parameters will appear, and this saves unboxing each one.

No functional change.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com


Reviewed-by: Yang Hongyang yan...@cn.fujitsu.com


---
  tools/libxl/libxl_create.c   |   12 ++--
  tools/libxl/libxl_internal.h |2 +-
  tools/libxl/libxl_save_callout.c |2 +-
  3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..385891c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
   int rc, uint32_t domid);

  static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-uint32_t *domid,
-int restore_fd, int checkpointed_stream,
+uint32_t *domid, int restore_fd,
+const libxl_domain_restore_params *params,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
@@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  libxl_domain_config_init(cdcs-dcs.guest_config_saved);
  libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config);
  cdcs-dcs.restore_fd = restore_fd;
+if (params) cdcs-dcs.restore_params = *params;
  cdcs-dcs.callback = domain_create_cb;
-cdcs-dcs.checkpointed_stream = checkpointed_stream;
  libxl__ao_progress_gethow(cdcs-dcs.aop_console_how, aop_console_how);
  cdcs-domid_out = domid;

@@ -1619,7 +1619,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
-return do_domain_create(ctx, d_config, domid, -1, 0,
+return do_domain_create(ctx, d_config, domid, -1, NULL,
  ao_how, aop_console_how);
  }

@@ -1629,8 +1629,8 @@ int libxl_domain_create_restore(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how 
*aop_console_how)
  {
-return do_domain_create(ctx, d_config, domid, restore_fd,
-params-checkpointed_stream, ao_how, 
aop_console_how);
+return do_domain_create(ctx, d_config, domid, restore_fd, params,
+ao_how, aop_console_how);
  }

  /*
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6226c18..796bd21 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
  libxl_domain_config *guest_config;
  libxl_domain_config guest_config_saved; /* vanilla config */
  int restore_fd;
+libxl_domain_restore_params restore_params;
  libxl__domain_create_cb *callback;
  libxl_asyncprogress_how aop_console_how;
  /* private to domain_create */
  int guest_domid;
-int checkpointed_stream;
  libxl__domain_build_state build_state;
  libxl__bootloader_state bl;
  libxl__stub_dm_spawn_state dmss;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 40b25e4..3585a84 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -59,7 +59,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, 
libxl__domain_create_state *dcs,
  state-store_domid, state-console_port,
  state-console_domid,
  hvm, pae, superpages,
-cbflags, dcs-checkpointed_stream,
+cbflags, dcs-restore_params.checkpointed_stream,
  };

  dcs-shs.ao = ao;



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.2-testing test] 58678: FAIL

2015-06-17 Thread osstest service user
flight 58678 xen-4.2-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58678/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt  3 host-install(3) broken in 58584 REGR. vs. 58411

Tests which are failing intermittently (not blocking):
 test-i386-i386-libvirt3 host-install(3)  broken in 58584 pass in 58678
 test-amd64-i386-pair  4 host-install/dst_host(4) broken in 58584 pass in 58678
 test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken in 58584 pass 
in 58678
 test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 58678
 test-amd64-i386-xl-win7-amd64  3 host-install(3) broken in 58584 pass in 58678
 test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass in 
58678
 test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass in 
58678
 test-i386-i386-xl-qemut-winxpsp3 3 host-install(3) broken in 58616 pass in 
58678
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-boot   fail in 58584 pass in 58678
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-localmigrate.2 fail in 58584 
pass in 58678
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 58631 
pass in 58616
 test-i386-i386-libvirt9 debian-install fail in 58631 pass in 58678
 test-amd64-amd64-xl-win7-amd64 16 guest-stopfail pass in 58584
 test-amd64-i386-xl-qemuu-win7-amd64 12 guest-localmigrate   fail pass in 58631

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 58411
 test-amd64-i386-xl-win7-amd64 16 guest-stop   fail  like 58411
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58411

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)blocked in 58584 n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-i386-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail in 58616 never pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install  fail never pass
 build-amd64-rumpuserxen   5 rumpuserxen-buildfail   never pass
 build-i386-rumpuserxen5 rumpuserxen-buildfail   never pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-i386-i386-libvirt   12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xend-winxpsp3 20 leak-check/check fail  never pass
 test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass

version targeted for testing:
 xen  97134c441d6d81ba0d7cdcfdc4d8315115b99dce
baseline version:
 xen  21a8344ca38a2797a13b4bf57031b6f49ae12ccb


People who touched revisions under test:
  Ian Campbell ian.campb...@citrix.com
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com
  Stefano Stabellini stefano.stabell...@eu.citrix.com


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-i386-i386-xlpass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-qemuu-freebsd10-amd64pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail 

Re: [Xen-devel] [PATCH 01/44] kernel: Add support for poweroff handler call chain

2015-06-17 Thread Stephen Boyd
On 10/06/2014 10:28 PM, Guenter Roeck wrote:
 Various drivers implement architecture and/or device specific means to
 remove power from the system.  For the most part, those drivers set the
 global variable pm_power_off to point to a function within the driver.

 This mechanism has a number of drawbacks.  Typically only one scheme
 to remove power is supported (at least if pm_power_off is used).
 At least in theory there can be multiple means remove power, some of
 which may be less desirable. For example, some mechanisms may only
 power off the CPU or the CPU card, while another may power off the
 entire system.  Others may really just execute a restart sequence
 or drop into the ROM monitor. Using pm_power_off can also be racy
 if the function pointer is set from a driver built as module, as the
 driver may be in the process of being unloaded when pm_power_off is
 called. If there are multiple poweroff handlers in the system, removing
 a module with such a handler may inadvertently reset the pointer to
 pm_power_off to NULL, leaving the system with no means to remove power.

 Introduce a system poweroff handler call chain to solve the described
 problems.  This call chain is expected to be executed from the
 architecture specific machine_power_off() function.  Drivers providing
 system poweroff functionality are expected to register with this call chain.
 By using the priority field in the notifier block, callers can control
 poweroff handler execution sequence and thus ensure that the poweroff
 handler with the optimal capabilities to remove power for a given system
 is called first.

What happened to this series? I want to add shutdown support to my
platform and I need to write a register on the PMIC in one driver to
configure it for shutdown instead of restart and then write an MMIO
register to tell the PMIC to actually do the shutdown in another driver.
It seems that the notifier solves this case for me, albeit with the
slight complication that I need to order the two with some priority.

I'm also considering putting the PMIC configuration part into the reboot
notifier chain, because it only does things to change the configuration
and not actually any shutdown/restart itself. That removes any
requirement to get the priority of notifiers right. This series will
still be useful for the MMIO register that needs to be toggled though.
Right now I have to assign pm_power_off or hook the reboot notifier with
a different priority to make this work.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3.4 000/172] 3.4.108-rc1 review

2015-06-17 Thread Zefan Li
On 2015/6/16 23:13, Ian Campbell wrote:
 On Tue, 2015-06-16 at 16:33 +0800, l...@kernel.org wrote:
 From: Zefan Li lize...@huawei.com

 This is the start of the stable review cycle for the 3.4.108 release.
 There are 172 patches in this series, all will be posted as a response
 to this one.  If anyone has any issues with these being applied, please
 let me know.

 Responses should be made by Thu Jun 18 08:30:58 UTC 2015.
 Anything received after that time might be too late.
 
 Would it be possible to also include 31a418986a58 xen: netback: read
 hotplug script once at start of day. which has started trickling into
 other stable branches already, please.
 
 If not now then for 109.
 

Queued up for 3.4.108. :)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller

2015-06-17 Thread Olaf Hering
On Wed, Jun 17, Ian Campbell wrote:

 So, Olaf, ping...

I will return to pvscsi work next week.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 12:22 +0100, Ross Lagerwall wrote:
 On 06/17/2015 12:14 PM, Ian Campbell wrote:
  On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote:
  On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote:
  The current code for allocating memory for PV guests batches the
  hypercalls to allocate memory by allocating 1024*1024 extents of order 0
  at a time. To make this faster, first try allocating extents of order 9
  (2 MiB) before falling back to the order 0 allocating if the order 9
  allocation fails.
 
  On my test machine this reduced the time to start a 128 GiB PV guest by
  about 60 seconds.
 
   From 61s or from 600s?
 
 It reduced the total time _by_ 60 seconds.

I know, my question was whether 60s was a significant or insignificant
fraction of the time take to start a guest. Seems like it was pretty
significant.

  I think it takes about 10 
 seconds to start the guest now, but it is not related to the size of 
 guest memory.
 
 Regards



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 0/6] evtchn: Improve scalebility

2015-06-17 Thread David Vrabel
The per-domain event channel lock limits scalability when many VCPUs
are trying to send interdomain events.  A per-channel lock is
introduced eliminating any lock contention when sending an event.

See this graph for the performance improvements:

  http://xenbits.xen.org/people/dvrabel/evtchn-scalability.png

A different test (using Linux's evtchn device which masks/unmasks
event channels) showed the following lock profile improvements:

Per-domain lock:
(XEN)   lock:69267976(0004:19830041), block:
2407(0002:3C7C5C96)

Per-event channel lock 
(XEN)   lock:  686530(:076AF5F6), block:
1787(:000B4D22)

Locking removed from evtchn_unmask():
(XEN)   lock:   10769(:00512999), block:  
99(:9491)

v3:
- Clear xen_consumer when clearing state.
- Defer freeing struct evtchn's until evtchn_destroy_final().
- Remove redundant d-evtchn test from port_is_valid().
- Use port_is_valid() again.
- Drop event lock from notify_via_xen_event_channel().

v2:
- Use unsigned int for d-valid_evtchns.
- Compare channel pointers in double_evtchn_lock().

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-06-17 Thread Ian Campbell
On Mon, 2015-06-15 at 12:12 +0200, Dario Faggioli wrote:
 A parameter is either per-domain or per-vcpu, no matter how the user try
 to set or get it. In RTDS, all parameters are per-domain now and, with
 your work, all of them are becoming per-vcpu, and that's ok. But then,
 per-dom parameters should just no longer exist.

Are you saying there is going to be no domain wide default for a given
per-vcpu parameter? 

if that is the case then what happens if I hotplug a new VCPU without
settings its per-vcpu properties?

I expected something like.

domain_params_set (weight=10)
add vcpu0 (weight will be 10)
add vcpu1 (weight will be 10)
vcpu_params_set(0, weight=20)
add vcpu2 (weight will be 10, from domain wide default)

If not that then what weight should vcpu2 have at this point?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-06-17 Thread Dario Faggioli
On Wed, 2015-06-17 at 13:14 +0100, Ian Campbell wrote:
 On Mon, 2015-06-15 at 12:12 +0200, Dario Faggioli wrote:
  A parameter is either per-domain or per-vcpu, no matter how the user try
  to set or get it. In RTDS, all parameters are per-domain now and, with
  your work, all of them are becoming per-vcpu, and that's ok. But then,
  per-dom parameters should just no longer exist.
 
 Are you saying there is going to be no domain wide default for a given
 per-vcpu parameter? 
 
No, no, there sure is a default, and I certainly would make it domain
wide.

 if that is the case then what happens if I hotplug a new VCPU without
 settings its per-vcpu properties?
 
Exactly.

 I expected something like.
 
 domain_params_set (weight=10)
 add vcpu0 (weight will be 10)
 add vcpu1 (weight will be 10)
 vcpu_params_set(0, weight=20)
 add vcpu2 (weight will be 10, from domain wide default)
 
 If not that then what weight should vcpu2 have at this point?
 
Absolutely.

What I'm aiming at here is the per-domain API to have a well defined and
valuable behavior, and the one you suggest in your other email (i.e.,
let's return the default) is exactly that! :-)

Thanks and Regards,
Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-06-17 Thread Dario Faggioli
On Wed, 2015-06-17 at 13:08 +0100, Ian Campbell wrote:
 On Tue, 2015-06-09 at 18:18 +0200, Dario Faggioli wrote:
  On Mon, 2015-06-08 at 15:55 -0500, Chong Li wrote:
   On Mon, Jun 8, 2015 at 10:56 AM, Dario Faggioli
  
So, Thoughts? What do you think the best way forward could be?
   
   I like option 2 more. But I think we may also need a 'vcpuid' field in
   libxl_sched_params.
   
  For sparse array support, yes. At which point, I would flip the names as
  well, i.e., something like this:
  
  libxl_vcpu_sched_params = Struct(vcpu_sched_params,[
  (vcpuid,   integer, { xxx some init val xxx}),
  (weight,   integer, {'init_val': 'LIBXL_PARAM_WEIGHT_DEFAULT'}),
  (cap,  integer, {'init_val': 'LIBXL_PARAM_CAP_DEFAULT'}),
  (period,   integer, {'init_val': 'LIBXL_PARAM_PERIOD_DEFAULT'}),
  (slice,integer, {'init_val': 'LIBXL_PARAM_SLICE_DEFAULT'}),
  (latency,  integer, {'init_val': 'LIBXL_PARAM_LATENCY_DEFAULT'}),
  (extratime,integer, {'init_val': 
  'LIBXL_PARAM_EXTRATIME_DEFAULT'}),
  (budget,   integer, {'init_val': 'LIBXL_PARAM_BUDGET_DEFAULT'}),
 
 Is a full new set of defaults really necessary? 

No, it's not... I was 'sketching' the various alternatives and just went
too far. :-)

 I don't think it would
 be semantically all that strange to say that an unspecified per-vcpu
 value will default to the domain default, at which point having
 LIBXL_DOMAIN_SCHED_PARAM_FOO_DEFAULT mentioned doesn't seem all that
 strange.
 
Indeed.

  libxl_sched_params = Struct(sched_params,[
  (sched,libxl_scheduler),
  (vcpus,Array(libxl_sched_params, num_vcpus)),
  ])
  
  With the possibility of naming the latter 'libxl_vcpus_sched_params',
  which is more descriptive, but perhaps is too similar to
  libxl_vcpu_sched_params.
 
 I think I'd go the other way and name the thing with all the values in
 it libxl_sched_params and the thing with the per-vcpu array in it
 libxl_vcpu_sched_params.
 
Fine with me.

  While we're here, another thing we would appreciate some feedback on is
  what should happen to libxl_domain_sched_params_get(). This occurred to
  my mind while reviewing patch 4 of this series. Actually, I think we've
  discussed this before, but can't find the reference now.
  
  Anyway, my view is that, for a scheduler that uses per-vcpu parameters,
  libxl_domain_sched_params_set() should set the same parameters for all
  the vcpus.
  When it comes to _get(), however, I'm not sure. To match the _set()
  case, we'd need to return the parameters of all the vcpus, but we can't,
  because the function takes a libxl_domain_sched_params argument, which
  just holds 1 tuple.
 
 Wouldn't domain_get/set be manipulating the domain's default values for
 things, i.e. what a vcpu gets if nothing more specific is specified? Or
 is it too late to say that?
 
I actually like this. A lot.

 For set overall I don't really think it matters too much if it sets
 everything (i..e all vcpus and the defaults) so long as is a documented
 effect of the API -- anyone who adds per-vcpu support would then be
 aware of this and can adjust their code accordingly.
 
Exactly. Personally, I think I'd make per-domain set behave
consistently, and hence use it for set the default.

 For get I think saying it returns the defaults used for vcpus which
 don't have something more explicit set is perfectly fine and doesn't
 pose an upgrade wrinkle, since only those who are aware of the vcpu
 settings would care about the distinction.
 
As said, I like this, great idea.

Chong, what do you think? Can you make this happen?

Regards,
Dario
-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job

2015-06-17 Thread Ian Jackson
Ian Campbell writes (Re: [OSSTEST Nested PATCH v11 6/7] Compose the main 
recipe of nested test job):
 I'll leave it to Ian to suggest a solution since it will no doubt
 involve some tcl plumbing (I'd be inclined to record 'hosts which are
 actually guests' somewhere and have the infra clean them up
 automatically after doing leak check and log collection).

Thinking aloud:

The things we want to happen at the end of the job are:
  * leak check on the l1
 !* logs capture on the l1 (might involve power cycle)
  * destruction of the l1 on l0
  * leak check on l0
 !* logs capture on l0

The items marked ! should happen even in case of failure.

This is going to be quite fiddly to do in a generic way.  But on the
other hand, doing it in a generic way would avoid writing an explicit
error handler in the l1 test case code, and replicating the per-host
operations.

 I was thinking more along the lines of creating Osstest/PDU/guest.pm
 with the appropriate methods calling out to toolstack($l0)-foo, setting
 $ho-{Power} = 'guest $l1guestname' somewhere and allowing
 power_cycle_host_setup to do it's thing.

Yes, that's what I meant.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 13:11 +0100, Julien Grall wrote:
 Hi Ian,
 
 On 17/06/15 11:08, Ian Campbell wrote:
  On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote:
  I wrote this before reading the rest of the thread with your alternative
  suggestion.  Both approaches can work, maybe it is true that having the
  guest requesting mappings is better. And we should certainly do the same
  thing for PVH and ARM guests.
 
  If we have the guest requesting the mapping, obviously the hypercall
  implementation should check whether mapping the memory region has been
  previously allowed by the toolstack, that should still call
  xc_domain_iomem_permission.
  
  Whoever makes the initial decision/setup we still need to decide what to
  do about reads and writes to the device's BAR registers. Who deals with
  these and how, and if writes are allowed and if so how is this then
  reflected in the guest p2m.
  
  I would very much prefer ARM and x86 PVH to use the same approach to
  this.
  
  For x86 HVM I believe QEMU takes care of this, by emulating the
  reads/writes to CFG space and making hypercalls (domctls?) to update the
  p2m. There is no QEMU to do this in the ARM or x86 PVH cases.
  
  For x86 PV I believe pciback deals with the register reads/writes but it
  doesn't need to do anything wrt the p2m because that is a guest internal
  construct. This obviously doesn't apply to ARM or x86 PVH either since
  the p2m is managed by Xen in both those cases.
 
 By default, the OS doesn't setup the BARs, it expects the firmware to do
 it.

_By_default_, yes. But it can decide to rewrite the BARs if it wants,
Linux for example has command line options to do so.

Maybe we have the option to exclude this possibility. It would certainly
simplify things. (Which makes most of the rest of your reply moot).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv2 3/5] evtchn: use a per-event channel lock for sending events

2015-06-17 Thread Jan Beulich
 On 16.06.15 at 18:39, david.vra...@citrix.com wrote:
 On 16/06/15 17:19, Jan Beulich wrote:
 On 16.06.15 at 17:58, david.vra...@citrix.com wrote:
 On 16/06/15 16:19, David Vrabel wrote:
 @@ -1221,6 +1277,8 @@ void notify_via_xen_event_channel(struct domain 
 *ld, 
 int lport)
  evtchn_port_set_pending(rd, rchn-notify_vcpu_id, rchn);
  }
  
 +spin_unlock(lchn-lock);
 +
  spin_unlock(ld-event_lock);
  }

 Again I think the event lock can be dropped earlier now.

 Ditto.

 Uh, no. This is notify.  I've kept the locking like this because of the
 ld-is_dying check.  I think we need the ld-event_lock in case
 d-is_dying is set and evtchn_destroy(ld) is called.
 
 Right, but if evtchn_destroy() was a concern, then this wouldn't
 apply just here, but also in the sending path you are relaxing.
 Afaict due to the channel lock being taken in __evtchn_close()
 you can drop the even lock here as the latest after you acquired
 the channel one (I haven't been able to convince myself yet that
 dropping it even before that would be okay).
 
 But in the evtchn_send() case, we're in a hypercall so we know
 ld-is_dying is false and thus cannot be racing with evtchn_destroy(ld).

That's only the common code path; there's an evtchn_send() call
from __domain_finalise_shutdown() which afaict doesn't make such
guarantees (in particular there clearly are d != current-domain
cases here).

And then - how is being in a hypercall a guard against is_dying not
getting set?

 It would be good to remove event_lock from notify_xen_event_channel() as
 well since this is heavily used for ioreqs and vm events.  Let me have a
 more careful look.

Indeed, thanks.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream

2015-06-17 Thread Wen Congyang
On 06/15/2015 09:44 PM, Andrew Cooper wrote:
 This is the final bit of untangling for Remus.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/libxl_create.c  |   25 
  tools/libxl/libxl_internal.h|6 
  tools/libxl/libxl_stream_read.c |   62 
 +++
  3 files changed, 93 insertions(+)
 
 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index 7dd7130..ac918bd 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -747,6 +747,27 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t 
 domid,
  libxl_device_model_version_to_string(b_info-device_model_version));
  }
  
 +/*- remus asynchronous checkpoint callback -*/
 +
 +static void remus_checkpoint_stream_done(
 +libxl__egc *egc, libxl__domain_create_state *dcs, int rc);
 +
 +static void libxl__remus_domain_checkpoint_callback(void *data)
 +{
 +libxl__save_helper_state *shs = data;
 +libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
 +libxl__egc *egc = dcs-shs.egc;
 +STATE_AO_GC(dcs-ao);
 +
 +libxl__stream_read_start_checkpoint(egc, dcs-srs);
 +}
 +
 +static void remus_checkpoint_stream_done(
 +libxl__egc *egc, libxl__domain_create_state *dcs, int rc)
 +{
 +libxl__xc_domain_saverestore_async_callback_done(egc, dcs-shs, rc);
 +}
 +
  /*- main domain creation -*/
  
  /* We have a linear control flow; only one event callback is
 @@ -1008,6 +1029,8 @@ static void domcreate_bootloader_done(libxl__egc *egc,
  libxl_domain_config *const d_config = dcs-guest_config;
  const int restore_fd = dcs-restore_fd;
  libxl__domain_build_state *const state = dcs-build_state;
 +libxl__srm_restore_autogen_callbacks *const callbacks =
 +dcs-shs.callbacks.restore.a;
  
  if (rc) {
  domcreate_rebuild_done(egc, dcs, rc);
 @@ -1035,6 +1058,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
  }
  
  /* Restore */
 +callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;
  
  rc = libxl__build_pre(gc, domid, d_config, state);
  if (rc)
 @@ -1044,6 +1068,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
  dcs-srs.fd = restore_fd;
  dcs-srs.legacy = (dcs-restore_params.stream_version == 1);
  dcs-srs.completion_callback = domcreate_stream_done;
 +dcs-srs.checkpoint_callback = remus_checkpoint_stream_done;
  
  libxl__stream_read_start(egc, dcs-srs);
  return;
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index bf1c377..e271a0b 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -3205,11 +3205,15 @@ struct libxl__stream_read_state {
  void (*completion_callback)(libxl__egc *egc,
  libxl__domain_create_state *dcs,
  int rc);
 +void (*checkpoint_callback)(libxl__egc *egc,
 +libxl__domain_create_state *dcs,
 +int rc);
  /* Private */
  libxl__carefd *v2_carefd;
  int rc;
  int joined_rc;
  bool running;
 +bool in_checkpoint;
  libxl__datacopier_state dc;
  size_t expected_len;
  libxl_sr_hdr hdr;
 @@ -3222,6 +3226,8 @@ _hidden void libxl__stream_read_start(libxl__egc *egc,
  
  _hidden void libxl__stream_read_continue(libxl__egc *egc,
   libxl__stream_read_state *stream);
 +_hidden void libxl__stream_read_start_checkpoint(
 +libxl__egc *egc, libxl__stream_read_state *stream);
  
  _hidden void libxl__stream_read_abort(libxl__egc *egc,
libxl__stream_read_state *stream, int 
 rc);
 diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
 index a8cd2c3..09ef0aa 100644
 --- a/tools/libxl/libxl_stream_read.c
 +++ b/tools/libxl/libxl_stream_read.c
 @@ -80,6 +80,10 @@ static void emulator_padding_done(libxl__egc *egc,
libxl__datacopier_state *dc,
int onwrite, int errnoval);
  
 +/* Error handling for checkpoint mini-loop. */
 +static void checkpoint_done(libxl__egc *egc,
 +libxl__stream_read_state *stream, int rc);
 +
  void libxl__stream_read_start(libxl__egc *egc,
libxl__stream_read_state *stream)
  {
 @@ -162,6 +166,35 @@ void libxl__stream_read_continue(libxl__egc *egc,
  stream_failed(egc, stream, ret);
  }
  
 +void libxl__stream_read_start_checkpoint(libxl__egc *egc,
 + libxl__stream_read_state *stream)
 +{
 +libxl__datacopier_state *dc = stream-dc;
 +int ret = 0;
 +
 +assert(stream-running);
 +assert(!stream-in_checkpoint);
 +

Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests

2015-06-17 Thread Ross Lagerwall

On 06/17/2015 12:14 PM, Ian Campbell wrote:

On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote:

On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote:

The current code for allocating memory for PV guests batches the
hypercalls to allocate memory by allocating 1024*1024 extents of order 0
at a time. To make this faster, first try allocating extents of order 9
(2 MiB) before falling back to the order 0 allocating if the order 9
allocation fails.

On my test machine this reduced the time to start a 128 GiB PV guest by
about 60 seconds.


 From 61s or from 600s?


It reduced the total time _by_ 60 seconds. I think it takes about 10 
seconds to start the guest now, but it is not related to the size of 
guest memory.


Regards
--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller

2015-06-17 Thread Juergen Gross

On 06/17/2015 01:06 PM, Ian Campbell wrote:

On Thu, 2015-05-21 at 18:07 +0100, George Dunlap wrote:

We have several outstanding patch series which add devices that have
two levels: a controller and individual devices attached to that
controller.

In the interest of consistency, this patch introduces a section that
sketches out a template for interfaces for such devices.


Chun Yan and Jeurgen:

I was hoping we could come to some sort of agreement on this such that
it can be used as the basis for both the pvusb and pvscsi interfaces. As
such your feedback here is important...


I already gave some feedback. As everything regarding this feedback
has been discussed:

Acked-by: Juergen Gross jgr...@suse.com





Signed-off-by: George Dunlap george.dun...@eu.citrix.com
---
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@citrix.com
CC: Wei Liu wei.l...@citrix.com
CC: Juergen Gross jgr...@suse.com
CC: Chun Yan Liu cy...@suse.com
CC: Olaf Hering o...@aepfle.de

So, this is definitely RFC -- I tried to spec things out in a way that
made sense, but I often just chose something that I thought would be a
sensible starting point for discussion.

This spec looks a lot more like the PVUSB spec than the PVSCSI spec,
in part because I think the PVUSB spec has already had a lot more
thought that's gone into it.

A couple of random points to discuss:

* Calling things controllers, using typectrl for the device name,
   and using ctrl as the field name for the devid of the controller
   in the individual devices.

* I've said that having an index (port, lun, whatever) is optional.
   Do we want to make that requred?  Do we want it to have a consistent
   name?  In the case of emulated USB, we can't really specify to qemu
   what port the device gets attached to, so I'm tempted to say it's
   not required; but even there we could always give it a port number
   just for name's sake.

* Naming sub-devices.  We need to have a way to uniquely name both
   controllers and subdevices.  Here I've said that we will have both
   typectrl and type devid namespaces, mainly because in the
   previous point I opted not to require an index.  Another option
   would be not to have another devid namespace, but to use
   ctrl,index as the unique identifier.  (This would mean requiring
   the index/port/lun specification above.)

* libxl_device_type_list listing devices across all controllers.  I
   think this is the most practical thing to do, but one could imagine
   wanting to query by controller ID instead.

Feedback welcome.
---
  tools/libxl/libxl.h | 46 ++
  1 file changed, 46 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 2ed7194..d757845 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1234,6 +1234,52 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int 
nr_vtpms);
   *
   *   This function does not interact with the guest and therefore
   *   cannot block on the guest.
+ *
+ * Controllers
+ * ---
+ *
+ * Most devices are treated individually.  Some devices however, like
+ * USB or SCSI, inherently have the need to have busses or
+ * controllers to which individual devices can be attached.
+ *
+ * In that case, for each type, there will be two sets of the
+ * functions, types, and devid namespaces outlined above: one based on
+ * 'type', and one based on 'typectrl'.
+ *
+ * In those cases, libxl_device_typectrl_function will act more or
+ * less like top-level non-bus devices: they will either create or
+ * accept a libxl_devid which will be unique within the
+ * typectrl libxl_devid namespace.
+ *
+ * Second-level devices which will be attached to a controller will
+ * include in their libxl_device_type a field called ctrl, which
+ * will be the libxl_devid of the corresponding controller.  It may also
+ * include an index onto that bus, that represents (for example) a USB
+ * port or a SCSI LUN.
+ *
+ * These second-level devices will also have their own devid which
+ * will be unique within the type devid namespace, and will be used
+ * for queries or removals.
+ *
+ * In the case where there are multiple different ways to implement a
+ * given device -- for instance, one which is fully PV and one which
+ * uses an emulator -- the controller will contain a field which
+ * specifies what type of implementation is used.  The implementations
+ * of individual devices will be known by the controller to which they are
+ * attached.
+ *
+ * If libxl_device_type_add receives an uninitialized ctrl devid, it
+ * may return an error.  Or it may (but is not required to) choose to
+ * automatically choose a suitable controller to which to attach the
+ * new device.  It may also (but is not required to) automatically
+ * create a new controller if no suitable controllers exist.
+ * Individual devices should document their behavior.
+ *
+ * libxl_device_typectrl_list will list all controllers for the domain.
+ *
+ * 

[Xen-devel] [qemu-mainline test] 58664: regressions - FAIL

2015-06-17 Thread osstest service user
flight 58664 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58664/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 58551
 build-i3865 xen-build fail REGR. vs. 58551
 build-amd64-xsm   5 xen-build fail REGR. vs. 58551
 build-i386-xsm5 xen-build fail REGR. vs. 58551
 build-armhf   5 xen-build fail REGR. vs. 58551
 build-armhf-xsm   5 xen-build fail REGR. vs. 58551

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a

version targeted for testing:
 qemuu93f6d1c16036aaf34055d16f54ea770fb8d6d280
baseline version:
 qemuu0a2df857a7038c75379cc575de5d4be4c0ac629e


People who touched revisions under test:
  Alberto Garcia be...@igalia.com
  Alex Bennée alex.ben...@linaro.org
  Alexander Graf ag...@suse.de
  Andrew Jones drjo...@redhat.com
  Aurelien Jarno aurel...@aurel32.net
  Aurelio C. Remonda aurelioremo...@gmail.com
  Benoit Canet benoit.ca...@nodalink.com
  Benoît Canet benoit.ca...@nodalink.com
  Christian Borntraeger borntrae...@de.ibm.com
  Cole Robinson crobi...@redhat.com
  Dave Airlie airl...@redhat.com
  Eric Auger eric.au...@linaro.org
  Fam Zheng f...@redhat.com
  Gerd Hoffmann kra...@redhat.com
  Johan Karlsson johan.karls...@enea.com
  John Snow js...@redhat.com
  Kevin Wolf 

Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller

2015-06-17 Thread Juergen Gross

On 05/27/2015 03:57 PM, George Dunlap wrote:

On 05/27/2015 11:09 AM, Juergen Gross wrote:

George,

I'm on vacation this and the next week with only limited email access.
So please don't expect fast reaction on any further questions during
this time. :-)


Then quit reading your work e-mail and get back to the important stuff! :-)


OK, so I looked it up[1] and the full address seems to be:
* adapter number / host
* channel number / bus
* id number / target
* LUN

In which case, controller would correspond to adapter / host, right?

In the vscsi world, what levels of what can you make?  I know you
mentioned before that some devices have multiple LUNs, and those need
to be grouped together, with the same LUNs as they do on real
hardware, to work properly -- is that right?


Not all of the devices have this requirement, but some.


The USB case actually has something somewhat similar:
* USB controller
* USB bus
* USB device
* USB function

So far, there's not really a controller/bus distinction: each
controller has exactly one bus.  When we assign a USB device to a bus,
we automatically go through and assign each function fo that device
individually.

Would it make sense to treat vscsi the same way -- i.e., to make a
bus, and then attach targetss to it, and have the LUNs for any
given target automatically assigned when the target is assigned?


As long as it is still possible to assign individual LUNs as well.
If dom0 is controlling e.g. a RAID system you might want to assign
one LUN of a target to domU A and one LUN of the same target to domU B.


OK, so it sounds like in the vscsi case, it would be useful to assign
either an entire target, or an individual LUN.

In the case of assigning a target, you'll want to assign all the LUNs as
well, such that the virtual LUNs mirror the real LUNs.

In the case of assigning a LUN, I assume you'll still need a virtual
target.  Will you be wanting an interface for creating virtual targets,
so that you can assign several real LUNs to the same target?  Or will
you just want one virtual target per LUN if you're not assigning an
entire target?


Nearly missed this question.

Hmm, I think the first option would be better. Otherwise it could be
difficult to assign a just created LUN of a target to the same virtual
target while the system is running.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 1/6] evtchn: clear xen_consumer when clearing state

2015-06-17 Thread David Vrabel
Freeing a xen event channel would clear xen_consumer before clearing
the channel state, leaving a window where the channel is in a funny
state (still bound but no consumer).

Move the clear of xen_consumer into free_evtchn() where the state is
also cleared.

Signed-off-by: David Vrabel david.vra...@citrix.com
---
 xen/common/event_channel.c |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 947880f..90e3121 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -200,6 +200,7 @@ static void free_evtchn(struct domain *d, struct evtchn 
*chn)
 /* Reset binding to vcpu0 when the channel is freed. */
 chn-state  = ECS_FREE;
 chn-notify_vcpu_id = 0;
+chn-xen_consumer   = 0;
 
 xsm_evtchn_close_post(chn);
 }
@@ -1187,7 +1188,6 @@ void free_xen_event_channel(struct domain *d, int port)
 BUG_ON(!port_is_valid(d, port));
 chn = evtchn_from_port(d, port);
 BUG_ON(!consumer_is_xen(chn));
-chn-xen_consumer = 0;
 
 spin_unlock(d-event_lock);
 
@@ -1287,10 +1287,7 @@ void evtchn_destroy(struct domain *d)
 
 /* Close all existing event channels. */
 for ( i = 0; port_is_valid(d, i); i++ )
-{
-evtchn_from_port(d, i)-xen_consumer = 0;
 (void)__evtchn_close(d, i);
-}
 
 /* Free all event-channel buckets. */
 spin_lock(d-event_lock);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 5/6] evtchn: remove the locking when unmasking an event channel

2015-06-17 Thread David Vrabel
The event channel lock is no longer required to check if the port is
valid.

Signed-off-by: David Vrabel david.vra...@citrix.com
---
v3:
- Use port_is_valid() again.
---
 xen/common/event_channel.c |4 
 1 file changed, 4 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index e7c4445..eabb4c8 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -978,8 +978,6 @@ int evtchn_unmask(unsigned int port)
 struct domain *d = current-domain;
 struct evtchn *evtchn;
 
-ASSERT(spin_is_locked(d-event_lock));
-
 if ( unlikely(!port_is_valid(d, port)) )
 return -EINVAL;
 
@@ -1146,9 +1144,7 @@ long do_event_channel_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 struct evtchn_unmask unmask;
 if ( copy_from_guest(unmask, arg, 1) != 0 )
 return -EFAULT;
-spin_lock(current-domain-event_lock);
 rc = evtchn_unmask(unmask.port);
-spin_unlock(current-domain-event_lock);
 break;
 }
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 2/6] evtchn: defer freeing struct evtchn's until evtchn_destroy_final()

2015-06-17 Thread David Vrabel
notify_via_xen_event_channel() and free_xen_event_channel() had to
check if the domain was dying because they may be called while the
domain is being destroyed and the struct evtchn's are being freed.

By deferring the freeing of the struct evtchn's until all references
to the domain are dropped, these functions can rely on the channel
state being present and valid.

Signed-off-by: David Vrabel david.vra...@citrix.com
---
 xen/common/event_channel.c |   48 +++-
 1 file changed, 12 insertions(+), 36 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 90e3121..ab3b48e 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1175,22 +1175,6 @@ int alloc_unbound_xen_event_channel(
 
 void free_xen_event_channel(struct domain *d, int port)
 {
-struct evtchn *chn;
-
-spin_lock(d-event_lock);
-
-if ( unlikely(d-is_dying) )
-{
-spin_unlock(d-event_lock);
-return;
-}
-
-BUG_ON(!port_is_valid(d, port));
-chn = evtchn_from_port(d, port);
-BUG_ON(!consumer_is_xen(chn));
-
-spin_unlock(d-event_lock);
-
 (void)__evtchn_close(d, port);
 }
 
@@ -1202,18 +1186,12 @@ void notify_via_xen_event_channel(struct domain *ld, 
int lport)
 
 spin_lock(ld-event_lock);
 
-if ( unlikely(ld-is_dying) )
-{
-spin_unlock(ld-event_lock);
-return;
-}
-
 ASSERT(port_is_valid(ld, lport));
 lchn = evtchn_from_port(ld, lport);
-ASSERT(consumer_is_xen(lchn));
 
 if ( likely(lchn-state == ECS_INTERDOMAIN) )
 {
+ASSERT(consumer_is_xen(lchn));
 rd= lchn-u.interdomain.remote_dom;
 rchn  = evtchn_from_port(rd, lchn-u.interdomain.remote_port);
 evtchn_port_set_pending(rd, rchn-notify_vcpu_id, rchn);
@@ -1279,7 +1257,7 @@ int evtchn_init(struct domain *d)
 
 void evtchn_destroy(struct domain *d)
 {
-unsigned int i, j;
+unsigned int i;
 
 /* After this barrier no new event-channel allocations can occur. */
 BUG_ON(!d-is_dying);
@@ -1289,8 +1267,17 @@ void evtchn_destroy(struct domain *d)
 for ( i = 0; port_is_valid(d, i); i++ )
 (void)__evtchn_close(d, i);
 
+clear_global_virq_handlers(d);
+
+evtchn_fifo_destroy(d);
+}
+
+
+void evtchn_destroy_final(struct domain *d)
+{
+unsigned int i, j;
+
 /* Free all event-channel buckets. */
-spin_lock(d-event_lock);
 for ( i = 0; i  NR_EVTCHN_GROUPS; i++ )
 {
 if ( !d-evtchn_group[i] )
@@ -1298,20 +1285,9 @@ void evtchn_destroy(struct domain *d)
 for ( j = 0; j  BUCKETS_PER_GROUP; j++ )
 free_evtchn_bucket(d, d-evtchn_group[i][j]);
 xfree(d-evtchn_group[i]);
-d-evtchn_group[i] = NULL;
 }
 free_evtchn_bucket(d, d-evtchn);
-d-evtchn = NULL;
-spin_unlock(d-event_lock);
-
-clear_global_virq_handlers(d);
-
-evtchn_fifo_destroy(d);
-}
-
 
-void evtchn_destroy_final(struct domain *d)
-{
 #if MAX_VIRT_CPUS  BITS_PER_LONG
 xfree(d-poll_mask);
 d-poll_mask = NULL;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 3/6] evtchn: simplify port_is_valid()

2015-06-17 Thread David Vrabel
By keeping a count of the number of currently valid event channels,
port_is_valid() can be simplified.

d-valid_evtchns is only increased (while holding d-event_lock), so
port_is_valid() may be safely called without taking the lock (this
will be useful later).

Signed-off-by: David Vrabel david.vra...@citrix.com
---
v3:
- Remove redundant d-evtchn test.

v2:
- Used unsigned int for d-valid_evtchns.
---
 xen/common/event_channel.c |3 +++
 xen/include/xen/event.h|6 +-
 xen/include/xen/sched.h|5 +++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index ab3b48e..4924796 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -189,6 +189,8 @@ static int get_free_port(struct domain *d)
 return -ENOMEM;
 bucket_from_port(d, port) = chn;
 
+write_atomic(d-valid_evtchns, d-valid_evtchns + EVTCHNS_PER_BUCKET);
+
 return port;
 }
 
@@ -1232,6 +1234,7 @@ int evtchn_init(struct domain *d)
 d-evtchn = alloc_evtchn_bucket(d, 0);
 if ( !d-evtchn )
 return -ENOMEM;
+d-valid_evtchns = EVTCHNS_PER_BUCKET;
 
 spin_lock_init_prof(d, event_lock);
 if ( get_free_port(d) != 0 )
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 690f865..af923d1 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -89,11 +89,7 @@ static inline bool_t port_is_valid(struct domain *d, 
unsigned int p)
 {
 if ( p = d-max_evtchns )
 return 0;
-if ( !d-evtchn )
-return 0;
-if ( p  EVTCHNS_PER_BUCKET )
-return 1;
-return group_from_port(d, p) != NULL  bucket_from_port(d, p) != NULL;
+return p  read_atomic(d-valid_evtchns);
 }
 
 static inline struct evtchn *evtchn_from_port(struct domain *d, unsigned int p)
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 80c6f62..604d047 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -336,8 +336,9 @@ struct domain
 /* Event channel information. */
 struct evtchn   *evtchn; /* first bucket only */
 struct evtchn  **evtchn_group[NR_EVTCHN_GROUPS]; /* all other buckets */
-unsigned int max_evtchns;
-unsigned int max_evtchn_port;
+unsigned int max_evtchns; /* number supported by ABI */
+unsigned int max_evtchn_port; /* max permitted port number */
+unsigned int valid_evtchns;   /* number of allocated event channels */
 spinlock_t   event_lock;
 const struct evtchn_port_ops *evtchn_port_ops;
 struct evtchn_fifo_domain *evtchn_fifo;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 6/6] evtchn: pad struct evtchn to 64 bytes

2015-06-17 Thread David Vrabel
The number of struct evtchn in a page must be a power of two.  Under
some workloads performance is improved slightly by padding struct
evtchn to 64 bytes (a typical cache line size), thus putting the fewer
per-channel locks into each cache line.

This does not decrease the number of struct evtchn's per-page.

Signed-off-by: David Vrabel david.vra...@citrix.com
---
v2:
- Use __attribute__((aligned(64))) for the padding.
---
 xen/include/xen/sched.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 44ea92d..a0ff9d2 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -129,7 +129,7 @@ struct evtchn
 #endif
 } ssid;
 #endif
-};
+} __attribute__((aligned(64)));
 
 int  evtchn_init(struct domain *d); /* from domain_create */
 void evtchn_destroy(struct domain *d); /* from domain_kill */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 4/6] evtchn: use a per-event channel lock for sending events

2015-06-17 Thread David Vrabel
When sending an event, use a new per-event channel lock to safely
validate the event channel state.

This new lock must be held when changing event channel state.  Note
that the event channel lock must also be held when changing state from
ECS_FREE or it will race with a concurrent get_free_port() call.

To avoid having to take the remote event channel locks when sending to
an interdomain event channel, the local and remote channel locks are
both held when binding or closing an interdomain event channel.

This significantly  increases the  number of events  that can  be sent
from multiple  VCPUs.  But struct  evtchn increases in  size, reducing
the number that fit into a single page to 64 (instead of 128).

Signed-off-by: David Vrabel david.vra...@citrix.com
---
v3:
- Use port_is_valid in evtchn_send().
- Drop event lock from notify_via_xen_event_channel().

v2:
- Compare channel pointers in double_evtchn_lock().
---
 xen/common/event_channel.c |   82 
 xen/include/xen/sched.h|1 +
 2 files changed, 69 insertions(+), 14 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 4924796..e7c4445 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -139,6 +139,7 @@ static struct evtchn *alloc_evtchn_bucket(struct domain *d, 
unsigned int port)
 return NULL;
 }
 chn[i].port = port + i;
+spin_lock_init(chn[i].lock);
 }
 return chn;
 }
@@ -229,11 +230,15 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t 
*alloc)
 if ( rc )
 goto out;
 
+spin_lock(chn-lock);
+
 chn-state = ECS_UNBOUND;
 if ( (chn-u.unbound.remote_domid = alloc-remote_dom) == DOMID_SELF )
 chn-u.unbound.remote_domid = current-domain-domain_id;
 evtchn_port_init(d, chn);
 
+spin_unlock(chn-lock);
+
 alloc-port = port;
 
  out:
@@ -244,6 +249,28 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t 
*alloc)
 }
 
 
+static void double_evtchn_lock(struct evtchn *lchn, struct evtchn *rchn)
+{
+if ( lchn  rchn )
+{
+spin_lock(lchn-lock);
+spin_lock(rchn-lock);
+}
+else
+{
+if ( lchn != rchn )
+spin_lock(rchn-lock);
+spin_lock(lchn-lock);
+}
+}
+
+static void double_evtchn_unlock(struct evtchn *lchn, struct evtchn *rchn)
+{
+spin_unlock(lchn-lock);
+if ( lchn != rchn )
+spin_unlock(rchn-lock);
+}
+
 static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
 {
 struct evtchn *lchn, *rchn;
@@ -286,6 +313,8 @@ static long 
evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
 if ( rc )
 goto out;
 
+double_evtchn_lock(lchn, rchn);
+
 lchn-u.interdomain.remote_dom  = rd;
 lchn-u.interdomain.remote_port = rport;
 lchn-state = ECS_INTERDOMAIN;
@@ -301,6 +330,8 @@ static long 
evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
  */
 evtchn_port_set_pending(ld, lchn-notify_vcpu_id, lchn);
 
+double_evtchn_unlock(lchn, rchn);
+
 bind-local_port = lport;
 
  out:
@@ -341,11 +372,16 @@ static long evtchn_bind_virq(evtchn_bind_virq_t *bind)
 ERROR_EXIT(port);
 
 chn = evtchn_from_port(d, port);
+
+spin_lock(chn-lock);
+
 chn-state  = ECS_VIRQ;
 chn-notify_vcpu_id = vcpu;
 chn-u.virq = virq;
 evtchn_port_init(d, chn);
 
+spin_unlock(chn-lock);
+
 v-virq_to_evtchn[virq] = bind-port = port;
 
  out:
@@ -372,10 +408,15 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
 ERROR_EXIT(port);
 
 chn = evtchn_from_port(d, port);
+
+spin_lock(chn-lock);
+
 chn-state  = ECS_IPI;
 chn-notify_vcpu_id = vcpu;
 evtchn_port_init(d, chn);
 
+spin_unlock(chn-lock);
+
 bind-port = port;
 
  out:
@@ -450,11 +491,15 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind)
 goto out;
 }
 
+spin_lock(chn-lock);
+
 chn-state  = ECS_PIRQ;
 chn-u.pirq.irq = pirq;
 link_pirq_port(port, chn, v);
 evtchn_port_init(d, chn);
 
+spin_unlock(chn-lock);
+
 bind-port = port;
 
 #ifdef CONFIG_X86
@@ -468,7 +513,6 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind)
 return rc;
 }
 
-
 static long __evtchn_close(struct domain *d1, int port1)
 {
 struct domain *d2 = NULL;
@@ -575,15 +619,24 @@ static long __evtchn_close(struct domain *d1, int port1)
 BUG_ON(chn2-state != ECS_INTERDOMAIN);
 BUG_ON(chn2-u.interdomain.remote_dom != d1);
 
+double_evtchn_lock(chn1, chn2);
+
+free_evtchn(d1, chn1);
+
 chn2-state = ECS_UNBOUND;
 chn2-u.unbound.remote_domid = d1-domain_id;
-break;
+
+double_evtchn_unlock(chn1, chn2);
+
+goto out;
 
 default:
 BUG();
 }
 
+spin_lock(chn1-lock);
 free_evtchn(d1, chn1);
+spin_unlock(chn1-lock);
 
  out:
 if ( d2 != NULL )
@@ -610,21 +663,18 @@ int 

Re: [Xen-devel] [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 11:00 +, Pang, LongtaoX wrote:
   2015-06-17 08:10:26 Z executing ssh ... root@192.168.199.25 find /tmp 
   /var/run
  /var/tmp /var/lib/xen /var/core ! -type d -print0 -ls -printf '\0'
   find: `/var/core': No such file or directory
  
  /var/core is created by ts-host-install. I think the tail end of the
  function sub in there which does that and populates /etc/sysctl.conf
  and /etc/security/limits.d/coredumps.conf should be refactored probably
  to be alongside the osstest-confirm-booted thing which IIRC you are
  already going to refactor in the next version.
  
 I reviewed related code in 'ts-host-install'. But I am not very clear, the 
 below code will be executed 
 in 'ts-debian-hvm-install' too? Or refactor 'osstest-confirm-booted' and the 
 below action should be finished inside this shell script? 
 Since, it need '/var/core' directory in L1, please correct me.

I think you've already arranged for the osstest-confirm-booted stuff to
happen for the L1 host too, but that Ian J has requested it be done
differently such that it is not duplicated in two places.

The code to setup /var/core and the associated sysctl.conf and limits.d
changes should be done in the same manner as osstest-confirm-booted
eventually is based on Ian's suggestion, whatever that was.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-06-17 Thread Ian Campbell
On Tue, 2015-06-09 at 18:18 +0200, Dario Faggioli wrote:
 On Mon, 2015-06-08 at 15:55 -0500, Chong Li wrote:
  On Mon, Jun 8, 2015 at 10:56 AM, Dario Faggioli
 
   So, Thoughts? What do you think the best way forward could be?
  
  I like option 2 more. But I think we may also need a 'vcpuid' field in
  libxl_sched_params.
  
 For sparse array support, yes. At which point, I would flip the names as
 well, i.e., something like this:
 
 libxl_vcpu_sched_params = Struct(vcpu_sched_params,[
 (vcpuid,   integer, { xxx some init val xxx}),
 (weight,   integer, {'init_val': 'LIBXL_PARAM_WEIGHT_DEFAULT'}),
 (cap,  integer, {'init_val': 'LIBXL_PARAM_CAP_DEFAULT'}),
 (period,   integer, {'init_val': 'LIBXL_PARAM_PERIOD_DEFAULT'}),
 (slice,integer, {'init_val': 'LIBXL_PARAM_SLICE_DEFAULT'}),
 (latency,  integer, {'init_val': 'LIBXL_PARAM_LATENCY_DEFAULT'}),
 (extratime,integer, {'init_val': 'LIBXL_PARAM_EXTRATIME_DEFAULT'}),
 (budget,   integer, {'init_val': 'LIBXL_PARAM_BUDGET_DEFAULT'}),

Is a full new set of defaults really necessary? I don't think it would
be semantically all that strange to say that an unspecified per-vcpu
value will default to the domain default, at which point having
LIBXL_DOMAIN_SCHED_PARAM_FOO_DEFAULT mentioned doesn't seem all that
strange.

 ])
 
 libxl_sched_params = Struct(sched_params,[
 (sched,libxl_scheduler),
 (vcpus,Array(libxl_sched_params, num_vcpus)),
 ])
 
 With the possibility of naming the latter 'libxl_vcpus_sched_params',
 which is more descriptive, but perhaps is too similar to
 libxl_vcpu_sched_params.

I think I'd go the other way and name the thing with all the values in
it libxl_sched_params and the thing with the per-vcpu array in it
libxl_vcpu_sched_params.

 Ian, George, what do you think?
 
 While we're here, another thing we would appreciate some feedback on is
 what should happen to libxl_domain_sched_params_get(). This occurred to
 my mind while reviewing patch 4 of this series. Actually, I think we've
 discussed this before, but can't find the reference now.
 
 Anyway, my view is that, for a scheduler that uses per-vcpu parameters,
 libxl_domain_sched_params_set() should set the same parameters for all
 the vcpus.
 When it comes to _get(), however, I'm not sure. To match the _set()
 case, we'd need to return the parameters of all the vcpus, but we can't,
 because the function takes a libxl_domain_sched_params argument, which
 just holds 1 tuple.

Wouldn't domain_get/set be manipulating the domain's default values for
things, i.e. what a vcpu gets if nothing more specific is specified? Or
is it too late to say that?

For set overall I don't really think it matters too much if it sets
everything (i..e all vcpus and the defaults) so long as is a documented
effect of the API -- anyone who adds per-vcpu support would then be
aware of this and can adjust their code accordingly.

For get I think saying it returns the defaults used for vcpus which
don't have something more explicit set is perfectly fine and doesn't
pose an upgrade wrinkle, since only those who are aware of the vcpu
settings would care about the distinction.

 Should we just WARN and ask, when on that specific scheduler, to use the
 per-vcpu variant being introduced in this patch
 (libxl_vcpu_sched_params_get())?
 
 This does not look ideal, but without changing the prototype of
 libxl_domain_sched_params_get(), I don't see what else sensible we could
 do... :-/
 
 Should we change it, and do the LIBXL_API_VERSION trick?
 
 So, again, thoughts?
 
 Regards,
 Dario
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-17 Thread Julien Grall
Hi Ian,

On 17/06/15 11:08, Ian Campbell wrote:
 On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote:
 I wrote this before reading the rest of the thread with your alternative
 suggestion.  Both approaches can work, maybe it is true that having the
 guest requesting mappings is better. And we should certainly do the same
 thing for PVH and ARM guests.

 If we have the guest requesting the mapping, obviously the hypercall
 implementation should check whether mapping the memory region has been
 previously allowed by the toolstack, that should still call
 xc_domain_iomem_permission.
 
 Whoever makes the initial decision/setup we still need to decide what to
 do about reads and writes to the device's BAR registers. Who deals with
 these and how, and if writes are allowed and if so how is this then
 reflected in the guest p2m.
 
 I would very much prefer ARM and x86 PVH to use the same approach to
 this.
 
 For x86 HVM I believe QEMU takes care of this, by emulating the
 reads/writes to CFG space and making hypercalls (domctls?) to update the
 p2m. There is no QEMU to do this in the ARM or x86 PVH cases.
 
 For x86 PV I believe pciback deals with the register reads/writes but it
 doesn't need to do anything wrt the p2m because that is a guest internal
 construct. This obviously doesn't apply to ARM or x86 PVH either since
 the p2m is managed by Xen in both those cases.

By default, the OS doesn't setup the BARs, it expects the firmware to do
it. If I'm not mistaken the PV guest is using the same memory layout as
the host, so the BAR is mapped 1:1 in the guest.  The interesting part
is bar_* in drivers/xen/xenpci-back/config_space_header.c

 So it seems that neither of the existing solutions are a good match for
 either ARM or x86 PVH.
 
 _If_ it were permissible to implement all BAR registers as r/o then this
 might simplify things, however I suspect this is not something we can
 get away with in the general case (i.e. a driver for a given PCI device
 _knows_ that BARN is writeable...).

The guest has to write in the BAR in order to get the size of the BAR [1].

 
 So I think we do need to allow guest writes to the BAR registers. I
 think it would be a bit strange (or even open potentially subtle
 (security?) issues) to require the guest to write the BAR register and
 to update the p2m to match as well.

I don't see any possible security issue to let the guest map the BAR
itself. If the guest doesn't have the right to some physical region it
won't be able to map it (see iomem permission).

Actually, we already do similar things for interrupt.

 
 So I think updates to the BAR need to be reflected in the p2m too by
 whichever entity is emulating those reads/writes.
 
 QEMU (or another daemon) is not really an option IMHO.

Well, by default the OS doesn't setup the BAR and expect the firmware to
initialize the BAR. In our case, there is no firmware and Xen will act
as it. It would be OK, to do it in the toolstack at boot time.

 Which leaves us with either Xen doing trap and emulate or pciback
 dealing with this via the pciif.h cfg space access stuff. I've a slight
 preference for the latter since I think both ARM and x86 PVH are
 planning to use pcifront/pciback already.

pcifront/pciback is already emulate the access to the config space for us...

It would be mad to use Xen for that as we would have to reinvent our own
config space access way (ARM doesn't have ioport, hence cf8 is not present).

 
 Which leaves us with a requirement for the backend to be able to update
 the p2m, which in turn leads to a need for an interface available to the
 dom0 kernel (which the existing domctl is not).
 
 There will also need to be a mechanism to expose a suitable MMIO hole to
 pcifront. On x86 this would be via the machine memory map, I think, but
 on ARM we may need something else, perhaps a xenstore key associated
 with the pci bus entry in xenstore?
 
 That's for guests, for dom0 Xen also need to be aware of changes to the
 BAR registers of the real physical devices since it needs to know the
 real hardware values in order to point guest p2m mappings at them.

I don't understand why dom0 Xen needs to be aware of changes in physical
BAR.

PCIback is already able to deal with the value of physical BAR, you have
nothing else to do (see drivers/xen/xenpci-back/config_space_header.c).

Regards,

[1] http://wiki.osdev.org/PCI#Base_Address_Registers

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests

2015-06-17 Thread Ian Campbell
On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote:
 On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote:
  The current code for allocating memory for PV guests batches the
  hypercalls to allocate memory by allocating 1024*1024 extents of order 0
  at a time. To make this faster, first try allocating extents of order 9
  (2 MiB) before falling back to the order 0 allocating if the order 9
  allocation fails.
  
  On my test machine this reduced the time to start a 128 GiB PV guest by
  about 60 seconds.

From 61s or from 600s?

  
  Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 
 Acked-by: Wei Liu wei.l...@citrix.com

Applied. There were some rejects due to libxc: unify handling of vNUMA
layout having gone in first, I think they were just contextual and they
seemed trivial to resolve, but please do (both) check.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Propagate clock-frequency to DOMU if present in the DT timer node

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-03 at 15:48 +0100, Julien Grall wrote:
 When the property clock-frequency is present in the DT timer node, it
 means that the bootloader/firmware didn't correctly configured the

configure

 CNTFRQ/CNTFRQ_EL0 on each processor.
 
 The best solution would be to fix the offending firmware/bootloader,
 although it may not always be possible to modify and re-flash it.
 
 As it's not possible to trap the register CNTFRQ/CNTFRQ_EL0, we have
 to extend xen_arch_domainconfig to provide the timer frequency to the
 toolstack when the property clock-frequency is present to the host DT
 timer node. Then, a property clock-frequency will be created in the guest
 DT timer node if the value is not 0.
 
 We could have set the property in the guest DT no matter if the property
 is present in the host DT. Although, we still want to let the guest
 using CNTFRQ in normal case. After all, the property clock-frequency
 is just a workaround for buggy firmware.
 
 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Cc: Chris Brand chris.br...@broadcom.com

I was about to apply but:

libxl_arm.c: In function ‘make_timer_node’:
libxl_arm.c:468:9: error: implicit declaration of function ‘fdt_property_u32’ 
[-Werror=implicit-function-declaration]
cc1: all warnings being treated as errors

That happen on both my 32 bit build (Debian Wheezy) and my 64 bit on
(Ubuntu Saucy).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] libxl: refactor toolstack save restore code

2015-06-17 Thread Wei Liu
This patch does following things:
1. Document v1 format.
2. Factor out function to handle QEMU restore data and function to
   handle v1 blob for restore path.
3. Refactor save function to generate different blobs in the order
   specified in format specification.
4. Change functions to use goto out idiom.

No functional changes introduced.

Signed-off-by: Wei Liu wei.l...@citrix.com
---
Changes in v2:
1. Use ret instead of rc.
2. Use ptr instead of buf to reduce patch size.
---
 tools/libxl/libxl_dom.c | 232 +---
 1 file changed, 142 insertions(+), 90 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 867172a..600393d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1032,6 +1032,15 @@ struct libxl__physmap_info {
 char name[];
 };
 
+/* Bump version every time when toolstack saved data changes.
+ * Different types of data are arranged in the specified order.
+ *
+ * Version 1:
+ *   uint32_t version
+ *   QEMU physmap data:
+ * uint32_t count
+ * libxl__physmap_info * count
+ */
 #define TOOLSTACK_SAVE_VERSION 1
 
 static inline char *restore_helper(libxl__gc *gc, uint32_t dm_domid,
@@ -1043,41 +1052,28 @@ static inline char *restore_helper(libxl__gc *gc, 
uint32_t dm_domid,
phys_offset, node);
 }
 
-int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
- uint32_t size, void *user)
+static int libxl__toolstack_restore_qemu(libxl__gc *gc, uint32_t domid,
+ const uint8_t *ptr, uint32_t size)
 {
-libxl__save_helper_state *shs = user;
-libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
-STATE_AO_GC(dcs-ao);
-int i, ret;
-const uint8_t *ptr = buf;
-uint32_t count = 0, version = 0;
-struct libxl__physmap_info* pi;
+int ret, i;
+uint32_t count;
 char *xs_path;
 uint32_t dm_domid;
+struct libxl__physmap_info *pi;
 
-LOG(DEBUG,domain=%PRIu32 toolstack data size=%PRIu32, domid, size);
-
-if (size  sizeof(version) + sizeof(count)) {
+if (size  sizeof(count)) {
 LOG(ERROR, wrong size);
-return -1;
-}
-
-memcpy(version, ptr, sizeof(version));
-ptr += sizeof(version);
-
-if (version != TOOLSTACK_SAVE_VERSION) {
-LOG(ERROR, wrong version);
-return -1;
+ret = -1;
+goto out;
 }
 
 memcpy(count, ptr, sizeof(count));
 ptr += sizeof(count);
 
-if (size  sizeof(version) + sizeof(count) +
-count * (sizeof(struct libxl__physmap_info))) {
+if (size  sizeof(count) + count*(sizeof(struct libxl__physmap_info))) {
 LOG(ERROR, wrong size);
-return -1;
+ret = -1;
+goto out;
 }
 
 dm_domid = libxl_get_stubdom_id(CTX, domid);
@@ -1088,21 +1084,64 @@ int libxl__toolstack_restore(uint32_t domid, const 
uint8_t *buf,
 xs_path = restore_helper(gc, dm_domid, domid,
  pi-phys_offset, start_addr);
 ret = libxl__xs_write(gc, 0, xs_path, %PRIx64, pi-start_addr);
-if (ret)
-return -1;
+if (ret) goto out;
+
 xs_path = restore_helper(gc, dm_domid, domid, pi-phys_offset, size);
 ret = libxl__xs_write(gc, 0, xs_path, %PRIx64, pi-size);
-if (ret)
-return -1;
+if (ret) goto out;
+
 if (pi-namelen  0) {
 xs_path = restore_helper(gc, dm_domid, domid,
  pi-phys_offset, name);
 ret = libxl__xs_write(gc, 0, xs_path, %s, pi-name);
-if (ret)
-return -1;
+if (ret) goto out;
 }
 }
-return 0;
+
+ret = 0;
+out:
+return ret;
+
+}
+
+static int libxl__toolstack_restore_v1(libxl__gc *gc, uint32_t domid,
+   const uint8_t *ptr, uint32_t size)
+{
+return libxl__toolstack_restore_qemu(gc, domid, ptr, size);
+}
+
+int libxl__toolstack_restore(uint32_t domid, const uint8_t *ptr,
+ uint32_t size, void *user)
+{
+libxl__save_helper_state *shs = user;
+libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+STATE_AO_GC(dcs-ao);
+int ret;
+uint32_t version = 0, bufsize;
+
+LOG(DEBUG,domain=%PRIu32 toolstack data size=%PRIu32, domid, size);
+
+if (size  sizeof(version)) {
+LOG(ERROR, wrong size);
+ret = -1;
+goto out;
+}
+
+memcpy(version, ptr, sizeof(version));
+ptr += sizeof(version);
+bufsize = size - sizeof(version);
+
+switch (version) {
+case 1:
+ret = libxl__toolstack_restore_v1(gc, domid, ptr, bufsize);
+break;
+default:
+LOG(ERROR, wrong version);
+ret = -1;
+}
+
+out:
+return ret;
 }
 
 /* Domain suspend (save) */
@@ -1685,81 +1724,94 @@ int 

Re: [Xen-devel] [PATCH] libxc: unify handling of vNUMA layout

2015-06-17 Thread Ian Campbell
On Thu, 2015-06-04 at 12:52 -0400, Boris Ostrovsky wrote:
 On 06/04/2015 06:23 AM, Wei Liu wrote:
  This patch does the following:
  1. Use local variables for dummy vNUMA layout in PV case.
  2. Avoid leaking dummy layout back to caller in PV case.
  3. Use local variables to reference vNUMA layout (whether it is dummy
  or provided by caller) for both PV and HVM.
 
  Signed-off-by: Wei Liu wei.l...@citrix.com
 
 
   Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com

Acked + applied, thanks.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller

2015-06-17 Thread Ian Campbell
On Thu, 2015-05-21 at 18:07 +0100, George Dunlap wrote:
 We have several outstanding patch series which add devices that have
 two levels: a controller and individual devices attached to that
 controller.
 
 In the interest of consistency, this patch introduces a section that
 sketches out a template for interfaces for such devices.

Chun Yan and Jeurgen:

I was hoping we could come to some sort of agreement on this such that
it can be used as the basis for both the pvusb and pvscsi interfaces. As
such your feedback here is important...

 
 Signed-off-by: George Dunlap george.dun...@eu.citrix.com
 ---
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 CC: Juergen Gross jgr...@suse.com
 CC: Chun Yan Liu cy...@suse.com
 CC: Olaf Hering o...@aepfle.de
 
 So, this is definitely RFC -- I tried to spec things out in a way that
 made sense, but I often just chose something that I thought would be a
 sensible starting point for discussion.
 
 This spec looks a lot more like the PVUSB spec than the PVSCSI spec,
 in part because I think the PVUSB spec has already had a lot more
 thought that's gone into it.
 
 A couple of random points to discuss:
 
 * Calling things controllers, using typectrl for the device name,
   and using ctrl as the field name for the devid of the controller
   in the individual devices.
 
 * I've said that having an index (port, lun, whatever) is optional.
   Do we want to make that requred?  Do we want it to have a consistent
   name?  In the case of emulated USB, we can't really specify to qemu
   what port the device gets attached to, so I'm tempted to say it's
   not required; but even there we could always give it a port number
   just for name's sake.
 
 * Naming sub-devices.  We need to have a way to uniquely name both
   controllers and subdevices.  Here I've said that we will have both
   typectrl and type devid namespaces, mainly because in the
   previous point I opted not to require an index.  Another option
   would be not to have another devid namespace, but to use
   ctrl,index as the unique identifier.  (This would mean requiring
   the index/port/lun specification above.)
 
 * libxl_device_type_list listing devices across all controllers.  I
   think this is the most practical thing to do, but one could imagine
   wanting to query by controller ID instead.
 
 Feedback welcome.
 ---
  tools/libxl/libxl.h | 46 ++
  1 file changed, 46 insertions(+)
 
 diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
 index 2ed7194..d757845 100644
 --- a/tools/libxl/libxl.h
 +++ b/tools/libxl/libxl.h
 @@ -1234,6 +1234,52 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int 
 nr_vtpms);
   *
   *   This function does not interact with the guest and therefore
   *   cannot block on the guest.
 + *
 + * Controllers
 + * ---
 + *
 + * Most devices are treated individually.  Some devices however, like
 + * USB or SCSI, inherently have the need to have busses or
 + * controllers to which individual devices can be attached.
 + *
 + * In that case, for each type, there will be two sets of the
 + * functions, types, and devid namespaces outlined above: one based on
 + * 'type', and one based on 'typectrl'.
 + *
 + * In those cases, libxl_device_typectrl_function will act more or
 + * less like top-level non-bus devices: they will either create or
 + * accept a libxl_devid which will be unique within the
 + * typectrl libxl_devid namespace.
 + *
 + * Second-level devices which will be attached to a controller will
 + * include in their libxl_device_type a field called ctrl, which
 + * will be the libxl_devid of the corresponding controller.  It may also
 + * include an index onto that bus, that represents (for example) a USB
 + * port or a SCSI LUN.
 + *
 + * These second-level devices will also have their own devid which
 + * will be unique within the type devid namespace, and will be used
 + * for queries or removals.
 + *
 + * In the case where there are multiple different ways to implement a
 + * given device -- for instance, one which is fully PV and one which
 + * uses an emulator -- the controller will contain a field which
 + * specifies what type of implementation is used.  The implementations
 + * of individual devices will be known by the controller to which they are
 + * attached.
 + *
 + * If libxl_device_type_add receives an uninitialized ctrl devid, it
 + * may return an error.  Or it may (but is not required to) choose to
 + * automatically choose a suitable controller to which to attach the
 + * new device.  It may also (but is not required to) automatically
 + * create a new controller if no suitable controllers exist.
 + * Individual devices should document their behavior.
 + *
 + * libxl_device_typectrl_list will list all controllers for the domain.
 + *
 + * libxl_device_type_list will list all devices for all controllers
 + * for the domain.  The individual 

Re: [Xen-devel] [PATCH] oxenstored: implement XS_RESET_WATCHES

2015-06-17 Thread Ian Campbell
On Tue, 2015-06-09 at 11:08 +0100, Wei Liu wrote:
 Signed-off-by: Wei Liu wei.l...@citrix.com
 Acked-by: David Scott dave.sc...@citrix.com

Applied, thanks.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-netback:Make the function xenvif_schedulable have a return type of bool

2015-06-17 Thread Wei Liu
On Tue, Jun 16, 2015 at 11:03:30PM -0400, Nicholas Krause wrote:
 This makes the function xenvif_sechedulable have a return type of
 bool now due to this particular function's return statement only
 ever evaluating to have a value of one or zero.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com

Acked-by: Wei Liu wei.l...@citrix.com

 ---
  drivers/net/xen-netback/common.h| 2 +-
  drivers/net/xen-netback/interface.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/net/xen-netback/common.h 
 b/drivers/net/xen-netback/common.h
 index 8a495b3..c02cafb 100644
 --- a/drivers/net/xen-netback/common.h
 +++ b/drivers/net/xen-netback/common.h
 @@ -280,7 +280,7 @@ void xenvif_free(struct xenvif *vif);
  int xenvif_xenbus_init(void);
  void xenvif_xenbus_fini(void);
  
 -int xenvif_schedulable(struct xenvif *vif);
 +bool xenvif_schedulable(struct xenvif *vif);
  
  int xenvif_queue_stopped(struct xenvif_queue *queue);
  void xenvif_wake_queue(struct xenvif_queue *queue);
 diff --git a/drivers/net/xen-netback/interface.c 
 b/drivers/net/xen-netback/interface.c
 index 1a83e19..b5fcb52 100644
 --- a/drivers/net/xen-netback/interface.c
 +++ b/drivers/net/xen-netback/interface.c
 @@ -63,7 +63,7 @@ void xenvif_skb_zerocopy_complete(struct xenvif_queue 
 *queue)
   atomic_dec(queue-inflight_packets);
  }
  
 -int xenvif_schedulable(struct xenvif *vif)
 +bool xenvif_schedulable(struct xenvif *vif)
  {
   return netif_running(vif-dev) 
   test_bit(VIF_STATUS_CONNECTED, vif-status) 
 -- 
 2.1.4

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxl: clean up qemu-save and qemu-resume files

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-03 at 11:44 +0100, Wei Liu wrote:
 These files are leaked when using qemu-trad stubdom.  They are
 intermediate files created by libxc. Unfortunately they don't fit well
 in our userdata scheme. Clean them up after we destroy all userdata,
 we're sure they are not useful anymore at that point.
 
 Signed-off-by: Wei Liu wei.l...@citrix.com

Acked + applied.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] libxc: fix xc_dom_load_elf_symtab

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 10:56 +0100, Wei Liu wrote:
 On Thu, Jun 11, 2015 at 06:05:20PM +0200, Roger Pau Monne wrote:
  xc_dom_load_elf_symtab was incorrectly trying to perform the same
  calculations already done in elf_parse_bsdsyms when load == 0 is used.
  Instead of trying to repeat the calculations, just trust what
  elf_parse_bsdsyms has already accounted for.
  
  This also simplifies the code by allowing the non-load case to return
  earlier.
  
  Signed-off-by: Roger Pau Monné roger@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Wei Liu wei.l...@citrix.com
 
 Acked-by: Wei Liu wei.l...@citrix.com

Applied, thanks (#1 was committed by Jan already).




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

2015-06-17 Thread Ian Campbell
On Tue, 2015-06-16 at 12:44 +0100, Ian Jackson wrote:
 George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):
  On Tue, Jun 16, 2015 at 12:19 PM, Juergen Gross jgr...@suse.com wrote:
   I'm pretty sure this is just a matter of timing during boot: the busses
   are all (or at least some of them) queried at the same time and the
   first answering gets number 1, the next 2, and so on. And timing seems
   to be different enough to result in unstable bus numbers.
  
  Right -- I meant stable in topology within one boot, but at least
  two of you have now understood me to mean across reboots by default,
  so that's obviously a detail that needs to be specified. :-)
 
 I think stable in topology within one boot is nearly vacuous (or are
 we expecting buses or devices to spontaneously renumber themselves for
 no reason) and also nearly useless.
 
 We definitely need _some_ way for a user to identify a specific device
 uniquely in a way that is reliable from one boot to the next.  vid:pid
 is one way to do this, but not always sufficient.

We should perhaps separate out the end user from the libxl
user (i.e. the toolstack application) in our deliberations here.

We have the option of settling on a single way of describing a device to
libxl in the libxl API but allowing the toolstack to choose to present
the user with multiple different ways to specify things by providing a
suitable set of helper functions (in either libxlu or libxl proper) from
a variety of schemes to the one true way of describing a device.

That sounds better to me than having the libxl API consist of the union
of all potentially useful ways to refer to a device.

I'm not sure whether that also gives us the freedom to use something
which is only stable for a given boot at the libxl API layer though.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests

2015-06-17 Thread Wei Liu
On Wed, Jun 17, 2015 at 12:14:03PM +0100, Ian Campbell wrote:
 On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote:
  On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote:
   The current code for allocating memory for PV guests batches the
   hypercalls to allocate memory by allocating 1024*1024 extents of order 0
   at a time. To make this faster, first try allocating extents of order 9
   (2 MiB) before falling back to the order 0 allocating if the order 9
   allocation fails.
   
   On my test machine this reduced the time to start a 128 GiB PV guest by
   about 60 seconds.
 
 From 61s or from 600s?
 
   
   Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
  
  Acked-by: Wei Liu wei.l...@citrix.com
 
 Applied. There were some rejects due to libxc: unify handling of vNUMA
 layout having gone in first, I think they were just contextual and they
 seemed trivial to resolve, but please do (both) check.
 

The patch looks good to me.

Wei.

 Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

2015-06-17 Thread Ian Campbell
On Tue, 2015-06-16 at 16:20 +0200, Juergen Gross wrote:
 My point was to avoid the sysfs accesses in libxl in order to support
 BSD as well and to reduce the complexity.

As a slight aside to this, can't libxl use libusb for a lot of this
stuff and therefore avoid being Linux specific?

http://libusb.info/ claims to support Linux, OS X, Windows, Windows CE,
Android, OpenBSD/NetBSD, Haiku.. Interestingly FreeBSD is missing there
but I don't think that need to be a blocker.

I don't see a problem with adding libusb to our set of dependencies, and
it's certainly got to be better than (re)implementing a bunch of sysfs
stuff (which I presume is what libusb does under the hood anyway).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

2015-06-17 Thread Juergen Gross

On 06/17/2015 01:34 PM, Ian Campbell wrote:

On Tue, 2015-06-16 at 16:20 +0200, Juergen Gross wrote:

My point was to avoid the sysfs accesses in libxl in order to support
BSD as well and to reduce the complexity.


As a slight aside to this, can't libxl use libusb for a lot of this
stuff and therefore avoid being Linux specific?

http://libusb.info/ claims to support Linux, OS X, Windows, Windows CE,
Android, OpenBSD/NetBSD, Haiku.. Interestingly FreeBSD is missing there
but I don't think that need to be a blocker.

I don't see a problem with adding libusb to our set of dependencies, and
it's certainly got to be better than (re)implementing a bunch of sysfs
stuff (which I presume is what libusb does under the hood anyway).


+1


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 0/3] net/xen: Clean up

2015-06-17 Thread Julien Grall
Hi,

The first 2 patches were originally part of the Xen 64KB series [1]. Although
I think they can go without waiting the rest of the 64KB series.

The third patch has been added in the v4.

Regards,

Cc: Wei Liu wei.l...@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: David Vrabel david.vra...@citrix.com
Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Cc: Boris Ostrovsky boris.ostrov...@oracle.com
Cc: net...@vger.kernel.org

[1] http://lkml.org/lkml/2015/5/14/533

Julien Grall (3):
  net/xen-netfront: Correct printf format in xennet_get_responses
  net/xen-netback: Remove unused code in xenvif_rx_action
  net/xen-netback: Don't mix hexa and decimal with 0x in the printf
format

 drivers/net/xen-netback/netback.c | 19 +++
 drivers/net/xen-netfront.c|  2 +-
 2 files changed, 8 insertions(+), 13 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format

2015-06-17 Thread Julien Grall
Prepend 0x to all %x in order to avoid confusion while reading when there is
other decimal value in the log.

Also replace some of the hexadecimal print to decimal to uniformize the
format with netfront.

Signed-off-by: Julien Grall julien.gr...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
Cc: net...@vger.kernel.org

---
Changes in v5:
- Fix commit message
- Add Ian's ack.

Changes in v4:
- Patch added
---
 drivers/net/xen-netback/netback.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index ba3ae30..11bd9d8 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -748,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
slots++;
 
if (unlikely((txp-offset + txp-size)  PAGE_SIZE)) {
-   netdev_err(queue-vif-dev, Cross page boundary, 
txp-offset: %x, size: %u\n,
+   netdev_err(queue-vif-dev, Cross page boundary, 
txp-offset: %u, size: %u\n,
 txp-offset, txp-size);
xenvif_fatal_tx_err(queue-vif);
return -EINVAL;
@@ -874,7 +874,7 @@ static inline void xenvif_grant_handle_set(struct 
xenvif_queue *queue,
if (unlikely(queue-grant_tx_handle[pending_idx] !=
 NETBACK_INVALID_HANDLE)) {
netdev_err(queue-vif-dev,
-  Trying to overwrite active handle! pending_idx: 
%x\n,
+  Trying to overwrite active handle! pending_idx: 
0x%x\n,
   pending_idx);
BUG();
}
@@ -887,7 +887,7 @@ static inline void xenvif_grant_handle_reset(struct 
xenvif_queue *queue,
if (unlikely(queue-grant_tx_handle[pending_idx] ==
 NETBACK_INVALID_HANDLE)) {
netdev_err(queue-vif-dev,
-  Trying to unmap invalid handle! pending_idx: %x\n,
+  Trying to unmap invalid handle! pending_idx: 
0x%x\n,
   pending_idx);
BUG();
}
@@ -1243,7 +1243,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
/* No crossing a page as the payload mustn't fragment. */
if (unlikely((txreq.offset + txreq.size)  PAGE_SIZE)) {
netdev_err(queue-vif-dev,
-  txreq.offset: %x, size: %u, end: %lu\n,
+  txreq.offset: %u, size: %u, end: %lu\n,
   txreq.offset, txreq.size,
   (unsigned long)(txreq.offset~PAGE_MASK) + 
txreq.size);
xenvif_fatal_tx_err(queue-vif);
@@ -1593,12 +1593,12 @@ static inline void xenvif_tx_dealloc_action(struct 
xenvif_queue *queue)
queue-pages_to_unmap,
gop - queue-tx_unmap_ops);
if (ret) {
-   netdev_err(queue-vif-dev, Unmap fail: nr_ops %tx ret 
%d\n,
+   netdev_err(queue-vif-dev, Unmap fail: nr_ops %tu ret 
%d\n,
   gop - queue-tx_unmap_ops, ret);
for (i = 0; i  gop - queue-tx_unmap_ops; ++i) {
if (gop[i].status != GNTST_okay)
netdev_err(queue-vif-dev,
-   host_addr: %llx handle: %x 
status: %d\n,
+   host_addr: 0x%llx handle: 
0x%x status: %d\n,
   gop[i].host_addr,
   gop[i].handle,
   gop[i].status);
@@ -1731,7 +1731,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 
pending_idx)
queue-mmap_pages[pending_idx], 1);
if (ret) {
netdev_err(queue-vif-dev,
-  Unmap fail: ret: %d pending_idx: %d host_addr: %llx 
handle: %x status: %d\n,
+  Unmap fail: ret: %d pending_idx: %d host_addr: %llx 
handle: 0x%x status: %d\n,
   ret,
   pending_idx,
   tx_unmap_op.host_addr,
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 1/3] net/xen-netfront: Correct printf format in xennet_get_responses

2015-06-17 Thread Julien Grall
rx-status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.

Also use %u rather than %x for rx-offset.

Signed-off-by: Julien Grall julien.gr...@citrix.com
Reviewed-by: David Vrabel david.vra...@citrix.com
Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Cc: Boris Ostrovsky boris.ostrov...@oracle.com
Cc: net...@vger.kernel.org

---
Changes in v4:
- Use %u for the rx-offset because offset is unsigned

Changes in v3:
- Use %d for the rx-offset too.

Changes in v2:
- Add David's Reviewed-by
---
 drivers/net/xen-netfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index e031c94..281720f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -733,7 +733,7 @@ static int xennet_get_responses(struct netfront_queue 
*queue,
if (unlikely(rx-status  0 ||
 rx-offset + rx-status  PAGE_SIZE)) {
if (net_ratelimit())
-   dev_warn(dev, rx-offset: %x, size: %u\n,
+   dev_warn(dev, rx-offset: %u, size: %d\n,
 rx-offset, rx-status);
xennet_move_rx_slot(queue, skb, ref);
err = -EINVAL;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 2/3] net/xen-netback: Remove unused code in xenvif_rx_action

2015-06-17 Thread Julien Grall
The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b xen-netback:
always fully coalesce guest Rx packets.

Signed-off-by: Julien Grall julien.gr...@citrix.com
Acked-by: Wei Liu wei.l...@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: net...@vger.kernel.org

---
Changes in v2:
- Add Wei's Acked-by
---
 drivers/net/xen-netback/netback.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 0d25943..ba3ae30 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
(skb = xenvif_rx_dequeue(queue)) != NULL) {
-   RING_IDX old_req_cons;
-   RING_IDX ring_slots_used;
-
queue-last_rx_time = jiffies;
 
-   old_req_cons = queue-rx.req_cons;
XENVIF_RX_CB(skb)-meta_slots_used = xenvif_gop_skb(skb, npo, 
queue);
-   ring_slots_used = queue-rx.req_cons - old_req_cons;
 
__skb_queue_tail(rxq, skb);
}
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Modified RTDS scheduler to use an event-driven model instead of polling.

2015-06-17 Thread Meng Xu
Hi Dagaen,

First, please try not top post in the ML. :-) As the one you just
sent, it is better to reply to the comment instead of top reply. :-)

2015-06-16 23:03 GMT-07:00 Dagaen Golomb dgol...@seas.upenn.edu:
 Thanks for the reply, budget enforcement in the scheduler timer makes sense.
 I think I have an idea of what he wants done now.

Great! It is better to summarize your understanding of the design and
send to the ML to check if your understanding is correct as Chong did
for the improved toolstack of RTDS scheduler. This will save rounds of
patches.

Thanks,

Meng



---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling

2015-06-17 Thread Wu, Feng


 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, June 10, 2015 2:44 PM
 To: Wu, Feng
 Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin;
 Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org
 Subject: Re: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU
 scheduling
 
  On 08.05.15 at 11:07, feng...@intel.com wrote:
  --- a/xen/arch/x86/hvm/vmx/vmx.c
  +++ b/xen/arch/x86/hvm/vmx/vmx.c
  @@ -1711,6 +1711,131 @@ static void vmx_handle_eoi(u8 vector)
   __vmwrite(GUEST_INTR_STATUS, status);
   }
 
  +static void vmx_pi_desc_update(struct vcpu *v, int old_state)
  +{
  +struct pi_desc *pi_desc = v-arch.hvm_vmx.pi_desc;
  +struct pi_desc old, new;
  +unsigned long flags;
  +
  +if ( !iommu_intpost )
  +return;
 
 Considering how the adjustment to start_vmx(), this at best should
 be an ASSERT() (if a check is needed at all).
 
  +switch ( v-runstate.state )
  +{
  +case RUNSTATE_runnable:
  +case RUNSTATE_offline:
  +/*
  + * We don't need to send notification event to a non-running
  + * vcpu, the interrupt information will be delivered to it before
  + * VM-ENTRY when the vcpu is scheduled to run next time.
  + */
  +pi_desc-sn = 1;
  +
  +/*
  + * If the state is transferred from RUNSTATE_blocked,
  + * we should set 'NV' feild back to posted_intr_vector,
  + * so the Posted-Interrupts can be delivered to the vCPU
  + * by VT-d HW after it is scheduled to run.
  + */
  +if ( old_state == RUNSTATE_blocked )
  +{
  +do
  +{
  +old.control = new.control = pi_desc-control;
  +new.nv = posted_intr_vector;
  +}
  +while ( cmpxchg(pi_desc-control, old.control, new.control)
  +!= old.control );
 
 So why is it okay to do the SN update non-atomically when the
 vector update is done atomically?
 
 Also the curly braces do _not_ go on separate lines for do/while
 constructs.
 
 And then do you really need to atomically update the whole 64 bits
 here, rather than just the nv field? By not making it a bit field
 you could perhaps just write_atomic() it?


Thinking more about this, maybe write_atomic() is not good for this, we need use
the LOCK prefix when updating the posted-interrupt descriptor.

Thanks,
Feng


  +
  +   /*
  +* Delete the vCPU from the related block list
  +* if we are resuming from blocked state
  +*/
  +   spin_lock_irqsave(per_cpu(blocked_vcpu_lock,
  + v-pre_pcpu), flags);
  +   list_del(v-blocked_vcpu_list);
  +   spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock,
  +  v-pre_pcpu), flags);
  +}
  +break;
  +
  +case RUNSTATE_blocked:
  +/*
  + * The vCPU is blocked on the block list.
  + * Add the blocked vCPU on the list of the
  + * vcpu-pre_pcpu, which is the destination
  + * of the wake-up notification event.
  + */
  +v-pre_pcpu = v-processor;
 
 Is latching this upon runstate change really enough? I.e. what about
 the v-processor changes that sched_move_domain() or individual
 schedulers do? Or if it really just matters on which CPU's blocked list
 the vCPU is (while its -processor changing subsequently doesn't
 matter) I'd like to see the field named more after its purpose (e.g.
 pi_block_cpu; list and lock should btw also have a connection to PI
 in their names).
 
 In the end, if the placement on a list followed v-processor, you
 would likely get away without the extra new field. Are there
 synchronization constraints speaking against such a model?
 
  +spin_lock_irqsave(per_cpu(blocked_vcpu_lock,
  +  v-pre_pcpu), flags);
  +list_add_tail(v-blocked_vcpu_list,
  +  per_cpu(blocked_vcpu, v-pre_pcpu));
  +spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock,
  +   v-pre_pcpu), flags);
  +
  +do
  +{
  +old.control = new.control = pi_desc-control;
  +
  +/*
  + * We should not block the vCPU if
  + * an interrupt was posted for it.
  + */
  +
  +if ( old.on == 1 )
 
 I think I said so elsewhere, but just in case I didn't: Please don't
 compare boolean fields with 1 - e.g. in the case here if ( old.on )
 suffices.
 
  +{
  +/*
  + * The vCPU will be removed from the block list
  + * during its state transferring from RUNSTATE_blocked
  + * to RUNSTATE_runnable after the following tasklet
  + * is scheduled to run.
  + */
 
 Precise comments please - I don't think the scheduling 

Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling

2015-06-17 Thread Wu, Feng


 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Tuesday, June 16, 2015 5:25 PM
 To: Wu, Feng
 Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin;
 Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org
 Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU
 scheduling
 
  On 16.06.15 at 10:56, feng...@intel.com wrote:
 
 
  -Original Message-
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: Tuesday, June 16, 2015 4:29 PM
  To: Wu, Feng
  Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin;
  Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org
  Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during
 vCPU
  scheduling
 
   On 16.06.15 at 10:13, feng...@intel.com wrote:
   From: Jan Beulich [mailto:jbeul...@suse.com]
   Sent: Tuesday, June 16, 2015 3:53 PM
On 16.06.15 at 02:17, feng...@intel.com wrote:
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Wednesday, June 10, 2015 2:44 PM
 On 08.05.15 at 11:07, feng...@intel.com wrote:
 +
 +   /*
 +* Delete the vCPU from the related block list
 +* if we are resuming from blocked state
 +*/
 +   spin_lock_irqsave(per_cpu(blocked_vcpu_lock,
 + v-pre_pcpu), flags);
 +   list_del(v-blocked_vcpu_list);
 +   spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock,
 +  v-pre_pcpu), flags);
 +}
 +break;
 +
 +case RUNSTATE_blocked:
 +/*
 + * The vCPU is blocked on the block list.
 + * Add the blocked vCPU on the list of the
 + * vcpu-pre_pcpu, which is the destination
 + * of the wake-up notification event.
 + */
 +v-pre_pcpu = v-processor;
   
Is latching this upon runstate change really enough? I.e. what about
the v-processor changes that sched_move_domain() or individual
schedulers do? Or if it really just matters on which CPU's blocked 
list
the vCPU is (while its -processor changing subsequently doesn't
matter) I'd like to see the field named more after its purpose (e.g.
pi_block_cpu; list and lock should btw also have a connection to PI
in their names).
   
Yes, It doesn't matter if vCPU changes. The key point is that we put
the vCPU on a pCPU list and we change the NDST field to this pCPU,
then the wakeup notification event will get there. You are right, I
need to rename them to reflect the real purpose of it.
   
   
In the end, if the placement on a list followed v-processor, you
would likely get away without the extra new field. Are there
synchronization constraints speaking against such a model?
   
I don't understand this quit well. Do you mean using 'v-processor'
as the pCPU list for the blocked vCPUs? Then what about 'v-processor'
changes, seems we cannot handle this case.
  
   That was the question - does anything speak against such a model?
  
   Do you mean still using v-processor as the pCPU to store the blocked
 vCPU?
 
  Not to store, but to indicate, but yes, the question still is regarding
  the need for the new field. I'm fine with adding the field if there is a
  proper explanation for it being needed; I'm not going to agree to add
  new fields when existing ones can serve the purpose.
 
  Fair enough. The basic reason for adding this new field is 'v-processor' 
  can
  be changed behind, so we lost control of it. This is straightforward way I
  can think of right now.
 
 This is the obvious part you state. What needs to be explained is
 why it would be impossible (or a t least a bad idea) to move the
 vCPU between blocked lists when its v-processor changes.

The key point here is that we put the blocked vCPU in the pCPU list and
update the 'NDST' field in posted-interrupt descriptor for the wakeup
notification event. Here are some reasons why using 'v-processor' as
the pCPU to store the blocked vCPUs is not a good idea:

There may be many places where 'v-processor' gets changed, we need
to find all of them and in each place, see if the vCPU is currently blocked, if
it is, remove it from the old pCPU list and insert to the new pCPU list (need
spinlock operations), we need also update the 'NDST' filed for the wakeup
notification event. Besides that, the individual scheduler can change
'v-processor', which means we may need to add some code specific to PI
purpose to the scheduler code, I don't think this is a good way. On the other
hand, if you use the current solution, we can get the following benefit:
- The logic is clear and simple.
- We only need to update the posted-interrupt descriptor before the vCPU
is blocked, i.e. once the 'NDST' filed is updated before blocking the vCPU,
we don't need to change it even 'v-processor' is changed.
- We have a central 

Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges

2015-06-17 Thread Chen, Tiejun

On 2015/6/17 16:05, Jan Beulich wrote:

On 17.06.15 at 09:54, tiejun.c...@intel.com wrote:

On 2015/6/17 15:19, Jan Beulich wrote:

On 17.06.15 at 09:10, tiejun.c...@intel.com wrote:

Yeah, this may waste some spaces in this worst case but I this think
this can guarantee our change don't impact on the original expectation,
right?


Some space may be multiple Gb (e.g. the frame buffer of a graphics


Sure.


card), which is totally unacceptable.



But then I don't understand what's your way. How can we fit all pci
devices just with the smallest power-of-2 region enclosing the reserved
device memory?

For example, the whole pci memory is sitting at
[0xa000, 0xa200]. And there are two PCI devices, A and B. Note
each device needs to be allocated with 0x100. So if without
concerning RMRR,

A. [0xa000,0xa100]
B. [0xa100,0xa200]

But if one RMRR resides at [0xa0f0, 0xa1f0] which obviously
generate its own alignment with 0x100. So the pci memory is expended
as [0xa000, 0xa300], right?

Then actually the whole pci memory can be separated three segments like,

#1. [0xa000, 0xa0f0]
#2. [0xa0f0, 0xa1f0] - RMRR would occupy
#3. [0xa1f0, 0xa300]

So just #3 can suffice to allocate but just for one device, right?


Right, i.e. this isn't even sufficient - you need [a000,a3ff]
to fit everything (but of course you can put smaller BARs into the
unused ranges [a000,a0ef] and [a1f0,a1ff]).


Yes, I knew there's this sort of hole that we should use efficiently as 
you said. And I also thought about this way previously but current pci 
allocation framework isn't friend to implement this easily,


/* Assign iomem and ioport resources in descending order of size. */
for ( i = 0; i  nr_bars; i++ )
{

I mean it isn't easy to calculate what's the most sufficient size in 
advance, and its also difficult to find to fit a appropriate pci device 
into those holes, so see below,



That's why I said it's not going to be tricky to get all corner cases
right _and_ not use up more space than needed.


ought to work out the smallest power-of-2 region enclosing the


Okay. I remember the smallest size of a given PCI I/O space is 8 bytes,
and the smallest size of a PCI memory space is 16 bytes. So

/* At least 16 bytes to align a PCI BAR size. */
uint64_t align = 16;

reserved_start = memory_map.map[j].addr;
reserved_size = memory_map.map[j].size;

reserved_start = (reserved_star + align)  ~(align - 1);
reserved_size = (reserved_size + align)  ~(align - 1);

Is this correct?


Simply aligning the region doesn't help afaict. You need to fit it
with the other MMIO allocations.


I guess you're saying just those mmio allocations conflicting with RMRR?
But we don't know these exact addresses until we finalize to allocate
them, right?


That's the point - you need to allocate them _around_ the reserved
regions.



Something hits me to generate another idea,

#1. Still allocate all devices as before.
#2. Lookup all actual bars to check if they're conflicting RMRR

We can skip these bars to keep zero. Then later it would make lookup easily.

#3. Need to reallocate these conflicting bars.
#3.1 Trying to reallocate them with the remaining resources
#3.2 If the remaining resources aren't enough, we need to allocate them 
from high_mem_resource.


I just feel this way may be easy and better. And even, this way also can 
help terminate the preexisting allocation failures, right?


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 58648: regressions - FAIL

2015-06-17 Thread osstest service user
flight 58648 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58648/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 58551
 build-i3865 xen-build fail REGR. vs. 58551
 build-amd64-xsm   5 xen-build fail REGR. vs. 58551
 build-i386-xsm5 xen-build fail REGR. vs. 58551
 build-armhf   5 xen-build fail REGR. vs. 58551
 build-armhf-xsm   5 xen-build fail REGR. vs. 58551

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a

version targeted for testing:
 qemuu93f6d1c16036aaf34055d16f54ea770fb8d6d280
baseline version:
 qemuu0a2df857a7038c75379cc575de5d4be4c0ac629e


People who touched revisions under test:
  Alberto Garcia be...@igalia.com
  Alex Bennée alex.ben...@linaro.org
  Alexander Graf ag...@suse.de
  Andrew Jones drjo...@redhat.com
  Aurelien Jarno aurel...@aurel32.net
  Aurelio C. Remonda aurelioremo...@gmail.com
  Benoit Canet benoit.ca...@nodalink.com
  Benoît Canet benoit.ca...@nodalink.com
  Christian Borntraeger borntrae...@de.ibm.com
  Cole Robinson crobi...@redhat.com
  Dave Airlie airl...@redhat.com
  Eric Auger eric.au...@linaro.org
  Fam Zheng f...@redhat.com
  Gerd Hoffmann kra...@redhat.com
  Johan Karlsson johan.karls...@enea.com
  John Snow js...@redhat.com
  Kevin Wolf 

Re: [Xen-devel] [PATCH v2] x86: synchronize PCI config space access decoding

2015-06-17 Thread Jan Beulich
 On 16.06.15 at 20:26, andrew.coop...@citrix.com wrote:
 On 16/06/15 09:09, Jan Beulich wrote:
 On 15.06.15 at 17:32, andrew.coop...@citrix.com wrote:
 On 15/06/15 15:30, Jan Beulich wrote:
 @@ -2439,9 +2434,19 @@ struct hvm_ioreq_server *hvm_select_iore
  
  type = IOREQ_TYPE_PCI_CONFIG;
  addr = ((uint64_t)sbdf  32) |
 -   CF8_ADDR_HI(cf8) |
 CF8_ADDR_LO(cf8) |
 (p-addr  3);
 +/* AMD extended configuration space access? */
 +if ( CF8_ADDR_HI(cf8) 
 + d-arch.x86_vendor == X86_VENDOR_AMD 
 + d-arch.x86 = 0x10  d-arch.x86 = 0x17 )
 +{
 +uint64_t msr_val;
 +
 +if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) 
 We now have several common paths which read this MSR looking for CF8_EXT.

 I think it would make sense to probe this on boot and have a
 cpu_has_amd_cf8_ext rather than repeatedly sampling an off-cpu MSR,
 although this would require synchronising it across all northbridges in
 emulate privileged op.

 Alternatively, it might just be better to unconditionally enable it
 during startup (as Linux does) and prevent dom0 from playing, which
 would avoid the need to synchronise updates to it.
 You just repeat what you said for v1, without taking into
 consideration my reply thereto: Us not using this method
 ourselves, we should honor and play by what Dom0 does.
 
 Sorry - I had completely forgotten that this was a v2, and had already
 asked this question.
 
 However, hvm_select_ioreq_server() it not a rare function to call, and I
 am still concerned with the overhead.

And I can only repeat that the MSR isn't being accessed unless an
apparent extended access is being seen.

 It turns out that MSR_AMD64_NB_CFG is unconditionally RAZ and has all
 writes discarded, so no HVM guest will ever be in a position to
 legitimately use AMD extended configuration access.

Where have you found that? The register (named NB_CFG1 in
newer families' BKGDs) is clearly r/w.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling

2015-06-17 Thread Wu, Feng


 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, June 10, 2015 2:44 PM
 To: Wu, Feng
 Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin;
 Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org
 Subject: Re: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU
 scheduling
 
  On 08.05.15 at 11:07, feng...@intel.com wrote:
  --- a/xen/arch/x86/hvm/vmx/vmx.c
  +++ b/xen/arch/x86/hvm/vmx/vmx.c
  @@ -1711,6 +1711,131 @@ static void vmx_handle_eoi(u8 vector)
   __vmwrite(GUEST_INTR_STATUS, status);
   }
 
  +static void vmx_pi_desc_update(struct vcpu *v, int old_state)
  +{
  +struct pi_desc *pi_desc = v-arch.hvm_vmx.pi_desc;
  +struct pi_desc old, new;
  +unsigned long flags;
  +
  +if ( !iommu_intpost )
  +return;
 
 Considering how the adjustment to start_vmx(), this at best should
 be an ASSERT() (if a check is needed at all).
 
  +switch ( v-runstate.state )
  +{
  +case RUNSTATE_runnable:
  +case RUNSTATE_offline:
  +/*
  + * We don't need to send notification event to a non-running
  + * vcpu, the interrupt information will be delivered to it before
  + * VM-ENTRY when the vcpu is scheduled to run next time.
  + */
  +pi_desc-sn = 1;
  +
  +/*
  + * If the state is transferred from RUNSTATE_blocked,
  + * we should set 'NV' feild back to posted_intr_vector,
  + * so the Posted-Interrupts can be delivered to the vCPU
  + * by VT-d HW after it is scheduled to run.
  + */
  +if ( old_state == RUNSTATE_blocked )
  +{
  +do
  +{
  +old.control = new.control = pi_desc-control;
  +new.nv = posted_intr_vector;
  +}
  +while ( cmpxchg(pi_desc-control, old.control, new.control)
  +!= old.control );
 
 So why is it okay to do the SN update non-atomically when the
 vector update is done atomically?
 
 Also the curly braces do _not_ go on separate lines for do/while
 constructs.
 
 And then do you really need to atomically update the whole 64 bits
 here, rather than just the nv field? By not making it a bit field
 you could perhaps just write_atomic() it?
 
  +
  +   /*
  +* Delete the vCPU from the related block list
  +* if we are resuming from blocked state
  +*/
  +   spin_lock_irqsave(per_cpu(blocked_vcpu_lock,
  + v-pre_pcpu), flags);
  +   list_del(v-blocked_vcpu_list);
  +   spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock,
  +  v-pre_pcpu), flags);
  +}
  +break;
  +
  +case RUNSTATE_blocked:
  +/*
  + * The vCPU is blocked on the block list.
  + * Add the blocked vCPU on the list of the
  + * vcpu-pre_pcpu, which is the destination
  + * of the wake-up notification event.
  + */
  +v-pre_pcpu = v-processor;
 
 Is latching this upon runstate change really enough? I.e. what about
 the v-processor changes that sched_move_domain() or individual
 schedulers do? Or if it really just matters on which CPU's blocked list
 the vCPU is (while its -processor changing subsequently doesn't
 matter) I'd like to see the field named more after its purpose (e.g.
 pi_block_cpu; list and lock should btw also have a connection to PI
 in their names).
 
 In the end, if the placement on a list followed v-processor, you
 would likely get away without the extra new field. Are there
 synchronization constraints speaking against such a model?
 
  +spin_lock_irqsave(per_cpu(blocked_vcpu_lock,
  +  v-pre_pcpu), flags);
  +list_add_tail(v-blocked_vcpu_list,
  +  per_cpu(blocked_vcpu, v-pre_pcpu));
  +spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock,
  +   v-pre_pcpu), flags);
  +
  +do
  +{
  +old.control = new.control = pi_desc-control;
  +
  +/*
  + * We should not block the vCPU if
  + * an interrupt was posted for it.
  + */
  +
  +if ( old.on == 1 )
 
 I think I said so elsewhere, but just in case I didn't: Please don't
 compare boolean fields with 1 - e.g. in the case here if ( old.on )
 suffices.
 
  +{
  +/*
  + * The vCPU will be removed from the block list
  + * during its state transferring from RUNSTATE_blocked
  + * to RUNSTATE_runnable after the following tasklet
  + * is scheduled to run.
  + */
 
 Precise comments please - I don't think the scheduling of the
 tasklet makes any difference; I suppose it's the execution of it
 that does.
 
  +tasklet_schedule(v-vcpu_wakeup_tasklet);
  +

Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges

2015-06-17 Thread Jan Beulich
 On 17.06.15 at 09:54, tiejun.c...@intel.com wrote:
 On 2015/6/17 15:19, Jan Beulich wrote:
 On 17.06.15 at 09:10, tiejun.c...@intel.com wrote:
 Yeah, this may waste some spaces in this worst case but I this think
 this can guarantee our change don't impact on the original expectation,
 right?

 Some space may be multiple Gb (e.g. the frame buffer of a graphics
 
 Sure.
 
 card), which is totally unacceptable.

 
 But then I don't understand what's your way. How can we fit all pci 
 devices just with the smallest power-of-2 region enclosing the reserved 
 device memory?
 
 For example, the whole pci memory is sitting at
 [0xa000, 0xa200]. And there are two PCI devices, A and B. Note 
 each device needs to be allocated with 0x100. So if without 
 concerning RMRR,
 
 A. [0xa000,0xa100]
 B. [0xa100,0xa200]
 
 But if one RMRR resides at [0xa0f0, 0xa1f0] which obviously 
 generate its own alignment with 0x100. So the pci memory is expended 
 as [0xa000, 0xa300], right?
 
 Then actually the whole pci memory can be separated three segments like,
 
 #1. [0xa000, 0xa0f0]
 #2. [0xa0f0, 0xa1f0] - RMRR would occupy
 #3. [0xa1f0, 0xa300]
 
 So just #3 can suffice to allocate but just for one device, right?

Right, i.e. this isn't even sufficient - you need [a000,a3ff]
to fit everything (but of course you can put smaller BARs into the
unused ranges [a000,a0ef] and [a1f0,a1ff]).
That's why I said it's not going to be tricky to get all corner cases
right _and_ not use up more space than needed.

 ought to work out the smallest power-of-2 region enclosing the

 Okay. I remember the smallest size of a given PCI I/O space is 8 bytes,
 and the smallest size of a PCI memory space is 16 bytes. So

 /* At least 16 bytes to align a PCI BAR size. */
 uint64_t align = 16;

 reserved_start = memory_map.map[j].addr;
 reserved_size = memory_map.map[j].size;

 reserved_start = (reserved_star + align)  ~(align - 1);
 reserved_size = (reserved_size + align)  ~(align - 1);

 Is this correct?

 Simply aligning the region doesn't help afaict. You need to fit it
 with the other MMIO allocations.
 
 I guess you're saying just those mmio allocations conflicting with RMRR? 
 But we don't know these exact addresses until we finalize to allocate 
 them, right?

That's the point - you need to allocate them _around_ the reserved
regions.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Modified RTDS scheduler to use an event-driven model instead of polling.

2015-06-17 Thread Dario Faggioli
[Let me just reply to this... Then I'll jump to the end of the thread]

On Wed, 2015-06-17 at 01:24 -0400, Dagaen Golomb wrote:

  If replenishments were done by themsevles, then the scheduler may
  decide to wait for some period of time, and in this same time period a
  replenishment
  could change what should be executing.
 
  Well, the scheduler then would better *not* decide to wait for any
  period of time. It better act like this: as soon as a replenishment
  happens, and a new deadline is assigned to a vCPU, put the vCPU in the
  runqueue, in the proper position; and if such a newly replenished vCPU
  should now preempt any other, currently running, vCPU, 'tickle' the pCPU
  in question and let it reschedule, and pick up the newly replenished
  vCPU.
 
 Yes, this is an option. However, I thought this would actually be an
 option you would
 not like. 

How so... I've been arguing for this the whole time?!?! :-O

I'm sure I've put down a sketch of what I think the replenishment
function should do in my first or second email in the thread, and I'm
sure that involved a call to runq_tickle()

 For the replenishment routine to know when to call the
 scheduler and when
 not to, it basically has to perform the the scheduler's logic, at
 which point if there is a
 change... why doesn't it just do it (i.e., be the scheduler) instead
 of adding logic and
 possibly overhead to call it.
 
It has to call runq_tickle(), which is not the scheduler's logic, is
the function that decides whether a pCPU should reschedule.

runq_tickle(), right now, is called from within rt_vcpu_wake(), which
makes perfect sense, and from within rt_context_saved(), which makes no
sense. It should be called from within rt_vcpu_wake() and from the
replenishment handling routine.

Scheduling, to me, is __runq_pick() + the if() in rt_schedule() that
checks whether scurr-cur_deadline=snext-cur_deadline.

Anyway, I've zero interest in turning this into a fight over
terminology... If you want to call runq_tickle() the scheduler, go
ahead, it would just make communication a bit more difficult, but I'm up
for the challenge! :-)

Oh, BTW, while we're here, __runq_pick() being called from a bunch of
places, outside of rt_schedule(), is also something I never liked (you
can go check that in the archives), and I also blame the confusion
between scheduling and replenishmets, for the fact that such a thing is
necessary. I seriously hope this can be fixed too.

 Also, I'd like to point out that when I said things like make more 
 intelligent
 decision handling, I meant exactly things like the above: using as much
 information as possible to prevent scheduler invocations. The current patch
 doesn't reschedule for any replenishment.

How does it not? you're arming s_timer to fire at either next
replenishment or budget enforcement point in time...

  Firstly, it doesn't even look at
 the run queue: replenishments here can only happen on miss.
 
... Oh, wait, I know! I know! I know! You mean it doesn't tickle at any
replenishment, or it doesn't context switch at any replenishment,
don't you?

(See, I'm good at adapting to the new terminology! :-D)

  So, see? Transient  violation of EDF is just there, no matter the
  approach!
 
 Of course, violation during the scheduler is possible for any implementation 
 :(
 
Yeah, well, it was you that said that which should stick to your design
to prevent violations, which would have crept in if going with my
solution. Bha, anyway, it's probably best to let go...

  I really don't see what you mean here. There won't be anything like that
  to take care of. Xen offers timers as an abstraction, and deals with
  them itself. You only need to take care of properly serializing
  rt_schedule() and the timer handling routine, for the code sections that
  require that.
 
 This is good to know, thanks. The question would then change to how does
 Xen handle such an occurrence, but I think this is irrelevant now that I know
 which side you had in mind - that is, that the replenishment handler would
 decide when the scheduler should be invoked.
 
It should call runq_tickle(), yes.

  It would be a hard pitch for a gEDF scheduler with a
  similar accounting period where the gEDF policy could be violated.
 
  There's no accounting period of any sort. It's called overhead!
 
 This part was referring to if we had two timers without extensive 
 communication,
 cooperation, or information/logic sharing.
 
I'm in the dark about what this extensive communication, cooperation,
or information/logic sharing could be, I have to admit.

If it means that the replenishment handler should call runq_tickle(),
then yes, let me state it once more: replenishment handling routine
should call runq_tickle() (other places, with the only other exception
of rt_vcpu_wake(), shouldn't).

  You also may know that there is an EDF implementation in mainline Linux,
  and you may would like to go have a look there too. I'm not including
  any example from there 

Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream

2015-06-17 Thread Wen Congyang
On 06/15/2015 09:44 PM, Andrew Cooper wrote:
 From: Ross Lagerwall ross.lagerw...@citrix.com
 
 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/Makefile|1 +
  tools/libxl/libxl_internal.h|   39 
  tools/libxl/libxl_stream_read.c |  485 
 +++
  3 files changed, 525 insertions(+)
  create mode 100644 tools/libxl/libxl_stream_read.c
 
 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
 index cc9c152..c71c5fe 100644
 --- a/tools/libxl/Makefile
 +++ b/tools/libxl/Makefile
 @@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
 libxl_pci.o \
   libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
   libxl_internal.o libxl_utils.o libxl_uuid.o \
   libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
 \
 + libxl_stream_read.o \
   libxl_save_callout.o _libxl_save_msgs_callout.o \
   libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
  LIBXL_OBJS += libxl_genid.o
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index 101994f..4f33cb8 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -19,6 +19,8 @@
  
  #include libxl_osdeps.h /* must come before any other headers */
  
 +#include libxl_sr_stream_format.h
 +
  #include assert.h
  #include dirent.h
  #include errno.h
 @@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
   libxl__domain_create_state*,
   int rc, uint32_t domid);
  
 +/* State for manipulating a libxl migration v2 stream */
 +typedef struct libxl__stream_read_state libxl__stream_read_state;
 +
 +struct libxl__stream_read_state {
 +/* filled by the user */
 +libxl__ao *ao;
 +int fd;
 +void (*completion_callback)(libxl__egc *egc,
 +libxl__domain_create_state *dcs,
 +int rc);
 +/* Private */
 +int rc;
 +bool running;
 +libxl__datacopier_state dc;
 +size_t expected_len;
 +libxl_sr_hdr hdr;
 +libxl_sr_rec_hdr rec_hdr;
 +void *rec_body;
 +};
 +
 +_hidden void libxl__stream_read_start(libxl__egc *egc,
 +  libxl__stream_read_state *stream);
 +
 +_hidden void libxl__stream_read_continue(libxl__egc *egc,
 + libxl__stream_read_state *stream);
 +
 +_hidden void libxl__stream_read_abort(libxl__egc *egc,
 +  libxl__stream_read_state *stream, int 
 rc);
 +
 +static inline bool libxl__stream_read_inuse(
 +const libxl__stream_read_state *stream)
 +{
 +return stream-running;
 +}
 +
 +
  struct libxl__domain_create_state {
  /* filled in by user */
  libxl__ao *ao;
 @@ -3137,6 +3175,7 @@ struct libxl__domain_create_state {
  libxl__stub_dm_spawn_state dmss;
  /* If we're not doing stubdom, we use only dmss.dm,
   * for the non-stubdom device model. */
 +libxl__stream_read_state srs;
  libxl__save_helper_state shs;
  /* necessary if the domain creation failed and we have to destroy it */
  libxl__domain_destroy_state dds;
 diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
 new file mode 100644
 index 000..9cdaadf
 --- /dev/null
 +++ b/tools/libxl/libxl_stream_read.c
 @@ -0,0 +1,485 @@
 +/*
 + * Copyright (C) 2015  Citrix Ltd.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU Lesser General Public License as published
 + * by the Free Software Foundation; version 2.1 only. with the special
 + * exception on linking described in file LICENSE.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU Lesser General Public License for more details.
 + */
 +
 +#include libxl_osdeps.h /* must come before any other headers */
 +
 +#include libxl_internal.h
 +
 +/*
 + * Infrastructure for reading and acting on the contents of a libxl migration
 + * stream. There are a lot of moving parts here.
 + *
 + * Entry points from outside:
 + *  - libxl__stream_read_start()
 + * - Set up reading a stream from the start.
 + *
 + *  - libxl__stream_read_continue()
 + * - Set up reading the next record from a started stream.
 + *
 + * The principle loop functionality involves reading the stream header, then
 + * reading a record at time and acting upon it.  It follows the callbacks:
 + *
 + *  - stream_header_done()
 + *  - 

Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling

2015-06-17 Thread Jan Beulich
 On 17.06.15 at 09:00, feng...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, June 10, 2015 2:44 PM
  On 08.05.15 at 11:07, feng...@intel.com wrote:
  --- a/xen/include/xen/sched.h
  +++ b/xen/include/xen/sched.h
  @@ -142,6 +142,8 @@ struct vcpu
 
   int  processor;
 
  +int  pre_pcpu;
 
 This again doesn't belong into the common structure (and should
 be accompanied by a comment, and should be unsigned int).
 
 I added three new member to 'struct vcpu' in this series:
   - struct tasklet   vcpu_wakeup_tasklet;
   - struct list_head blocked_vcpu_list;
   - int  pre_pcpu;
 
 I am trying to find a place specific to vmx to put them in, but the only one
 I find is ' struct arch_vmx_struct ', however, I don't think this is a good
 place, since it contains all vt-x stuff defined in the SDM. Do you think
 'struct hvm_vcpu' is the right place?

No, arch_vmx_struct is. Taking active_list and active_cpu as
example, the structure clearly isn't limited to fields defined by
the SDM.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges

2015-06-17 Thread Chen, Tiejun

On 2015/6/16 17:40, Jan Beulich wrote:

On 16.06.15 at 11:29, tiejun.c...@intel.com wrote:

I'm trying to walk into this direction:

  /*
   * We'll skip all space overlapping with reserved memory later,
   * so we need to increase mmio_total to compensate them.
   */
  for ( j = 0; j  memory_map.nr_map ; j++ )
  {
  uint64_t conflict_size = 0;
  if ( memory_map.map[j].type != E820_RAM )
  {
  reserved_start = memory_map.map[j].addr;
  reserved_size = memory_map.map[j].size;
  reserved_end = reserved_start + reserved_size;
  if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start,
 reserved_start, reserved_size) )
  {
  /*
   * Calculate how much mmio range conflict with
   * reserved device memory.
   */
  conflict_size += reserved_size;

  /*
   * But we may need to subtract those sizes beyond the
   * pci memory, [pci_mem_start, pci_mem_end].
   */
  if ( reserved_start  pci_mem_start )
  conflict_size -= (pci_mem_start - reserved_start);
  if ( reserved_end  pci_mem_end )
  conflict_size -= (reserved_end - pci_mem_end);
  }
  }

  if ( conflict_size )
  {
  uint64_t conflict_size = max_t(
  uint64_t, conflict_size, max_bar_sz);
  conflict_size = ~(conflict_size - 1);
  mmio_total += conflict_size;
  }
  }


This last thing goes in the right direction, but is complete overkill
when you have a small reserved region and a huge BAR. You


Yeah, this may waste some spaces in this worst case but I this think 
this can guarantee our change don't impact on the original expectation, 
right?



ought to work out the smallest power-of-2 region enclosing the


Okay. I remember the smallest size of a given PCI I/O space is 8 bytes, 
and the smallest size of a PCI memory space is 16 bytes. So


/* At least 16 bytes to align a PCI BAR size. */
uint64_t align = 16;

reserved_start = memory_map.map[j].addr;
reserved_size = memory_map.map[j].size;

reserved_start = (reserved_star + align)  ~(align - 1);
reserved_size = (reserved_size + align)  ~(align - 1);

Is this correct?


reserved range (albeit there are tricky corner cases to consider).



Yeah, its a little tricky since RMRR always owns a fixed start address, 
so we can't reorder them with all pci bars. I just think at least we 
should provide a correct solution now, then further look into what can 
be optimized. So I think we'd better get conflict_size with 
max(conflict_size, max_bar_sz), right?


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream

2015-06-17 Thread Wen Congyang
On 06/15/2015 09:44 PM, Andrew Cooper wrote:
 From: Ross Lagerwall ross.lagerw...@citrix.com
 
 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/Makefile|1 +
  tools/libxl/libxl_internal.h|   39 
  tools/libxl/libxl_stream_read.c |  485 
 +++
  3 files changed, 525 insertions(+)
  create mode 100644 tools/libxl/libxl_stream_read.c
 
 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
 index cc9c152..c71c5fe 100644
 --- a/tools/libxl/Makefile
 +++ b/tools/libxl/Makefile
 @@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
 libxl_pci.o \
   libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
   libxl_internal.o libxl_utils.o libxl_uuid.o \
   libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
 \
 + libxl_stream_read.o \
   libxl_save_callout.o _libxl_save_msgs_callout.o \
   libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
  LIBXL_OBJS += libxl_genid.o
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index 101994f..4f33cb8 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -19,6 +19,8 @@
  
  #include libxl_osdeps.h /* must come before any other headers */
  
 +#include libxl_sr_stream_format.h
 +
  #include assert.h
  #include dirent.h
  #include errno.h
 @@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
   libxl__domain_create_state*,
   int rc, uint32_t domid);
  
 +/* State for manipulating a libxl migration v2 stream */
 +typedef struct libxl__stream_read_state libxl__stream_read_state;
 +
 +struct libxl__stream_read_state {
 +/* filled by the user */
 +libxl__ao *ao;
 +int fd;
 +void (*completion_callback)(libxl__egc *egc,
 +libxl__domain_create_state *dcs,
 +int rc);
 +/* Private */
 +int rc;
 +bool running;
 +libxl__datacopier_state dc;
 +size_t expected_len;
 +libxl_sr_hdr hdr;
 +libxl_sr_rec_hdr rec_hdr;
 +void *rec_body;
 +};
 +
 +_hidden void libxl__stream_read_start(libxl__egc *egc,
 +  libxl__stream_read_state *stream);
 +
 +_hidden void libxl__stream_read_continue(libxl__egc *egc,
 + libxl__stream_read_state *stream);
 +
 +_hidden void libxl__stream_read_abort(libxl__egc *egc,
 +  libxl__stream_read_state *stream, int 
 rc);
 +
 +static inline bool libxl__stream_read_inuse(
 +const libxl__stream_read_state *stream)
 +{
 +return stream-running;
 +}
 +
 +
  struct libxl__domain_create_state {
  /* filled in by user */
  libxl__ao *ao;
 @@ -3137,6 +3175,7 @@ struct libxl__domain_create_state {
  libxl__stub_dm_spawn_state dmss;
  /* If we're not doing stubdom, we use only dmss.dm,
   * for the non-stubdom device model. */
 +libxl__stream_read_state srs;
  libxl__save_helper_state shs;
  /* necessary if the domain creation failed and we have to destroy it */
  libxl__domain_destroy_state dds;
 diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
 new file mode 100644
 index 000..9cdaadf
 --- /dev/null
 +++ b/tools/libxl/libxl_stream_read.c
 @@ -0,0 +1,485 @@
 +/*
 + * Copyright (C) 2015  Citrix Ltd.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU Lesser General Public License as published
 + * by the Free Software Foundation; version 2.1 only. with the special
 + * exception on linking described in file LICENSE.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU Lesser General Public License for more details.
 + */
 +
 +#include libxl_osdeps.h /* must come before any other headers */
 +
 +#include libxl_internal.h
 +
 +/*
 + * Infrastructure for reading and acting on the contents of a libxl migration
 + * stream. There are a lot of moving parts here.
 + *
 + * Entry points from outside:
 + *  - libxl__stream_read_start()
 + * - Set up reading a stream from the start.
 + *
 + *  - libxl__stream_read_continue()
 + * - Set up reading the next record from a started stream.
 + *
 + * The principle loop functionality involves reading the stream header, then
 + * reading a record at time and acting upon it.  It follows the callbacks:
 + *
 + *  - stream_header_done()
 + *  - 

Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges

2015-06-17 Thread Chen, Tiejun

On 2015/6/17 15:19, Jan Beulich wrote:

On 17.06.15 at 09:10, tiejun.c...@intel.com wrote:

On 2015/6/16 17:40, Jan Beulich wrote:

On 16.06.15 at 11:29, tiejun.c...@intel.com wrote:

I'm trying to walk into this direction:

   /*
* We'll skip all space overlapping with reserved memory later,
* so we need to increase mmio_total to compensate them.
*/
   for ( j = 0; j  memory_map.nr_map ; j++ )
   {
   uint64_t conflict_size = 0;
   if ( memory_map.map[j].type != E820_RAM )
   {
   reserved_start = memory_map.map[j].addr;
   reserved_size = memory_map.map[j].size;
   reserved_end = reserved_start + reserved_size;
   if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start,
  reserved_start, reserved_size) )
   {
   /*
* Calculate how much mmio range conflict with
* reserved device memory.
*/
   conflict_size += reserved_size;

   /*
* But we may need to subtract those sizes beyond the
* pci memory, [pci_mem_start, pci_mem_end].
*/
   if ( reserved_start  pci_mem_start )
   conflict_size -= (pci_mem_start - reserved_start);
   if ( reserved_end  pci_mem_end )
   conflict_size -= (reserved_end - pci_mem_end);
   }
   }

   if ( conflict_size )
   {
   uint64_t conflict_size = max_t(
   uint64_t, conflict_size, max_bar_sz);
   conflict_size = ~(conflict_size - 1);
   mmio_total += conflict_size;
   }
   }


This last thing goes in the right direction, but is complete overkill
when you have a small reserved region and a huge BAR. You


Yeah, this may waste some spaces in this worst case but I this think
this can guarantee our change don't impact on the original expectation,
right?


Some space may be multiple Gb (e.g. the frame buffer of a graphics


Sure.


card), which is totally unacceptable.



But then I don't understand what's your way. How can we fit all pci 
devices just with the smallest power-of-2 region enclosing the reserved 
device memory?


For example, the whole pci memory is sitting at
[0xa000, 0xa200]. And there are two PCI devices, A and B. Note 
each device needs to be allocated with 0x100. So if without 
concerning RMRR,


A. [0xa000,0xa100]
B. [0xa100,0xa200]

But if one RMRR resides at [0xa0f0, 0xa1f0] which obviously 
generate its own alignment with 0x100. So the pci memory is expended 
as [0xa000, 0xa300], right?


Then actually the whole pci memory can be separated three segments like,

#1. [0xa000, 0xa0f0]
#2. [0xa0f0, 0xa1f0] - RMRR would occupy
#3. [0xa1f0, 0xa300]

So just #3 can suffice to allocate but just for one device, right?

If I'm wrong please correct me.


ought to work out the smallest power-of-2 region enclosing the


Okay. I remember the smallest size of a given PCI I/O space is 8 bytes,
and the smallest size of a PCI memory space is 16 bytes. So

/* At least 16 bytes to align a PCI BAR size. */
uint64_t align = 16;

reserved_start = memory_map.map[j].addr;
reserved_size = memory_map.map[j].size;

reserved_start = (reserved_star + align)  ~(align - 1);
reserved_size = (reserved_size + align)  ~(align - 1);

Is this correct?


Simply aligning the region doesn't help afaict. You need to fit it
with the other MMIO allocations.


I guess you're saying just those mmio allocations conflicting with RMRR? 
But we don't know these exact addresses until we finalize to allocate 
them, right?


Thanks
Tiejun




reserved range (albeit there are tricky corner cases to consider).



Yeah, its a little tricky since RMRR always owns a fixed start address,
so we can't reorder them with all pci bars. I just think at least we
should provide a correct solution now, then further look into what can
be optimized. So I think we'd better get conflict_size with
max(conflict_size, max_bar_sz), right?


As per above - no, this is not an option.

Jan





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-4.2-testing test] 58616: trouble: blocked/broken/fail/pass

2015-06-17 Thread Jan Beulich
 On 17.06.15 at 03:13, osst...@xenbits.xen.org wrote:
 flight 58616 xen-4.2-testing real [real]
 http://logs.test-lab.xenproject.org/osstest/logs/58616/ 
 
 Failures and problems with tests :-(
 
 Tests which did not succeed and are blocking,
 including tests which could not be run:
  build-amd64-libvirt  3 host-install(3) broken in 58584 REGR. vs. 
 58411
 
 Tests which are failing intermittently (not blocking):
  test-i386-i386-libvirt3 host-install(3)  broken in 58584 pass in 
 58616
  test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 
 58616
  test-amd64-i386-pair  4 host-install/dst_host(4) broken in 58584 pass in 
 58616
  test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken in 58584 pass 
 in 58616
  test-amd64-i386-xl-win7-amd64  3 host-install(3) broken in 58584 pass in 
 58616
  test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass in 
 58616
  test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass in 
 58616
  test-amd64-amd64-xl-multivcpu  3 host-install(3) broken in 58604 pass in 
 58616
  test-i386-i386-xl 3 host-install(3)  broken in 58604 pass in 
 58616
  test-i386-i386-xl-sedf3 host-install(3)  broken in 58604 pass in 
 58616
  test-amd64-i386-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 58604 
 pass in 58616
  test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 58604 
 pass in 58616
  test-i386-i386-xl-qemuu-winxpsp3 3 host-install(3) broken in 58604 pass in 
 58616
  test-i386-i386-xl-qemut-winxpsp3  3 host-install(3)   broken pass in 
 58604

Is there an explanation for all these recurring host install failures?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges

2015-06-17 Thread Jan Beulich
 On 17.06.15 at 09:10, tiejun.c...@intel.com wrote:
 On 2015/6/16 17:40, Jan Beulich wrote:
 On 16.06.15 at 11:29, tiejun.c...@intel.com wrote:
 I'm trying to walk into this direction:

   /*
* We'll skip all space overlapping with reserved memory later,
* so we need to increase mmio_total to compensate them.
*/
   for ( j = 0; j  memory_map.nr_map ; j++ )
   {
   uint64_t conflict_size = 0;
   if ( memory_map.map[j].type != E820_RAM )
   {
   reserved_start = memory_map.map[j].addr;
   reserved_size = memory_map.map[j].size;
   reserved_end = reserved_start + reserved_size;
   if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start,
  reserved_start, reserved_size) )
   {
   /*
* Calculate how much mmio range conflict with
* reserved device memory.
*/
   conflict_size += reserved_size;

   /*
* But we may need to subtract those sizes beyond the
* pci memory, [pci_mem_start, pci_mem_end].
*/
   if ( reserved_start  pci_mem_start )
   conflict_size -= (pci_mem_start - reserved_start);
   if ( reserved_end  pci_mem_end )
   conflict_size -= (reserved_end - pci_mem_end);
   }
   }

   if ( conflict_size )
   {
   uint64_t conflict_size = max_t(
   uint64_t, conflict_size, max_bar_sz);
   conflict_size = ~(conflict_size - 1);
   mmio_total += conflict_size;
   }
   }

 This last thing goes in the right direction, but is complete overkill
 when you have a small reserved region and a huge BAR. You
 
 Yeah, this may waste some spaces in this worst case but I this think 
 this can guarantee our change don't impact on the original expectation, 
 right?

Some space may be multiple Gb (e.g. the frame buffer of a graphics
card), which is totally unacceptable.

 ought to work out the smallest power-of-2 region enclosing the
 
 Okay. I remember the smallest size of a given PCI I/O space is 8 bytes, 
 and the smallest size of a PCI memory space is 16 bytes. So
 
 /* At least 16 bytes to align a PCI BAR size. */
 uint64_t align = 16;
 
 reserved_start = memory_map.map[j].addr;
 reserved_size = memory_map.map[j].size;
 
 reserved_start = (reserved_star + align)  ~(align - 1);
 reserved_size = (reserved_size + align)  ~(align - 1);
 
 Is this correct?

Simply aligning the region doesn't help afaict. You need to fit it
with the other MMIO allocations.

 reserved range (albeit there are tricky corner cases to consider).

 
 Yeah, its a little tricky since RMRR always owns a fixed start address, 
 so we can't reorder them with all pci bars. I just think at least we 
 should provide a correct solution now, then further look into what can 
 be optimized. So I think we'd better get conflict_size with 
 max(conflict_size, max_bar_sz), right?

As per above - no, this is not an option.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream

2015-06-17 Thread Ian Campbell
On Tue, 2015-06-16 at 16:53 +0100, Andrew Cooper wrote:
 On 16/06/15 16:03, Ian Campbell wrote:
  On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
  when signalled to do so by libxl__remus_domain_checkpoint_callback()
  I think I saw that Remus wasn't currently working, so I'll let you and
  Hongyang thrash something out before I spend too much effort reviewing
  these last few RFC bits. Unless you think it is worth my having a look
  now?
 
 
 
 Remus was broken by patch 19 in the series, and this patch forms part of
 fixing it again.
 
 I can't find a way of fixing the layering violation in both plain
 migration and Remus, in a readable, bisectable way.
 
 Remus requires identical source and destination toolstacks, and the
 Remus maintainers are happy enough with the break it and fix it up in
 the same series approach.
 
 Now that the series is comeplete, there is some shuffling room to reduce
 the window of breakage, but short of folding patches 19, 21, 23-25
 together, Remus will break.

The report I was referring to thinking I'd seen was that Remus was still
broken even after the complete series was applied i.e. there was still
more to be done.

I'm happy with the transient breakage in this series on this occasion,
but I was proposing not to review until Remus was thought to be working
OK at the end.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream

2015-06-17 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
[...]

+
+static void stream_success(libxl__egc *egc,
+   libxl__stream_write_state *stream);
+static void stream_failed(libxl__egc *egc,
+  libxl__stream_write_state *stream, int ret);
+static void stream_done(libxl__egc *egc,
+libxl__stream_write_state *stream);
+
+static void check_stream_finished(libxl__egc *egc,
+  libxl__domain_suspend_state *dcs,


s/dcs/dss/


+  int rc, const char *what);
+
+/* Event callbacks for plain VM. */
+static void stream_header_done(libxl__egc *egc,
+   libxl__datacopier_state *dc,
+   int onwrite, int errnoval);
+static void libxc_header_done(libxl__egc *egc,
+  libxl__datacopier_state *dc,
+  int onwrite, int errnoval);
+/* libxl__xc_domain_save_done() lives here, event-order wise. */
+static void write_toolstack_record(libxl__egc *egc,
+   libxl__stream_write_state *stream);
+static void toolstack_record_done(libxl__egc *egc,
+  libxl__datacopier_state *dc,
+  int onwrite, int errnoval);
+static void write_emulator_record(libxl__egc *egc,
+  libxl__stream_write_state *stream);
+static void emulator_body_done(libxl__egc *egc,
+   libxl__datacopier_state *dc,
+   int onwrite, int errnoval);
+static void emulator_padding_done(libxl__egc *egc,
+  libxl__datacopier_state *dc,
+  int onwrite, int errnoval);
+static void write_end_record(libxl__egc *egc,
+ libxl__stream_write_state *stream);
+static void end_record_done(libxl__egc *egc,
+libxl__datacopier_state *dc,
+int onwrite, int errnoval);
+
+void libxl__stream_write_start(libxl__egc *egc,
+   libxl__stream_write_state *stream)
+{
+libxl__datacopier_state *dc = stream-dc;
+STATE_AO_GC(stream-ao);
+struct libxl_sr_hdr hdr = { 0 };
+int ret = 0;
+
+assert(!stream-running);
+stream-running = true;
+
+memset(dc, 0, sizeof(*dc));
+dc-readwhat = ;
+dc-copywhat = suspend header;
+dc-writewhat = save/migration stream;
+dc-ao = ao;
+dc-readfd = -1;
+dc-writefd = stream-fd;
+dc-maxsz = INT_MAX;
+dc-bytes_to_read = INT_MAX;
+dc-callback = stream_header_done;
+
+ret = libxl__datacopier_start(dc);
+if (ret)
+goto err;
+
+hdr.ident   = htobe64(RESTORE_STREAM_IDENT);
+hdr.version = htobe32(RESTORE_STREAM_VERSION);
+hdr.options = htobe32(0);
+
+libxl__datacopier_prefixdata(egc, dc, hdr, sizeof(hdr));
+return;
+
+ err:
+assert(ret);
+stream_failed(egc, stream, ret);
+}
+
+void libxl__stream_write_abort(libxl__egc *egc,
+   libxl__stream_write_state *stream, int rc)
+{
+stream_failed(egc, stream, rc);
+}
+
+static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
+{
+stream-rc = 0;
+stream-running = false;
+
+stream_done(egc, stream);
+}
+
+static void stream_failed(libxl__egc *egc,
+  libxl__stream_write_state *stream, int rc)
+{
+assert(rc);
+stream-rc = rc;
+
+if (stream-running) {
+stream-running = false;
+stream_done(egc, stream);
+}
+}
+
+static void stream_done(libxl__egc *egc,
+libxl__stream_write_state *stream)
+{
+libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+
+assert(!stream-running);
+
+check_stream_finished(egc, dss, stream-rc, stream);
+}
+
+static void check_stream_finished(libxl__egc *egc,
+  libxl__domain_suspend_state *dss,
+  int rc, const char *what)
+{
+libxl__stream_write_state *stream = dss-sws;
+STATE_AO_GC(dss-ao);
+
+LOG(INFO, Task '%s' joining (rc %d), what, rc);
+
+if (rc  !stream-joined_rc) {
+bool skip = false;
+/* First reported failure from joining tasks.  Tear everything down */
+stream-joined_rc = rc;
+
+if (libxl__stream_write_inuse(dss-sws)) {
+skip = true;
+libxl__stream_write_abort(egc, dss-sws, rc);
+}
+
+if (libxl__save_helper_inuse(dss-shs)) {
+skip = true;
+libxl__save_helper_abort(egc, dss-shs);
+}
+
+/* There is at least one more active task to join - wait for its
+   callback */
+if ( skip )
+return;
+}
+
+if (libxl__stream_write_inuse(dss-sws))
+LOG(DEBUG, stream still in use);
+else if (libxl__save_helper_inuse(dss-shs))
+

Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling

2015-06-17 Thread Jan Beulich
 On 17.06.15 at 08:54, feng...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, June 10, 2015 2:44 PM
  On 08.05.15 at 11:07, feng...@intel.com wrote:
  +switch ( v-runstate.state )
  +{
  +case RUNSTATE_runnable:
  +case RUNSTATE_offline:
  +/*
  + * We don't need to send notification event to a non-running
  + * vcpu, the interrupt information will be delivered to it before
  + * VM-ENTRY when the vcpu is scheduled to run next time.
  + */
  +pi_desc-sn = 1;
  +
  +/*
  + * If the state is transferred from RUNSTATE_blocked,
  + * we should set 'NV' feild back to posted_intr_vector,
  + * so the Posted-Interrupts can be delivered to the vCPU
  + * by VT-d HW after it is scheduled to run.
  + */
  +if ( old_state == RUNSTATE_blocked )
  +{
  +do
  +{
  +old.control = new.control = pi_desc-control;
  +new.nv = posted_intr_vector;
  +}
  +while ( cmpxchg(pi_desc-control, old.control, new.control)
  +!= old.control );
 
 So why is it okay to do the SN update non-atomically when the
 vector update is done atomically?
 
 Also the curly braces do _not_ go on separate lines for do/while
 constructs.
 
 And then do you really need to atomically update the whole 64 bits
 here, rather than just the nv field? By not making it a bit field
 you could perhaps just write_atomic() it?

 
 Thinking more about this, maybe write_atomic() is not good for this, we need 
 use
 the LOCK prefix when updating the posted-interrupt descriptor.

The LOCK prefix can't be applied to simple loads and stores; they're
implicitly atomic.

Also - please remember to trim your replies.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint()

2015-06-17 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:

And call it when a checkpoint record is found in the libxc stream.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
  tools/libxc/include/xenguest.h |3 +++
  tools/libxc/xc_sr_restore.c|   15 ++-
  tools/libxl/libxl_save_msgs_gen.pl |2 +-
  3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7581263..b0d27ed 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -102,6 +102,9 @@ struct restore_callbacks {
  int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
  uint32_t size, void* data);

+/* A checkpoint record has been found in the stream */
+int (*checkpoint)(void* data);
+
  /* to be provided as the last argument to each callback function */
  void* data;
  };
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..5e0f817 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,5 +1,7 @@
  #include arpa/inet.h

+#include assert.h
+
  #include xc_sr_common.h

  /*
@@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
  static int handle_checkpoint(struct xc_sr_context *ctx)
  {
  xc_interface *xch = ctx-xch;
-int rc = 0;
+int rc = 0, ret;
  unsigned i;

  if ( !ctx-restore.checkpointed )
@@ -482,6 +484,13 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
  goto err;
  }

+ret = ctx-restore.callbacks-checkpoint(ctx-restore.callbacks-data);
+if ( ret )
+{
+rc = -1;
+goto err;
+}
+
  if ( ctx-restore.buffer_all_records )
  {
  IPRINTF(All records buffered);
@@ -735,6 +744,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, 
uint32_t dom,
  ctx.restore.checkpointed = checkpointed_stream;
  ctx.restore.callbacks = callbacks;

+/* Sanity checks for callbacks. */
+if (checkpointed_stream)


coding style


+assert(callbacks-checkpoint);
+
  IPRINTF(In experimental %s, __func__);
  DPRINTF(fd %d, dom %u, hvm %u, pae %u, superpages %d
  , checkpointed_stream %d, io_fd, dom, hvm, pae,
diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
b/tools/libxl/libxl_save_msgs_gen.pl
index 6b4b65e..36b279e 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -25,7 +25,7 @@ our @msgs = (
  'unsigned long', 'total'] ],
  [  3, 'scxA',   suspend, [] ],
  [  4, 'scxA',   postcopy, [] ],
-[  5, 'scxA',   checkpoint, [] ],
+[  5, 'srcxA',   checkpoint, [] ],
  [  6, 'scxA',   switch_qemu_logdirty,  [qw(int domid
unsigned enable)] ],
  #toolstack_save  done entirely `by hand'



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges

2015-06-17 Thread Jan Beulich
 On 17.06.15 at 10:26, tiejun.c...@intel.com wrote:
 Something hits me to generate another idea,
 
 #1. Still allocate all devices as before.
 #2. Lookup all actual bars to check if they're conflicting RMRR
 
 We can skip these bars to keep zero. Then later it would make lookup easily.
 
 #3. Need to reallocate these conflicting bars.
 #3.1 Trying to reallocate them with the remaining resources
 #3.2 If the remaining resources aren't enough, we need to allocate them 
 from high_mem_resource.

That's possible onyl for 64-bit BARs.

 I just feel this way may be easy and better. And even, this way also can 
 help terminate the preexisting allocation failures, right?

I think this complicates things rather than simplifying them: The
more passes (and adjustments to previous settings) you do, the
more error prone the whole logic will become. It may well be that
you need to basically re-write what is there right now...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 07/13] tools/libxl: Update libxl__domain_unpause() to support qemu-xen

2015-06-17 Thread Yang Hongyang



On 06/16/2015 12:22 AM, Wei Liu wrote:

On Mon, Jun 15, 2015 at 09:29:55AM +0800, Yang Hongyang wrote:



On 06/12/2015 08:33 PM, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:43:11AM +0800, Yang Hongyang wrote:

Currently, libxl__domain_unpause() only supports
qemu-xen-traditional. Update it to support qemu-xen.

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com


This looks very similar to an existing function called
libxl__domain_resume_device_model. Maybe you don't need to invent a new
function.


---
  tools/libxl/libxl.c | 42 +-
  1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d5691dc..5c843c2 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -933,10 +933,37 @@ out:
  return AO_INPROGRESS;
  }

-int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
+static int libxl__domain_unpause_device_model(libxl__gc *gc, uint32_t domid)
  {
  char *path;
  char *state;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
+uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+path = libxl__device_model_xs_path(gc, dm_domid, domid, /state);
+state = libxl__xs_read(gc, XBT_NULL, path);
+if (state != NULL  !strcmp(state, paused)) {


The only difference between your function and
libxl__domain_unpause_device_model is the check for state node. I
think you can just add the check to libxl__domain_resume_device_model
and use that function.


I'm not sure if we change the existing function's behavior will affect the
existing callers, if there's no problem to do so, I will do as what you
said in the next version.



Qemu-dm currently has several states. libxl__domain_resume_device_model
doesn't check the state and writes unconditionally. I think checking
before writing would be an improvement.


fixed, thank you!



Wei.



Wei.
.



--
Thanks,
Yang.

.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job

2015-06-17 Thread Pang, LongtaoX
  
   Thinking about this leads me to ask another question.  Suppose that a
   bug causes the l1 to lock up completely.  ts-logs-capture will attempt
   to hard reboot a locked-up host.  If it can't fetch any logs, it calls
   target_reboot_hard($ho);
  
   What will that do if $ho refers to the l1 ?  It relies on the power
   method.  Does your nested l1 host have a power method ?
  I'm afraid l1 won't like normal hosts has power cycle operations. Maybe
  we need to simulate it?
 
 Perhaps arrange for an appropriate PowerMethod for hosts which are
 actually guests?
 
Update.
I think maybe we need to refactor 'power_cycle' function in TestSupport.pm. I 
have not try it, something like below?
sub power_cycle ($) {
my ($ho) = @_;
+if (guest_var($ho,enable_nestedhvm,'') =~ m/true/) {
+guest_destroy($ho);
+guest_create($ho);
+guest_await_dhcp_tcp($ho,300);
+guest_check_up($ho);
+} else {
$mjobdb-host_check_allocated($ho);
die refusing to set power state for host $ho-{Name}.
 possibly shared with other jobs\n
if $ho-{SharedMaybeOthers};
power_state($ho, 0);
power_cycle_sleep($ho);
power_state($ho, 1);
+}
}
 Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format

2015-06-17 Thread Jan Beulich
 On 16.06.15 at 21:10, julien.gr...@citrix.com wrote:
 Append 0x to all %x in order to avoid while reading when there is other
 decimal value in the log.

To save on the space taken by the format strings you should prefer
%#x over 0x%x (like we do in the hypervisor).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Modified RTDS scheduler to use an event-driven model instead of polling.

2015-06-17 Thread Dario Faggioli
On Tue, 2015-06-16 at 22:45 -0700, Meng Xu wrote:

  So here's my proposal, lets see if it fits what you want:

 I will leave this to Dario to answer if it fits what he wants. :-P

  1.) The scheduler does not do any timing,

 Not really. The rt_schedule does the budget enforcement. When a
 current VCPU runs out of budget, the rt_schedule will be invoked by a
 timer (you can refer to the schedule function in xen/sched/schedule.c
 to have a look how the timer is armed/disarmed.). When the rt_schedule
 is invoked, it needs to:
 a) update the budget of the current running VCPU and move it from runq
 to depleted queue;
 b) pick the current highest VCPU from runq and return it as the snext VCPU.
 
 So scheduling logic is still  involved in the rt_schedule function.
 
Correct, thanks Meng.

  2.) replenishments are scheduled via timer at each [next]
  replenishment period. Each
  replenishment resorts the replenished vCPUs, and if any of the first
  #CPUs in the
  runq change, we tickle a pCPU for each change

 This is right.
 
Yeah, sort of. I mean, I'm not sure what Each replenishment resorts the
replenished vCPUs means. If it means that a replenished vCPU is given a
new deadline, and hence needs to move from depleted queue to runqueue,
in a position within runqueue that reflects its new deadline... then, I
agree.

This requires holding the runqueue lock, but we need it anyway, for now.

This may or may not require a pCPU to be tickled, and it's
runq_tickle()'s job to decide whether that is the case and, if yes,
which one.

Also, about this we tickle a pCPU for each change: first of all, as I
just said, that's not _always_ the case, and we should avoid tickling
when it's not necessary. Also, when it's necessary in principle, it
makes few sense to tickle a pCPU twice in a very tight loop (i.e.,
without giving it the chance to reschedule).

So if, for instance, two different replenishments (no matter whether
being serviced together, or in different invocations of the
replenishment timer handler) both determine that pCPU Y is the one that
should be tickled (because it's the one running the vCPU with the latest
deadline, and both the two replenished vCPUs have more imminent deadline
than what's running on Y, but later than what's running everywhere
else), there's no point in tickling twice. Not that doing it would be
harmful, it just won't have any effect. What I mean is much rather that
you can use this information to keep runq_tickle() simple/quick.

Note that this is pretty much what's happening already in runq_tickle(),
thanks to the use of the prv-tickled bitmap, so the point here is
making sure that we tickle a pCPU for each change doesn't mean
removing this, because I think that would be wrong. It's probably
unlikely that this was what Dagaen meant, but I though I better state
things clearly. :-)

  In this case, we can use one timer.

 We actually have two: one for budget enforcement and the other for
 budget replenishment.
 
EXACTLY!! :-)

  We could use the current one as a
  replenishment
  timer, and make it so rt_schedule isn't the default invoked method.
 
  Does this sound similar to what you are suggesting?

 I don't think so, but I will leave it for Dario's for his opinion.
 
use current one as replenishment timer, and make it so rt_schedule
isn't the default invoked method

Please, please, please, don't do that! :-P

 In Dario's suggestion, you just simply remove the update_budget
 function out of rt_schedule. This is because budget enforcement, which
 is the work of rt_schedule, does not naturelly involves budget
 replenishment.
 
Yes. Or at least, that's the core of my suggestion, yes. In more
details, my suggestion includes getting rid of a bunch of other
invocations of scheduling/picking and replenishment related functions
from many other places, as I tried to make clear in my last email.

Perhaps, the fact that I haven't stated that clearly enough since now,
was what was making it a bit harder for you to see what I meant with
'making things simple', etc. I hope I explained myself better now.

 In order to achieve budget replenishment, we need another timer to
 invoke another function (budget_replenish function), that is dedicated
 to that.
 
Yep, indeed.

  I have to ask
  because I thought
  you wouldn't like the idea,

 I guess Dario won't like this idea. :-P (I'm kidding, but it should be
 the case.)
 
You're right, it's not. :-)

 
  and its not really *that* far off from
  what we have now, Its
  a little restructuring so that replenishments occur before any
  scheduling activity and
  the handler checks if switching is needed (basically acting as the
  scheduler) and then
  calls tickle. Sounds like what you had in mind?
 
This need for an strict ordering between replenishments and scheduling
is something that you're (Dagaen) insisting a lot on, while I really
don't see why it would be a good thing.

I've stated countless time that they're independent, which also mean
there's no 

Re: [Xen-devel] [PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 10:30 +0100, Julien Grall wrote:
 I see different opinion on whether using 0x% or %#. As I plan to resend 
 a version with the commit message update, shall I use %#?

I think it's mostly pointless bike-shedding over a saving measured in
single digit bytes, use whichever you like.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] libxc: fix xc_dom_load_elf_symtab

2015-06-17 Thread Wei Liu
On Thu, Jun 11, 2015 at 06:05:20PM +0200, Roger Pau Monne wrote:
 xc_dom_load_elf_symtab was incorrectly trying to perform the same
 calculations already done in elf_parse_bsdsyms when load == 0 is used.
 Instead of trying to repeat the calculations, just trust what
 elf_parse_bsdsyms has already accounted for.
 
 This also simplifies the code by allowing the non-load case to return
 earlier.
 
 Signed-off-by: Roger Pau Monné roger@citrix.com
 Cc: Ian Jackson ian.jack...@eu.citrix.com
 Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
 Cc: Ian Campbell ian.campb...@citrix.com
 Cc: Wei Liu wei.l...@citrix.com

Acked-by: Wei Liu wei.l...@citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 58620: tolerable FAIL - PUSHED

2015-06-17 Thread osstest service user
flight 58620 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58620/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-libvirt  3 host-install(3)  broken in 58568 pass in 58620
 test-amd64-i386-xl-qemuu-winxpsp3 3 host-install(3) broken in 58568 pass in 
58620
 test-amd64-i386-rumpuserxen-i386 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail pass in 58568

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt  11 guest-start  fail   like 58428
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58428
 test-amd64-i386-freebsd10-amd64  9 freebsd-install fail like 58428
 test-amd64-i386-freebsd10-i386  9 freebsd-install  fail like 58428
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 58428
 test-armhf-armhf-libvirt-xsm 11 guest-start  fail   like 58428
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58428
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58428
 test-armhf-armhf-libvirt 11 guest-start  fail   like 58428

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass

version targeted for testing:
 linux0f57d86787d8b1076ea8f9cba2a46d534a27
baseline version:
 linuxdf5f4158415b6fc4a2d683c6fdc806be6da176bc


People who touched revisions under test:
  Axel Lin axel@ingics.com
  Dan Williams dan.j.willi...@intel.com
  Daniel Verkamp daniel.verk...@intel.com
  Dave Airlie airl...@redhat.com
  Dave Jiang dave.ji...@intel.com
  David S. Miller da...@davemloft.net
  David Woodhouse david.woodho...@intel.com
  Eric Dumazet eduma...@google.com
  Erik Hugne erik.hu...@ericsson.com
  Govindarajulu Varadarajan _gov...@gmx.com
  Hauke Mehrtens ha...@hauke-m.de
  Herbert Xu herb...@gondor.apana.org.au
  Ingo Molnar mi...@kernel.org
  Jaedon Shin jaedon.s...@gmail.com
  Jeff Kirsher jeffrey.t.kirs...@intel.com
  Jeff Layton jeff.lay...@primarydata.com
  Jens Axboe ax...@fb.com
  Jerome Marchand jmarc...@redhat.com
  Johannes Berg johannes.b...@intel.com
  Jon Mason jdma...@kudzu.us
  Jurgen Kramer gtmkra...@xs4all.nl
  Kan Liang kan.li...@intel.com
  Krzysztof Kozlowski k.kozlow...@samsung.com
  Linus Torvalds torva...@linux-foundation.org
  Ludovic Desroches ludovic.desroc...@atmel.com
  Marcelo Ricardo Leitner marcelo.leit...@gmail.com
  Markos Chandras markos.chand...@imgtec.com
  Masanari Iida standby2...@gmail.com
  Maxime Ripard maxime.rip...@free-electrons.com
  Mel Gorman mgor...@suse.de
  Michael Buesch m...@bues.ch
  Ming Lei tom.leim...@gmail.com
  Neil Horman nhor...@tuxdriver.com
  NeilBrown  ne...@suse.de
  NeilBrown ne...@suse.de
  Nikolay Aleksandrov ra...@blackwall.org
  Peter Zijlstra (Intel) pet...@infradead.org
  Peter Zijlstra pet...@infradead.org
  Rabin Vincent rabin.vinc...@axis.com
  Ralf Baechle r...@linux-mips.org
  Richard Cochran richardcoch...@gmail.com
  Richard Weinberger rich...@nod.at
  Robert Shearman rshea...@brocade.com
  Shaohua Li s...@fb.com
  Takashi Iwai ti...@suse.de
  Thomas Gleixner t...@linutronix.de
  Vinod Koul vinod.k...@intel.com
  Vlad Yasevich vyasev...@gmail.com


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  

Re: [Xen-devel] [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks

2015-06-17 Thread Andrew Cooper
On 17/06/15 11:14, Ian Campbell wrote:
 On Tue, 2015-06-16 at 16:06 +0100, Andrew Cooper wrote:
 On 16/06/15 16:04, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Libxl has now been fully adjusted not to need them.

 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 Acked-by: Ian Campbell ian.campb...@citrix.com

 /me looks mournfully at the #28 shaped hole in this series which would
 nuke all the migration v1 code from libxc :-)
 I was going to slip that into v2.  I didn't want to delay posting v1 for
 review, given the proximity of the 4.6 freeze.

 I think I will transcribe the description of the legacy protocol from
 xg_save_restore.h
 That would be good, thanks.

  and code up the legacy protocol in python.
 That would be above and beyond, but don't let me stop you ;-)

Well - it is currently coded up with magic numbers in the conversion
script.  All I was planning to do was make some numbers less magic.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2 linux-next] Revert ufs: fix deadlocks introduced by sb mutex merge

2015-06-17 Thread Jan Kara
On Fri 05-06-15 23:03:48, Al Viro wrote:
 On Fri, Jun 05, 2015 at 07:50:18PM +0100, Al Viro wrote:
  Basically, we have
  i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
  truncate_mutex: block pointers changes.  Per-inode.
  s_lock: block and inode bitmaps changes.  Per-filesystem.
  
  For UFS it's slightly more complicated due to tail packing they are doing 
  for
  short files, but most of that complexity is due to having that stuff handled
  way too deep in call chain.
 
 Oh, lovely... commit 10e5dc
 Author: Evgeniy Dushistov dushis...@mail.ru
 Date:   Sat Jul 1 04:36:24 2006 -0700
 
 [PATCH] ufs: truncate should allocate block for last byte
 
 had removed -truncate() method and missed the fact that vmtrucate() had
 callers outside of -setattr(), such as handling of -prepare_write() partial
 failures and short copies on write(2) in general.
 
 Then we had a long and convoluted series of conversions that ended up with
 vmtruncate() lifted into ufs_write_failed() and replaced with
 truncate_pagecache() in there.
 
 Through all that, everybody (me included) had not noticed that we *still*
 do not free blocks allocated by ufs_write_begin() failing halfway through.
 While we are at it, ufs_write_end() ought to call ufs_write_failed() in
 case when we'd been called after a short copy (and do the same block freeing).
 
 Joy...  Folks, is anybody actively maintaining fs/ufs these days?

Looking into the changelog there wasn't anyone seriously looking into UFS
for at least 5-6 years... Fabian did some cleanups recently but they were
mostly cosmetic. So I don't think it's really maintained. OTOH clearly
there are some users since occasionally someone comes with a bug report.
Welcome to the world where we have ~50 on disk / network filesystems :-|

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] building error

2015-06-17 Thread Wen Congyang
When I try to build xen on fc21, I meet the following errors:
drivers/net/ath/ath9k/ath9k_ar9003_phy.c: In function ‘ar9003_hw_ani_control’:
drivers/net/ath/ath9k/ath9k_ar9003_phy.c:862:11: error: logical not is only 
applied to the left hand side of comparison [-Werror=logical-not-parentheses]
   if (!on != aniState-ofdmWeakSigDetectOff) {
   ^
drivers/net/ath/ath9k/ath9k_ar9003_phy.c:1016:14: error: logical not is only 
applied to the left hand side of comparison [-Werror=logical-not-parentheses]
   if (!is_on != aniState-mrcCCKOff) {
  ^
cc1: all warnings being treated as errors
bin/rules/drivers/net/ath/ath9k/ath9k_ar9003_phy.c.r:3: recipe for target 
'bin/ath9k_ar9003_phy.o' failed
make[7]: *** [bin/ath9k_ar9003_phy.o] Error 1
make[7]: *** Waiting for unfinished jobs
  [BUILD] bin/nvsvpd.o
drivers/net/ath/ath9k/ath9k_ar5008_phy.c: In function 
‘ar5008_hw_ani_control_old’:
drivers/net/ath/ath9k/ath9k_ar5008_phy.c:1144:11: error: logical not is only 
applied to the left hand side of comparison [-Werror=logical-not-parentheses]
   if (!on != aniState-ofdmWeakSigDetectOff) {
   ^
drivers/net/ath/ath9k/ath9k_ar5008_phy.c: In function 
‘ar5008_hw_ani_control_new’:
drivers/net/ath/ath9k/ath9k_ar5008_phy.c:1310:11: error: logical not is only 
applied to the left hand side of comparison [-Werror=logical-not-parentheses]
   if (!on != aniState-ofdmWeakSigDetectOff) {
   ^
cc1: all warnings being treated as errors
bin/rules/drivers/net/ath/ath9k/ath9k_ar5008_phy.c.r:3: recipe for target 
'bin/ath9k_ar5008_phy.o' failed
make[7]: *** [bin/ath9k_ar5008_phy.o] Error 1
make[7]: Leaving directory '/work/src/xen/tools/firmware/etherboot/ipxe/src'
Makefile:28: recipe for target 'ipxe/src/bin/rtl8139.rom' failed
make[6]: *** [ipxe/src/bin/rtl8139.rom] Error 2

I know my gcc has some problems, and will generate wrong bin file(We have 
discussed
it earlier).
But this building error is new problem. Do we have any plan to use newer ipxe?

I update ipxe to avoid the building error, and I meet the following error:
hw/virtio/virtio-rng.c: In function ‘virtio_rng_device_realize’:
hw/virtio/virtio-rng.c:152:31: error: logical not is only applied to the left 
hand side of comparison [-Werror=logical-not-parentheses]
 if (!vrng-conf.period_ms  0) {
   ^
cc1: all warnings being treated as errors

It seems that qemu have some problems.

Thanks
Wen Congyang

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format

2015-06-17 Thread Julien Grall

Hi Ian,

On 17/06/2015 10:25, Ian Campbell wrote:

On Tue, 2015-06-16 at 20:10 +0100, Julien Grall wrote:

Append 0x to all %x in order to avoid while reading when there is other
decimal value in the log.

Also replace some of the hexadecimal print to decimal to uniformize the
format with netfront.

Signed-off-by: Julien Grall julien.gr...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: net...@vger.kernel.org


You meant s/Append/Prepend/, nonetheless:


I noticed a missing word after avoid in the commit message too. I will 
update to:


Prepend 0x to all %x in order to avoid confusion while reading when 
there is other decimal value in the log.


[...].



Acked-by: Ian Campbell ian.campb...@citrix.com


I see different opinion on whether using 0x% or %#. As I plan to resend 
a version with the commit message update, shall I use %#?


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream

2015-06-17 Thread Andrew Cooper
On 17/06/15 02:31, Yang Hongyang wrote:
 +default:
 +ret = ERROR_FAIL;
 +goto err;
 +}
 +
 +ret = libxl__domain_suspend_device_model(gc, dss);

 This is no longer needed, the suspend callback already called
 this function and the emulator context already saved to a file.

 This call will cause Primary's emulator stop under Remus.
 postcopy callback will resume primary. then in checkpoint
 callback, we shouldn't suspend device model. 

It is the result of copying how everything was done previously.  I will
drop it.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] x86: synchronize PCI config space access decoding

2015-06-17 Thread Jan Beulich
 On 17.06.15 at 11:36, andrew.coop...@citrix.com wrote:
 On 17/06/15 07:29, Jan Beulich wrote:
 On 16.06.15 at 20:26, andrew.coop...@citrix.com wrote:
 It turns out that MSR_AMD64_NB_CFG is unconditionally RAZ and has all
 writes discarded, so no HVM guest will ever be in a position to
 legitimately use AMD extended configuration access.
 Where have you found that? The register (named NB_CFG1 in
 newer families' BKGDs) is clearly r/w.
 
 It is implemented as RAZ/write discard in the hvm msr intercept code,
 and appears to exist only to prevent the guest blowing up in a
 cross-vendor case.

Ah, that's something _we_ do. But the MSR write being discarded
doesn't mean a guest can't legitimately use extended accesses: I
don't think you expect OSes to read back what they wrote? I.e. an
OS enabling this functionality would read zero, write back the value
modified to have the bit set, and go on assuming extended
accesses are okay now.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream

2015-06-17 Thread Wen Congyang
On 06/17/2015 05:50 PM, Andrew Cooper wrote:
 On 17/06/15 08:57, Wen Congyang wrote:
 +/* Queue up reading the body. */
 +size_t bytes_to_read;
 +
 +switch (rec_hdr-type) {
 +/*
 + * Emulator records want to retain the blob in the pipe, for a 
 further
 + * datacopier call to move elsewhere.  Just read the emulator 
 header.
 + */
 In this case, we should not call ROUNDUP().

 +case REC_TYPE_EMULATOR_CONTEXT:
 +bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
 +break;
 +
 +default:
 +bytes_to_read = rec_hdr-length;
 +break;
 +}
 +
 +bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);
 So, I think it is better to move ROUNDUP to default case.

 Thanks
 Wen Congyang

 
 sizeof(struct libxl_sr_emulator_hdr) is cunningly of the appropriate
 order already.

Yes

 
 I suppose it is probably better to move the roundup into the default
 case and assert() appropriate alignment after the switch()

Do you mean the sub-header must be aligned

Thanks
Wen Congyang

 
 ~Andrew
 .
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 18:49 +0800, Wen Congyang wrote:
 On 06/17/2015 06:15 PM, Ian Campbell wrote:
  On Wed, 2015-06-17 at 11:09 +0800, Wen Congyang wrote:
  +if (hdr-options  RESTORE_OPT_BIG_ENDIAN) {
  +ret = ERROR_FAIL;
  +LOG(ERROR, Unable to handle big endian streams);
  +goto err;
 
  I think it is better to check if the host is big endian or not.
  
  It's not, there are no big endian ports of Xen today. I think encoding
  that here is fine.
 
 IIRC, arm supports big endian. Do we only use arm+little endian?

Currently, yes. There are plans afoot to support big endian _guests_ but
not big endian hypervisor (and by extension IMHO tools or at least the
migration stream should remain LE). That would be a lot of faff for very
little benefit IMHO.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 11:43 +0100, Andrew Cooper wrote:
 On 17/06/15 11:14, Ian Campbell wrote:
  On Tue, 2015-06-16 at 16:06 +0100, Andrew Cooper wrote:
  On 16/06/15 16:04, Ian Campbell wrote:
  On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
  Libxl has now been fully adjusted not to need them.
 
  Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
  Acked-by: Ian Campbell ian.campb...@citrix.com
 
  /me looks mournfully at the #28 shaped hole in this series which would
  nuke all the migration v1 code from libxc :-)
  I was going to slip that into v2.  I didn't want to delay posting v1 for
  review, given the proximity of the 4.6 freeze.
 
  I think I will transcribe the description of the legacy protocol from
  xg_save_restore.h
  That would be good, thanks.
 
   and code up the legacy protocol in python.
  That would be above and beyond, but don't let me stop you ;-)
 
 Well - it is currently coded up with magic numbers in the conversion
 script.  All I was planning to do was make some numbers less magic.

That does sound nice, thanks.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 06/17] x86/hvm: revert 82ed8716b fix direct PCI port I/O emulation retry...

2015-06-17 Thread Paul Durrant
 -Original Message-
 From: Paul Durrant [mailto:paul.durr...@citrix.com]
 Sent: 11 June 2015 16:43
 To: xen-de...@lists.xenproject.org
 Cc: Paul Durrant; Keir (Xen.org); Jan Beulich; Andrew Cooper
 Subject: [PATCH v2 06/17] x86/hvm: revert 82ed8716b fix direct PCI port I/O
 emulation retry...
 
 ...and error handling
 
 NOTE: A straight reversion was not possible because of subsequent changes
   in the code so this at least partially a manual reversion.
 
 By limiting hvmemul_do_io_addr() to reps falling within the pages on which
 a reference has already been taken, we can guarantee that calls to
 hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
 or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic from
 the intercept code and simplify it significantly.
 
 Normally hvmemul_do_io_addr() will only reference single page at a time.
 It will, however, take an extra page reference for I/O spanning a page
 boundary.
 
 It is still important to know, upon returning from x86_emulate(), whether
 the number of reps was reduced so the mmio_retry flag is retained for that
 purpose.
 
 Signed-off-by: Paul Durrant paul.durr...@citrix.com

Unfortunately, during further testing with XenServer, I found an issue with 
this patch and also subsequent patches in the series. I will therefore be 
posting a v3 of the series.

  Paul

 Cc: Keir Fraser k...@xen.org
 Cc: Jan Beulich jbeul...@suse.com
 Cc: Andrew Cooper andrew.coop...@citrix.com
 ---
  xen/arch/x86/hvm/emulate.c |   83 --
 --
  xen/arch/x86/hvm/intercept.c   |   52 +
  xen/include/asm-x86/hvm/vcpu.h |2 +-
  3 files changed, 69 insertions(+), 68 deletions(-)
 
 diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
 index 4af70b0..38b8956 100644
 --- a/xen/arch/x86/hvm/emulate.c
 +++ b/xen/arch/x86/hvm/emulate.c
 @@ -60,6 +60,7 @@ static int hvmemul_do_io(
  .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
  .addr = addr,
  .size = size,
 +.count = *reps,
  .dir = dir,
  .df = df,
  .data = data,
 @@ -131,15 +132,6 @@ static int hvmemul_do_io(
  HVMIO_dispatched : HVMIO_awaiting_completion;
  vio-io_size = size;
 
 -/*
 - * When retrying a repeated string instruction, force exit to guest after
 - * completion of the retried iteration to allow handling of interrupts.
 - */
 -if ( vio-mmio_retrying )
 -*reps = 1;
 -
 -p.count = *reps;
 -
  if ( dir == IOREQ_WRITE )
  {
  if ( !data_is_addr )
 @@ -153,17 +145,9 @@ static int hvmemul_do_io(
  switch ( rc )
  {
  case X86EMUL_OKAY:
 -case X86EMUL_RETRY:
 -*reps = p.count;
  p.state = STATE_IORESP_READY;
 -if ( !vio-mmio_retry )
 -{
 -hvm_io_assist(p);
 -vio-io_state = HVMIO_none;
 -}
 -else
 -/* Defer hvm_io_assist() invocation to hvm_do_resume(). */
 -vio-io_state = HVMIO_handle_mmio_awaiting_completion;
 +hvm_io_assist(p);
 +vio-io_state = HVMIO_none;
  break;
  case X86EMUL_UNHANDLEABLE:
  {
 @@ -294,17 +278,67 @@ int hvmemul_do_io_addr(
  bool_t is_mmio, paddr_t addr, unsigned long *reps,
  unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
  {
 -struct page_info *ram_page;
 +struct vcpu *v = current;
 +unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
 +struct page_info *ram_page[2];
 +int nr_pages = 0;
 +unsigned long count;
  int rc;
 
 -rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), ram_page);
 +
 +rc = hvmemul_acquire_page(ram_gmfn, ram_page[nr_pages]);
  if ( rc != X86EMUL_OKAY )
 -return rc;
 +goto out;
 +
 +nr_pages++;
 +
 +/* Detemine how many reps will fit within this page */
 +for ( count = 0; count  *reps; count++ )
 +{
 +paddr_t start, end;
 +
 +if ( df )
 +{
 +start = ram_gpa - count * size;
 +end = ram_gpa + size - 1;
 +}
 +else
 +{
 +start = ram_gpa;
 +end = ram_gpa + (count + 1) * size - 1;
 +}
 +
 +if ( paddr_to_pfn(start) != ram_gmfn ||
 + paddr_to_pfn(end) != ram_gmfn )
 +break;
 +}
 +
 +if ( count == 0 )
 +{
 +/*
 + * This access must span two pages, so grab a reference to
 + * the next page and do a single rep.
 + */
 +rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
 +  ram_page[nr_pages]);
 +if ( rc != X86EMUL_OKAY )
 +goto out;
 
 -rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
 +nr_pages++;
 +count = 1;
 +}
 +
 +rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
 (uint64_t)ram_gpa);
 +if ( rc == 

Re: [Xen-devel] [xen-4.2-testing test] 58616: trouble: blocked/broken/fail/pass

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 09:17 +0100, Jan Beulich wrote:
  On 17.06.15 at 03:13, osst...@xenbits.xen.org wrote:
  flight 58616 xen-4.2-testing real [real]
  http://logs.test-lab.xenproject.org/osstest/logs/58616/ 
  
  Failures and problems with tests :-(
  
  Tests which did not succeed and are blocking,
  including tests which could not be run:
   build-amd64-libvirt  3 host-install(3) broken in 58584 REGR. vs. 
  58411
  
  Tests which are failing intermittently (not blocking):
   test-i386-i386-libvirt3 host-install(3)  broken in 58584 pass in 
  58616
   test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 
  58616
   test-amd64-i386-pair  4 host-install/dst_host(4) broken in 58584 pass in 
  58616
   test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken in 58584 
  pass in 58616
   test-amd64-i386-xl-win7-amd64  3 host-install(3) broken in 58584 pass in 
  58616
   test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass 
  in 58616
   test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass 
  in 58616
   test-amd64-amd64-xl-multivcpu  3 host-install(3) broken in 58604 pass in 
  58616
   test-i386-i386-xl 3 host-install(3)  broken in 58604 pass in 
  58616
   test-i386-i386-xl-sedf3 host-install(3)  broken in 58604 pass in 
  58616
   test-amd64-i386-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 58604 
  pass in 58616
   test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 
  58604 pass in 58616
   test-i386-i386-xl-qemuu-winxpsp3 3 host-install(3) broken in 58604 pass in 
  58616
   test-i386-i386-xl-qemut-winxpsp3  3 host-install(3)   broken pass in 
  58604
 
 Is there an explanation for all these recurring host install failures?

merlot0 had forgotten its BIOS settings, which is now fixed but seems to
have taken out most flights in the last 24hours.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling

2015-06-17 Thread Wu, Feng


 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, June 17, 2015 3:56 PM
 To: Wu, Feng
 Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin;
 Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org
 Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU
 scheduling
 
  On 17.06.15 at 08:54, feng...@intel.com wrote:
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: Wednesday, June 10, 2015 2:44 PM
   On 08.05.15 at 11:07, feng...@intel.com wrote:
   +switch ( v-runstate.state )
   +{
   +case RUNSTATE_runnable:
   +case RUNSTATE_offline:
   +/*
   + * We don't need to send notification event to a non-running
   + * vcpu, the interrupt information will be delivered to it 
   before
   + * VM-ENTRY when the vcpu is scheduled to run next time.
   + */
   +pi_desc-sn = 1;
   +
   +/*
   + * If the state is transferred from RUNSTATE_blocked,
   + * we should set 'NV' feild back to posted_intr_vector,
   + * so the Posted-Interrupts can be delivered to the vCPU
   + * by VT-d HW after it is scheduled to run.
   + */
   +if ( old_state == RUNSTATE_blocked )
   +{
   +do
   +{
   +old.control = new.control = pi_desc-control;
   +new.nv = posted_intr_vector;
   +}
   +while ( cmpxchg(pi_desc-control, old.control,
 new.control)
   +!= old.control );
 
  So why is it okay to do the SN update non-atomically when the
  vector update is done atomically?
 
  Also the curly braces do _not_ go on separate lines for do/while
  constructs.
 
  And then do you really need to atomically update the whole 64 bits
  here, rather than just the nv field? By not making it a bit field
  you could perhaps just write_atomic() it?
 
 
  Thinking more about this, maybe write_atomic() is not good for this, we need
  use
  the LOCK prefix when updating the posted-interrupt descriptor.
 
 The LOCK prefix can't be applied to simple loads and stores; they're
 implicitly atomic.
 

Yes, I know LOCK prefix cannot be used together with 'mov', but
my concern is without LOCK, does it work well when this
write-atomic() operation and hardware setting 'ON' occur at the
same time?

Hardware updates the 'ON' filed like this:
(which is listed in the Spec.)

Hardware performs a coherent atomic read-modify-write
operation of the posted-interrupt descriptor as follows:
- Read contents of the Posted Interrupt Descriptor, claiming
exclusive ownership of its hosting cache-line. If invalid
programming of Posted Interrupt Descriptor is detected,
release ownership of the cache-line, and block the interrupt
request.
- If above checks succeed, retrieve current values of Posted
Interrupt Requests (PIR bits 255:0), Outstanding Notification (ON),
Suppress Notification (SN), Notification Vector (NV), and
Notification Destination (NDST) fields in the Posted Interrupt
Descriptor.
- Modify the following descriptor field values atomically:
* Set bit in PIR corresponding to the Vector field value from
the IRTE
* Compute X = ((ON == 0)  (URG | (SN == 0)))
* If (X == 1), Set ON field.
- Promote the cache-line to be globally observable, so that
the modifications are visible to other caching agents. Hardware
may write-back the cache-line any time after this step.

If we don't use the LOCK prefix, 'NV' filed may be changed between
reading these filed and writing back by hardware, right? But for
'NV', maybe this is not an issue, since hardware doesn't change it,
It only check 'ON', 'SN' and updates 'ON' if needed. And that's
why we need use LOCK prefix to update 'SN', right? What is you
thoughts about this?

Sorry for the long description above, since the hardware guys
advise me to use LOCK prefix when updating the posted-interrupt
descriptor, I just don't want to make any mistake here.

Thanks,
Feng

 Also - please remember to trim your replies.
 
 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [qemu-mainline test] 58648: regressions - FAIL

2015-06-17 Thread Ian Campbell
On Wed, 2015-06-17 at 07:42 +, osstest service user wrote:
 flight 58648 qemu-mainline real [real]
 http://logs.test-lab.xenproject.org/osstest/logs/58648/
 
 Regressions :-(
 
 Tests which did not succeed and are blocking,
 including tests which could not be run:
  build-amd64   5 xen-build fail REGR. vs. 
 58551
  build-i3865 xen-build fail REGR. vs. 
 58551
  build-amd64-xsm   5 xen-build fail REGR. vs. 
 58551
  build-i386-xsm5 xen-build fail REGR. vs. 
 58551
  build-armhf   5 xen-build fail REGR. vs. 
 58551
  build-armhf-xsm   5 xen-build fail REGR. vs. 
 58551

Seems to be a real upstream issue:

install: cannot stat 
`/home/osstest/build.58648.build-amd64-xsm/xen/tools/qemu-xen-dir/pc-bios/vgabios-virtio.bin':
 No such file or directory




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 58618: regressions - FAIL

2015-06-17 Thread osstest service user
flight 58618 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58618/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2   6 xen-boot  fail REGR. vs. 58392

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 11 guest-start   fail REGR. vs. 58392
 test-armhf-armhf-libvirt 11 guest-start  fail blocked in 58392
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 58392
 test-amd64-i386-libvirt  11 guest-start  fail   like 58392
 test-amd64-amd64-libvirt-xsm 11 guest-start  fail   like 58392
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58392
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58392
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58392

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  fcbfaf9d260adbdb9352d6300b9f63c4ed443d49
baseline version:
 xen  12e817e281034f5881f46e0e4f1d127820101a78


People who touched revisions under test:
  Andrew Cooper andrew.coop...@citrix.com
  Dario Faggioli dario.faggi...@citrix.com
  David Vrabel david.vra...@citrix.com
  Don Dugger donald.d.dug...@intel.com
  Don Slutz dsl...@verizon.com
  George Dunlap george.dun...@eu.citrix.com
  Ian Campbell ian.campb...@citrix.com
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com
  Juergen Gross jgr...@suse.com
  Kevin Tian kevin.t...@intel.com
  Olaf Hering o...@aepfle.de
  Roger Pau Monné roger@citrix.com
  Ross Lagerwall ross.lagerw...@citrix.com
  Wei Liu wei.l...@citrix.com


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm fail
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 

Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling

2015-06-17 Thread Wu, Feng


 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, June 17, 2015 4:59 PM
 To: Wu, Feng
 Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin;
 Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org
 Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU
 scheduling
 
  On 17.06.15 at 10:46, feng...@intel.com wrote:
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: Wednesday, June 17, 2015 3:56 PM
   On 17.06.15 at 08:54, feng...@intel.com wrote:
   From: Jan Beulich [mailto:jbeul...@suse.com]
   Sent: Wednesday, June 10, 2015 2:44 PM
On 08.05.15 at 11:07, feng...@intel.com wrote:
+switch ( v-runstate.state )
+{
+case RUNSTATE_runnable:
+case RUNSTATE_offline:
+/*
+ * We don't need to send notification event to a non-running
+ * vcpu, the interrupt information will be delivered to it
 before
+ * VM-ENTRY when the vcpu is scheduled to run next time.
+ */
+pi_desc-sn = 1;
+
+/*
+ * If the state is transferred from RUNSTATE_blocked,
+ * we should set 'NV' feild back to posted_intr_vector,
+ * so the Posted-Interrupts can be delivered to the vCPU
+ * by VT-d HW after it is scheduled to run.
+ */
+if ( old_state == RUNSTATE_blocked )
+{
+do
+{
+old.control = new.control = pi_desc-control;
+new.nv = posted_intr_vector;
+}
+while ( cmpxchg(pi_desc-control, old.control,
  new.control)
+!= old.control );
  
   So why is it okay to do the SN update non-atomically when the
   vector update is done atomically?
  
   Also the curly braces do _not_ go on separate lines for do/while
   constructs.
  
   And then do you really need to atomically update the whole 64 bits
   here, rather than just the nv field? By not making it a bit field
   you could perhaps just write_atomic() it?
  
  
   Thinking more about this, maybe write_atomic() is not good for this, we
 need
   use
   the LOCK prefix when updating the posted-interrupt descriptor.
 
  The LOCK prefix can't be applied to simple loads and stores; they're
  implicitly atomic.
 
 
  Yes, I know LOCK prefix cannot be used together with 'mov', but
  my concern is without LOCK, does it work well when this
  write-atomic() operation and hardware setting 'ON' occur at the
  same time?
 
  Hardware updates the 'ON' filed like this:
  (which is listed in the Spec.)
 
  Hardware performs a coherent atomic read-modify-write
  operation of the posted-interrupt descriptor as follows:
  - Read contents of the Posted Interrupt Descriptor, claiming
  exclusive ownership of its hosting cache-line.
 
 The cache line part here is the really important aspect. The CPU
 will do the same for LOCKed transactions as well as simple stores.
 But of course - I repeat this just to avoid any misunderstanding -
 this works only if the store covers _only_ nv (recall that I brought
 this up in context of trying to avoid using bitfields where possible).

Your clarification makes thing clear. Thank you very much!

Thanks,
Feng

 
  Sorry for the long description above, since the hardware guys
  advise me to use LOCK prefix when updating the posted-interrupt
  descriptor, I just don't want to make any mistake here.
 
 And that advice is correct if you fiddle with the descriptor in
 quantities covering more than the precise byte(s) you intend
 to modify.
 
 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [libvirt test] 58119: regressions - FAIL

2015-06-17 Thread Ian Jackson
Ian Campbell writes (Re: [Xen-devel] [libvirt test] 58119: regressions - 
FAIL):
 merlot0 has just lost its BIOS settings and failed to reboot -- which
 might also imply some other interesting misconfiguration, Ian, do you
 think that is plausible?

Not particularly plausible, I'm afraid, no.

.oO{ I should get that host history report patch series finished. }

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-17 Thread Ian Campbell
On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote:
 I wrote this before reading the rest of the thread with your alternative
 suggestion.  Both approaches can work, maybe it is true that having the
 guest requesting mappings is better. And we should certainly do the same
 thing for PVH and ARM guests.
 
 If we have the guest requesting the mapping, obviously the hypercall
 implementation should check whether mapping the memory region has been
 previously allowed by the toolstack, that should still call
 xc_domain_iomem_permission.

Whoever makes the initial decision/setup we still need to decide what to
do about reads and writes to the device's BAR registers. Who deals with
these and how, and if writes are allowed and if so how is this then
reflected in the guest p2m.

I would very much prefer ARM and x86 PVH to use the same approach to
this.

For x86 HVM I believe QEMU takes care of this, by emulating the
reads/writes to CFG space and making hypercalls (domctls?) to update the
p2m. There is no QEMU to do this in the ARM or x86 PVH cases.

For x86 PV I believe pciback deals with the register reads/writes but it
doesn't need to do anything wrt the p2m because that is a guest internal
construct. This obviously doesn't apply to ARM or x86 PVH either since
the p2m is managed by Xen in both those cases.

So it seems that neither of the existing solutions are a good match for
either ARM or x86 PVH.

_If_ it were permissible to implement all BAR registers as r/o then this
might simplify things, however I suspect this is not something we can
get away with in the general case (i.e. a driver for a given PCI device
_knows_ that BARN is writeable...).

So I think we do need to allow guest writes to the BAR registers. I
think it would be a bit strange (or even open potentially subtle
(security?) issues) to require the guest to write the BAR register and
to update the p2m to match as well.

So I think updates to the BAR need to be reflected in the p2m too by
whichever entity is emulating those reads/writes.

QEMU (or another daemon) is not really an option IMHO.

Which leaves us with either Xen doing trap and emulate or pciback
dealing with this via the pciif.h cfg space access stuff. I've a slight
preference for the latter since I think both ARM and x86 PVH are
planning to use pcifront/pciback already.

Which leaves us with a requirement for the backend to be able to update
the p2m, which in turn leads to a need for an interface available to the
dom0 kernel (which the existing domctl is not).

There will also need to be a mechanism to expose a suitable MMIO hole to
pcifront. On x86 this would be via the machine memory map, I think, but
on ARM we may need something else, perhaps a xenstore key associated
with the pci bus entry in xenstore?

That's for guests, for dom0 Xen also need to be aware of changes to the
BAR registers of the real physical devices since it needs to know the
real hardware values in order to point guest p2m mappings at them.

Wow, this is far more complicated than I imagined :-(

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   >