Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On Thu, Jan 21, 2016 at 10:37:51AM +, Ian Campbell wrote: > On Thu, 2016-01-21 at 10:25 +, Wei Liu wrote: > > On Thu, Jan 21, 2016 at 10:12:27AM +, Ian Campbell wrote: > > [...] > > > > I've asked the reporter to send logs for the 4.4 case to xen-devel. > > > > > > User confirmed[0] that 4.4 is actually OK. > > > > > > Did someone request stable backports yet, or shall I do so? > > > > > > > I vaguely remember we requested backport for relevant patches long time > > ago, but I admit I have lost track. So it wouldn't hurt if you do it > > again. > > So I think we'd be looking for: > > 32a8440 xen-netfront: respect user provided max_queues > 4c82ac3 xen-netback: respect user provided max_queues > ca88ea1 xen-netfront: update num_queues to real created > > which certainly resolves things such that the workarounds work, and I think > will also fix the default case such that it works with up to 32 vcpus > (although it will consume all the grants and only get 31/32 queues). > > Does that sound correct? > Yes, it does. > As Annie said, we may still want to consider what a sensible default max > queues would be. > Maybe we should set a cap to 8 or 16 by default. Wei. > Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
El 21/01/16 a les 11.29, Ian Campbell ha escrit: > On Thu, 2016-01-21 at 11:01 +0100, Roger Pau Monné wrote: >> El 21/01/16 a les 10.39, Ian Campbell ha escrit: It also means that HVMlite guests created with current Xen will be capable of migrating to newer version of Xen, that might have a different default policy. For example in the future we might want to enable the lapic by default, so if a guest is created with the current Xen version it doesn't get a lapic at all, and then when migrated to newer versions a lapic would magically appear after the migration, which is not desired. >>> >>> ... and the reason these details can't be propagated via the Xen blob is >>> that this emul stuff needs to be set exactly once at domain create time I >>> suppose? Changing it to be later binding is considered to be too hard/too >>> big a yak? >> >> Right, emulated devices are initialised as part of the >> XEN_DOMCTL_createdomain hypercall. Allowing them to be added later on >> and introducing a kind of intermediate domain building phase where only >> a certain set of hypercalls are valid is a possibility that Andrew >> already pointed out, however this seems like a very big project. > > This seems like the right approach to me, but I appreciate you not wanting > to tackle this here and now. > > Would it be possible to set the set of "potential" emulated devices at > create time and only "commit" to it after the save state is loaded? That > would essentially mean init-all, load state, de-init those which didn't get > any state loaded? Not as nice as doing it with the split of hypercall > availability, but might be more achievable, since you already have the de- > init code for domain teardown time. Sadly there are devices that AFAICT don't restore any state (like the VGA), which means a more complex mechanism is needed and we cannot rely exclusively on restores in order to decide if a device is present or not. IMHO the current approach is not that bad, and I think we should be able to migrate to a better one without problems. If in the future we want to do something like what you describe above (setting the set of emulated devices based on the Xen context restored), the extra information in the libxl JSON stream is certainly not going to hurt us. At worst we could always check that the libxl JSON information matches with what the hypervisor context contains for extra safety. > In any case, whatever is chosen as the solution the commit message needs to > go into a fair amount of detail as to why we picked that way of doing > things, particularly if it is a compromise vs doing it properly. > > This is important so we can answer "why did we do it this way" in 2 years > time. Right, thanks, I will update the commit message with the outcome of this discussion. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [distros-debian-wheezy test] 38678: trouble: blocked/broken/pass
flight 38678 distros-debian-wheezy real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/38678/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i3863 host-install(3) broken REGR. vs. 38635 build-amd64 3 host-install(3) broken REGR. vs. 38635 build-i386-pvops 3 host-install(3) broken REGR. vs. 38635 build-amd64-pvops 3 host-install(3) broken REGR. vs. 38635 Tests which did not succeed, but are not blocking: test-amd64-i386-i386-wheezy-netboot-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-i386-wheezy-netboot-pygrub 1 build-check(1) blocked n/a test-amd64-i386-amd64-wheezy-netboot-pygrub 1 build-check(1) blocked n/a test-amd64-amd64-amd64-wheezy-netboot-pvgrub 1 build-check(1) blocked n/a baseline version: flight 38635 jobs: build-amd64 broken build-armhf pass build-i386 broken build-amd64-pvopsbroken build-armhf-pvopspass build-i386-pvops broken test-amd64-amd64-amd64-wheezy-netboot-pvgrub blocked test-amd64-i386-i386-wheezy-netboot-pvgrub blocked test-amd64-i386-amd64-wheezy-netboot-pygrub blocked test-amd64-amd64-i386-wheezy-netboot-pygrub blocked sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
> -Original Message- > From: xen-devel-boun...@lists.xen.org [mailto:xen-devel- > boun...@lists.xen.org] On Behalf Of Paul Durrant > Sent: 20 January 2016 13:14 > To: Ian Campbell; xen-de...@lists.xenproject.org > Cc: Ian Jackson; Keir (Xen.org); Jan Beulich; Tim (Xen.org) > Subject: Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of > "request-multicast-control" flag > > > -Original Message- > > From: Ian Campbell [mailto:ian.campb...@citrix.com] > > Sent: 20 January 2016 13:06 > > To: Paul Durrant; xen-de...@lists.xenproject.org > > Cc: Ian Jackson; Jan Beulich; Keir (Xen.org); Tim (Xen.org) > > Subject: Re: [PATCH] public/io/netif.h: change semantics of "request- > > multicast-control" flag > > > > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > > > My patch b2700877 "move and amend multicast control documentation" > > > clarified use of the multicast control protocol between frontend and > > > backend. However, it transpires that the restrictions that documentation > > > placed on the "request-multicast-control" flag make it hard for a > > > frontend to enable 'all multicast' promiscuous mode, in that to do so > > > would require the frontend and backend to disconnect and re-connect. > > > > Do we therefore think that this document reflected reality, i.e. might this > > not be "just" a documentation bug? > > > > (Or maybe we can't tell because the only previous implementation was > years > > ago in Solaris or something) > > That's my concern. I hope it's just a documentation bug, but I don't know. > Also I've already done an implementation in Linux netback according to the > restricted semantics. > > > > > > This patch adds a new "feature-dynamic-multicast-control" flag to allow > > > a backend to advertise that it will watch "request-multicast-control" > hence > > > allowing it to be meaningfully modified by the frontend at any time rather > > > than only when the frontend and backend are disconnected. > > > > Would allowing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} to take a > bcast > > address > > be easier on the backend, in that it would just need to be a static feature > > rather than watching stuff on the fly? > > The documented semantics of the list are 'exact match' so sending a bcast > address doesn't do much good with a backend that doesn't know to treat is > specially hence a frontend can't tell whether 'all multicast' mode is going to > work without the extra feature flag. As for watching "request-multcast- > control" vs. add/remove of bcast, the complexity of implementation is > cheaper for the latter but I think the former is 'nicer'. > Are you ok with the xenstore watch approach (and leaving the patch as is) or would you prefer to spec. the bcast address as a wildcard and submit a new patch? Paul > Paul > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Ian, On Thu, Jan 21, 2016 at 10:19:32AM +, Ian Campbell wrote: >On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote: >> >> To platform device of ARM, hypervisor is responsible for the mapping >> between machine address and guest physical address, also responsible >> for the irq mapping. >> >> But to embedded ARM SoC, the hardware clk IP is handled in Dom0. > >Arguably Xen ought to be the one to do this, but we have determined >(rightly, I think) that doing so for the entirely clk tree of an SoC would >involve importing an unmanageable amount of code into Xen[0]. > >In the meantime we defer this to Dom0. > >I wonder though if it would be possible to manage the clocks for a >passthrough device from the toolstack, i.e. is there a sysfs node where one >can say "keep the clock for this device enabled (at xMhz) even though you >think the device is unused"? I am afraid not. The linux device driver without xen can work well, because there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the driver. I do not want to remove the clk apis usage from the device driver for xen, because the driver also serves when no xen support. The pvclk interface is mainly to let the clk api can work as no xen. Introduce pvclk may introduce more stuff, I have no more idea for now, if want to let the device driver unchanged and work well for passed through device. (: > >If so (or if it can be easily added) then the toolstack would just need to >manage that value as part of the passthrough process rather than having the >frontend do it via a PV protocol. > >Ian. > >[0] I'd like at some point for Xen to gain sufficient knowledge of clock IP >to minimally control things like the CPU and DRAM clocks etc, but that >needn't imply full clock tree support and would hopefully be a manageable >amount of code, which would (hopefully) mostly be about trapping and >emulating writes to one or two clock controllers per platform and >arbitrating a small set of bits (while allowing dom0 to control the >others). This is at least a medium if not long term idea though, and for >all I know it might turn out to be unworkable in practice. I am happy to see that if one day, xen takes part of the clock management to minimize power usage. These few days, I also look into the power management of xen. seems it mainly serves for x86. Let xen hypervisor handle cpu and dram clocks may be better than let dom0. Thanks, Peng. > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
On 21/01/16 11:07, Jan Beulich wrote: On 20.01.16 at 21:10,wrote: >> First of all, SMEP and SMAP. 32bit PV guests are subject to Xen's >> SMEP/SMAP choices, because of running in ring 1. >> >> SMAP in particular is problematic because older Linux guests do fall >> foul of it; they don't understand what a SMAP pagefault is, and enter an >> infinite loop of pagefaults. SMEP is also problematic because it breaks >> any guest wishing to use a shared address space between kernel and >> user. (I had some fun getting the test framework to function until I >> twigged what was happening). >> >> Both of these are regressions; older guests relying on existing >> behaviour cease to function on newer hardware/Xen despite identical >> settings. > And for both of them there simply should be a way for the guest to > state whether it's compatible (which should be the case for anything > we can't deal with completely transparently to guests). > >> For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug >> build (so there is a guest observable difference between running on a >> debug and a non-debug Xen), and the comment beside it even identifies >> that it breaks BSD guests. PTE bits 62:59 used by hardware if CR4.PKE >> is set. Currently this means that we are not able to support Protection >> Key for PV guests (although this restriction technically only applies to >> debug builds of the hypervisor). >> >> The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52). This bit >> is used to notice when a 64bit PV guest attempts to override the fixup >> Xen applies to its PTEs. Xen unilaterally sets _PAGE_GLOBAL for user >> pages, and clears _PAGE_GLOBAL for supervisor mappings, setting >> _PAGE_USER in both cases as the PV kernel runs in ring3. The only thing >> _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately >> tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a >> warning is logged and the kernel overridden. >> >> >> Neither of the used PTE bits exist in the Xen public ABI. Neither of >> them serve a purpose other than a debugging aid. >> >> I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of >> "all bits available for guest use". > And a kernel using any of the conflicting bits would then become > unusable on a hypervisor with that debug option enabled? I'd > rather see us document the state things are in... _PAGE_GNTMAP is already states: /* * Debug option: Ensure that granted mappings are not implicitly unmapped. * WARNING: This will need to be disabled to run OSes that use the spare PTE * bits themselves (e.g., *BSD). */ I was intending to have CONFIG_PV_PTE_DEBUG as an EXPERT option, disabled by default even in debug builds. There should not be an ABI difference between release and "normal" debug builds. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V13 5/5] xl: add pvusb commands
On Thu, 2016-01-21 at 12:12 +, Wei Liu wrote: > On Tue, Jan 19, 2016 at 08:10:07PM -0700, Jim Fehlig wrote: > > On 01/19/2016 11:11 AM, Ian Jackson wrote: > > > Chunyan Liu writes ("[PATCH V13 5/5] xl: add pvusb commands"): > > > > Add pvusb commands: usbctrl-attach, usbctrl-detach, usb-list, > > > > usbdev-attach and usbdev-detach. > > > Thanks for swapping this with the other patch. It is better now. > > > > > > > +=item B I I > > > However, I think you need to explictly state that the user may (and > > > indeed, must) pass multiple settings as separate arguments. AFAICT > > > the parser here doesn't do the ,-splitting. > > > > I just noticed this is the case with network devices as well. E.g. > > > > #xl network-attach hvm-domU mac=00:16:3e:xx:yy:zz,bridge=br0 > > libxl: error: libxl_device.c:1095:device_hotplug_child_death_cb: > > script: Could > > not find bridge device xenbr0 > > > > main_networkattach() in tools/libxl/xl_cmdimpl.c doesn't split on the > > ',', so > > everything after mac=00:16:3e:xx:yy:zz is ignored. I'd need advice on > > how to fix > > this though. Based on xl-network-configuration doc and Xen tool's long > > history > > of network-attach supporting that syntax, I'd say main_networkattach() > > should be > > changed to split on ','. I could also change the docs. Do tools > > maintainers have > > a preference, or alternative option? > > > > It is splitting on " " at the moment which is not very desirable. > > I think this is a bug. I've added it to my list and will look into > fixing it. If you are feeling particularly highly motivated then xl_cmdimpl.c contains quite a lot of adhoc parsing of various comma a space separated lists (often dupped for a given instance into cmdline vs cfg file, which leads to unintentional behavioural differences like above) which could usefully be consolidated into a series of helpers. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
On Thu, 2016-01-21 at 11:48 +, Paul Durrant wrote: > > -Original Message- > > From: xen-devel-boun...@lists.xen.org [mailto:xen-devel- > > boun...@lists.xen.org] On Behalf Of Paul Durrant > > Sent: 20 January 2016 13:14 > > To: Ian Campbell; xen-de...@lists.xenproject.org > > Cc: Ian Jackson; Keir (Xen.org); Jan Beulich; Tim (Xen.org) > > Subject: Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of > > "request-multicast-control" flag > > > > > -Original Message- > > > From: Ian Campbell [mailto:ian.campb...@citrix.com] > > > Sent: 20 January 2016 13:06 > > > To: Paul Durrant; xen-de...@lists.xenproject.org > > > Cc: Ian Jackson; Jan Beulich; Keir (Xen.org); Tim (Xen.org) > > > Subject: Re: [PATCH] public/io/netif.h: change semantics of "request- > > > multicast-control" flag > > > > > > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > > > > My patch b2700877 "move and amend multicast control documentation" > > > > clarified use of the multicast control protocol between frontend > > > > and > > > > backend. However, it transpires that the restrictions that > > > > documentation > > > > placed on the "request-multicast-control" flag make it hard for a > > > > frontend to enable 'all multicast' promiscuous mode, in that to do > > > > so > > > > would require the frontend and backend to disconnect and re- > > > > connect. > > > > > > Do we therefore think that this document reflected reality, i.e. > > > might this > > > not be "just" a documentation bug? > > > > > > (Or maybe we can't tell because the only previous implementation was > > years > > > ago in Solaris or something) > > > > That's my concern. I hope it's just a documentation bug, but I don't > > know. > > Also I've already done an implementation in Linux netback according to > > the > > restricted semantics. > > > > > > > > > This patch adds a new "feature-dynamic-multicast-control" flag to > > > > allow > > > > a backend to advertise that it will watch "request-multicast- > > > > control" > > hence > > > > allowing it to be meaningfully modified by the frontend at any time > > > > rather > > > > than only when the frontend and backend are disconnected. > > > > > > Would allowing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} to take a > > bcast > > > address > > > be easier on the backend, in that it would just need to be a static > > > feature > > > rather than watching stuff on the fly? > > > > The documented semantics of the list are 'exact match' so sending a bcast > > address doesn't do much good with a backend that doesn't know to treat is > > specially hence a frontend can't tell whether 'all multicast' mode is going > > to > > work without the extra feature flag. As for watching "request-multcast- > > control" vs. add/remove of bcast, the complexity of implementation is > > cheaper for the latter but I think the former is 'nicer'. > > > > Are you ok with the xenstore watch approach (and leaving the patch as is) > or would you prefer to spec. the bcast address as a wildcard and submit a > new patch? I'm fine with the watch approach, was just suggesting the alternative in case it turned out to be much easier. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
> -Original Message- > From: Ian Campbell [mailto:ian.campb...@citrix.com] > Sent: 21 January 2016 11:59 > To: Paul Durrant; xen-de...@lists.xenproject.org > Cc: Ian Jackson; Keir (Xen.org); Jan Beulich; Tim (Xen.org) > Subject: Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of > "request-multicast-control" flag > > On Thu, 2016-01-21 at 11:48 +, Paul Durrant wrote: > > > -Original Message- > > > From: xen-devel-boun...@lists.xen.org [mailto:xen-devel- > > > boun...@lists.xen.org] On Behalf Of Paul Durrant > > > Sent: 20 January 2016 13:14 > > > To: Ian Campbell; xen-de...@lists.xenproject.org > > > Cc: Ian Jackson; Keir (Xen.org); Jan Beulich; Tim (Xen.org) > > > Subject: Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of > > > "request-multicast-control" flag > > > > > > > -Original Message- > > > > From: Ian Campbell [mailto:ian.campb...@citrix.com] > > > > Sent: 20 January 2016 13:06 > > > > To: Paul Durrant; xen-de...@lists.xenproject.org > > > > Cc: Ian Jackson; Jan Beulich; Keir (Xen.org); Tim (Xen.org) > > > > Subject: Re: [PATCH] public/io/netif.h: change semantics of "request- > > > > multicast-control" flag > > > > > > > > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > > > > > My patch b2700877 "move and amend multicast control > documentation" > > > > > clarified use of the multicast control protocol between frontend > > > > > and > > > > > backend. However, it transpires that the restrictions that > > > > > documentation > > > > > placed on the "request-multicast-control" flag make it hard for a > > > > > frontend to enable 'all multicast' promiscuous mode, in that to do > > > > > so > > > > > would require the frontend and backend to disconnect and re- > > > > > connect. > > > > > > > > Do we therefore think that this document reflected reality, i.e. > > > > might this > > > > not be "just" a documentation bug? > > > > > > > > (Or maybe we can't tell because the only previous implementation was > > > years > > > > ago in Solaris or something) > > > > > > That's my concern. I hope it's just a documentation bug, but I don't > > > know. > > > Also I've already done an implementation in Linux netback according to > > > the > > > restricted semantics. > > > > > > > > > > > > This patch adds a new "feature-dynamic-multicast-control" flag to > > > > > allow > > > > > a backend to advertise that it will watch "request-multicast- > > > > > control" > > > hence > > > > > allowing it to be meaningfully modified by the frontend at any time > > > > > rather > > > > > than only when the frontend and backend are disconnected. > > > > > > > > Would allowing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} to take a > > > bcast > > > > address > > > > be easier on the backend, in that it would just need to be a static > feature > > > > rather than watching stuff on the fly? > > > > > > The documented semantics of the list are 'exact match' so sending a bcast > > > address doesn't do much good with a backend that doesn't know to treat > is > > > specially hence a frontend can't tell whether 'all multicast' mode is > > > going > to > > > work without the extra feature flag. As for watching "request-multcast- > > > control" vs. add/remove of bcast, the complexity of implementation is > > > cheaper for the latter but I think the former is 'nicer'. > > > > > > > Are you ok with the xenstore watch approach (and leaving the patch as is) > > or would you prefer to spec. the bcast address as a wildcard and submit a > > new patch? > > I'm fine with the watch approach, was just suggesting the alternative in > case it turned out to be much easier. > I already have an implementation of the watch approach which is now allowing Windows logo testing to pass :-) Paul > Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] stable backport of 3 xen multiqueue network dev fixes
Hi Dave, Please could you queue these three: 32a8440 xen-netfront: respect user provided max_queues 4c82ac3 xen-netback: respect user provided max_queues ca88ea1 xen-netfront: update num_queues to real created for stable backports to at least 4.1. Multiqueue was added in v3.16 so I think they could all go to LTS that far back. Users are tripping over issues in the field with exhausting the number of Xen grant entries available to guests when creating large numbers of queues (which happens with guests with >=32 VCPUs). ca88ea1 is a fix which causes the failure to allocate grants for all of the desired queues gracefully, rather than crashing or failing to pass any traffic. 32a8440 and 4c82ac3 fix the handling of the module parameters on both ends so that users can choose to have fewer queues than p/vcpus if they like, which would have been an effective workaround for the issue solved by ca88ea1 if that hadn't been broken. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On Thu, 2016-01-21 at 10:56 +, David Vrabel wrote: > On 20/01/16 12:23, Ian Campbell wrote: > > There have been a few reports recently[0] which relate to a failure of > > netfront to allocate sufficient grant refs for all the queues: > > > > [0.533589] xen_netfront: can't alloc rx grant refs > > [0.533612] net eth0: only created 31 queues > > > > Which can be worked around by increasing the number of grants on the > > hypervisor command line or by limiting the number of queues permitted > > by > > either back or front using a module param (which was broken but is now > > fixed on both sides, but I'm not sure it has been backported everywhere > > such that it is a reliable thing to always tell users as a workaround). > > > > Is there any plan to do anything about the default/out of the box > > experience? Either limiting the number of queues or making both ends > > cope > > more gracefully with failure to create some queues (or both) might be > > sufficient? > > > > I think the crash after the above in the first link at [0] is fixed? I > > think that was the purpose of ca88ea1247df "xen-netfront: update > > num_queues > > to real created" which was in 4.3. > > I think the correct solution is to increase the default maximum grant > table size. That could well make sense, but then there will just be another higher limit, so we should perhaps do both. i.e. factoring in: * performance i.e. ability for N queues to saturate whatever sort of link contemporary Linux can saturate these days, plus some headroom, or whatever other ceiling seems sensible) * grant table resource consumption i.e. (sensible max number of blks * nr gnts per blk + sensible max number of vifs * nr gnts per vif + other devs needs) < per guest grant limit) to pick both the default gnttab size and the default max queuers. (or s/sensible/supportable/g etc). > Although, unless you're using the not-yet-applied per-cpu rwlock patches > multiqueue is terrible on many (multisocket) systems and the number of > queue should be limited in netback to 4 or even just 2. Presumably the guest can't tell, so it can't do this. I think when you say "terrible" you don't mean "worse than without mq" but rather "not realising the expected gains from a larger nunber of queues", right?. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] VirtFS support on Xen
On Thu, Jan 21, 2016 at 10:50:08AM +, David Vrabel wrote: > On 21/01/16 10:28, Wei Liu wrote: > > [RFC] VirtFS support on Xen > > > > # Introduction > > > > QEMU/KVM supports file system passthrough via an interface called > > VirtFS [0]. VirtFS is in turn implemented with 9pfs protocol [1] and > > VirtIO transport. > > > > Xen used to have its own implementation of file system passthrough > > called XenFS, but that has been inactive for a few years. The latest > > update was in 2009 [2]. > > > > This project aims to add VirtFS support on Xen. This is more > > sustainable than inventing our own wheel.# > > What's the use case for this? Who wants this feature? > Anyone who wants file system passthrough. More specifically, VM-based container solutions can share files from host file system. http://xendevsummit2015.sched.org/event/3WDg/xen-containers-better-way-to-run-docker-containers-sainath-grandhi-intel?iframe=no==yes=no http://xendevsummit2015.sched.org/event/3X1L/hyper-make-vm-run-like-containers-xu-wang-hyper?iframe=no==yes=no Wei. > David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
On 21/01/16 12:59, Jan Beulich wrote: >> For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug build (so there is a guest observable difference between running on a debug and a non-debug Xen), and the comment beside it even identifies that it breaks BSD guests. PTE bits 62:59 used by hardware if CR4.PKE is set. Currently this means that we are not able to support Protection Key for PV guests (although this restriction technically only applies to debug builds of the hypervisor). The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52). This bit is used to notice when a 64bit PV guest attempts to override the fixup Xen applies to its PTEs. Xen unilaterally sets _PAGE_GLOBAL for user pages, and clears _PAGE_GLOBAL for supervisor mappings, setting _PAGE_USER in both cases as the PV kernel runs in ring3. The only thing _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a warning is logged and the kernel overridden. Neither of the used PTE bits exist in the Xen public ABI. Neither of them serve a purpose other than a debugging aid. I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of "all bits available for guest use". >>> And a kernel using any of the conflicting bits would then become >>> unusable on a hypervisor with that debug option enabled? I'd >>> rather see us document the state things are in... >> _PAGE_GNTMAP is already states: >> >> /* >> * Debug option: Ensure that granted mappings are not implicitly unmapped. >> * WARNING: This will need to be disabled to run OSes that use the spare PTE >> * bits themselves (e.g., *BSD). >> */ > But that's (assuming the use of the two bits were spelled out) a > guest OS not fully playing by the spec. To me, "available" PTE bits > being shared implies that some of them may be claimed by Xen, > while others may be claimed by guests. You're right that this needs > to be written down, but I don't think we need to go as far as > forbidding Xen to use any of them. And even less so should we > preclude their use for any purpose going forward. I do not want to make an ABI which mandates the use of certain bits by Xen. As we see with the Protection Key feature, newer hardware feature start using bits which were previously software available, and we absolutely don't want to be in a position where our ABI prevents us from ever supporting a new feature. > In the end, with (as it seems) not much effort this could even be > made dynamic: A guest could advertise which of the bits it doesn't > use, and then Xen could pick two of them for the two purposes it > currently needs them for. Should a guest leave no or just one bit > available, the debugging aid could then be disabled. This is unnecessarily complicated. It only helps going forward, and leaves Xen with substantially more complicated PTE handling. What happens if at some point later, we try to boot a PV guest which has a different PTE layout to what Xen has chosen? We definitely can't update every PTE with a newly-chosen layout. > >> I was intending to have CONFIG_PV_PTE_DEBUG as an EXPERT option, >> disabled by default even in debug builds. >> >> There should not be an ABI difference between release and "normal" debug >> builds. > Well, I see your point, but as said above I'm not convinced > disabling all that code is the right solution. In fact, what you > propose is not far away from removing that code altogether. The two bits are only used for specialised debugging. They should be relegated to people doing specific debugging, and not interfere with the overwhelming majority of cases where Xen doesn't need to use any software available PTE bits. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] arm: clean up build variables
>>> On 20.01.16 at 22:47,wrote: > This consolidates some of the different variables used for the ARM > builds. This change was prompted by the Kconfig changes but looking back > in time the CONFIG_ARM_{32,64} variables existed before Kconfig so this > should just be a generic cleanup. > > Signed-off-by: Doug Goldstein > --- > xen/arch/arm/Makefile| 8 > xen/arch/arm/Rules.mk| 18 -- > xen/drivers/passthrough/Makefile | 2 +- > 3 files changed, 9 insertions(+), 19 deletions(-) Trivial IOMMU part Acked-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 12/16] ARM: Xen: Document UEFI support on Xen ARM virtual platforms
On Tue, Jan 19, 2016 at 09:43:59PM +0800, Shannon Zhao wrote: > > On 2016/1/19 21:13, Mark Rutland wrote: > >On Tue, Jan 19, 2016 at 12:23:17PM +, Stefano Stabellini wrote: > >>>On Tue, 19 Jan 2016, Mark Rutland wrote: > >On Tue, Jan 19, 2016 at 06:25:25PM +0800, Shannon Zhao wrote: > >>> > > >>We don't do this in Documentation/arm/uefi.txt, and I don't see > >>> > > >>why we > >>> > > >>should do so here. > >>> > > >> > >>> > > >>Does Xen handle arbitrary size memory map descriptors? I'm not > >>> > > >>sure what > >>> > > >>new information might be passed in future additions to the > >>> > > >>descriptor > >>> > > >>format, and I'm not sure what should happen in the Dom0 case. > >> > > > > >> > > >Xen passes to Dom0 the memory map in the same format as the native > >> > > >memory map. > > > >Does Xen parse or modify the EFI memory map in any way? > >>> > >>>Xen: > >>>- calls EFI_BOOT_SERVICES.GetMemoryMap() > >>>- takes note of the memory regions for its own usage > >>>- create the fdt notes, including efi-mmap-start, with a pointer to it > >>> > >>> > >Does it pass the raw values returned by EFI_BOOT_SERVICES.GetMemoryMap() > >through to the xen,uefi-* properties, or does is make any static > >assumptions about what the values will be? > >>> > >>>It just passes the raw values. > >I take it that means that any memory carved out for Xen itself is > >described/discovered via a separate mechanism? How does that work > > For Xen hypervisor booting on UEFI, it get the EFI memory map > through the similar way like Linux, e.g. call > EFI_BOOT_SERVICES.GetMemoryMap(). > For Dom0, Xen will create a new EFI memory map for Dom0. > > See [PATCH v3 52/62] arm/acpi: Prepare EFI memory descriptor for Dom0 > http://lists.xen.org/archives/html/xen-devel/2015-11/msg01884.html Ok. So Xen will parse the EFI memory map, and will create a new memory map to pass to the Dom0 kernel, presumably with some memory having been carved out for Xen itself, and never described to Dom0. So if there's any extension to that in future, Dom0 may see problems. There's not much that can be done about that, however, and extensions to the descriptors seem unlikely at present. Thanks, Mark. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Thu, 2016-01-21 at 12:55 +, Stefano Stabellini wrote: > On Thu, 21 Jan 2016, Peng Fan wrote: > > Hi Ian, > > > > On Thu, Jan 21, 2016 at 12:26:04PM +, Ian Campbell wrote: > > > On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote: > > > > Hi Ian, > > > > > > > > On Thu, Jan 21, 2016 at 10:19:32AM +, Ian Campbell wrote: > > > > > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote: > > > > > > > > > > > > To platform device of ARM, hypervisor is responsible for the > > > > > > mapping > > > > > > between machine address and guest physical address, also > > > > > > responsible > > > > > > for the irq mapping. > > > > > > > > > > > > But to embedded ARM SoC, the hardware clk IP is handled in > > > > > > Dom0. > > > > > > > > > > Arguably Xen ought to be the one to do this, but we have > > > > > determined > > > > > (rightly, I think) that doing so for the entirely clk tree of an > > > > > SoC > > > > > would > > > > > involve importing an unmanageable amount of code into Xen[0]. > > > > > > > > > > In the meantime we defer this to Dom0. > > > > > > > > > > I wonder though if it would be possible to manage the clocks for > > > > > a > > > > > passthrough device from the toolstack, i.e. is there a sysfs node > > > > > where > > > > > one > > > > > can say "keep the clock for this device enabled (at xMhz) even > > > > > though > > > > > you > > > > > think the device is unused"? > > > > > > > > I am afraid not. The linux device driver without xen can work well, > > > > because > > > > there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well > > > > in the > > > > driver. > > > > I do not want to remove the clk apis usage from the device driver > > > > for > > > > xen, because the driver > > > > also serves when no xen support. The pvclk interface is mainly to > > > > let the > > > > clk api can work as no xen. > > > > > > Would adding a dummy fixed-clock[0] (or several of them) to the guest > > > passthrough DT satisfy the driver's requirements? They would be > > > hardcoded > > > to whatever rate dom0 and/or the tools has decided upon (and had set > > > in the > > > real h/w). > > > > If using this way, we have the assumption that DomU device driver would > > not > > change the rate of the clock driving the device. I am not sure whether > > this is > > ok for so many platform devices based ARM core. > > > > In /sys/kernel/debug/clk/, there are clk tree info, but > > clk api are not exposed to userspace as far as I know, so > > if using sysfs interface to set a known fixed rate or enable/disable > > the clock, > > we need to expose the clk info to userspace. > > Keeping in mind that we might now want to let DomU change the actual s/now/not/? > clock frequency anyway, exposing a dummy clock to keep the DomU driver > happy might be a good option. > > However whether we set the frequency from the Dom0 kernel or from the > toolstack, how do we find out the right clock rate? From a grep in the > kernel sources, it looks like some drivers still have hardcoded values. > > Asking the user to provide the clock rate seems a bit too much to me. Remember that for this use case they already need to provide the host DT path to the device, the list of iomem resources, the list of irqs and a suitable device tree snippet. platform device passthrough is very much an "expert" option (intended for folks building embedded device, not really for normal end users), which already needs a fair bit of familiarity with the system in question, I don't think providing some info about the clocks goes too much further than this. Seamless & easy passthrough is something we are aiming for for PCI, but not for the general case of platform passthrough. Making platform device passthrough as easy as PCI passthrough is much harder and far reaching than avoiding the need to specify some clock frequencies (remember we had very long discussions about the design and couldn't come up with anything especially feasible). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v1 4/8] x86/init: add linker table support
On 01/21/2016 03:38 AM, Roger Pau Monné wrote: El 20/01/16 a les 22.33, Luis R. Rodriguez ha escrit: On Wed, Jan 20, 2016 at 1:00 PM, Konrad Rzeszutek Wilkwrote: +static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn) +{ + if (!fn->supp_hardware_subarch) { + pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init); + WARN_ON(1); + } + if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch) + return true; + return false; +} So the logic for this working is that boot_params.hdr.hardware_subarch And the macros define two: BIT(X86_SUBARCH_PC) or BIT(X86_SUBARCH_XEN). But hardware_subarch by default is set to zero. Which means if GRUB2, PXELinux, Xen multiboot1 don't set it - then the X86_SUBARCH_PC is choosen right? 1 << 0 & 1 << X86_SUBARCH_PC (which is zero). For this to nicely work with Xen it ought to do this: diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 993b7a7..6cf9afd 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1676,6 +1676,7 @@ asmlinkage __visible void __init xen_start_kernel(void) boot_params.hdr.ramdisk_image = initrd_start; boot_params.hdr.ramdisk_size = xen_start_info->mod_len; boot_params.hdr.cmd_line_ptr = __pa(xen_start_info->cmd_line); + boot_params.hdr.hardware_subarch = X86_SUBARCH_XEN; if (!xen_initial_domain()) { add_preferred_console("xenboot", 0, NULL); ? That's correct for PV and PVH, likewise when qemu is required for HVM qemu could set it. I have the qemu change done but that should only cover HVM. A common place to set this as well could be the hypervisor, but currently the hypervisor doesn't set any boot_params, instead a generic struct is passed and the kernel code (for any OS) is expected to interpret this and then set the required values for the OS in the init path. Long term though if we wanted to merge init further one way could be to have the hypervisor just set the zero page cleanly for the different modes. If we needed more data other than the hardware_subarch we also have the hardware_subarch_data, that's a u64 , and how that is used would be up to the subarch. In Xen's case it could do what it wants with it. That would still mean perhaps defining as part of a Xen boot protocol a place where xen specific code can count on finding more Xen data passed by the hypervisor, the xen_start_info. That is, if we wanted to merge init paths this is something to consider. One thing I considered on the question of who should set the zero page for Xen with the prospect of merging inits, or at least this subarch for both short term and long term are the obvious implications in terms of hypervisor / kernel / qemu combination requirements if the subarch is needed. Having it set in the kernel is an obvious immediate choice for PV / PVH but it means we can't merge init paths completely (down to asm inits), we'd still be able to merge some C init paths though, the first entry would still be different. Having the zero page set on the hypervisor would go long ways but it would mean a hypervisor change required. I don't think the hypervisor should be setting Linux specific boot related parameters, the boot ABI should be OS agnostic. IMHO, a small shim should be added to Linux in order to set what Linux requires when entering from a Xen entry point. And that's exactly what HVMlite does. Most of this shim layer is setting up boot_params, after which we jump to standard x86 boot path (i.e. startup_{32|64}). With hardware_subarch set to zero. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Jan, On Thu, Jan 21, 2016 at 03:21:38AM -0700, Jan Beulich wrote: On 21.01.16 at 09:59,wrote: >> uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm. >> passthrough uart2, hypervisor handles the reg and interrupts, that is >> because >> hypervisor handles the memory map and the interrupt controller(GIC). But >> here >> CCM is not handled by hypervisor, it is handled by Dom0. > >This, I take it, describes the current state, which doesn't make clear >whether this is intentionally that way (and can't be changed), or >just happens to be that way because so far it didn't matter. > >> For ARMV8 server products, I am not sure whether hypercall is better; but to >> my case, it's not feasible to use hypercall. > >Because of ...? I guess you mean DomU issues hypercall and Xen forwards the request to Dom0, then Dom0 reply the response? If you experts think pvclk is not a good way to handle the clock for passed through device, I can try hypercall way. > >> Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to >> enable/disable/set rate for a clock for the device. So I think it's okay >> for multipile parties, the clk subsystem in Dom0 can handle mutiple requests >> even if the clock is shared. > >I.e. if one party sets one rate, and later another party sets >a different rate, things are going to work fine? If so, why would >the different parties need control over the rate in the first place? oh. thanks for teaching me. If two IPs share one clock, and IP1 for Dom0, IP2 for DomU, Dom0 needs clock work at 20Hz, but DomU want clock work at 40Hz. pvclk can not avoid this. If not using pvclk, we develop a new method to handle clock. I guest we can also not avoid this? Thanks, Peng. > >Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote: > Hi Ian, > > On Thu, Jan 21, 2016 at 10:19:32AM +, Ian Campbell wrote: > > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote: > > > > > > To platform device of ARM, hypervisor is responsible for the mapping > > > between machine address and guest physical address, also responsible > > > for the irq mapping. > > > > > > But to embedded ARM SoC, the hardware clk IP is handled in Dom0. > > > > Arguably Xen ought to be the one to do this, but we have determined > > (rightly, I think) that doing so for the entirely clk tree of an SoC > > would > > involve importing an unmanageable amount of code into Xen[0]. > > > > In the meantime we defer this to Dom0. > > > > I wonder though if it would be possible to manage the clocks for a > > passthrough device from the toolstack, i.e. is there a sysfs node where > > one > > can say "keep the clock for this device enabled (at xMhz) even though > > you > > think the device is unused"? > > I am afraid not. The linux device driver without xen can work well, > because > there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the > driver. > I do not want to remove the clk apis usage from the device driver for > xen, because the driver > also serves when no xen support. The pvclk interface is mainly to let the > clk api can work as no xen. Would adding a dummy fixed-clock[0] (or several of them) to the guest passthrough DT satisfy the driver's requirements? They would be hardcoded to whatever rate dom0 and/or the tools has decided upon (and had set in the real h/w). Ian. [0] Documentation/devicetree/bindings/clock/fixed-clock.txt ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Ian, On Thu, Jan 21, 2016 at 12:26:04PM +, Ian Campbell wrote: >On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote: >> Hi Ian, >> >> On Thu, Jan 21, 2016 at 10:19:32AM +, Ian Campbell wrote: >> > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote: >> > > >> > > To platform device of ARM, hypervisor is responsible for the mapping >> > > between machine address and guest physical address, also responsible >> > > for the irq mapping. >> > > >> > > But to embedded ARM SoC, the hardware clk IP is handled in Dom0. >> > >> > Arguably Xen ought to be the one to do this, but we have determined >> > (rightly, I think) that doing so for the entirely clk tree of an SoC >> > would >> > involve importing an unmanageable amount of code into Xen[0]. >> > >> > In the meantime we defer this to Dom0. >> > >> > I wonder though if it would be possible to manage the clocks for a >> > passthrough device from the toolstack, i.e. is there a sysfs node where >> > one >> > can say "keep the clock for this device enabled (at xMhz) even though >> > you >> > think the device is unused"? >> >> I am afraid not. The linux device driver without xen can work well, >> because >> there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the >> driver. >> I do not want to remove the clk apis usage from the device driver for >> xen, because the driver >> also serves when no xen support. The pvclk interface is mainly to let the >> clk api can work as no xen. > >Would adding a dummy fixed-clock[0] (or several of them) to the guest >passthrough DT satisfy the driver's requirements? They would be hardcoded >to whatever rate dom0 and/or the tools has decided upon (and had set in the >real h/w). If using this way, we have the assumption that DomU device driver would not change the rate of the clock driving the device. I am not sure whether this is ok for so many platform devices based ARM core. In /sys/kernel/debug/clk/, there are clk tree info, but clk api are not exposed to userspace as far as I know, so if using sysfs interface to set a known fixed rate or enable/disable the clock, we need to expose the clk info to userspace. Jan said using hypercall in the other mail, do you have any ideas about this? Thanks, Peng. > >Ian. > >[0] Documentation/devicetree/bindings/clock/fixed-clock.txt > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Thu, 21 Jan 2016, Peng Fan wrote: > Hi Ian, > > On Thu, Jan 21, 2016 at 12:26:04PM +, Ian Campbell wrote: > >On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote: > >> Hi Ian, > >> > >> On Thu, Jan 21, 2016 at 10:19:32AM +, Ian Campbell wrote: > >> > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote: > >> > > > >> > > To platform device of ARM, hypervisor is responsible for the mapping > >> > > between machine address and guest physical address, also responsible > >> > > for the irq mapping. > >> > > > >> > > But to embedded ARM SoC, the hardware clk IP is handled in Dom0. > >> > > >> > Arguably Xen ought to be the one to do this, but we have determined > >> > (rightly, I think) that doing so for the entirely clk tree of an SoC > >> > would > >> > involve importing an unmanageable amount of code into Xen[0]. > >> > > >> > In the meantime we defer this to Dom0. > >> > > >> > I wonder though if it would be possible to manage the clocks for a > >> > passthrough device from the toolstack, i.e. is there a sysfs node where > >> > one > >> > can say "keep the clock for this device enabled (at xMhz) even though > >> > you > >> > think the device is unused"? > >> > >> I am afraid not. The linux device driver without xen can work well, > >> because > >> there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the > >> driver. > >> I do not want to remove the clk apis usage from the device driver for > >> xen, because the driver > >> also serves when no xen support. The pvclk interface is mainly to let the > >> clk api can work as no xen. > > > >Would adding a dummy fixed-clock[0] (or several of them) to the guest > >passthrough DT satisfy the driver's requirements? They would be hardcoded > >to whatever rate dom0 and/or the tools has decided upon (and had set in the > >real h/w). > > If using this way, we have the assumption that DomU device driver would not > change the rate of the clock driving the device. I am not sure whether this is > ok for so many platform devices based ARM core. > > In /sys/kernel/debug/clk/, there are clk tree info, but > clk api are not exposed to userspace as far as I know, so > if using sysfs interface to set a known fixed rate or enable/disable the > clock, > we need to expose the clk info to userspace. Keeping in mind that we might now want to let DomU change the actual clock frequency anyway, exposing a dummy clock to keep the DomU driver happy might be a good option. However whether we set the frequency from the Dom0 kernel or from the toolstack, how do we find out the right clock rate? From a grep in the kernel sources, it looks like some drivers still have hardcoded values. Asking the user to provide the clock rate seems a bit too much to me. > Jan said using hypercall in the other mail, do you have any ideas about this? I recall having discussed this in the past and the conclusion was that moving the clock framework into Xen, so that the hypervisor could directly control clock frequencies, although it might make sense from an architectural point of view, would be too complex to be feasible.___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
>>> On 21.01.16 at 12:16,wrote: > On 21/01/16 11:07, Jan Beulich wrote: > On 20.01.16 at 21:10, wrote: >>> First of all, SMEP and SMAP. 32bit PV guests are subject to Xen's >>> SMEP/SMAP choices, because of running in ring 1. >>> >>> SMAP in particular is problematic because older Linux guests do fall >>> foul of it; they don't understand what a SMAP pagefault is, and enter an >>> infinite loop of pagefaults. SMEP is also problematic because it breaks >>> any guest wishing to use a shared address space between kernel and >>> user. (I had some fun getting the test framework to function until I >>> twigged what was happening). >>> >>> Both of these are regressions; older guests relying on existing >>> behaviour cease to function on newer hardware/Xen despite identical >>> settings. >> And for both of them there simply should be a way for the guest to >> state whether it's compatible (which should be the case for anything >> we can't deal with completely transparently to guests). >> >>> For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug >>> build (so there is a guest observable difference between running on a >>> debug and a non-debug Xen), and the comment beside it even identifies >>> that it breaks BSD guests. PTE bits 62:59 used by hardware if CR4.PKE >>> is set. Currently this means that we are not able to support Protection >>> Key for PV guests (although this restriction technically only applies to >>> debug builds of the hypervisor). >>> >>> The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52). This bit >>> is used to notice when a 64bit PV guest attempts to override the fixup >>> Xen applies to its PTEs. Xen unilaterally sets _PAGE_GLOBAL for user >>> pages, and clears _PAGE_GLOBAL for supervisor mappings, setting >>> _PAGE_USER in both cases as the PV kernel runs in ring3. The only thing >>> _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately >>> tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a >>> warning is logged and the kernel overridden. >>> >>> >>> Neither of the used PTE bits exist in the Xen public ABI. Neither of >>> them serve a purpose other than a debugging aid. >>> >>> I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of >>> "all bits available for guest use". >> And a kernel using any of the conflicting bits would then become >> unusable on a hypervisor with that debug option enabled? I'd >> rather see us document the state things are in... > > _PAGE_GNTMAP is already states: > > /* > * Debug option: Ensure that granted mappings are not implicitly unmapped. > * WARNING: This will need to be disabled to run OSes that use the spare PTE > * bits themselves (e.g., *BSD). > */ But that's (assuming the use of the two bits were spelled out) a guest OS not fully playing by the spec. To me, "available" PTE bits being shared implies that some of them may be claimed by Xen, while others may be claimed by guests. You're right that this needs to be written down, but I don't think we need to go as far as forbidding Xen to use any of them. And even less so should we preclude their use for any purpose going forward. In the end, with (as it seems) not much effort this could even be made dynamic: A guest could advertise which of the bits it doesn't use, and then Xen could pick two of them for the two purposes it currently needs them for. Should a guest leave no or just one bit available, the debugging aid could then be disabled. > I was intending to have CONFIG_PV_PTE_DEBUG as an EXPERT option, > disabled by default even in debug builds. > > There should not be an ABI difference between release and "normal" debug > builds. Well, I see your point, but as said above I'm not convinced disabling all that code is the right solution. In fact, what you propose is not far away from removing that code altogether. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 01/21/16 03:25, Jan Beulich wrote: > >>> On 21.01.16 at 10:10,wrote: > > On 01/21/2016 04:53 PM, Jan Beulich wrote: > > On 21.01.16 at 09:25, wrote: > >>> On 01/21/2016 04:18 PM, Jan Beulich wrote: > Yes. But then - other than you said above - it still looks to me as > if the split between PMEM and PBLK is arranged for by firmware? > >>> > >>> Yes. But OS/Hypervisor is not excepted to dynamically change its configure > >>> (re-split), > >>> i,e, for PoV of OS/Hypervisor, it is static. > >> > >> Exactly, that has been my understanding. And hence the PMEM part > >> could be under the hypervisor's control, while the PBLK part could be > >> Dom0's responsibility. > >> > > > > I am not sure if i have understood your point. What your suggestion is that > > leave PMEM for hypervisor and all other parts (PBLK and _DSM handling) to > > Dom0? If yes, we should: > > a) handle hotplug in hypervisor (new PMEM add/remove) that causes hyperivsor > > interpret ACPI SSDT/DSDT. > > Why would this be different from ordinary memory hotplug, where > Dom0 deals with the ACPI CA interaction, notifying Xen about the > added memory? > The process of NVDIMM hotplug is similar to the ordinary memory hotplug, and seemingly possible to support it in Xen hypervisor like ordinary memory hotplug. > > b) some _DSMs control PMEM so you should filter out these kind of _DSMs and > > handle them in hypervisor. > > Not if (see above) following the model we currently have in place. > You mean let dom0 linux evaluates those _DSMs and interact with hypervisor if necessary (e.g. XENPF_mem_hotadd for memory hotplug)? > > c) hypervisor should mange PMEM resource pool and partition it to multiple > > VMs. > > Yes. > But I Still do not quite understand this part: why must pmem resource management and partition be done in hypervisor? I mean if we allow the following steps of operations (for example) (1) partition pmem in dom 0 (2) get address and size of each partition (part_addr, part_size) (3) call a hypercall like nvdimm_memory_mapping(d, part_addr, part_size, gpfn) to map a partition to the address gpfn in dom d. Only the last step requires hypervisor. Would anything be wrong if we allow above operations? Ha ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] VirtFS support on Xen
On 21/01/16 10:28, Wei Liu wrote: > [RFC] VirtFS support on Xen > > # Introduction > > QEMU/KVM supports file system passthrough via an interface called > VirtFS [0]. VirtFS is in turn implemented with 9pfs protocol [1] and > VirtIO transport. > > Xen used to have its own implementation of file system passthrough > called XenFS, but that has been inactive for a few years. The latest > update was in 2009 [2]. > > This project aims to add VirtFS support on Xen. This is more > sustainable than inventing our own wheel.# What's the use case for this? Who wants this feature? David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.5-testing baseline-only test] 38676: trouble: blocked/broken/fail/pass
On 21/01/16 11:00, Ian Campbell wrote: > On Thu, 2016-01-21 at 07:49 +0100, Juergen Gross wrote: >> On 21/01/16 07:31, Platform Team regression test user wrote: >>> This run is configured for baseline tests only. >>> >>> flight 38676 xen-4.5-testing real [real] >>> http://osstest.xs.citrite.net/~osstest/testlogs/logs/38676/ >> >> I've seen this more than once now: some test results seem not to be >> reachable from outside Citrix: >> >> $ host osstest.xs.citrite.net >> Host osstest.xs.citrite.net not found: 3(NXDOMAIN) >> >> Is this on purpose? > > More due to a lack of somewhere to post them publicly. Okay. I just wanted to make sure you are aware of that issue. I'm fine with ignoring those mails. :-) Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
>>> On 20.01.16 at 21:10,wrote: > First of all, SMEP and SMAP. 32bit PV guests are subject to Xen's > SMEP/SMAP choices, because of running in ring 1. > > SMAP in particular is problematic because older Linux guests do fall > foul of it; they don't understand what a SMAP pagefault is, and enter an > infinite loop of pagefaults. SMEP is also problematic because it breaks > any guest wishing to use a shared address space between kernel and > user. (I had some fun getting the test framework to function until I > twigged what was happening). > > Both of these are regressions; older guests relying on existing > behaviour cease to function on newer hardware/Xen despite identical > settings. And for both of them there simply should be a way for the guest to state whether it's compatible (which should be the case for anything we can't deal with completely transparently to guests). > For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug > build (so there is a guest observable difference between running on a > debug and a non-debug Xen), and the comment beside it even identifies > that it breaks BSD guests. PTE bits 62:59 used by hardware if CR4.PKE > is set. Currently this means that we are not able to support Protection > Key for PV guests (although this restriction technically only applies to > debug builds of the hypervisor). > > The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52). This bit > is used to notice when a 64bit PV guest attempts to override the fixup > Xen applies to its PTEs. Xen unilaterally sets _PAGE_GLOBAL for user > pages, and clears _PAGE_GLOBAL for supervisor mappings, setting > _PAGE_USER in both cases as the PV kernel runs in ring3. The only thing > _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately > tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a > warning is logged and the kernel overridden. > > > Neither of the used PTE bits exist in the Xen public ABI. Neither of > them serve a purpose other than a debugging aid. > > I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of > "all bits available for guest use". And a kernel using any of the conflicting bits would then become unusable on a hypervisor with that debug option enabled? I'd rather see us document the state things are in... > The other question is what we do when it comes to %cr4 and PV guests. > > The current SMAP issue is a blocker for XenServer, and I have some nasty > logic to fix up behind the guests back. I have only just discovered the > SMEP issue, but it is still a regression (again, nothing states that a > PV guest must have a split address space; segmentation is a perfectly > valid option in 32bit guests). The PK issue is one which shouldn't be > an issue for us to implement in PV guests. > > I am leaning towards allowing a toolstack to permit a PV guest to be > able to play with a few more CR4 bits. We can't give a guest kernel > complete carte blanche, because of the security implications. However, > we do already context switch CR4 for PV guests, so a few extra bits on > a "nominated safe" domain is no extra hassle. Sounds reasonable. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.4-testing test] 78620: tolerable FAIL - PUSHED
flight 78620 xen-4.4-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/78620/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeatfail like 77671 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 77671 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 77671 Tests which did not succeed, but are not blocking: test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a build-amd64-rumpuserxen 6 xen-buildfail never pass test-armhf-armhf-xl-vhd 9 debian-di-installfail never pass build-i386-rumpuserxen6 xen-buildfail never pass test-armhf-armhf-libvirt-qcow2 9 debian-di-installfail never pass test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 11 guest-start fail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 9 debian-di-installfail never pass version targeted for testing: xen 425f7f77ce950f39236f7c13e288ce5198c3576a baseline version: xen 6d2c41d676339ecc6255b92876b6153d4f85f579 Last test of basis77671 2016-01-09 20:12:32 Z 11 days Testing same since78620 2016-01-20 13:41:06 Z0 days1 attempts People who touched revisions under test: Andrew CooperIan Campbell Jan Beulich Kevin Tian jobs: build-amd64-xend pass build-i386-xend pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-qemuu-nested-amdfail test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-rumpuserxen-amd64 blocked
Re: [Xen-devel] [PATCH V13 5/5] xl: add pvusb commands
On Tue, Jan 19, 2016 at 08:10:07PM -0700, Jim Fehlig wrote: > On 01/19/2016 11:11 AM, Ian Jackson wrote: > > Chunyan Liu writes ("[PATCH V13 5/5] xl: add pvusb commands"): > >> Add pvusb commands: usbctrl-attach, usbctrl-detach, usb-list, > >> usbdev-attach and usbdev-detach. > > Thanks for swapping this with the other patch. It is better now. > > > >> +=item B I I > > However, I think you need to explictly state that the user may (and > > indeed, must) pass multiple settings as separate arguments. AFAICT > > the parser here doesn't do the ,-splitting. > > I just noticed this is the case with network devices as well. E.g. > > #xl network-attach hvm-domU mac=00:16:3e:xx:yy:zz,bridge=br0 > libxl: error: libxl_device.c:1095:device_hotplug_child_death_cb: script: Could > not find bridge device xenbr0 > > main_networkattach() in tools/libxl/xl_cmdimpl.c doesn't split on the ',', so > everything after mac=00:16:3e:xx:yy:zz is ignored. I'd need advice on how to > fix > this though. Based on xl-network-configuration doc and Xen tool's long history > of network-attach supporting that syntax, I'd say main_networkattach() should > be > changed to split on ','. I could also change the docs. Do tools maintainers > have > a preference, or alternative option? > It is splitting on " " at the moment which is not very desirable. I think this is a bug. I've added it to my list and will look into fixing it. Wei. > Regards, > Jim > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xen-netfront crash when detaching network while some network activity
On 01/20/2016 09:59 PM, Konrad Rzeszutek Wilk wrote: > On Tue, Dec 01, 2015 at 11:32:58PM +0100, Marek Marczykowski-Górecki wrote: >> On Tue, Dec 01, 2015 at 05:00:42PM -0500, Konrad Rzeszutek Wilk wrote: >>> On Tue, Nov 17, 2015 at 03:45:15AM +0100, Marek Marczykowski-Górecki wrote: On Wed, Oct 21, 2015 at 08:57:34PM +0200, Marek Marczykowski-Górecki wrote: > On Wed, May 27, 2015 at 12:03:12AM +0200, Marek Marczykowski-Górecki > wrote: >> On Tue, May 26, 2015 at 11:56:00AM +0100, David Vrabel wrote: >>> On 22/05/15 12:49, Marek Marczykowski-Górecki wrote: Hi all, I'm experiencing xen-netfront crash when doing xl network-detach while some network activity is going on at the same time. It happens only when domU has more than one vcpu. Not sure if this matters, but the backend is in another domU (not dom0). I'm using Xen 4.2.2. It happens on kernel 3.9.4 and 4.1-rc1 as well. Steps to reproduce: 1. Start the domU with some network interface 2. Call there 'ping -f some-IP' 3. Call 'xl network-detach NAME 0' >>> >>> Do you see this all the time or just on occassions? >> >> Using above procedure - all the time. >> >>> I tried to reproduce it and couldn't see it. Is your VM an PV or HVM? >> >> PV, started by libvirt. This may have something to do, the problem didn't >> existed on older Xen (4.1) and started by xl. I'm not sure about kernel >> version there, but I think I've tried there 3.18 too, which has this >> problem. >> >> But I don't see anything special in domU config file (neither backend >> nor frontend) - it may be some libvirt default. If that's really the >> cause. Can I (and how) get any useful information about that? > > libvirt naturally does some libxl calls, and they may be different. > > Any chance you could give me an idea of: > - What commands you use in libvirt? > - Do you use a bond or bridge? > - What version of libvirt you are using? > > Thanks! > CC-ing Joao just in case he has seen this. >> Hm, So far I couldn't reproduce the issue with upstream Xen/linux/libvirt, using both libvirt or plain xl (both on a bridge setup) and also irrespective of the both load and direction of traffic (be it a ping flood, pktgen with min. sized packets or iperf). >> >> -- >> Best Regards, >> Marek Marczykowski-Górecki >> Invisible Things Lab >> A: Because it messes up the order in which people normally read text. >> Q: Why is top-posting such a bad thing? > > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On 20/01/16 12:23, Ian Campbell wrote: > There have been a few reports recently[0] which relate to a failure of > netfront to allocate sufficient grant refs for all the queues: > > [0.533589] xen_netfront: can't alloc rx grant refs > [0.533612] net eth0: only created 31 queues > > Which can be worked around by increasing the number of grants on the > hypervisor command line or by limiting the number of queues permitted by > either back or front using a module param (which was broken but is now > fixed on both sides, but I'm not sure it has been backported everywhere > such that it is a reliable thing to always tell users as a workaround). > > Is there any plan to do anything about the default/out of the box > experience? Either limiting the number of queues or making both ends cope > more gracefully with failure to create some queues (or both) might be > sufficient? > > I think the crash after the above in the first link at [0] is fixed? I > think that was the purpose of ca88ea1247df "xen-netfront: update num_queues > to real created" which was in 4.3. I think the correct solution is to increase the default maximum grant table size. Although, unless you're using the not-yet-applied per-cpu rwlock patches multiqueue is terrible on many (multisocket) systems and the number of queue should be limited in netback to 4 or even just 2. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] cleancache: constify cleancache_ops structure
On Thu, 21 Jan 2016, David Vrabel wrote: > On 23/12/15 21:06, Julia Lawall wrote: > > The cleancache_ops structure is never modified, so declare it as const. > > > > This also removes the __read_mostly declaration on the cleancache_ops > > variable declaration, since it seems redundant with const. > > > > Done with the help of Coccinelle. > > > > Signed-off-by: Julia Lawall> > > > --- > > > > Not sure that the __read_mostly change is correct. Does it apply to the > > variable, or to what the variable points to? > > The variable, so... Thanks. I'll update the patch, unless you have already fixed it. julia > > --- a/mm/cleancache.c > > +++ b/mm/cleancache.c > > @@ -22,7 +22,7 @@ > > * cleancache_ops is set by cleancache_register_ops to contain the pointers > > * to the cleancache "backend" implementation functions. > > */ > > -static struct cleancache_ops *cleancache_ops __read_mostly; > > +static const struct cleancache_ops *cleancache_ops; > > ...you want to retain the __read_mostly here. > > David > -- > To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
On Thu, Jan 21, 2016 at 11:01:43AM +0100, Roger Pau Monné wrote: > El 21/01/16 a les 10.39, Ian Campbell ha escrit: > > On Wed, 2016-01-20 at 19:33 +0100, Roger Pau Monné wrote: > >> El 20/01/16 a les 14.01, Ian Campbell ha escrit: > >>> On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote: > Allow enabling or disabling emulated devices from the libxl domain > configuration file. For HVM guests with a device model all the > emulated > devices are enabled. For HVM guests without a device model no devices > are > enabled by default, although they can be enabled using the options > provided. > The arbiter of whether a combination is posible or not is always Xen, > libxl > doesn't do any kind of check. > > This set of options is also propagated inside of the libxl migration > record > as part of the contents of the libxl_domain_build_info struct. > >>> > >>> ... and this is the real motivation for this change, not actually > >>> allowing > >>> users to control all this AIUI. > >>> > >>> Did you check that the fields updated using libxl_defbool_setdefault > >>> are > >>> actually updated in the JSON copy and therefore propagated to the other > >>> side of a migration as specific values and not as "pick a default"? I > >>> think > >>> we don't want these changing on migration. I think/hope all this was > >>> automatically handled by the work Wei did in the last release cycle. > >> > >> No, values populated by the {build/create}_info_setdefault functions are > >> not propagated, OTOH values manually set by the user in the config file > >> are indeed propagated. Do we have any guarantee that _setdefault is > >> always going to behave in the same way? > > > > No, part of the purpose of defbool and the other "do the default" values is > > that we can evolve things over time. > > > >> If we don't have that guarantee I think this is already a bug, and we > >> should call _setdefault before sending the domain info to the other end. > >> In fact I have a patch that does exactly that, but I'm unsure if it's > >> needed because I don't know the policy regarding default values in libxl. > > > > Wei, isn't this (turning the defaults into concrete values) supposed to be > > taken care of by the JSON mangling which you added? > > Heh, I think you mean the JSON mangling added by Wei. In order to > propagate the values filled by default in libxl_domain_config I had to > add the following patch, which basically calls the _setdefault > functions before converting the domain_config into JSON. I'm planning > to make it part of this series in the next iteration: The requirement of recording decision made in libxl and pass that to the receiving end is not new. We had the same problem for uuid, disk and some other things. The first way of doing it is to update JSON before it is sent -- see libxl.c:libxl_retrieve_domain_configuration. It uses the stashed JSON file as template and pull in various bits from hypervisor and xenstore. Your need of recording what emulated devices are available fits here. You just need to provide a way to retrieve those bits in that function. Another way of doing it is to update the stashed JSON template before it is saved. See libxl_internal.c:libxl__update_domain_configuration. I think this might be easier than the first way of doing it. You should not export the _setdefault function to xl because it is a layer violation. Hope this clarify things. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 78597: tolerable FAIL - PUSHED
flight 78597 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/78597/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail in 78509 pass in 78597 test-armhf-armhf-xl-rtds 11 guest-startfail in 78509 pass in 78597 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail pass in 78509 test-armhf-armhf-xl-vhd 14 guest-start/debian.repeat fail pass in 78509 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail like 78421 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass version targeted for testing: qemuu3db34bf64ab4f8797565dd8750003156c32b301d baseline version: qemuu19b6d84316892c8086e0115d6f09cb01abb86cfc Last test of basis78421 2016-01-18 09:47:22 Z3 days Testing same since78509 2016-01-19 09:21:37 Z2 days2 attempts People who touched revisions under test: Andreas FärberChristophe Fergeau Daniel P. Berrange Gerd Hoffmann Juan Quintela Mark Cave-Ayland Paolo Bonzini Peter Maydell Wolfgang Bumiller jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt
Re: [Xen-devel] [PATCH V13 5/5] xl: add pvusb commands
On Thu, Jan 21, 2016 at 12:21:11PM +, Ian Campbell wrote: > On Thu, 2016-01-21 at 12:12 +, Wei Liu wrote: > > On Tue, Jan 19, 2016 at 08:10:07PM -0700, Jim Fehlig wrote: > > > On 01/19/2016 11:11 AM, Ian Jackson wrote: > > > > Chunyan Liu writes ("[PATCH V13 5/5] xl: add pvusb commands"): > > > > > Add pvusb commands: usbctrl-attach, usbctrl-detach, usb-list, > > > > > usbdev-attach and usbdev-detach. > > > > Thanks for swapping this with the other patch. It is better now. > > > > > > > > > +=item B I I > > > > However, I think you need to explictly state that the user may (and > > > > indeed, must) pass multiple settings as separate arguments. AFAICT > > > > the parser here doesn't do the ,-splitting. > > > > > > I just noticed this is the case with network devices as well. E.g. > > > > > > #xl network-attach hvm-domU mac=00:16:3e:xx:yy:zz,bridge=br0 > > > libxl: error: libxl_device.c:1095:device_hotplug_child_death_cb: > > > script: Could > > > not find bridge device xenbr0 > > > > > > main_networkattach() in tools/libxl/xl_cmdimpl.c doesn't split on the > > > ',', so > > > everything after mac=00:16:3e:xx:yy:zz is ignored. I'd need advice on > > > how to fix > > > this though. Based on xl-network-configuration doc and Xen tool's long > > > history > > > of network-attach supporting that syntax, I'd say main_networkattach() > > > should be > > > changed to split on ','. I could also change the docs. Do tools > > > maintainers have > > > a preference, or alternative option? > > > > > > > It is splitting on " " at the moment which is not very desirable. > > > > I think this is a bug. I've added it to my list and will look into > > fixing it. > > If you are feeling particularly highly motivated then xl_cmdimpl.c contains > quite a lot of adhoc parsing of various comma a space separated lists > (often dupped for a given instance into cmdline vs cfg file, which leads to > unintentional behavioural differences like above) which could usefully be > consolidated into a series of helpers. > Or rewrite every adhoc parser with bison/flex or Ocaml / Haskell parsec. Jokes aside. I will see what I can do. Consolidating them into helper functions is a good start. Wei. > Ian. > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH LIBVIRT v3] libxl: Support cmdline= in xl config files
... and consolidate the cmdline/extra/root parsing to facilitate doing so. The logic is the same as xl's parse_cmdline from the current xen.git master branch (e6f0e099d2c17de47fd86e817b1998db903cab61). On the formatting side switch to producing cmdline= instead of extra=. Update a few tests and add serveral more. - test-cmdline is added to test the exclusive use of cmdline. - test-fullvirt-direct-kernel-boot.cfg is updated due to the switch on the formatting side and now tests the exclusive use of cmdline=. - Tests are added for both paravirt and fullvirt where the .cfg uses extra= and (paravirt only) root=. These are format (xl->xml) only since the inverse will generate cmdline= hence is not a round trip (which was already true if using root=, which used to generate extra= on the way back). - Tests are added for both paravirt and fullvirt where the .cfg declares cmdline= as well as bogus extra= and (paravirt only) root= entries which should be ignored. Again these are format only tests since the inverse won't include the bogus lines. The last two bullets here required splitting the DO_TEST macro into two halves, as is done in the xmconfigtest.c case. In order to introduce a use of VIR_WARN for logging I had to add virerror.h and VIR_LOG_INIT. Signed-off-by: Ian Campbell--- v2: Use VIR_INFO (adding necessary infra) Don't initialise things to NULL when there is no need. v3: I know know the answer re VIR_FROM_THIS, because Jim fixed it. Initialise cmdline to NULL, since neither I nor gcc were smart enough to spot the uninitialised path I did this in preference to adding the else case, since that apparently won't be masking the compiler's ability to spot uninitialised vars in this function. Add tests Addjust xenFormatXLOS to produce cmdline= instead of extra=. --- src/xenconfig/xen_xl.c | 70 +- ...est-fullvirt-direct-kernel-boot-bogus-extra.cfg | 31 ++ ...est-fullvirt-direct-kernel-boot-bogus-extra.xml | 51 .../test-fullvirt-direct-kernel-boot-extra.cfg | 30 ++ .../test-fullvirt-direct-kernel-boot-extra.xml | 51 .../test-fullvirt-direct-kernel-boot.cfg | 2 +- .../test-paravirt-cmdline-bogus-extra-root.cfg | 13 .../test-paravirt-cmdline-bogus-extra-root.xml | 32 ++ .../test-paravirt-cmdline-extra-root.cfg | 15 + .../test-paravirt-cmdline-extra-root.xml | 32 ++ tests/xlconfigdata/test-paravirt-cmdline.cfg | 14 + tests/xlconfigdata/test-paravirt-cmdline.xml | 32 ++ tests/xlconfigtest.c | 24 ++-- 13 files changed, 365 insertions(+), 32 deletions(-) create mode 100644 tests/xlconfigdata/test-fullvirt-direct-kernel-boot-bogus-extra.cfg create mode 100644 tests/xlconfigdata/test-fullvirt-direct-kernel-boot-bogus-extra.xml create mode 100644 tests/xlconfigdata/test-fullvirt-direct-kernel-boot-extra.cfg create mode 100644 tests/xlconfigdata/test-fullvirt-direct-kernel-boot-extra.xml create mode 100644 tests/xlconfigdata/test-paravirt-cmdline-bogus-extra-root.cfg create mode 100644 tests/xlconfigdata/test-paravirt-cmdline-bogus-extra-root.xml create mode 100644 tests/xlconfigdata/test-paravirt-cmdline-extra-root.cfg create mode 100644 tests/xlconfigdata/test-paravirt-cmdline-extra-root.xml create mode 100644 tests/xlconfigdata/test-paravirt-cmdline.cfg create mode 100644 tests/xlconfigdata/test-paravirt-cmdline.xml diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c index 026cbcc..be194e3 100644 --- a/src/xenconfig/xen_xl.c +++ b/src/xenconfig/xen_xl.c @@ -27,6 +27,7 @@ #include "virconf.h" #include "virerror.h" +#include "virlog.h" #include "domain_conf.h" #include "viralloc.h" #include "virstring.h" @@ -35,6 +36,8 @@ #define VIR_FROM_THIS VIR_FROM_XENXL +VIR_LOG_INIT("xen.xen_xl"); + /* * Xen provides a libxl utility library, with several useful functions, * specifically xlu_disk_parse for parsing xl disk config strings. @@ -58,11 +61,46 @@ extern int xlu_disk_parse(XLU_Config *cfg, libxl_device_disk *disk); #endif +static int xenParseCmdline(virConfPtr conf, char **r_cmdline) +{ +char *cmdline = NULL; +const char *root, *extra, *buf; + +if (xenConfigGetString(conf, "cmdline", , NULL) < 0) +return -1; + +if (xenConfigGetString(conf, "root", , NULL) < 0) +return -1; + +if (xenConfigGetString(conf, "extra", , NULL) < 0) +return -1; + +if (buf) { +if (VIR_STRDUP(cmdline, buf) < 0) +return -1; +if (root || extra) +VIR_WARN("ignoring root= and extra= in favour of cmdline="); +} else { +if (root && extra) { +if (virAsprintf(, "root=%s %s", root, extra) < 0) +return -1; +} else
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Thu, 2016-01-21 at 20:35 +0800, Peng Fan wrote: > On Thu, Jan 21, 2016 at 12:26:04PM +, Ian Campbell wrote: > > Would adding a dummy fixed-clock[0] (or several of them) to the guest > > passthrough DT satisfy the driver's requirements? They would be hardcoded > > to whatever rate dom0 and/or the tools has decided upon (and had set in the > > real h/w). > > If using this way, we have the assumption that DomU device driver would not > change the rate of the clock driving the device. I am not sure whether this is > ok for so many platform devices based ARM core. Don't (non-buggy) drivers already need to cope with the possibility that the clk core may not be able to satisfy their request for a particular rate due to constraints from other ip blocks? I would also expect the user to want to configure things in dom0 such that the specific drivers used in domU are satisfied with what they get (which is a bit fiddly perhaps, but platform device passthrough already is somewhat that way). > In /sys/kernel/debug/clk/, there are clk tree info, but > clk api are not exposed to userspace as far as I know, so > if using sysfs interface to set a known fixed rate or enable/disable the > clock, > we need to expose the clk info to userspace. That seems possible to arrange given a use case for it though, a SMOP (and convincing the clk maintainer of the need). Worst case is a xen-clkback driver or perhaps even vfio will want to do something like this, we can't normally use vfio on Xen, but in this case perhaps we could. > Jan said using hypercall in the other mail, do you have any ideas about > this? This would need a whole clock infrastructure in Xen, wouldn't it? I described why that is not currently available in an earlier mail, and also my opinion that doing the whole thing in Xen would involve pulling in far too much SoC specific code for each specific platform. Ian. > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
>>> On 21.01.16 at 13:06,wrote: > On Thu, Jan 21, 2016 at 03:21:38AM -0700, Jan Beulich wrote: > On 21.01.16 at 09:59, wrote: >>> uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm. >>> passthrough uart2, hypervisor handles the reg and interrupts, that is >>> because >>> hypervisor handles the memory map and the interrupt controller(GIC). But >>> here >>> CCM is not handled by hypervisor, it is handled by Dom0. >> >>This, I take it, describes the current state, which doesn't make clear >>whether this is intentionally that way (and can't be changed), or >>just happens to be that way because so far it didn't matter. >> >>> For ARMV8 server products, I am not sure whether hypercall is better; but to >>> my case, it's not feasible to use hypercall. >> >>Because of ...? > > I guess you mean DomU issues hypercall and Xen forwards the request to Dom0, > then Dom0 reply the response? Well, no, that wouldn't be a desirable (or even sane) model. >>> Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to >>> enable/disable/set rate for a clock for the device. So I think it's okay >>> for multipile parties, the clk subsystem in Dom0 can handle mutiple >>> requests >>> even if the clock is shared. >> >>I.e. if one party sets one rate, and later another party sets >>a different rate, things are going to work fine? If so, why would >>the different parties need control over the rate in the first place? > > oh. thanks for teaching me. If two IPs share one clock, and IP1 for Dom0, > IP2 for DomU, > Dom0 needs clock work at 20Hz, but DomU want clock work at 40Hz. pvclk can > not avoid this. But you mustn't allow a DomU to affect Dom0, nor may two DomU-s interact in such a way with one another. > If not using pvclk, we develop a new method to handle clock. I guest we can > also not avoid this? At the very least it would need to be avoided by denying the request. Upon shared use, either all parties agree, or only one may use the clock. And passing through a (platform) device would therefore imply validating that the needed clock(s) are available to the target domain. Doing this in a consistent way with all control in one component's hands seems doable only if hypervisor and/or tool stack are the controlling (and arbitrating) entity. In the end this is one of the reasons why to me a simple PV I/O interface doesn't seem suitable here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
>>> On 21.01.16 at 14:17,wrote: > On 21/01/16 12:59, Jan Beulich wrote: >>> > For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug > build (so there is a guest observable difference between running on a > debug and a non-debug Xen), and the comment beside it even identifies > that it breaks BSD guests. PTE bits 62:59 used by hardware if CR4.PKE > is set. Currently this means that we are not able to support Protection > Key for PV guests (although this restriction technically only applies to > debug builds of the hypervisor). > > The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52). This bit > is used to notice when a 64bit PV guest attempts to override the fixup > Xen applies to its PTEs. Xen unilaterally sets _PAGE_GLOBAL for user > pages, and clears _PAGE_GLOBAL for supervisor mappings, setting > _PAGE_USER in both cases as the PV kernel runs in ring3. The only thing > _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately > tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a > warning is logged and the kernel overridden. > > > Neither of the used PTE bits exist in the Xen public ABI. Neither of > them serve a purpose other than a debugging aid. > > I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of > "all bits available for guest use". And a kernel using any of the conflicting bits would then become unusable on a hypervisor with that debug option enabled? I'd rather see us document the state things are in... >>> _PAGE_GNTMAP is already states: >>> >>> /* >>> * Debug option: Ensure that granted mappings are not implicitly unmapped. >>> * WARNING: This will need to be disabled to run OSes that use the spare PTE >>> * bits themselves (e.g., *BSD). >>> */ >> But that's (assuming the use of the two bits were spelled out) a >> guest OS not fully playing by the spec. To me, "available" PTE bits >> being shared implies that some of them may be claimed by Xen, >> while others may be claimed by guests. You're right that this needs >> to be written down, but I don't think we need to go as far as >> forbidding Xen to use any of them. And even less so should we >> preclude their use for any purpose going forward. > > I do not want to make an ABI which mandates the use of certain bits by Xen. Agreed, but that doesn't match what I said: We also shouldn't make an ABI precluding use of some of the bits by Xen. >> In the end, with (as it seems) not much effort this could even be >> made dynamic: A guest could advertise which of the bits it doesn't >> use, and then Xen could pick two of them for the two purposes it >> currently needs them for. Should a guest leave no or just one bit >> available, the debugging aid could then be disabled. > > This is unnecessarily complicated. It only helps going forward, and > leaves Xen with substantially more complicated PTE handling. Substantially? Instead of using a constant, we'd use a per-domain variable. Not much of a complication afaict. > What happens if at some point later, we try to boot a PV guest which has > a different PTE layout to what Xen has chosen? We definitely can't > update every PTE with a newly-chosen layout. Just to re-iterate: A per domain selection of bit mask(s). >>> I was intending to have CONFIG_PV_PTE_DEBUG as an EXPERT option, >>> disabled by default even in debug builds. >>> >>> There should not be an ABI difference between release and "normal" debug >>> builds. >> Well, I see your point, but as said above I'm not convinced >> disabling all that code is the right solution. In fact, what you >> propose is not far away from removing that code altogether. > > The two bits are only used for specialised debugging. They should be > relegated to people doing specific debugging, and not interfere with the > overwhelming majority of cases where Xen doesn't need to use any > software available PTE bits. Your repeated claim that _PAGE_GUEST_KERNEL is purely debugging only makes we wonder how you would mean to adjust adjust_guest_l1e() with that flag gone (most notably the last of its if()-s). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v1 4/8] x86/init: add linker table support
El 20/01/16 a les 22.33, Luis R. Rodriguez ha escrit: > On Wed, Jan 20, 2016 at 1:00 PM, Konrad Rzeszutek Wilk >wrote: >>> +static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn) >>> +{ >>> + if (!fn->supp_hardware_subarch) { >>> + pr_err("Init sequence fails to declares any supported >>> subarchs: %pF\n", fn->early_init); >>> + WARN_ON(1); >>> + } >>> + if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch) >>> + return true; >>> + return false; >>> +} >> >> So the logic for this working is that boot_params.hdr.hardware_subarch >> >> And the macros define two: BIT(X86_SUBARCH_PC) or BIT(X86_SUBARCH_XEN). >> >> But hardware_subarch by default is set to zero. Which means if GRUB2, >> PXELinux, Xen multiboot1 >> don't set it - then the X86_SUBARCH_PC is choosen right? >> >> 1 << 0 & 1 << X86_SUBARCH_PC (which is zero). >> >> For this to nicely work with Xen it ought to do this: >> >> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >> index 993b7a7..6cf9afd 100644 >> --- a/arch/x86/xen/enlighten.c >> +++ b/arch/x86/xen/enlighten.c >> @@ -1676,6 +1676,7 @@ asmlinkage __visible void __init xen_start_kernel(void) >> boot_params.hdr.ramdisk_image = initrd_start; >> boot_params.hdr.ramdisk_size = xen_start_info->mod_len; >> boot_params.hdr.cmd_line_ptr = __pa(xen_start_info->cmd_line); >> + boot_params.hdr.hardware_subarch = X86_SUBARCH_XEN; >> >> if (!xen_initial_domain()) { >> add_preferred_console("xenboot", 0, NULL); >> >> >> ? > > That's correct for PV and PVH, likewise when qemu is required for HVM > qemu could set it. I have the qemu change done but that should only > cover HVM. A common place to set this as well could be the hypervisor, > but currently the hypervisor doesn't set any boot_params, instead a > generic struct is passed and the kernel code (for any OS) is expected > to interpret this and then set the required values for the OS in the > init path. Long term though if we wanted to merge init further one way > could be to have the hypervisor just set the zero page cleanly for the > different modes. If we needed more data other than the > hardware_subarch we also have the hardware_subarch_data, that's a u64 > , and how that is used would be up to the subarch. In Xen's case it > could do what it wants with it. That would still mean perhaps defining > as part of a Xen boot protocol a place where xen specific code can > count on finding more Xen data passed by the hypervisor, the > xen_start_info. That is, if we wanted to merge init paths this is > something to consider. > > One thing I considered on the question of who should set the zero page > for Xen with the prospect of merging inits, or at least this subarch > for both short term and long term are the obvious implications in > terms of hypervisor / kernel / qemu combination requirements if the > subarch is needed. Having it set in the kernel is an obvious immediate > choice for PV / PVH but it means we can't merge init paths completely > (down to asm inits), we'd still be able to merge some C init paths > though, the first entry would still be different. Having the zero page > set on the hypervisor would go long ways but it would mean a > hypervisor change required. I don't think the hypervisor should be setting Linux specific boot related parameters, the boot ABI should be OS agnostic. IMHO, a small shim should be added to Linux in order to set what Linux requires when entering from a Xen entry point. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.5-testing baseline-only test] 38676: trouble: blocked/broken/fail/pass
On Thu, 2016-01-21 at 07:49 +0100, Juergen Gross wrote: > On 21/01/16 07:31, Platform Team regression test user wrote: > > This run is configured for baseline tests only. > > > > flight 38676 xen-4.5-testing real [real] > > http://osstest.xs.citrite.net/~osstest/testlogs/logs/38676/ > > I've seen this more than once now: some test results seem not to be > reachable from outside Citrix: > > $ host osstest.xs.citrite.net > Host osstest.xs.citrite.net not found: 3(NXDOMAIN) > > Is this on purpose? More due to a lack of somewhere to post them publicly. These results from "citrix-osst...@xenproject.org", generally with "baseline-only" in the subject and a flight > 5000 are from the osstest instance running in Cambridge, which used to be the production instance prior to the colo. Reports from the production colo come from "osstest service owner ". The Cambridge baseline-only tests were originally setup so we could make use of the historical info e.g. regarding machine specific failures in the period after the switch (i.e. when there wasn't much historical data for the colo based machines) and to give some overlap in the results of the old and new instances. Since then I have left them running since we have versions of hardware (particularly older machines) which the COLO doesn't have, and also because it gives those of us here a set of baselines to use for any adhoc testing we might run here. Being "baseline-only" each branch is run exactly once on any new baseline, i.e. when the XenProject instance gets a push of that branch. Right now our instance is suffering from a host-install issue exposed by the Debian Jessie upgrade (arising because some of those machines have two disks, which caused us to notice that none of the machines in the COLO did, which is being fixed), so the results look bad right now. The fix is in transit through the colo push gate where it will be picked up by the Cambridge instance and the results should improve. Anyway, I kept the mails going to the list because they might be of interest, and because they are baseline-only they should be relatively low volume (compared with the production instance). I'm happy to push logs for specific flights to my user space on xenbits if someone wants to look at an issue, but for the most part I would expect them to be ignored. If they are considered too annoying I can arrange for the reports to go elsewhere. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [RFC] VirtFS support on Xen
[RFC] VirtFS support on Xen # Introduction QEMU/KVM supports file system passthrough via an interface called VirtFS [0]. VirtFS is in turn implemented with 9pfs protocol [1] and VirtIO transport. Xen used to have its own implementation of file system passthrough called XenFS, but that has been inactive for a few years. The latest update was in 2009 [2]. This project aims to add VirtFS support on Xen. This is more sustainable than inventing our own wheel. # QEMU Some preliminary work has been done. The original code made a lot of assumptions that VirtFS was tied to VirtIO, which wasn't true. Patches to disentangle VirtFS generic code and VirtIO transport have been upstreamed. What we now need to do is to implement a Xen transport inside QEMU. # Linux kernel Linux kernel has a better shape than QEMU. The generic code and transport layers are separated cleanly. We need to add a new transport for Xen. # Xen toolstack New interfaces will be added to exposed relevant configuration options to users. In essence it is just plumbing the right options to QEMU. # Security Xen transport will utilise grant table to limit access from backend to frontend. Users can use several security modes provided by QEMU. By and large the security of VirtFS on Xen depends on the quality of QEMU VirtFS code (this is in line with our expectation of other PV backends in QEMU) and the particular setup of VirtFS (whether a chroot fs-proxy is used, which security model is picked etc). # Device type and wire format I propose to use "virtfs" as device type. A header file named virtfs.h shall be committed to xen.git as reference for future implementation. The design of wire format is limited by two factors: 1) 9pfs protocol is based on transaction -- client sends out request and expects response, 2) the arrangement of Linux kernel 9pfs places request buffer and response buffer on the wire at the same time. Linux seems to be the only one among several popular UNIX / UNIX-like systems that has in-kernel 9p protocol implementation (correct me if I'm wrong), so I couldn't fathom what other OSes such as FreeBSD and NetBSD would need from this wire format. But it would be safe to bet that they would like to reuse as much code as possible and follow the same path as Linux does. So the conclusion so far is that it would make sense for us to follow the Linux approach (putting two buffers on wire at the same time) to minimise our work. Other opinions are welcome. As for the wire format itself, I've attached a draft from my code. It's far from complete, but it contains the wire format I envisage. # Optimisation The Xen trasnport can be easily extended to use multiple-page ring should the need arise. A more sophisticated optimisation is to support "indirect-descriptor" like interface to boost bandwidth. Wei. [0] http://www.linux-kvm.org/page/VirtFS [1] https://en.wikipedia.org/wiki/9P_(protocol) [2] https://blog.xenproject.org/2009/03/26/status-of-xenfs/ [3] http://wiki.qemu.org/Documentation/9psetup [4] http://man.cat-v.org/plan_9/5/intro DRAFT HEADER /** * virtfs.h * * Copyright (c) 2016, Wei Liu*/ #ifndef __XEN_PUBLIC_IO_VIRTFS_H__ #define __XEN_PUBLIC_IO_VIRTFS_H__ #include #include struct xen_virtfsif_request { grant_ref_t gref; /* Reference to buffer page */ uint16_t offset; /* Offset within buffer page */ uint16_t flags;/* XEN_VIRTFSF_* */ uint16_t id; /* Echoed in response message. */ }; struct xen_virtfsif_response { uint16_t id; int16_t status;/* Indicate whether backend successfully copy / map */ }; /* TODO: RFC style diagrams go in here */ /* TODO: document xenstore path */ /* * /local/domain/$guest/device/virtfs/$x/tag * /local/domain/$guest/device/virtfs/$x/ring-ref * /local/domain/$guest/device/virtfs/$x/event-channel */ /* * This is wire format for virtfs requests: * Request 1: xen_virtfsif_request -- XEN_VIRTFSF_* (any flag) * Request 2: xen_virtfsif_request -- if request 1 has XEN_VIRTFSF_MORE * Request 3: xen_virtfsif_request -- if request 2 has XEN_VIRTFSF_MORE * ... * Request N: xen_virtfsif_request -- if request (N-1) has XEN_VIRTFSF_MORE, * while itself doesn't contain * XEN_VIRTFSF_MORE * * * | slot 0 | slot 1 | ... | slot m | ... | slot n | * | 9p request buffer| 9p response buffer | * * Grefs for 9p requests (if any) go in first half of wire format, * grefs for 9p responses (if any) go in second half. */ /* If this bit is set, the next slot if part of the request. */ #define _XEN_VIRTFSF_MORE (0) #define XEN_VIRTFSF_MORE (1U << _XEN_VIRTFSF_MORE) /* * If this bit is set, this gref is for backend to write data (9p * response buffer); otherwise it is for backend to read (9p
Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
On Thu, 2016-01-21 at 11:01 +0100, Roger Pau Monné wrote: > El 21/01/16 a les 10.39, Ian Campbell ha escrit: > > > If we don't have that guarantee I think this is already a bug, and we > > > should call _setdefault before sending the domain info to the other end. > > > In fact I have a patch that does exactly that, but I'm unsure if it's > > > needed because I don't know the policy regarding default values in libxl. > > > > Wei, isn't this (turning the defaults into concrete values) supposed to be > > taken care of by the JSON mangling which you added? > > Heh, I think you mean the JSON mangling added by Wei. Correct. > In order to > propagate the values filled by default in libxl_domain_config I had to > add the following patch, which basically calls the _setdefault > functions before converting the domain_config into JSON. I'm planning > to make it part of this series in the next iteration: I'll let Wei comment on why this isn't already done. > > > With the current code, libxl basically limits the set of allowed masks > > > to what it knows. After the change libxl just becomes a proxy for > > > transmitting what the user has selected to Xen, and Xen either accepts > > > or refuses it, basically making Xen the only arbiter that decides which > > > emulated devices get enabled or not. This means that if we want to make > > > more emulated devices available to the guest in the future no libxl > > > changes will be required. > > > > We would need to add a new defbool for the new feature. > > Yes, but I was thinking more in the direction of enabling them, rather > than adding new ones. Which would then require changing the defbool_set_default in libxl after this change, so you do still need to change libxl. > > > It also means that HVMlite guests created with current Xen will be > > > capable of migrating to newer version of Xen, that might have a > > > different default policy. For example in the future we might want to > > > enable the lapic by default, so if a guest is created with the current > > > Xen version it doesn't get a lapic at all, and then when migrated to > > > newer versions a lapic would magically appear after the migration, which > > > is not desired. > > > > ... and the reason these details can't be propagated via the Xen blob is > > that this emul stuff needs to be set exactly once at domain create time I > > suppose? Changing it to be later binding is considered to be too hard/too > > big a yak? > > Right, emulated devices are initialised as part of the > XEN_DOMCTL_createdomain hypercall. Allowing them to be added later on > and introducing a kind of intermediate domain building phase where only > a certain set of hypercalls are valid is a possibility that Andrew > already pointed out, however this seems like a very big project. This seems like the right approach to me, but I appreciate you not wanting to tackle this here and now. Would it be possible to set the set of "potential" emulated devices at create time and only "commit" to it after the save state is loaded? That would essentially mean init-all, load state, de-init those which didn't get any state loaded? Not as nice as doing it with the split of hypercall availability, but might be more achievable, since you already have the de- init code for domain teardown time. In any case, whatever is chosen as the solution the commit message needs to go into a fair amount of detail as to why we picked that way of doing things, particularly if it is a compromise vs doing it properly. This is important so we can answer "why did we do it this way" in 2 years time. > > Even with the set of devices set at domain creation time Xen needs to take > > care when reading its blob, and not fall apart (from a security PoV, it's > > allowed to fail the state load) when presented with a save record relating > > to something which is supposedly disabled. Has this been checked? > > Yes, trying to load a state into a disable device will result in > -ENODEV. Grand. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Jan, On Thu, Jan 21, 2016 at 12:53:01AM -0700, Jan Beulich wrote: On 21.01.16 at 02:29,wrote: >> The platform device passthrough part for arm is to mapping the machine io >> address >> to the guest physical io address. Then guest can map the phsical io address >> to its >> own virtual address, then by accessing virtual address, guest can access >> machine io address space. >> So the platform device passthrough does not needs frontend/backend driver to >> support, except smmu is handled by xen. >> >> But the platform device needs clk to drive the hardware IP, also may needs >> pinmux settings. >> the driver in guest needs to drive the hardware IP passed through to the >> guest, so it needs to operate on the clk. >> >> Just pasted comments from George for V1: >> >> " >> Just speaking from the perspective of a Xen dev who's not an ARM dev: >> a few more words on the relationship between pvclk and >> device-passthrough would be helpful to set the context. It sounds >> like: >> >> * On ARM, passing through a device requires a clocksource (at least >> for many devices) >> >> * dom0 has the hardware clocksource, but at the moment domUs don't >> have a suitable clocksource >> >> * This patch implements pvclk front/backend suitable for such devices >> >> Is that right? In which case something like the following would be helpful: >> >> "This patch introduces pvclk, a paravirtualized clock source suitable >> for devices to use when passing through to domUs on ARM systems." >> " > >That's a possible perspective to take, but not the only one. In >fact, coming to what I said previously, I wonder whether placing >the "backend" in Dom0 is the right thing in the first place - >fundamentally arbitration of hardware use should be done >(or at least checked/enforced) by the hypervisor. I.e. just like >while Dom0 may assign a PCI device to a guest, the hypervisor >is in charge of actually making all the resources needed for this >to work accessible to the guest, and/or verifying that permissions >are in place (like e.g. when setting up interrupts). Yet the model >proposed here completely bypasses the hypervisor afaict. To platform device of ARM, hypervisor is responsible for the mapping between machine address and guest physical address, also responsible for the irq mapping. But to embedded ARM SoC, the hardware clk IP is handled in Dom0. On my i.MX platform, the hardware clk IP named Clock Controller Module will output clks to drive different IPs, such as uart,gpu,lcd,sd controller,scsi controller,pcie controller. The hardware clock IP is not same for all ARM SoC vendors. ARM has spec, such as GIC and SMMU to ask SoC vendors follow the spec, then it's easy to let hypervisor handle them. But there is no common spec for the clock IP. > >Are there connections between a platform device and its clock(s), >e.g. in DT? If so, wouldn't it be possible for granting access to a >platform device to imply granting control of the respective clock? clock hardware IP is also a device. The following is partial dts for i.MX7Dual. clks: ccm@3038 { compatible = "fsl,imx7d-ccm"; reg = <0x3038 0x1>; interrupts = , ; #clock-cells = <1>; clocks = <>, <>; clock-names = "ckil", "osc"; }; uart2: serial@3089 { compatible = "fsl,imx7d-uart", "fsl,imx6q-uart"; reg = <0x3089 0x1>; interrupts = ; clocks = < IMX7D_UART2_ROOT_CLK>, < IMX7D_UART2_ROOT_CLK>; clock-names = "ipg", "per"; status = "disabled"; }; uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm. passthrough uart2, hypervisor handles the reg and interrupts, that is because hypervisor handles the memory map and the interrupt controller(GIC). But here CCM is not handled by hypervisor, it is handled by Dom0. >In which case clock control might perhaps better become a >hypercall based mechanism? And further - are all clocks in use for >at most one platform device (i.e. is there no sharing possible)? If >not, how do you envision to make multiple parties agree on the >clock settings and state? For ARMV8 server products, I am not sure whether hypercall is better; but to my case, it's not feasible to use hypercall. Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to enable/disable/set rate for a clock for the device. So I think it's okay for multipile parties, the clk subsystem in Dom0 can handle
Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
On Wed, 2016-01-20 at 19:33 +0100, Roger Pau Monné wrote: > El 20/01/16 a les 14.01, Ian Campbell ha escrit: > > On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote: > > > Allow enabling or disabling emulated devices from the libxl domain > > > configuration file. For HVM guests with a device model all the > > > emulated > > > devices are enabled. For HVM guests without a device model no devices > > > are > > > enabled by default, although they can be enabled using the options > > > provided. > > > The arbiter of whether a combination is posible or not is always Xen, > > > libxl > > > doesn't do any kind of check. > > > > > > This set of options is also propagated inside of the libxl migration > > > record > > > as part of the contents of the libxl_domain_build_info struct. > > > > ... and this is the real motivation for this change, not actually > > allowing > > users to control all this AIUI. > > > > Did you check that the fields updated using libxl_defbool_setdefault > > are > > actually updated in the JSON copy and therefore propagated to the other > > side of a migration as specific values and not as "pick a default"? I > > think > > we don't want these changing on migration. I think/hope all this was > > automatically handled by the work Wei did in the last release cycle. > > No, values populated by the {build/create}_info_setdefault functions are > not propagated, OTOH values manually set by the user in the config file > are indeed propagated. Do we have any guarantee that _setdefault is > always going to behave in the same way? No, part of the purpose of defbool and the other "do the default" values is that we can evolve things over time. > If we don't have that guarantee I think this is already a bug, and we > should call _setdefault before sending the domain info to the other end. > In fact I have a patch that does exactly that, but I'm unsure if it's > needed because I don't know the policy regarding default values in libxl. Wei, isn't this (turning the defaults into concrete values) supposed to be taken care of by the JSON mangling which you added? > > > > Signed-off-by: Roger Pau Monné> > > --- > > > Cc: Ian Jackson > > > Cc: Ian Campbell > > > Cc: Wei Liu > > > --- > > > docs/man/xl.cfg.pod.5 | 39 > > > +++ > > > tools/libxl/libxl.h | 11 +++ > > > tools/libxl/libxl_create.c | 21 - > > > tools/libxl/libxl_types.idl | 6 ++ > > > tools/libxl/libxl_x86.c | 33 - > > > tools/libxl/xl_cmdimpl.c| 7 +++ > > > 6 files changed, 111 insertions(+), 6 deletions(-) > > > > > > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 > > > index 8899f75..46d4529 100644 > > > --- a/docs/man/xl.cfg.pod.5 > > > +++ b/docs/man/xl.cfg.pod.5 > > > @@ -1762,6 +1762,45 @@ See F > > > for > > > more information. > > > > > > =back > > > > > > +=head3 HVM without a device model options > > > + > > > +This options can be used to change the set of emulated devices > > > provided > > > > "These..." > > > > > +to guests without a device model. Note that Xen might not support > > > all > > > +possible combinations. By default HVM guests without a device model > > > +don't have any of them enabled. > > > > ... and for those with a device model? The title and text suggest these > > options are invalid/ignored in that case, but the code does actually > > honour > > what the user specified in this case. > > Right, I've clarified this by adding the following paragraph: > > "It is important to notice that these options (except the hpet one) are > not available to HVM guests with a device model, and trying to set them > will cause xl to exit with an error." > > I've also fixed up the code in libxl__domain_build_info_setdefault to > actually error out if a HVM guest with device model tries to set any of > them. > > > > diff --git a/tools/libxl/libxl_types.idl > > > b/tools/libxl/libxl_types.idl > > > index 92c95e5..8a21cda 100644 > > > --- a/tools/libxl/libxl_types.idl > > > +++ b/tools/libxl/libxl_types.idl > > > @@ -519,6 +519,12 @@ libxl_domain_build_info = > > > Struct("domain_build_info",[ > > > ("serial_list", libxl_st > > > ring_list), > > > ("rdm", libxl_rdm_reserve), > > > ("rdm_mem_boundary_memkb", > > > MemKB), > > > + ("lapic",libxl_de > > > fbool), > > > + ("ioapic", libxl_de > > > fbool), > > > + ("rtc", libxl_de > > > fbool), > > > + ("power_management", > > > libxl_defbool), > > > + ("pic", libxl_de
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
>>> On 21.01.16 at 09:59,wrote: > uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm. > passthrough uart2, hypervisor handles the reg and interrupts, that is > because > hypervisor handles the memory map and the interrupt controller(GIC). But > here > CCM is not handled by hypervisor, it is handled by Dom0. This, I take it, describes the current state, which doesn't make clear whether this is intentionally that way (and can't be changed), or just happens to be that way because so far it didn't matter. > For ARMV8 server products, I am not sure whether hypercall is better; but to > my case, it's not feasible to use hypercall. Because of ...? > Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to > enable/disable/set rate for a clock for the device. So I think it's okay > for multipile parties, the clk subsystem in Dom0 can handle mutiple requests > even if the clock is shared. I.e. if one party sets one rate, and later another party sets a different rate, things are going to work fine? If so, why would the different parties need control over the rate in the first place? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
>>> On 20.01.16 at 18:17,wrote: > > On 01/21/2016 01:07 AM, Jan Beulich wrote: > On 20.01.16 at 16:29, wrote: >>> On 01/20/2016 07:20 PM, Jan Beulich wrote: To answer this I need to have my understanding of the partitioning being done by firmware confirmed: If that's the case, then "normal" means the part that doesn't get exposed as a block device (SSD). In any event there's no correlation to guest exposure here. >>> >>> Firmware does not manage NVDIMM. All the operations of nvdimm are handled >>> by OS. >>> >>> Actually, there are lots of things we should take into account if we move >>> the NVDIMM management to hypervisor: >>> a) ACPI NFIT interpretation >>> A new ACPI table introduced in ACPI 6.0 is named NFIT which exports the >>> base information of NVDIMM devices which includes PMEM info, PBLK >>> info, nvdimm device interleave, vendor info, etc. Let me explain it one >>> by one. >>> >>> PMEM and PBLK are two modes to access NVDIMM devices: >>> 1) PMEM can be treated as NV-RAM which is directly mapped to CPU's >>> address >>> space so that CPU can r/w it directly. >>> 2) as NVDIMM has huge capability and CPU's address space is limited, >>> NVDIMM >>> only offers two windows which are mapped to CPU's address space, >>> the data >>> window and access window, so that CPU can use these two windows to >>> access >>> the whole NVDIMM device. >> >> You fail to mention PBLK. The question above really was about what > > The 2) is PBLK. > >> entity controls which of the two modes get used (and perhaps for >> which parts of the overall NVDIMM). > > So i think the "normal" you mentioned is about PMEM. :) Yes. But then - other than you said above - it still looks to me as if the split between PMEM and PBLK is arranged for by firmware? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> > +spin_lock_irqsave(_cpu(pi_blocked_vcpu_lock, pi_block_cpu), > flags); > > + > > +/* > > + * v->arch.hvm_vmx.pi_block_cpu == NR_CPUS here means the vCPU > was > > + * removed from the blocking list while we are acquiring the lock. > > + */ > > +if ( v->arch.hvm_vmx.pi_block_cpu == NR_CPUS ) > > With you wanting to deal with changes behind your back here, > isn't there come kind of barrier needed between reading and using > pi_block_cpu, such that the compiler won't convert the local > variable accesses into multiple reads of > v->arch.hvm_vmx.pi_block_cpu (which iiuc it is allowed to do)? > Thinking about this more. Seems we call spin_lock_irqsave() before using ' v->arch.hvm_vmx.pi_block_cpu ', can spin_lock_irqsave() used as the serializing instruction here? IIRIC, spin lock operations can be used for this purpose, right? Thanks, Feng ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 21/01/16 09:10, Xiao Guangrong wrote: > > > On 01/21/2016 04:53 PM, Jan Beulich wrote: > On 21.01.16 at 09:25,wrote: >>> On 01/21/2016 04:18 PM, Jan Beulich wrote: Yes. But then - other than you said above - it still looks to me as if the split between PMEM and PBLK is arranged for by firmware? >>> >>> Yes. But OS/Hypervisor is not excepted to dynamically change its >>> configure >>> (re-split), >>> i,e, for PoV of OS/Hypervisor, it is static. >> >> Exactly, that has been my understanding. And hence the PMEM part >> could be under the hypervisor's control, while the PBLK part could be >> Dom0's responsibility. >> > > I am not sure if i have understood your point. What your suggestion is > that > leave PMEM for hypervisor and all other parts (PBLK and _DSM handling) to > Dom0? If yes, we should: > a) handle hotplug in hypervisor (new PMEM add/remove) that causes > hyperivsor >interpret ACPI SSDT/DSDT. > b) some _DSMs control PMEM so you should filter out these kind of > _DSMs and >handle them in hypervisor. > c) hypervisor should mange PMEM resource pool and partition it to > multiple >VMs. It is not possible for Xen to handle ACPI such as this. There can only be one OSPM on a system, and 9/10ths of the functionality needing it already lives in Dom0. The only rational course of action is for Xen to treat both PBLK and PMEM as "devices" and leave them in Dom0's hands. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
El 21/01/16 a les 10.39, Ian Campbell ha escrit: > On Wed, 2016-01-20 at 19:33 +0100, Roger Pau Monné wrote: >> El 20/01/16 a les 14.01, Ian Campbell ha escrit: >>> On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote: Allow enabling or disabling emulated devices from the libxl domain configuration file. For HVM guests with a device model all the emulated devices are enabled. For HVM guests without a device model no devices are enabled by default, although they can be enabled using the options provided. The arbiter of whether a combination is posible or not is always Xen, libxl doesn't do any kind of check. This set of options is also propagated inside of the libxl migration record as part of the contents of the libxl_domain_build_info struct. >>> >>> ... and this is the real motivation for this change, not actually >>> allowing >>> users to control all this AIUI. >>> >>> Did you check that the fields updated using libxl_defbool_setdefault >>> are >>> actually updated in the JSON copy and therefore propagated to the other >>> side of a migration as specific values and not as "pick a default"? I >>> think >>> we don't want these changing on migration. I think/hope all this was >>> automatically handled by the work Wei did in the last release cycle. >> >> No, values populated by the {build/create}_info_setdefault functions are >> not propagated, OTOH values manually set by the user in the config file >> are indeed propagated. Do we have any guarantee that _setdefault is >> always going to behave in the same way? > > No, part of the purpose of defbool and the other "do the default" values is > that we can evolve things over time. > >> If we don't have that guarantee I think this is already a bug, and we >> should call _setdefault before sending the domain info to the other end. >> In fact I have a patch that does exactly that, but I'm unsure if it's >> needed because I don't know the policy regarding default values in libxl. > > Wei, isn't this (turning the defaults into concrete values) supposed to be > taken care of by the JSON mangling which you added? Heh, I think you mean the JSON mangling added by Wei. In order to propagate the values filled by default in libxl_domain_config I had to add the following patch, which basically calls the _setdefault functions before converting the domain_config into JSON. I'm planning to make it part of this series in the next iteration: --- commit b1b2cea4b61ce9bd05797d3dc5ff0f5fffccfd05 Author: Roger Pau MonneDate: Wed Jan 20 19:06:50 2016 +0100 libxl: introduce libxl_domain_info_setdefault to the public API The newly introduced function populates the libxl_domain_config with their default values, just like it's done during domain creation. This is needed so the libxl_domain_config sent to the restore side during migration is accurate, since default values might change between libxl versions. Signed-off-by: Roger Pau Monné --- --- NB: I'm unsure whether this is a bug or not. IMHO I think it is, because there's no guarantee that the default values will stay the same between libxl versions, so a domain created with an old libxl version might see differences when migrated to a newer libxl version. diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 157f07c..70bb6e1 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -886,6 +886,15 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src); */ #define LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN 1 +/* + * LIBXL_HAVE_DOMAIN_INFO_SETDEFAULT + * + * In the case that LIBXL_HAVE_DOMAIN_INFO_SETDEFAULT is set libxl + * provides the libxl_domain_info_setdefault function that can be used + * to set the libxl_domain_config fields to their default values. + */ +#define LIBXL_HAVE_DOMAIN_INFO_SETDEFAULT 1 + typedef char **libxl_string_list; void libxl_string_list_dispose(libxl_string_list *sl); int libxl_string_list_length(const libxl_string_list *sl); @@ -1202,6 +1211,9 @@ int libxl_domain_soft_reset(libxl_ctx *ctx, void libxl_domain_config_init(libxl_domain_config *d_config); void libxl_domain_config_dispose(libxl_domain_config *d_config); +/* Fill the libxl_domain_config struct with their default values. */ +int libxl_domain_info_setdefault(libxl_ctx *ctx, libxl_domain_config *d_config); + /* * Retrieve domain configuration and filled it in d_config. The * returned configuration can be used to rebuild a domain. It only diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index c7700a7..c988c2e 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -886,17 +886,10 @@ static void initiate_domain_create(libxl__egc *egc, goto error_out; } -ret = libxl__domain_create_info_setdefault(gc, _config->c_info); -if (ret) { -LOG(ERROR,
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote: > > To platform device of ARM, hypervisor is responsible for the mapping > between machine address and guest physical address, also responsible > for the irq mapping. > > But to embedded ARM SoC, the hardware clk IP is handled in Dom0. Arguably Xen ought to be the one to do this, but we have determined (rightly, I think) that doing so for the entirely clk tree of an SoC would involve importing an unmanageable amount of code into Xen[0]. In the meantime we defer this to Dom0. I wonder though if it would be possible to manage the clocks for a passthrough device from the toolstack, i.e. is there a sysfs node where one can say "keep the clock for this device enabled (at xMhz) even though you think the device is unused"? If so (or if it can be easily added) then the toolstack would just need to manage that value as part of the passthrough process rather than having the frontend do it via a PV protocol. Ian. [0] I'd like at some point for Xen to gain sufficient knowledge of clock IP to minimally control things like the CPU and DRAM clocks etc, but that needn't imply full clock tree support and would hopefully be a manageable amount of code, which would (hopefully) mostly be about trapping and emulating writes to one or two clock controllers per platform and arbitrating a small set of bits (while allowing dom0 to control the others). This is at least a medium if not long term idea though, and for all I know it might turn out to be unworkable in practice. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On Thu, 2016-01-21 at 10:25 +, Wei Liu wrote: > On Thu, Jan 21, 2016 at 10:12:27AM +, Ian Campbell wrote: > [...] > > > I've asked the reporter to send logs for the 4.4 case to xen-devel. > > > > User confirmed[0] that 4.4 is actually OK. > > > > Did someone request stable backports yet, or shall I do so? > > > > I vaguely remember we requested backport for relevant patches long time > ago, but I admit I have lost track. So it wouldn't hurt if you do it > again. So I think we'd be looking for: 32a8440 xen-netfront: respect user provided max_queues 4c82ac3 xen-netback: respect user provided max_queues ca88ea1 xen-netfront: update num_queues to real created which certainly resolves things such that the workarounds work, and I think will also fix the default case such that it works with up to 32 vcpus (although it will consume all the grants and only get 31/32 queues). Does that sound correct? As Annie said, we may still want to consider what a sensible default max queues would be. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 01/21/2016 04:53 PM, Jan Beulich wrote: On 21.01.16 at 09:25,wrote: On 01/21/2016 04:18 PM, Jan Beulich wrote: Yes. But then - other than you said above - it still looks to me as if the split between PMEM and PBLK is arranged for by firmware? Yes. But OS/Hypervisor is not excepted to dynamically change its configure (re-split), i,e, for PoV of OS/Hypervisor, it is static. Exactly, that has been my understanding. And hence the PMEM part could be under the hypervisor's control, while the PBLK part could be Dom0's responsibility. I am not sure if i have understood your point. What your suggestion is that leave PMEM for hypervisor and all other parts (PBLK and _DSM handling) to Dom0? If yes, we should: a) handle hotplug in hypervisor (new PMEM add/remove) that causes hyperivsor interpret ACPI SSDT/DSDT. b) some _DSMs control PMEM so you should filter out these kind of _DSMs and handle them in hypervisor. c) hypervisor should mange PMEM resource pool and partition it to multiple VMs. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH LIBVIRT v2] libxl: Support cmdline= in xl config files
On Wed, 2016-01-20 at 11:16 -0700, Jim Fehlig wrote: > Ian Campbell wrote: > > ... and consolidate the cmdline/extra/root parsing to facilitate doing > > so. > > > > The logic is the same as xl's parse_cmdline from the current xen.git > > master > > branch (e6f0e099d2c17de47fd86e817b1998db903cab61). > > > > In order to introduce a use of VIR_WARN for logging I had to add > > virerror.h and VIR_LOG_INIT. > > > > Signed-off-by: Ian Campbell> > --- > > NB: I was unsure if I was supposed to change VIR_FROM_NONE into > > VIR_FROM_XEN, so I didn't (but will on advice). > > It seems some of the VIR_FROM_ needs cleanup throughout the various > Xen-related > files. We already have VIR_FROM_SEXPR and VIR_FROM_XENXM. > src/xenconfig/xen_sxpr.c should use the former, src/xenconfig/xen_xm.c the > latter. And given the pattern, I think we should introduce VIR_FROM_XENXL to > cover xl.cfg related code. I can take care of this. I've acked you patches about this. > > +static int xenParseCmdline(virConfPtr conf, char **r_cmdline) > > +{ > > +char *cmdline; > > One too many initializers removed :-). > > > +const char *root, *extra, *buf; > > + > > +if (xenConfigGetString(conf, "cmdline", , NULL) < 0) > > +return -1; > > + > > +if (xenConfigGetString(conf, "root", , NULL) < 0) > > +return -1; > > + > > +if (xenConfigGetString(conf, "extra", , NULL) < 0) > > +return -1; > > + > > +if (buf) { > > +if (VIR_STRDUP(cmdline, buf) < 0) > > +return -1; > > +if (root || extra) > > +VIR_WARN("ignoring root= and extra= in favour of > > cmdline="); > > +} else { > > +if (root && extra) { > > +if (virAsprintf(, "root=%s %s", root, extra) < 0) > > +return -1; > > +} else if (root) { > > +if (virAsprintf(, "root=%s", root) < 0) > > +return -1; > > +} else if (extra) { > > +if (VIR_STRDUP(cmdline, extra) < 0) > > +return -1; > > +} > > +} > > + > > +*r_cmdline = cmdline; > > If none of cmdline, extra, or root are set in the config, def->os.cmdline gets > set to garbage. xlconfigtest explodes when running 'make check'. I even thought about this quite carefully but missed this case :-/ Would you prefer and = NULL on the decl or an else cmdline = NULL at the end of that chain? > Sorry for not mentioning it in my previous review, but we should add a > test for cmdline, root, and extra. Ack, will do so for v3. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] VPMU backports for 4.6
On Thu, 2016-01-21 at 00:35 -0700, Jan Beulich wrote: > > > > On 20.01.16 at 18:36,wrote: > > (As a side --- XSA-163 says that VPMU is "unsupported security-wise". > > Do > > we make any distinction between a feature being generally or > > security-wise unsupported?) > > Not sure; considering stable tree maintenance one might imply > general support to be a superset of security support. But I can > easily see other views on that being equally possible/legitimate. I think I would take the view that if it is Unsupported then it is Unsupported until we explicitly say otherwise. That doesn't mean we can't backport fixes for issues which people (presumably those who are working on taking the feature from Unsupported to Supported in unstable) identify and would like fixed, assuming they don't impact other Supported features. I think it is OK for folks working on a currently unsupported feature in unstable to want to fix known issues even in stable releases, particularly (although not exclusively) those which would block people from doing further testing. Fixing those issues expands the amount of feedback which can be gathered from users who might be willing to test the feature, but not use a development version, which helps the developers find the next bug after the one which was fixed, which will help to move that feature forwards in the development branch too. It's also possible that a feature might improve sufficiently that we would consider it supported from a particular point release. We've done so n the past, I think, although not always successfully, I'm thinking xsave in and around the 4.0.x releases. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
>>> On 21.01.16 at 10:10,wrote: > On 01/21/2016 04:53 PM, Jan Beulich wrote: > On 21.01.16 at 09:25, wrote: >>> On 01/21/2016 04:18 PM, Jan Beulich wrote: Yes. But then - other than you said above - it still looks to me as if the split between PMEM and PBLK is arranged for by firmware? >>> >>> Yes. But OS/Hypervisor is not excepted to dynamically change its configure >>> (re-split), >>> i,e, for PoV of OS/Hypervisor, it is static. >> >> Exactly, that has been my understanding. And hence the PMEM part >> could be under the hypervisor's control, while the PBLK part could be >> Dom0's responsibility. >> > > I am not sure if i have understood your point. What your suggestion is that > leave PMEM for hypervisor and all other parts (PBLK and _DSM handling) to > Dom0? If yes, we should: > a) handle hotplug in hypervisor (new PMEM add/remove) that causes hyperivsor > interpret ACPI SSDT/DSDT. Why would this be different from ordinary memory hotplug, where Dom0 deals with the ACPI CA interaction, notifying Xen about the added memory? > b) some _DSMs control PMEM so you should filter out these kind of _DSMs and > handle them in hypervisor. Not if (see above) following the model we currently have in place. > c) hypervisor should mange PMEM resource pool and partition it to multiple > VMs. Yes. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On Thu, Jan 21, 2016 at 10:12:27AM +, Ian Campbell wrote: [...] > > I've asked the reporter to send logs for the 4.4 case to xen-devel. > > User confirmed[0] that 4.4 is actually OK. > > Did someone request stable backports yet, or shall I do so? > I vaguely remember we requested backport for relevant patches long time ago, but I admit I have lost track. So it wouldn't hurt if you do it again. Wei. > Ian. > > [0] http://lists.xen.org/archives/html/xen-users/2016-01/msg00110.html ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 01/21/2016 04:18 PM, Jan Beulich wrote: On 20.01.16 at 18:17,wrote: On 01/21/2016 01:07 AM, Jan Beulich wrote: On 20.01.16 at 16:29, wrote: On 01/20/2016 07:20 PM, Jan Beulich wrote: To answer this I need to have my understanding of the partitioning being done by firmware confirmed: If that's the case, then "normal" means the part that doesn't get exposed as a block device (SSD). In any event there's no correlation to guest exposure here. Firmware does not manage NVDIMM. All the operations of nvdimm are handled by OS. Actually, there are lots of things we should take into account if we move the NVDIMM management to hypervisor: a) ACPI NFIT interpretation A new ACPI table introduced in ACPI 6.0 is named NFIT which exports the base information of NVDIMM devices which includes PMEM info, PBLK info, nvdimm device interleave, vendor info, etc. Let me explain it one by one. PMEM and PBLK are two modes to access NVDIMM devices: 1) PMEM can be treated as NV-RAM which is directly mapped to CPU's address space so that CPU can r/w it directly. 2) as NVDIMM has huge capability and CPU's address space is limited, NVDIMM only offers two windows which are mapped to CPU's address space, the data window and access window, so that CPU can use these two windows to access the whole NVDIMM device. You fail to mention PBLK. The question above really was about what The 2) is PBLK. entity controls which of the two modes get used (and perhaps for which parts of the overall NVDIMM). So i think the "normal" you mentioned is about PMEM. :) Yes. But then - other than you said above - it still looks to me as if the split between PMEM and PBLK is arranged for by firmware? Yes. But OS/Hypervisor is not excepted to dynamically change its configure (re-split), i,e, for PoV of OS/Hypervisor, it is static. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
>>> On 21.01.16 at 10:05,wrote: >> > +spin_lock_irqsave(_cpu(pi_blocked_vcpu_lock, pi_block_cpu), >> flags); >> > + >> > +/* >> > + * v->arch.hvm_vmx.pi_block_cpu == NR_CPUS here means the vCPU >> was >> > + * removed from the blocking list while we are acquiring the lock. >> > + */ >> > +if ( v->arch.hvm_vmx.pi_block_cpu == NR_CPUS ) >> >> With you wanting to deal with changes behind your back here, >> isn't there come kind of barrier needed between reading and using >> pi_block_cpu, such that the compiler won't convert the local >> variable accesses into multiple reads of >> v->arch.hvm_vmx.pi_block_cpu (which iiuc it is allowed to do)? > > Thinking about this more. Seems we call spin_lock_irqsave() before > using ' v->arch.hvm_vmx.pi_block_cpu ', can spin_lock_irqsave() > used as the serializing instruction here? IIRIC, spin lock operations > can be used for this purpose, right? The way I placed the comment may have been misleading, and I'm sorry for that. The comment really is about the use of the local variable in acquiring the lock (I may have got misguided by local variable and field name being the same into thinking this latter access also checked the local variable). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
>>> On 21.01.16 at 10:29,wrote: > On 21/01/16 09:10, Xiao Guangrong wrote: >> I am not sure if i have understood your point. What your suggestion is >> that >> leave PMEM for hypervisor and all other parts (PBLK and _DSM handling) to >> Dom0? If yes, we should: >> a) handle hotplug in hypervisor (new PMEM add/remove) that causes >> hyperivsor >>interpret ACPI SSDT/DSDT. >> b) some _DSMs control PMEM so you should filter out these kind of >> _DSMs and >>handle them in hypervisor. >> c) hypervisor should mange PMEM resource pool and partition it to >> multiple >>VMs. > > It is not possible for Xen to handle ACPI such as this. > > There can only be one OSPM on a system, and 9/10ths of the functionality > needing it already lives in Dom0. > > The only rational course of action is for Xen to treat both PBLK and > PMEM as "devices" and leave them in Dom0's hands. See my other reply: Why would this be different from "ordinary" memory hotplug? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
>>> On 21.01.16 at 09:25,wrote: > On 01/21/2016 04:18 PM, Jan Beulich wrote: >> Yes. But then - other than you said above - it still looks to me as >> if the split between PMEM and PBLK is arranged for by firmware? > > Yes. But OS/Hypervisor is not excepted to dynamically change its configure > (re-split), > i,e, for PoV of OS/Hypervisor, it is static. Exactly, that has been my understanding. And hence the PMEM part could be under the hypervisor's control, while the PBLK part could be Dom0's responsibility. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On Wed, 2016-01-20 at 15:16 +, Ian Campbell wrote: > On Wed, 2016-01-20 at 10:10 -0500, Boris Ostrovsky wrote: > > On 01/20/2016 10:02 AM, David Vrabel wrote: > > > On 20/01/16 14:52, Ian Campbell wrote: > > > > On Wed, 2016-01-20 at 09:40 -0500, Boris Ostrovsky wrote: > > > > > On 01/20/2016 07:23 AM, Ian Campbell wrote: > > > > > > There have been a few reports recently[0] which relate to a > > > > > > failure of > > > > > > netfront to allocate sufficient grant refs for all the queues: > > > > > > > > > > > > [0.533589] xen_netfront: can't alloc rx grant refs > > > > > > [0.533612] net eth0: only created 31 queues > > > > > > > > > > > > Which can be worked around by increasing the number of grants > > > > > > on > > > > > > the > > > > > > hypervisor command line or by limiting the number of queues > > > > > > permitted > > > > > > by > > > > > > either back or front using a module param (which was broken but > > > > > > is now > > > > > > fixed on both sides, but I'm not sure it has been backported > > > > > > everywhere > > > > > > such that it is a reliable thing to always tell users as a > > > > > > workaround). > > > > > > > > > > > > Is there any plan to do anything about the default/out of the > > > > > > box > > > > > > experience? Either limiting the number of queues or making both > > > > > > ends > > > > > > cope > > > > > > more gracefully with failure to create some queues (or both) > > > > > > might be > > > > > > sufficient? > > > > > > > > > > > > I think the crash after the above in the first link at [0] is > > > > > > fixed? I > > > > > > think that was the purpose of ca88ea1247df "xen-netfront: > > > > > > update > > > > > > num_queues > > > > > > to real created" which was in 4.3. > > > > > I think ca88ea1247df is the solution --- it will limit the number > > > > > of > > > > > queues. > > > > That's in 4.4, which the first link at [0] claimed to have tested. > > > > I > > > > can > > > > see this fixing the crash, but does it really fix the "actually > > > > works > > > > with > > > > less queues than it tried to get" issue? > > > > That's what I thought it does too. I didn't notice that 4.4 was tested > > as well, so maybe not. > > I've asked the reporter to send logs for the 4.4 case to xen-devel. User confirmed[0] that 4.4 is actually OK. Did someone request stable backports yet, or shall I do so? Ian. [0] http://lists.xen.org/archives/html/xen-users/2016-01/msg00110.html ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] libxl: dispose libxl_dominfo after libxl_domain_info()
As suggested in a previous thread [0] this patch adds some missing calls to libxl_dominfo_{init,dispose} when doing some of the libxl_domain_info operations which would otherwise lead to memory leaks. [0] https://www.redhat.com/archives/libvir-list/2015-September/msg00519.html Signed-off-by: Joao Martins--- Changes since v1: - Add missing libxl_dominfo_init() on MemoryStats and GetInfo() --- src/libxl/libxl_driver.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index 73ed448..560f2a6 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -358,6 +358,8 @@ libxlReconnectDomain(virDomainObjPtr vm, virObjectLock(vm); +libxl_dominfo_init(_info); + /* Does domain still exist? */ rc = libxl_domain_info(cfg->ctx, _info, vm->def->id); if (rc == ERROR_INVAL) { @@ -389,11 +391,13 @@ libxlReconnectDomain(virDomainObjPtr vm, /* Enable domain death events */ libxl_evenable_domain_death(cfg->ctx, vm->def->id, 0, >deathW); +libxl_dominfo_dispose(_info); virObjectUnlock(vm); virObjectUnref(cfg); return 0; out: +libxl_dominfo_dispose(_info); libxlDomainCleanup(driver, vm); if (!vm->persistent) virDomainObjListRemoveLocked(driver->domains, vm); @@ -1589,6 +1593,8 @@ libxlDomainGetInfo(virDomainPtr dom, virDomainInfoPtr info) info->memory = vm->def->mem.cur_balloon; info->maxMem = virDomainDefGetMemoryActual(vm->def); } else { +libxl_dominfo_init(_info); + if (libxl_domain_info(cfg->ctx, _info, vm->def->id) != 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("libxl_domain_info failed for domain '%d'"), @@ -1598,6 +1604,8 @@ libxlDomainGetInfo(virDomainPtr dom, virDomainInfoPtr info) info->cpuTime = d_info.cpu_time; info->memory = d_info.current_memkb; info->maxMem = d_info.max_memkb; + +libxl_dominfo_dispose(_info); } info->state = virDomainObjGetState(vm, NULL); @@ -4791,6 +4799,7 @@ libxlDomainMemoryStats(virDomainPtr dom, virCheckFlags(0, -1); +libxl_dominfo_init(_info); cfg = libxlDriverConfigGet(driver); if (!(vm = libxlDomObjFromDomain(dom))) @@ -4822,13 +4831,12 @@ libxlDomainMemoryStats(virDomainPtr dom, ret = i; -libxl_dominfo_dispose(_info); - endjob: if (!libxlDomainObjEndJob(driver, vm)) vm = NULL; cleanup: +libxl_dominfo_dispose(_info); if (vm) virObjectUnlock(vm); virObjectUnref(cfg); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: dispose libxl_dominfo after libxl_domain_info()
On 01/21/2016 01:41 AM, Jim Fehlig wrote: > Joao Martins wrote: >> As suggested in a previous thread [0] this patch adds some missing calls >> to libxl_dominfo_dispose when doing some of the libxl_domain_info >> operations which would otherwise lead to memory leaks. >> >> [0] >> https://www.redhat.com/archives/libvir-list/2015-September/msg00519.html >> >> Signed-off-by: Joao Martins>> --- >> src/libxl/libxl_driver.c | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c >> index 73ed448..a449730 100644 >> --- a/src/libxl/libxl_driver.c >> +++ b/src/libxl/libxl_driver.c >> @@ -358,6 +358,8 @@ libxlReconnectDomain(virDomainObjPtr vm, >> >> virObjectLock(vm); >> >> +libxl_dominfo_init(_info); >> + >> /* Does domain still exist? */ >> rc = libxl_domain_info(cfg->ctx, _info, vm->def->id); >> if (rc == ERROR_INVAL) { >> @@ -389,11 +391,13 @@ libxlReconnectDomain(virDomainObjPtr vm, >> /* Enable domain death events */ >> libxl_evenable_domain_death(cfg->ctx, vm->def->id, 0, >deathW); >> >> +libxl_dominfo_dispose(_info); >> virObjectUnlock(vm); >> virObjectUnref(cfg); >> return 0; >> >> out: >> +libxl_dominfo_dispose(_info); >> libxlDomainCleanup(driver, vm); >> if (!vm->persistent) >> virDomainObjListRemoveLocked(driver->domains, vm); >> @@ -1598,6 +1602,8 @@ libxlDomainGetInfo(virDomainPtr dom, virDomainInfoPtr >> info) >> info->cpuTime = d_info.cpu_time; >> info->memory = d_info.current_memkb; >> info->maxMem = d_info.max_memkb; >> + >> +libxl_dominfo_dispose(_info); > > There should be a corresponding libxl_dominfo_init(). While looking at this, I > audited all uses of libxl_domain_info() and found another missing _init(). > With > the below diff squashed in, I think this code is finally correct :-). Do you > agree? > Ah, missed the dominfo_init() calls. Yeah, I definitively agree. I just sent it over v2. Joao > Regards, > Jim > > diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c > index 21c7610..4a9134e 100644 > --- a/src/libxl/libxl_driver.c > +++ b/src/libxl/libxl_driver.c > @@ -1593,6 +1593,8 @@ libxlDomainGetInfo(virDomainPtr dom, virDomainInfoPtr > info) > info->memory = vm->def->mem.cur_balloon; > info->maxMem = virDomainDefGetMemoryActual(vm->def); > } else { > +libxl_dominfo_init(_info); > + > if (libxl_domain_info(cfg->ctx, _info, vm->def->id) != 0) { > virReportError(VIR_ERR_INTERNAL_ERROR, > _("libxl_domain_info failed for domain '%d'"), > @@ -4797,6 +4799,7 @@ libxlDomainMemoryStats(virDomainPtr dom, > > virCheckFlags(0, -1); > > +libxl_dominfo_init(_info); > cfg = libxlDriverConfigGet(driver); > > if (!(vm = libxlDomObjFromDomain(dom))) > @@ -4828,13 +4831,12 @@ libxlDomainMemoryStats(virDomainPtr dom, > > ret = i; > > -libxl_dominfo_dispose(_info); > - > endjob: > if (!libxlDomainObjEndJob(driver, vm)) > vm = NULL; > > cleanup: > +libxl_dominfo_dispose(_info); > if (vm) > virObjectUnlock(vm); > virObjectUnref(cfg); > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] cleancache: constify cleancache_ops structure
On 23/12/15 21:06, Julia Lawall wrote: > The cleancache_ops structure is never modified, so declare it as const. > > This also removes the __read_mostly declaration on the cleancache_ops > variable declaration, since it seems redundant with const. > > Done with the help of Coccinelle. > > Signed-off-by: Julia Lawall> > --- > > Not sure that the __read_mostly change is correct. Does it apply to the > variable, or to what the variable points to? The variable, so... > --- a/mm/cleancache.c > +++ b/mm/cleancache.c > @@ -22,7 +22,7 @@ > * cleancache_ops is set by cleancache_register_ops to contain the pointers > * to the cleancache "backend" implementation functions. > */ > -static struct cleancache_ops *cleancache_ops __read_mostly; > +static const struct cleancache_ops *cleancache_ops; ...you want to retain the __read_mostly here. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable bisection] complete test-amd64-i386-xl-qemut-win7-amd64
branch xen-unstable xenbranch xen-unstable job test-amd64-i386-xl-qemut-win7-amd64 testid windows-install Tree: linux git://xenbits.xen.org/linux-pvops.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: xen git://xenbits.xen.org/xen.git Bug introduced: d23da94b123a0d9326408c376e5735697bd2d96a Bug not present: 83ae0bb2260c71e8dcec538dd477f7253aaef327 Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/78681/ commit d23da94b123a0d9326408c376e5735697bd2d96a Author: Roger Pau MonnéDate: Thu Jan 14 10:43:36 2016 +0100 Revert "libxc: create an initial FPU state for HVM guests" This reverts commit d64dbbcc7c9934a46126c59d78536235908377ad: Xen always set the FPU as initialized when loading a HVM context, so libxc has to provide a valid FPU context when setting the CPU registers. This was a stop-gap measure in order to unblock OSSTest Windows 7 failures while a proper fix for the HVM CPU save/restore is being worked on. This can now be reverted because a proper fix is in place and we can signal in the save record whether the FPU is initialized or not. Signed-off-by: Roger Pau Monné Reviewed-by: Andrew Cooper Acked-by: Wei Liu For bisection revision-tuple graph see: http://logs.test-lab.xenproject.org/osstest/results/bisect/xen-unstable/test-amd64-i386-xl-qemut-win7-amd64.windows-install.html Revision IDs in each graph node refer, respectively, to the Trees above. Running cs-bisection-step --graph-out=/home/logs/results/bisect/xen-unstable/test-amd64-i386-xl-qemut-win7-amd64.windows-install --summary-out=tmp/78681.bisection-summary --basis-template=77892 --blessings=real,real-bisect xen-unstable test-amd64-i386-xl-qemut-win7-amd64 windows-install Searching for failure / basis pass: 78525 fail [host=pinot0] / 78008 ok. Failure / basis pass flights: 78525 / 78008 (tree with no url: ovmf) (tree with no url: seabios) Tree: linux git://xenbits.xen.org/linux-pvops.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: xen git://xenbits.xen.org/xen.git Latest 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860 569eac99e8ddccd15fe78e8a3af5622afe780e3b f165e581d9a6f7cf81aa7496d3eee1e31212c8ad 0c2f1283486f953604a282188a06e2db143f475d Basis pass 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860 569eac99e8ddccd15fe78e8a3af5622afe780e3b f165e581d9a6f7cf81aa7496d3eee1e31212c8ad 20c8f1a8a5fd61cb6f0ba6f3c3b3d567b1765116 Generating revisions with ./adhoc-revtuple-generator git://xenbits.xen.org/linux-pvops.git#5d7b0fcc26d66db767a477574effc764022c19ac-5d7b0fcc26d66db767a477574effc764022c19ac git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 git://xenbits.xen.org/qemu-xen-traditional.git#569eac99e8ddccd15fe78e8a3af5622afe780e3b-569eac99e8ddccd15fe78e8a3af5622afe780e3b git://xenbits.xen.org/qemu-xen.git#f165e581d9a6f7cf81aa7496d3eee1e31212c8ad-f165e581d9a6f7cf81aa7496d3eee1e31212c8ad git://xenbits.xen.org/xen.git#20c8f1a8a5fd61cb6f0ba6f3c3b3d567b1765116-0c2f1283486f953604a282188a06e2db143f475d Loaded 1002 nodes in revision graph Searching for test results: 77892 pass irrelevant 78008 pass 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860 569eac99e8ddccd15fe78e8a3af5622afe780e3b f165e581d9a6f7cf81aa7496d3eee1e31212c8ad 20c8f1a8a5fd61cb6f0ba6f3c3b3d567b1765116 77945 pass 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860 569eac99e8ddccd15fe78e8a3af5622afe780e3b f165e581d9a6f7cf81aa7496d3eee1e31212c8ad 20c8f1a8a5fd61cb6f0ba6f3c3b3d567b1765116 78129 fail 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860 569eac99e8ddccd15fe78e8a3af5622afe780e3b f165e581d9a6f7cf81aa7496d3eee1e31212c8ad 7167222b15dc661ff0b44cc93d558f6bb4ff6492 78159 fail 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860 569eac99e8ddccd15fe78e8a3af5622afe780e3b f165e581d9a6f7cf81aa7496d3eee1e31212c8ad 7167222b15dc661ff0b44cc93d558f6bb4ff6492 78210 fail 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860 569eac99e8ddccd15fe78e8a3af5622afe780e3b f165e581d9a6f7cf81aa7496d3eee1e31212c8ad 7a1d1becf020f1fa511692bd97d50402588ef28d 78343 fail 5d7b0fcc26d66db767a477574effc764022c19ac c530a75c1e6a472b0eb9558310b518f0dfcd8860
Re: [Xen-devel] Error booting Xen
On Wed, 2016-01-20 at 03:06 -0700, Jan Beulich wrote: > > > > On 19.01.16 at 20:55,wrote: > > > > $ 'addr2line -e xen-syms 82d0801c1cce' returns > > 'xen/xen/arch/x86/xstate.c:387' which again points to > > xsave. Also, adding 'xsave=0' makes it boot just fine. > > Ah, I think I see the issue: We're zeroing the entire save area in > the fixup code, which will make XRSTORS fault unconditionally. > Shuai, would you please look into possible ways of fixing this? > Just setting the compressed flag may not be enough, and falling > back to plain XRSTOR doesn't seem to be an option either - I'm in > particular worried that the current model of zeroing everything > is bogus with the growing number of different components which > get loaded here. > > But of course another question then is why the XRSTORS faults > in the first place. I guess we'll need a debugging patch to dump > the full state to understand that. > If someone can produce and send such patch, I'm sure Harmandeep will be happy to give it a try on her hardware. Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
>>> On 21.01.16 at 15:01,wrote: > On 01/21/16 03:25, Jan Beulich wrote: >> >>> On 21.01.16 at 10:10, wrote: >> > b) some _DSMs control PMEM so you should filter out these kind of _DSMs and >> > handle them in hypervisor. >> >> Not if (see above) following the model we currently have in place. >> > > You mean let dom0 linux evaluates those _DSMs and interact with > hypervisor if necessary (e.g. XENPF_mem_hotadd for memory hotplug)? Yes. >> > c) hypervisor should mange PMEM resource pool and partition it to multiple >> > VMs. >> >> Yes. >> > > But I Still do not quite understand this part: why must pmem resource > management and partition be done in hypervisor? Because that's where memory management belongs. And PMEM, other than PBLK, is just another form of RAM. > I mean if we allow the following steps of operations (for example) > (1) partition pmem in dom 0 > (2) get address and size of each partition (part_addr, part_size) > (3) call a hypercall like nvdimm_memory_mapping(d, part_addr, part_size, > gpfn) to > map a partition to the address gpfn in dom d. > Only the last step requires hypervisor. Would anything be wrong if we > allow above operations? The main issue is that this would imo be a layering violation. I'm sure it can be made work, but that doesn't mean that's the way it ought to work. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
On 21/01/16 13:55, Jan Beulich wrote: I was intending to have CONFIG_PV_PTE_DEBUG as an EXPERT option, disabled by default even in debug builds. There should not be an ABI difference between release and "normal" debug builds. >>> Well, I see your point, but as said above I'm not convinced >>> disabling all that code is the right solution. In fact, what you >>> propose is not far away from removing that code altogether. >> The two bits are only used for specialised debugging. They should be >> relegated to people doing specific debugging, and not interfere with the >> overwhelming majority of cases where Xen doesn't need to use any >> software available PTE bits. > Your repeated claim that _PAGE_GUEST_KERNEL is purely debugging only > makes we wonder how you would mean to adjust adjust_guest_l1e() with > that flag gone (most notably the last of its if()-s). diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index b81d1fd..46ef5ce 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -1060,10 +1060,11 @@ get_page_from_l4e( == (_PAGE_GUEST_KERNEL|_PAGE_GLOBAL) ) \ MEM_LOG("Global bit is set to kernel page %lx", \ l1e_get_pfn((pl1e))); \ -if ( !(l1e_get_flags((pl1e)) & _PAGE_USER) ) \ -l1e_add_flags((pl1e), (_PAGE_GUEST_KERNEL|_PAGE_USER)); \ -if ( !(l1e_get_flags((pl1e)) & _PAGE_GUEST_KERNEL) ) \ -l1e_add_flags((pl1e), (_PAGE_GLOBAL|_PAGE_USER)); \ +if ( l1e_get_flags((pl1e)) & _PAGE_USER ) \ +l1e_add_flags((pl1e), _PAGE_GLOBAL); \ + else \ +l1e_remove_flags((pl1e), _PAGE_GLOBAL); \ +l1e_add_flags((pl1e), _PAGE_USER); \ } \ } while ( 0 ) _PAGE_GUEST_KERNEL isn't in the ABI, which means that the 2nd if() is the only piece of code which validly sets it. Read-modify-write operations already don't function correctly as _PAGE_GUEST_KERNEL is a hidden saturating bit from the guests point of view. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
On 21/01/16 13:17, Andrew Cooper wrote: > > As we see with the Protection Key feature, newer hardware feature start > using bits which were previously software available, and we absolutely > don't want to be in a position where our ABI prevents us from ever > supporting a new feature. I don't see a problem in not supporting new features in /PV/ guests. I think we should document the PV ABI as-is. i.e., that these two PTE bits are not available for PV guest use. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
On Fri, 2015-12-18 at 16:08 +, Malcolm Crossley wrote: > diff --git a/xen/include/asm-arm/percpu.h b/xen/include/asm-arm/percpu.h > index 71e7649..c308a56 100644 > --- a/xen/include/asm-arm/percpu.h > +++ b/xen/include/asm-arm/percpu.h > @@ -27,6 +27,11 @@ void percpu_init_areas(void); > #define __get_cpu_var(var) \ > (*RELOC_HIDE(_cpu__##var, READ_SYSREG(TPIDR_EL2))) > > +#define per_cpu_ptr(var, cpu) \ > +(*RELOC_HIDE(, __per_cpu_offset[cpu])) > +#define __get_cpu_ptr(var) \ > +(*RELOC_HIDE(, READ_SYSREG(TPIDR_EL2))) > + > #define DECLARE_PER_CPU(type, name) extern __typeof__(type) > per_cpu__##name > For ARM: Acked-by :Ian Campbell___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Error booting Xen
On Thu, Jan 21, 2016 at 8:44 PM, Dario Faggioliwrote: > On Wed, 2016-01-20 at 03:06 -0700, Jan Beulich wrote: >> > > > On 19.01.16 at 20:55, wrote: >> > >> > $ 'addr2line -e xen-syms 82d0801c1cce' returns >> > 'xen/xen/arch/x86/xstate.c:387' which again points to >> > xsave. Also, adding 'xsave=0' makes it boot just fine. >> >> Ah, I think I see the issue: We're zeroing the entire save area in >> the fixup code, which will make XRSTORS fault unconditionally. >> Shuai, would you please look into possible ways of fixing this? >> Just setting the compressed flag may not be enough, and falling >> back to plain XRSTOR doesn't seem to be an option either - I'm in >> particular worried that the current model of zeroing everything >> is bogus with the growing number of different components which >> get loaded here. >> >> But of course another question then is why the XRSTORS faults >> in the first place. I guess we'll need a debugging patch to dump >> the full state to understand that. >> > If someone can produce and send such patch, I'm sure Harmandeep will be > happy to give it a try on her hardware. > Yes, sure. I will be more than happy to do this :) Thank you Dario. Regards, Harman > Thanks and Regards, > Dario > -- > <> (Raistlin Majere) > - > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > My patch b2700877 "move and amend multicast control documentation" > clarified use of the multicast control protocol between frontend and > backend. However, it transpires that the restrictions that documentation > placed on the "request-multicast-control" flag make it hard for a > frontend to enable 'all multicast' promiscuous mode, in that to do so > would require the frontend and backend to disconnect and re-connect. > > This patch adds a new "feature-dynamic-multicast-control" flag to allow > a backend to advertise that it will watch "request-multicast-control" hence > allowing it to be meaningfully modified by the frontend at any time rather > than only when the frontend and backend are disconnected. > > Signed-off-by: Paul Durrant> Cc: Ian Campbell > Cc: Ian Jackson > Cc: Jan Beulich > Cc: Keir Fraser > Cc: Tim Deegan This looks good to me, but also adding Wei (Linux netback + BSD stuff) and Roger (BSD stuff) for their perspective. I should probably have done that for the last set of netif.h changes too, since apart from the nominal maintainers of xen/include/public/io/*.h it's worth getting input from the maintainers of the consumers. Not sure we can express that very well in MAINTAINERS :-(. Ian. > --- > xen/include/public/io/netif.h | 34 ++ > 1 file changed, 22 insertions(+), 12 deletions(-) > > diff --git a/xen/include/public/io/netif.h > b/xen/include/public/io/netif.h > index fe0a87f..8816e0f 100644 > --- a/xen/include/public/io/netif.h > +++ b/xen/include/public/io/netif.h > @@ -136,18 +136,28 @@ > */ > > /* > - * "feature-multicast-control" advertises the capability to filter ethernet > - * multicast packets in the backend. To enable use of this capability the > - * frontend must set "request-multicast-control" before moving into the > - * connected state. > - * > - * If "request-multicast-control" is set then the backend transmit side > should > - * no longer flood multicast packets to the frontend, it should instead drop > any > - * multicast packet that does not match in a filter list. The list is > - * amended by the frontend by sending dummy transmit requests containing > - * XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as specified > below. > - * Once enabled by the frontend, the feature cannot be disabled except by > - * closing and re-connecting to the backend. > + * "feature-multicast-control" and "feature-dynamic-multicast-control" > + * advertise the capability to filter ethernet multicast packets in the > + * backend. If the frontend wishes to take advantage of this feature then > + * it may set "request-multicast-control". If the backend only advertises > + * "feature-multicast-control" then "request-multicast-control" must be set > + * before the frontend moves into the connected state. The backend will > + * sample the value on this state transition and any subsequent change in > + * value will have no effect. However, if the backend also advertises > + * "feature-dynamic-multicast-control" then "request-multicast-control" > + * may be set by the frontend at any time. In this case, the backend will > + * watch the value and re-sample on watch events. > + * > + * If the sampled value of "request-multicast-control" is set then the > + * backend transmit side should no longer flood multicast packets to the > + * frontend, it should instead drop any multicast packet that does not > + * match in a filter list. > + * The list is amended by the frontend by sending dummy transmit requests > + * containing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as > + * specified below. > + * Note that the filter list may be amended even if the sampled value of > + * "request-multicast-control" is not set, however the filter should only > + * be applied if it is set. > */ > > /* ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv5 2/3] grant_table: convert grant table rwlock to percpu rwlock
On Fri, 2015-12-18 at 16:08 +, Malcolm Crossley wrote: > > xen/arch/arm/mm.c | 4 +- [...] > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 47bfb27..81f9e2e 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -1055,7 +1055,7 @@ int xenmem_add_to_physmap_one( > switch ( space ) > { > case XENMAPSPACE_grant_table: > -write_lock(>grant_table->lock); > +grant_write_lock(d->grant_table); > > if ( d->grant_table->gt_version == 0 ) > d->grant_table->gt_version = 1; > @@ -1085,7 +1085,7 @@ int xenmem_add_to_physmap_one( > > t = p2m_ram_rw; > > -write_unlock(>grant_table->lock); > +grant_write_unlock(d->grant_table); ARML Acked-by: Ian Campbell___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] cleancache: constify cleancache_ops structure
The cleancache_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall--- v2: put back the read mostly drivers/xen/tmem.c |2 +- include/linux/cleancache.h |2 +- mm/cleancache.c|4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h index bda5ec0b4..cb3e142 100644 --- a/include/linux/cleancache.h +++ b/include/linux/cleancache.h @@ -37,7 +37,7 @@ struct cleancache_ops { void (*invalidate_fs)(int); }; -extern int cleancache_register_ops(struct cleancache_ops *ops); +extern int cleancache_register_ops(const struct cleancache_ops *ops); extern void __cleancache_init_fs(struct super_block *); extern void __cleancache_init_shared_fs(struct super_block *); extern int __cleancache_get_page(struct page *); diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c index 945fc43..4ac2ca8 100644 --- a/drivers/xen/tmem.c +++ b/drivers/xen/tmem.c @@ -242,7 +242,7 @@ static int tmem_cleancache_init_shared_fs(char *uuid, size_t pagesize) return xen_tmem_new_pool(shared_uuid, TMEM_POOL_SHARED, pagesize); } -static struct cleancache_ops tmem_cleancache_ops = { +static const struct cleancache_ops tmem_cleancache_ops = { .put_page = tmem_cleancache_put_page, .get_page = tmem_cleancache_get_page, .invalidate_page = tmem_cleancache_flush_page, diff --git a/mm/cleancache.c b/mm/cleancache.c index 8fc5081..c6356d6 100644 --- a/mm/cleancache.c +++ b/mm/cleancache.c @@ -22,7 +22,7 @@ * cleancache_ops is set by cleancache_register_ops to contain the pointers * to the cleancache "backend" implementation functions. */ -static struct cleancache_ops *cleancache_ops __read_mostly; +static const struct cleancache_ops *cleancache_ops __read_mostly; /* * Counters available via /sys/kernel/debug/cleancache (if debugfs is @@ -49,7 +49,7 @@ static void cleancache_register_ops_sb(struct super_block *sb, void *unused) /* * Register operations for cleancache. Returns 0 on success. */ -int cleancache_register_ops(struct cleancache_ops *ops) +int cleancache_register_ops(const struct cleancache_ops *ops) { if (cmpxchg(_ops, NULL, ops)) return -EBUSY; ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
On Thu, Jan 21, 2016 at 03:29:36PM +, Ian Campbell wrote: > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > > My patch b2700877 "move and amend multicast control documentation" > > clarified use of the multicast control protocol between frontend and > > backend. However, it transpires that the restrictions that documentation > > placed on the "request-multicast-control" flag make it hard for a > > frontend to enable 'all multicast' promiscuous mode, in that to do so > > would require the frontend and backend to disconnect and re-connect. > > > > This patch adds a new "feature-dynamic-multicast-control" flag to allow > > a backend to advertise that it will watch "request-multicast-control" hence > > allowing it to be meaningfully modified by the frontend at any time rather > > than only when the frontend and backend are disconnected. > > > > Signed-off-by: Paul Durrant> > Cc: Ian Campbell > > Cc: Ian Jackson > > Cc: Jan Beulich > > Cc: Keir Fraser > > Cc: Tim Deegan > > > This looks good to me, but also adding Wei (Linux netback + BSD stuff) and > Roger (BSD stuff) for their perspective. > > I should probably have done that for the last set of netif.h changes too, > since apart from the nominal maintainers of xen/include/public/io/*.h it's > worth getting input from the maintainers of the consumers. Not sure we can > express that very well in MAINTAINERS :-(. > > Ian. LGTM Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
El 21/01/16 a les 12.31, Wei Liu ha escrit: > On Thu, Jan 21, 2016 at 11:01:43AM +0100, Roger Pau Monné wrote: >> El 21/01/16 a les 10.39, Ian Campbell ha escrit: >>> On Wed, 2016-01-20 at 19:33 +0100, Roger Pau Monné wrote: El 20/01/16 a les 14.01, Ian Campbell ha escrit: > On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote: >> Allow enabling or disabling emulated devices from the libxl domain >> configuration file. For HVM guests with a device model all the >> emulated >> devices are enabled. For HVM guests without a device model no devices >> are >> enabled by default, although they can be enabled using the options >> provided. >> The arbiter of whether a combination is posible or not is always Xen, >> libxl >> doesn't do any kind of check. >> >> This set of options is also propagated inside of the libxl migration >> record >> as part of the contents of the libxl_domain_build_info struct. > > ... and this is the real motivation for this change, not actually > allowing > users to control all this AIUI. > > Did you check that the fields updated using libxl_defbool_setdefault > are > actually updated in the JSON copy and therefore propagated to the other > side of a migration as specific values and not as "pick a default"? I > think > we don't want these changing on migration. I think/hope all this was > automatically handled by the work Wei did in the last release cycle. No, values populated by the {build/create}_info_setdefault functions are not propagated, OTOH values manually set by the user in the config file are indeed propagated. Do we have any guarantee that _setdefault is always going to behave in the same way? >>> >>> No, part of the purpose of defbool and the other "do the default" values is >>> that we can evolve things over time. >>> If we don't have that guarantee I think this is already a bug, and we should call _setdefault before sending the domain info to the other end. In fact I have a patch that does exactly that, but I'm unsure if it's needed because I don't know the policy regarding default values in libxl. >>> >>> Wei, isn't this (turning the defaults into concrete values) supposed to be >>> taken care of by the JSON mangling which you added? >> >> Heh, I think you mean the JSON mangling added by Wei. In order to >> propagate the values filled by default in libxl_domain_config I had to >> add the following patch, which basically calls the _setdefault >> functions before converting the domain_config into JSON. I'm planning >> to make it part of this series in the next iteration: > > The requirement of recording decision made in libxl and pass that to the > receiving end is not new. We had the same problem for uuid, disk and > some other things. > > The first way of doing it is to update JSON before it is sent -- see > libxl.c:libxl_retrieve_domain_configuration. It uses the stashed JSON > file as template and pull in various bits from hypervisor and xenstore. > Your need of recording what emulated devices are available fits here. > You just need to provide a way to retrieve those bits in that function. > > Another way of doing it is to update the stashed JSON template before it > is saved. See libxl_internal.c:libxl__update_domain_configuration. I > think this might be easier than the first way of doing it. > > You should not export the _setdefault function to xl because it is a > layer violation. > > Hope this clarify things. Hello, Yes, thank you very much, it has indeed clarified things. I've implemented it inside of libxl__update_domain_configuration without issues. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On 21/01/16 12:19, Ian Campbell wrote: > On Thu, 2016-01-21 at 10:56 +, David Vrabel wrote: >> On 20/01/16 12:23, Ian Campbell wrote: >>> There have been a few reports recently[0] which relate to a failure of >>> netfront to allocate sufficient grant refs for all the queues: >>> >>> [0.533589] xen_netfront: can't alloc rx grant refs >>> [0.533612] net eth0: only created 31 queues >>> >>> Which can be worked around by increasing the number of grants on the >>> hypervisor command line or by limiting the number of queues permitted >>> by >>> either back or front using a module param (which was broken but is now >>> fixed on both sides, but I'm not sure it has been backported everywhere >>> such that it is a reliable thing to always tell users as a workaround). >>> >>> Is there any plan to do anything about the default/out of the box >>> experience? Either limiting the number of queues or making both ends >>> cope >>> more gracefully with failure to create some queues (or both) might be >>> sufficient? >>> >>> I think the crash after the above in the first link at [0] is fixed? I >>> think that was the purpose of ca88ea1247df "xen-netfront: update >>> num_queues >>> to real created" which was in 4.3. >> >> I think the correct solution is to increase the default maximum grant >> table size. > > That could well make sense, but then there will just be another higher > limit, so we should perhaps do both. > > i.e. factoring in: > * performance i.e. ability for N queues to saturate whatever sort of link >contemporary Linux can saturate these days, plus some headroom, or >whatever other ceiling seems sensible) > * grant table resource consumption i.e. (sensible max number of blks * nr >gnts per blk + sensible max number of vifs * nr gnts per vif + other >devs needs) < per guest grant limit) to pick both the default gnttab >size and the default max queuers. Yes. >> Although, unless you're using the not-yet-applied per-cpu rwlock patches >> multiqueue is terrible on many (multisocket) systems and the number of >> queue should be limited in netback to 4 or even just 2. > > Presumably the guest can't tell, so it can't do this. > > I think when you say "terrible" you don't mean "worse than without mq" but > rather "not realising the expected gains from a larger nunber of queues", > right?. Malcolm did the analysis but if I remember correctly, 8 queues performed about the same as 1 queue and 16 were worse than 1 queue. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] netfront/netback multiqueue exhausting grants
On 2016/1/21 9:17, David Vrabel wrote: On 21/01/16 12:19, Ian Campbell wrote: On Thu, 2016-01-21 at 10:56 +, David Vrabel wrote: On 20/01/16 12:23, Ian Campbell wrote: There have been a few reports recently[0] which relate to a failure of netfront to allocate sufficient grant refs for all the queues: [0.533589] xen_netfront: can't alloc rx grant refs [0.533612] net eth0: only created 31 queues Which can be worked around by increasing the number of grants on the hypervisor command line or by limiting the number of queues permitted by either back or front using a module param (which was broken but is now fixed on both sides, but I'm not sure it has been backported everywhere such that it is a reliable thing to always tell users as a workaround). Is there any plan to do anything about the default/out of the box experience? Either limiting the number of queues or making both ends cope more gracefully with failure to create some queues (or both) might be sufficient? I think the crash after the above in the first link at [0] is fixed? I think that was the purpose of ca88ea1247df "xen-netfront: update num_queues to real created" which was in 4.3. I think the correct solution is to increase the default maximum grant table size. That could well make sense, but then there will just be another higher limit, so we should perhaps do both. i.e. factoring in: * performance i.e. ability for N queues to saturate whatever sort of link contemporary Linux can saturate these days, plus some headroom, or whatever other ceiling seems sensible) * grant table resource consumption i.e. (sensible max number of blks * nr gnts per blk + sensible max number of vifs * nr gnts per vif + other devs needs) < per guest grant limit) to pick both the default gnttab size and the default max queuers. Yes. Would it waste lots of resources in the case where guest vif has lots of queue but no network load? Here is an example of gntref consumed by vif, Dom0 20vcpu, domu 20vcpu, one vif would consumes 20*256*2=10240 gntref. If setting the maximum grant table size to 64pages(default value of xen is 32pages now?), then only 3 vif is supported in guest. Even blk isn't taken account in, and also blk multi-page ring feature. Thanks Annie Although, unless you're using the not-yet-applied per-cpu rwlock patches multiqueue is terrible on many (multisocket) systems and the number of queue should be limited in netback to 4 or even just 2. Presumably the guest can't tell, so it can't do this. I think when you say "terrible" you don't mean "worse than without mq" but rather "not realising the expected gains from a larger nunber of queues", right?. Malcolm did the analysis but if I remember correctly, 8 queues performed about the same as 1 queue and 16 were worse than 1 queue. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv1 1/4] atomic: replace atomic_compareandswap() with atomic_cmpxchg()
On Fri, 2015-12-18 at 14:09 +, David Vrabel wrote: > atomic_compareandswap() used atomic_t as the new, old and returned > values which is less convinient than using just int. "convenient" diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h > index 5a38c67..29ab265 100644 > --- a/xen/include/asm-arm/atomic.h > +++ b/xen/include/asm-arm/atomic.h > @@ -138,14 +138,6 @@ static inline void _atomic_set(atomic_t *v, int i) > # error "unknown ARM variant" > #endif > > -static inline atomic_t atomic_compareandswap( > -atomic_t old, atomic_t new, atomic_t *v) > -{ > -atomic_t rc; > -rc.counter = __cmpxchg(>counter, old.counter, new.counter, > sizeof(int)); > -return rc; > -} > - ARM: Acked-by: Ian Campbell___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
On Thu, Jan 21, 2016 at 04:35:40PM +0100, Roger Pau Monné wrote: > El 21/01/16 a les 16.29, Ian Campbell ha escrit: > > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > >> My patch b2700877 "move and amend multicast control documentation" > >> clarified use of the multicast control protocol between frontend and > >> backend. However, it transpires that the restrictions that documentation > >> placed on the "request-multicast-control" flag make it hard for a > >> frontend to enable 'all multicast' promiscuous mode, in that to do so > >> would require the frontend and backend to disconnect and re-connect. > >> > >> This patch adds a new "feature-dynamic-multicast-control" flag to allow > >> a backend to advertise that it will watch "request-multicast-control" hence > >> allowing it to be meaningfully modified by the frontend at any time rather > >> than only when the frontend and backend are disconnected. > >> > >> Signed-off-by: Paul Durrant> >> Cc: Ian Campbell > >> Cc: Ian Jackson > >> Cc: Jan Beulich > >> Cc: Keir Fraser > >> Cc: Tim Deegan > > > > > > This looks good to me, but also adding Wei (Linux netback + BSD stuff) and > > Roger (BSD stuff) for their perspective. > > > > I should probably have done that for the last set of netif.h changes too, > > since apart from the nominal maintainers of xen/include/public/io/*.h it's > > worth getting input from the maintainers of the consumers. Not sure we can > > express that very well in MAINTAINERS :-(. > > I'm going to leave this one to Wei, he has more experience than me > regarding FreeBSD netfront (and xen-net related topics). > > FWIW, a quick and dirty grep on FreeBSD netfront doesn't show any > results for "request-multicast-control", so I guess it's not implemented. > No, it is not implemented. Wei. > Roger. > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
On 21/01/16 14:29, David Vrabel wrote: > On 21/01/16 13:17, Andrew Cooper wrote: >> As we see with the Protection Key feature, newer hardware feature start >> using bits which were previously software available, and we absolutely >> don't want to be in a position where our ABI prevents us from ever >> supporting a new feature. > I don't see a problem in not supporting new features in /PV/ guests. I know you want to kill PV guests. I am not going to deliberately cripple PV guests in an attempt to achieve that goal. > > I think we should document the PV ABI as-is. i.e., that these two PTE > bits are not available for PV guest use. This is already known to break some guests. It shouldn't stay as it currently is. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
El 21/01/16 a les 16.29, Ian Campbell ha escrit: > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: >> My patch b2700877 "move and amend multicast control documentation" >> clarified use of the multicast control protocol between frontend and >> backend. However, it transpires that the restrictions that documentation >> placed on the "request-multicast-control" flag make it hard for a >> frontend to enable 'all multicast' promiscuous mode, in that to do so >> would require the frontend and backend to disconnect and re-connect. >> >> This patch adds a new "feature-dynamic-multicast-control" flag to allow >> a backend to advertise that it will watch "request-multicast-control" hence >> allowing it to be meaningfully modified by the frontend at any time rather >> than only when the frontend and backend are disconnected. >> >> Signed-off-by: Paul Durrant>> Cc: Ian Campbell >> Cc: Ian Jackson >> Cc: Jan Beulich >> Cc: Keir Fraser >> Cc: Tim Deegan > > > This looks good to me, but also adding Wei (Linux netback + BSD stuff) and > Roger (BSD stuff) for their perspective. > > I should probably have done that for the last set of netif.h changes too, > since apart from the nominal maintainers of xen/include/public/io/*.h it's > worth getting input from the maintainers of the consumers. Not sure we can > express that very well in MAINTAINERS :-(. I'm going to leave this one to Wei, he has more experience than me regarding FreeBSD netfront (and xen-net related topics). FWIW, a quick and dirty grep on FreeBSD netfront doesn't show any results for "request-multicast-control", so I guess it's not implemented. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
On 21/01/16 14:37, Andrew Cooper wrote: >> I think we should document the PV ABI as-is. i.e., that these two PTE >> bits are not available for PV guest use. > > This is already known to break some guests. It shouldn't stay as it > currently is. It's unfortunate that the current ABI is incompatible with some guests in some situations, but it can't be fixed without adding a mechanism for the guest to query or negotiate PTE bit availability since existing guests have to be able to run on older hypervisors. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
> On January 20, 2016 at 7:29 pm,wrote: > >>> On 20.01.16 at 11:26, wrote: > >> On January 15, 2016 at 9:10, wrote: > >> >>> On 23.12.15 at 09:25, wrote: > >> > @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu, > >> > return 0; > >> > } > >> > > >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did, > >> > + u16 seg, u8 bus, u8 > >> > +devfn) > >> { > >> > +struct domain *d; > >> > +struct pci_dev *pdev; > >> > + > >> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]); > >> > +ASSERT(d); > >> > >> Don't you need to acquire some lock in order to safely assert this? > > > > Referred to the other use case of 'rcu_lock_domain_by_id()', Xen > > checks whether The domain is 'NULL'. > > Could I also replace the 'ASSERT(d)' with > > + If ( d == NULL ) > > + return; > > ? > > At a first glance this doesn't look right, but in the end that's something > you need > to answer. > Is it quite similar to whether the domain has been destroyed when Device-TLB is flushing? Correct me if I still doesn't get you point.. > >> Also note that unused slots hold zero, i.e. there's at least a > >> theoretical > > risk of > >> problems here when you don't also look at > >> iommu->domid_bitmap. > >> > > I am not clear how to fix this point. Do you have good idea? > > Add a lock to 'iommu->domid_bitmap'? > > How would a lock help avoiding mistaking an unused slot to mean Dom0? As > already suggested - I think you simply need to consult the bitmap along with > the > domain ID array. > Try to get domain id with iommu->domid_map[did] ? > >> > +{ > >> > +list_del(>domain_list); > >> > >> This should happen under pcidevs_lock - you need to either acquire it > >> or > >> ASSERT() it being held. > >> > > > > I should check whether it is under pcidevs_lock -- with > > spin_is_locked(_lock) > > If it is not under pcidevs_lock, I should acquire it. > > I think ASSERT() is not a good idea. Hypervisor acquires this lock and > > then remove the resource. > > I don't understand this last sentence. > For example: in pci_remove_device() { ... spin_lock(_lock); .. iommu_remove_device().. .. spin_unlock(_lock); } Device-TLB is maybe flush error in iommu_remove_device().. Then it is under pcidevs_lock.. In this case, I need to check whether it is under pcidevs_lock. If not, I need to acquire the pcidevs_lock. > >> > @@ -349,9 +416,18 @@ static int flush_iotlb_qi( > >> > queue_invalidate_iotlb(iommu, > >> > type >> > >> DMA_TLB_FLUSH_GRANU_OFFSET, dr, > >> > dw, did, size_order, 0, addr); > >> > + > >> > +/* > >> > + * Synchronize with hardware for invalidation request > descriptors > >> > + * submitted before Device-TLB invalidate descriptor. > >> > + */ > >> > +rc = invalidate_sync(iommu); > >> > +if ( rc ) > >> > + return rc; > >> > + > >> > if ( flush_dev_iotlb ) > >> > ret = dev_invalidate_iotlb(iommu, did, addr, size_order, > type); > >> > -rc = invalidate_sync(iommu); > >> > + > >> > if ( !ret ) > >> > ret = rc; > >> > } > >> > >> This change is because of ...? > >> > > As Kevin's comments, > > I add 'dev_invalidate_iotlb_sync()' to dev_invalidate_iotlb(). > > Then i can know which device is flush timeout. > > I don't understand: How is your reply related to you moving up the invocation > of > invalidate_sync()? This invalidate_sync() is to synchronize with hardware for invalidation request descriptors submitted before Device-TLB invalidate descriptor. If I don't add this invalidate_sync().. dev_invalidate_iotlb_timeout() is not clear whether the flush time out is about Device-TLB flush error or the other previous iotlb/iec/context flush error. > > >> Plus logging the error code and device coordinates might turn out > >> helpful in such cases. But first of all - if there was a timeout, > >> we'd make it here only for Dom0. Perhaps the printing should move into an > else to the domain_crash()? > >> And if there was another error, the message would be outright wrong. > >> > > IMO, the print for Dom0 is enough. > > I think it is not a good idea to move into an else to domain_crash(). > > Domain_crash is quite a common part. > > Anyway I can improve it in low priority. > > At the very least the message should not end up being actively misleading. > What about the below? +printk(XENLOG_ERR + "Device %04x:%02x:%02x is flush error.", + seg, bus, devfn); Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH LIBVIRT v2] libxl: Support cmdline= in xl config files
Ian Campbell wrote: > On Wed, 2016-01-20 at 11:16 -0700, Jim Fehlig wrote: >> Ian Campbell wrote: >>> ... and consolidate the cmdline/extra/root parsing to facilitate doing >>> so. >>> >>> The logic is the same as xl's parse_cmdline from the current xen.git >>> master >>> branch (e6f0e099d2c17de47fd86e817b1998db903cab61). >>> >>> In order to introduce a use of VIR_WARN for logging I had to add >>> virerror.h and VIR_LOG_INIT. >>> >>> Signed-off-by: Ian Campbell>>> --- >>> NB: I was unsure if I was supposed to change VIR_FROM_NONE into >>> VIR_FROM_XEN, so I didn't (but will on advice). >> It seems some of the VIR_FROM_ needs cleanup throughout the various >> Xen-related >> files. We already have VIR_FROM_SEXPR and VIR_FROM_XENXM. >> src/xenconfig/xen_sxpr.c should use the former, src/xenconfig/xen_xm.c the >> latter. And given the pattern, I think we should introduce VIR_FROM_XENXL to >> cover xl.cfg related code. I can take care of this. > > I've acked you patches about this. Thanks. I'll push those trivial patches shortly. > >>> +static int xenParseCmdline(virConfPtr conf, char **r_cmdline) >>> +{ >>> +char *cmdline; >> One too many initializers removed :-). >> >>> +const char *root, *extra, *buf; >>> + >>> +if (xenConfigGetString(conf, "cmdline", , NULL) < 0) >>> +return -1; >>> + >>> +if (xenConfigGetString(conf, "root", , NULL) < 0) >>> +return -1; >>> + >>> +if (xenConfigGetString(conf, "extra", , NULL) < 0) >>> +return -1; >>> + >>> +if (buf) { >>> +if (VIR_STRDUP(cmdline, buf) < 0) >>> +return -1; >>> +if (root || extra) >>> +VIR_WARN("ignoring root= and extra= in favour of >>> cmdline="); >>> +} else { >>> +if (root && extra) { >>> +if (virAsprintf(, "root=%s %s", root, extra) < 0) >>> +return -1; >>> +} else if (root) { >>> +if (virAsprintf(, "root=%s", root) < 0) >>> +return -1; >>> +} else if (extra) { >>> +if (VIR_STRDUP(cmdline, extra) < 0) >>> +return -1; >>> +} >>> +} >>> + >>> +*r_cmdline = cmdline; >> If none of cmdline, extra, or root are set in the config, def->os.cmdline >> gets >> set to garbage. xlconfigtest explodes when running 'make check'. > > I even thought about this quite carefully but missed this case :-/ > > Would you prefer and = NULL on the decl or an else cmdline = NULL at the > end of that chain? 'char *cmdline = NULL;' is fine. > >> Sorry for not mentioning it in my previous review, but we should add a >> test for cmdline, root, and extra. > > Ack, will do so for v3. Thanks! Regards, Jim ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 0/6] HVMlite: DomU fixes and a Dom0 preparatory patch
Hello, The series is sorted in the following way: - Patch 1 is a preparatory patch for Dom0 HVMlite support. - Patch 4 is a fix from a fallout introduced by the HVMlite series, which inadvertently disabled the emulated PIT that was added to PV(H) guests. - Patches 2, 3 and 5 expand the fields of the libxl_domain_build_info structure in order to store what emulated devices inside of Xen are enabled. - Patch 6 is new in this iteration and aims to provide a simple way to report to the guest which emulated devices are enabled. I belive this is the last part of the guest ABI that I wanted to do (if we leave pci-passthrough out of the picture). IMHO, the DomU ABI is now complete. The full series can also be found in my git tree: http://xenbits.xen.org/gitweb/?p=people/royger/xen.git;a=shortlog;h=refs/heads/hvmlite_fixes_v4 I hope this will help review of patch 1/6, which is a significant rework, and will probably be better reviewed by looking at the resulting code. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 3/6] libxl: initialise the build info before calling prepare_config
libxl__arch_domain_prepare_config has access to the libxl_domain_build_info struct, so make sure it's properly initialised. Note that prepare_config is called from within libxl__domain_make. This is not a bug at the moment, because prepare_config doesn't access libxl_domain_build_info yet, but this is likely going to change. Signed-off-by: Roger Pau Monné--- Cc: Ian Jackson Cc: Ian Campbell Cc: Wei Liu --- Changes since v3: - Reword commit message to make it clear that prepare_config is called from domain_make, and that the fact that build_info is not initialised is not a bug ATM because prepare_config doesn't make use of it. --- tools/libxl/libxl_create.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index b669359..c7700a7 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -892,6 +892,12 @@ static void initiate_domain_create(libxl__egc *egc, goto error_out; } +ret = libxl__domain_build_info_setdefault(gc, _config->b_info); +if (ret) { +LOG(ERROR, "Unable to set domain build info defaults"); +goto error_out; +} + ret = libxl__domain_make(gc, d_config, , >config); if (ret) { LOG(ERROR, "cannot make domain: %d", ret); @@ -903,12 +909,6 @@ static void initiate_domain_create(libxl__egc *egc, dcs->guest_domid = domid; dcs->dmss.dm.guest_domid = 0; /* means we haven't spawned */ -ret = libxl__domain_build_info_setdefault(gc, _config->b_info); -if (ret) { -LOG(ERROR, "Unable to set domain build info defaults"); -goto error_out; -} - if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM && (libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) && libxl_defbool_val(d_config->b_info.u.hvm.altp2m))) { -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 20/31] x86: Improvements to in-hypervisor cpuid sanity checks
On 21/01/16 17:02, Jan Beulich wrote: On 16.12.15 at 22:24,wrote: >> @@ -864,69 +865,27 @@ void pv_cpuid(struct cpu_user_regs *regs) >> >> cpuid_count(a, c, , , , ); >> >> -if ( (regs->eax & 0x7fff) == 0x0001 ) >> -{ >> -/* Modify Feature Information. */ >> -if ( !cpu_has_apic ) >> -__clear_bit(X86_FEATURE_APIC, ); >> - >> -if ( !is_pvh_domain(currd) ) >> -{ >> -__clear_bit(X86_FEATURE_PSE, ); >> -__clear_bit(X86_FEATURE_PGE, ); >> -__clear_bit(X86_FEATURE_PSE36, ); >> -__clear_bit(X86_FEATURE_VME, ); >> -} >> -} > This I understand goes away because pv_featureset[] never has > those set? > >> case 0x8001: >> -/* Modify Feature Information. */ >> -if ( is_pv_32bit_domain(currd) ) >> -{ >> -__clear_bit(X86_FEATURE_LM % 32, ); >> -__clear_bit(X86_FEATURE_LAHF_LM % 32, ); >> -} >> -if ( is_pv_32bit_domain(currd) && >> - boot_cpu_data.x86_vendor != X86_VENDOR_AMD ) >> -__clear_bit(X86_FEATURE_SYSCALL % 32, ); > But what about these 32-bit specific removals? LM, from the deep feature dependency removal in libxc, when it is known that the domain is 32bit. For SYSCALL, as far as I can tell, the logic is wrong. 32bit compat mode code on Intel can use SYSCALL, as Xen is running in Long mode. (This is opposite to the AMD case where 32bit compat code cannot use SYSENTER, because Xen is in Long mode.) > Overall this of course makes things quite a bit more readable. And there is more to come. By the time my cpuid phase 2 plans are complete, all validitiy checks will be done at the set_cpuid_policy hypercall boundary, meaning that all these time-of-use checks can be dropped. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 23/31] xen/x86: Export cpuid levelling capabilities via SYSCTL
>>> On 16.12.15 at 22:24,wrote: > --- a/xen/arch/x86/cpu/common.c > +++ b/xen/arch/x86/cpu/common.c > @@ -32,6 +32,9 @@ integer_param("cpuid_mask_ext_ecx", opt_cpuid_mask_ext_ecx); > unsigned int __devinitdata opt_cpuid_mask_ext_edx = ~0u; > integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx); > > +unsigned int __initdata expected_levelling_cap; Unused variable. > +unsigned int __read_mostly levelling_caps; Never written to variable. > +/* > + * XEN_SYSCTL_get_levelling_caps (x86 specific) > + * > + * Return hardware capabilities concerning masking or faulting of the cpuid > + * instruction for PV guests. > + */ > +struct xen_sysctl_levelling_caps { > +/* > + * Featureset array index encoding: > + * - (possibly) 'e' Extended > + * - leaf, uppercase hex > + * - register, lowercase > + * - (possibly) subleaf > + */ > +#define XEN_SYSCTL_LEVELCAP_faulting (1ul << 0) /* CPUID faulting*/ > +#define XEN_SYSCTL_LEVELCAP_1c (1ul << 1) /* 0x0001.ecx*/ > +#define XEN_SYSCTL_LEVELCAP_1d (1ul << 2) /* 0x0001.edx*/ > +#define XEN_SYSCTL_LEVELCAP_e1c (1ul << 3) /* 0x8001.ecx*/ > +#define XEN_SYSCTL_LEVELCAP_e1d (1ul << 4) /* 0x8001.edx*/ > +#define XEN_SYSCTL_LEVELCAP_Da1 (1ul << 5) /* 0x000D:1.eax */ > +#define XEN_SYSCTL_LEVELCAP_6c (1ul << 6) /* 0x0006.ecx*/ > +#define XEN_SYSCTL_LEVELCAP_7a0 (1ul << 7) /* 0x0007:0.eax */ > +#define XEN_SYSCTL_LEVELCAP_7b0 (1ul << 8) /* 0x0007:0.ebx */ Similar comment apply to the naming here as I had given for the feature sysctl. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [BUG] EDAC infomation partially missing
>>> On 20.01.16 at 16:01,wrote: > Initially reported to debian > (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810964), redirected here: > > With AMD Opteron 6xxx processors, half of the memory controllers are > missing from /sys/devices/system/edac/mc > Checked with single 6120 (dual memory controller) and twin 6344 (2x dual > MC), other dual-module CPUs might be affected too. > > Booting plain Linux (3.2, 3.16, 4.1, 4.3), all memory controllers are > listed under /sys/devices/system/edac/mc as expected. Same happens, when > Xen 4.1 is used: all MCs present. > > Starting with Xen 4.4 (Debian Jessie), only mc1 (on the single CPU > machine) or mc2/mc3 (dual CPU machine) are present, although the full > system memory is accessible. Checked versions were 4.1.4 (Debian > Wheezy), 4.4.1 (Jessie) and 4.6.0 (Sid) As already indicated by Ian in that bug, you should supply us with full kernel and hypervisor logs for both the good and bad cases (ideally with the same kernel version use in both runs, so that we can exclude kernel behavior differences). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-next test] 78603: regressions - FAIL
flight 78603 linux-next real [real] http://logs.test-lab.xenproject.org/osstest/logs/78603/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-credit2 15 guest-localmigratefail REGR. vs. 78487 test-amd64-amd64-xl-xsm 15 guest-localmigratefail REGR. vs. 78487 test-amd64-amd64-xl 15 guest-localmigratefail REGR. vs. 78487 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate fail REGR. vs. 78487 test-amd64-i386-xl-xsm 15 guest-localmigratefail REGR. vs. 78487 test-amd64-i386-xl 15 guest-localmigratefail REGR. vs. 78487 test-amd64-amd64-pair 22 guest-migrate/dst_host/src_host fail REGR. vs. 78487 test-amd64-i386-pair 22 guest-migrate/dst_host/src_host fail REGR. vs. 78487 Regressions which are regarded as allowable (not blocking): test-amd64-i386-libvirt-xsm 15 guest-saverestore.2 fail REGR. vs. 78487 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2 fail REGR. vs. 78487 test-amd64-amd64-libvirt-pair 22 guest-migrate/dst_host/src_host fail REGR. vs. 78487 test-amd64-i386-libvirt 15 guest-saverestore.2 fail REGR. vs. 78487 test-amd64-amd64-xl-rtds 15 guest-localmigratefail REGR. vs. 78487 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail REGR. vs. 78487 test-amd64-amd64-libvirt 15 guest-saverestore.2 fail REGR. vs. 78487 test-armhf-armhf-libvirt-raw 9 debian-di-install fail REGR. vs. 78487 test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail blocked in 78487 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail blocked in 78487 test-amd64-i386-rumpuserxen-i386 10 guest-startfail like 78487 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail like 78487 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 78487 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 78487 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 78487 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1 fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass version targeted for testing: linux
Re: [Xen-devel] [PATCH V13 5/5] xl: add pvusb commands
Wei Liu writes ("Re: [Xen-devel] [PATCH V13 5/5] xl: add pvusb commands"): > Or rewrite every adhoc parser with bison/flex or Ocaml / Haskell parsec. I endorse this product and/or service. > Jokes aside. I will see what I can do. Consolidating them into helper > functions is a good start. Yes :-). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Thu, 21 Jan 2016, Ian Campbell wrote: > On Thu, 2016-01-21 at 12:55 +, Stefano Stabellini wrote: > > On Thu, 21 Jan 2016, Peng Fan wrote: > > > Hi Ian, > > > > > > On Thu, Jan 21, 2016 at 12:26:04PM +, Ian Campbell wrote: > > > > On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote: > > > > > Hi Ian, > > > > > > > > > > On Thu, Jan 21, 2016 at 10:19:32AM +, Ian Campbell wrote: > > > > > > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote: > > > > > > > > > > > > > > To platform device of ARM, hypervisor is responsible for the > > > > > > > mapping > > > > > > > between machine address and guest physical address, also > > > > > > > responsible > > > > > > > for the irq mapping. > > > > > > > > > > > > > > But to embedded ARM SoC, the hardware clk IP is handled in > > > > > > > Dom0. > > > > > > > > > > > > Arguably Xen ought to be the one to do this, but we have > > > > > > determined > > > > > > (rightly, I think) that doing so for the entirely clk tree of an > > > > > > SoC > > > > > > would > > > > > > involve importing an unmanageable amount of code into Xen[0]. > > > > > > > > > > > > In the meantime we defer this to Dom0. > > > > > > > > > > > > I wonder though if it would be possible to manage the clocks for > > > > > > a > > > > > > passthrough device from the toolstack, i.e. is there a sysfs node > > > > > > where > > > > > > one > > > > > > can say "keep the clock for this device enabled (at xMhz) even > > > > > > though > > > > > > you > > > > > > think the device is unused"? > > > > > > > > > > I am afraid not. The linux device driver without xen can work well, > > > > > because > > > > > there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well > > > > > in the > > > > > driver. > > > > > I do not want to remove the clk apis usage from the device driver > > > > > for > > > > > xen, because the driver > > > > > also serves when no xen support. The pvclk interface is mainly to > > > > > let the > > > > > clk api can work as no xen. > > > > > > > > Would adding a dummy fixed-clock[0] (or several of them) to the guest > > > > passthrough DT satisfy the driver's requirements? They would be > > > > hardcoded > > > > to whatever rate dom0 and/or the tools has decided upon (and had set > > > > in the > > > > real h/w). > > > > > > If using this way, we have the assumption that DomU device driver would > > > not > > > change the rate of the clock driving the device. I am not sure whether > > > this is > > > ok for so many platform devices based ARM core. > > > > > > In /sys/kernel/debug/clk/, there are clk tree info, but > > > clk api are not exposed to userspace as far as I know, so > > > if using sysfs interface to set a known fixed rate or enable/disable > > > the clock, > > > we need to expose the clk info to userspace. > > > > Keeping in mind that we might now want to let DomU change the actual > > s/now/not/? Sorry, I meant "we might not want to let DomU change" ... > > clock frequency anyway, exposing a dummy clock to keep the DomU driver > > happy might be a good option. > > > > However whether we set the frequency from the Dom0 kernel or from the > > toolstack, how do we find out the right clock rate? From a grep in the > > kernel sources, it looks like some drivers still have hardcoded values. > > > > Asking the user to provide the clock rate seems a bit too much to me. > > Remember that for this use case they already need to provide the host DT > path to the device, the list of iomem resources, the list of irqs and a > suitable device tree snippet. > > platform device passthrough is very much an "expert" option (intended for > folks building embedded device, not really for normal end users), which > already needs a fair bit of familiarity with the system in question, I > don't think providing some info about the clocks goes too much further than > this. > > Seamless & easy passthrough is something we are aiming for for PCI, but not > for the general case of platform passthrough. > > Making platform device passthrough as easy as PCI passthrough is much > harder and far reaching than avoiding the need to specify some clock > frequencies (remember we had very long discussions about the design and > couldn't come up with anything especially feasible). I agree that passthrough of non-PCI devices is a feature for "experts" and we don't have many good solutions to this problem. But one thing is asking for a device tree snippet, another is asking for the clock frequency: as an expert user, finding out the device tree node can be done by reading the device tree, the docs or googling for it, but how is a user supposed to find out the clock frequency? I don't think that asking expert users to grep for values in the Linux source tree should be an option. Is the frequency at least supposed to be documented in the hardware manual? Should we have a list of known devices and frequencies in libxl? Maybe not, but we could consider giving users the option to allow the DomU to
Re: [Xen-devel] [PATCH RFC 21/31] x86/domctl: Break out logic to update domain state from cpuid information
>>> On 16.12.15 at 22:24,wrote: > Later changes will add to this logic. > > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich And actually looks like this is pretty independent of the earlier patches, and hence could go in right away? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PULL 11/11] Xen PCI passthru: convert to realize()
From: Cao jinSigned-off-by: Cao jin Reviewed-by: Eric Blake Reviewed-by: Stefano Stabellini --- hw/xen/xen_pt.c | 53 - 1 file changed, 28 insertions(+), 25 deletions(-) diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index 9eef3df..d33221b 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -760,10 +760,10 @@ static void xen_pt_destroy(PCIDevice *d) { } /* init */ -static int xen_pt_initfn(PCIDevice *d) +static void xen_pt_realize(PCIDevice *d, Error **errp) { XenPCIPassthroughState *s = XEN_PT_DEVICE(d); -int rc = 0; +int i, rc = 0; uint8_t machine_irq = 0, scratch; uint16_t cmd = 0; int pirq = XEN_PT_UNASSIGNED_PIRQ; @@ -781,8 +781,8 @@ static int xen_pt_initfn(PCIDevice *d) ); if (err) { error_append_hint(, "Failed to \"open\" the real pci device"); -error_report_err(err); -return -1; +error_propagate(errp, err); +return; } s->is_virtfn = s->real_device.is_virtfn; @@ -802,19 +802,19 @@ static int xen_pt_initfn(PCIDevice *d) if ((s->real_device.domain == 0) && (s->real_device.bus == 0) && (s->real_device.dev == 2) && (s->real_device.func == 0)) { if (!is_igd_vga_passthrough(>real_device)) { -XEN_PT_ERR(d, "Need to enable igd-passthru if you're trying" - " to passthrough IGD GFX.\n"); +error_setg(errp, "Need to enable igd-passthru if you're trying" +" to passthrough IGD GFX"); xen_host_pci_device_put(>real_device); -return -1; +return; } xen_pt_setup_vga(s, >real_device, ); if (err) { error_append_hint(, "Setup VGA BIOS of passthrough" " GFX failed"); -error_report_err(err); +error_propagate(errp, err); xen_host_pci_device_put(>real_device); -return -1; +return; } /* Register ISA bridge for passthrough GFX. */ @@ -836,20 +836,19 @@ static int xen_pt_initfn(PCIDevice *d) /* Bind interrupt */ rc = xen_host_pci_get_byte(>real_device, PCI_INTERRUPT_PIN, ); if (rc) { -XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc); +error_setg_errno(errp, errno, "Failed to read PCI_INTERRUPT_PIN"); goto err_out; } if (!scratch) { -XEN_PT_LOG(d, "no pin interrupt\n"); +error_setg(errp, "no pin interrupt"); goto out; } machine_irq = s->real_device.irq; rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, ); - if (rc < 0) { -XEN_PT_ERR(d, "Mapping machine irq %u to pirq %i failed, (err: %d)\n", - machine_irq, pirq, errno); +error_setg_errno(errp, errno, "Mapping machine irq %u to" + " pirq %i failed", machine_irq, pirq); /* Disable PCI intx assertion (turn on bit10 of devctl) */ cmd |= PCI_COMMAND_INTX_DISABLE; @@ -870,8 +869,8 @@ static int xen_pt_initfn(PCIDevice *d) PCI_SLOT(d->devfn), e_intx); if (rc < 0) { -XEN_PT_ERR(d, "Binding of interrupt %i failed! (err: %d)\n", - e_intx, errno); +error_setg_errno(errp, errno, "Binding of interrupt %u failed", + e_intx); /* Disable PCI intx assertion (turn on bit10 of devctl) */ cmd |= PCI_COMMAND_INTX_DISABLE; @@ -879,8 +878,8 @@ static int xen_pt_initfn(PCIDevice *d) if (xen_pt_mapped_machine_irq[machine_irq] == 0) { if (xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq)) { -XEN_PT_ERR(d, "Unmapping of machine interrupt %i failed!" - " (err: %d)\n", machine_irq, errno); +error_setg_errno(errp, errno, "Unmapping of machine" +" interrupt %u failed", machine_irq); } } s->machine_irq = 0; @@ -893,14 +892,14 @@ out: rc = xen_host_pci_get_word(>real_device, PCI_COMMAND, ); if (rc) { -XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc); +error_setg_errno(errp, errno, "Failed to read PCI_COMMAND"); goto err_out; } else { val |= cmd; rc = xen_host_pci_set_word(>real_device, PCI_COMMAND, val); if (rc) { -XEN_PT_ERR(d, "Failed to write PCI_COMMAND val=0x%x!(rc: %d)\n", - val, rc); +error_setg_errno(errp, errno, "Failed to write PCI_COMMAND" + " val = 0x%x", val); goto err_out;
[Xen-devel] [libvirt test] 78659: tolerable FAIL - PUSHED
flight 78659 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/78659/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass version targeted for testing: libvirt a6cfd22eba5f204deaf0fde3bd3ca96474d77c55 baseline version: libvirt 020135dc855712d09b92a1c7ba3341cd22010f18 Last test of basis78582 2016-01-20 04:21:30 Z1 days Testing same since78659 2016-01-21 04:21:11 Z0 days1 attempts People who touched revisions under test: Cole RobinsonDaniel P. Berrange Ján Tomko jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-armhf-armhf-libvirt-xsm fail test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-armhf-armhf-libvirt fail test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair pass test-armhf-armhf-libvirt-qcow2 fail test-armhf-armhf-libvirt-raw fail test-amd64-amd64-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=libvirt + revision=a6cfd22eba5f204deaf0fde3bd3ca96474d77c55 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push libvirt a6cfd22eba5f204deaf0fde3bd3ca96474d77c55 + branch=libvirt +
Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
>>> On 21.01.16 at 17:16,wrote: >> On January 20, 2016 at 7:29 pm, wrote: >> >>> On 20.01.16 at 11:26, wrote: >> >> On January 15, 2016 at 9:10, wrote: >> >> >>> On 23.12.15 at 09:25, wrote: >> >> > @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu, >> >> > return 0; >> >> > } >> >> > >> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did, >> >> > + u16 seg, u8 bus, u8 >> >> > +devfn) >> >> { >> >> > +struct domain *d; >> >> > +struct pci_dev *pdev; >> >> > + >> >> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]); >> >> > +ASSERT(d); >> >> >> >> Don't you need to acquire some lock in order to safely assert this? >> > >> > Referred to the other use case of 'rcu_lock_domain_by_id()', Xen >> > checks whether The domain is 'NULL'. >> > Could I also replace the 'ASSERT(d)' with >> > + If ( d == NULL ) >> > + return; >> > ? >> >> At a first glance this doesn't look right, but in the end that's something > you need >> to answer. >> > > Is it quite similar to whether the domain has been destroyed when Device-TLB > is flushing? Going back to the original question I raised, you need to ask yourself: Can d validly be NULL when you get here (e.g. due to some other processor changing something in parallel)? If yes, you can't ASSERT(), and the next question then is what exactly it means that it got ripped off the table. The answer to that determines whether simply returning would be okay. >> >> Also note that unused slots hold zero, i.e. there's at least a >> >> theoretical >> > risk of >> >> problems here when you don't also look at >> >> iommu->domid_bitmap. >> >> >> > I am not clear how to fix this point. Do you have good idea? >> > Add a lock to 'iommu->domid_bitmap'? >> >> How would a lock help avoiding mistaking an unused slot to mean Dom0? As >> already suggested - I think you simply need to consult the bitmap along with > the >> domain ID array. >> > > Try to get domain id with iommu->domid_map[did] ? ??? >> >> > +{ >> >> > +list_del(>domain_list); >> >> >> >> This should happen under pcidevs_lock - you need to either acquire it >> >> or >> >> ASSERT() it being held. >> >> >> > >> > I should check whether it is under pcidevs_lock -- with >> > spin_is_locked(_lock) >> > If it is not under pcidevs_lock, I should acquire it. >> > I think ASSERT() is not a good idea. Hypervisor acquires this lock and >> > then remove the resource. >> >> I don't understand this last sentence. >> > For example: in > pci_remove_device() > { > ... > spin_lock(_lock); > .. > iommu_remove_device().. > .. > spin_unlock(_lock); > } > > Device-TLB is maybe flush error in iommu_remove_device().. > Then it is under pcidevs_lock.. > In this case, I need to check whether it is under pcidevs_lock. > If not, I need to acquire the pcidevs_lock. Ah, okay. But you can't use spin_is_locked() for that purpose. >> >> > @@ -349,9 +416,18 @@ static int flush_iotlb_qi( >> >> > queue_invalidate_iotlb(iommu, >> >> > type >> >> >> DMA_TLB_FLUSH_GRANU_OFFSET, dr, >> >> > dw, did, size_order, 0, addr); >> >> > + >> >> > +/* >> >> > + * Synchronize with hardware for invalidation request >> descriptors >> >> > + * submitted before Device-TLB invalidate descriptor. >> >> > + */ >> >> > +rc = invalidate_sync(iommu); >> >> > +if ( rc ) >> >> > + return rc; >> >> > + >> >> > if ( flush_dev_iotlb ) >> >> > ret = dev_invalidate_iotlb(iommu, did, addr, size_order, >> type); >> >> > -rc = invalidate_sync(iommu); >> >> > + >> >> > if ( !ret ) >> >> > ret = rc; >> >> > } >> >> >> >> This change is because of ...? >> >> >> > As Kevin's comments, >> > I add 'dev_invalidate_iotlb_sync()' to dev_invalidate_iotlb(). >> > Then i can know which device is flush timeout. >> >> I don't understand: How is your reply related to you moving up the > invocation of >> invalidate_sync()? > > This invalidate_sync() is to synchronize with hardware for invalidation > request descriptors > submitted before Device-TLB invalidate descriptor. > > If I don't add this invalidate_sync().. > dev_invalidate_iotlb_timeout() is not clear whether the flush time out is > about Device-TLB flush error or the other previous iotlb/iec/context flush > error. This being anything but obvious, maybe a brief comment would help? >> >> Plus logging the error code and device coordinates might turn out >> >> helpful in such cases. But first of all - if there was a timeout, >> >> we'd make it here only for Dom0. Perhaps the printing should move into an >> else to the domain_crash()? >> >> And if there was another error, the
Re: [Xen-devel] [PATCH RFC 18/31] xen/x86: Improve disabling of features which have dependencies
>>> On 16.12.15 at 22:24,wrote: > --- a/xen/arch/x86/xstate.c > +++ b/xen/arch/x86/xstate.c > @@ -278,11 +278,16 @@ unsigned int xstate_ctxt_size(u64 xcr0) > /* Collect the information of processor's extended state */ > void xstate_init(struct cpuinfo_x86 *c) > { > +static bool_t __initdata use_xsave = 1; > + > bool_t bsp = c == _cpu_data; > u32 eax, ebx, ecx, edx; > u64 feature_mask; > > -if ( boot_cpu_data.cpuid_level < XSTATE_CPUID ) > +boolean_param("xsave", use_xsave); Please make this sit next to the related variable. With that Reviewed-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.6-testing test] 78618: tolerable trouble: broken/fail/pass - PUSHED
flight 78618 xen-4.6-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/78618/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-libvirt-xsm 3 host-install(3) broken REGR. vs. 78528 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail REGR. vs. 78528 test-armhf-armhf-xl-rtds15 guest-start/debian.repeat fail blocked in 78528 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 78528 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 78528 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass version targeted for testing: xen 1fd615aa0108490ffc558d27627f509183cbfdaf baseline version: xen 6150df9f3f99ecbcbd9917002186d1d895b5602e Last test of basis78528 2016-01-19 14:09:10 Z2 days Testing same since78618 2016-01-20 13:17:45 Z1 days1 attempts People who touched revisions under test: Andrew CooperFeng Wu Ian Campbell Jan Beulich Kevin Tian Yang Zhang jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-prev pass build-i386-prev pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen
[Xen-devel] [PULL 03/11] xen-hvm: Clean up xen_hvm_init() error handling
From: Markus Armbrusterxen_hvm_init() returns -1 without cleaning up on some errors (harmless long as the caller exit()s on error), dies with hw_error() on others. hw_error() isn't approprate here. Clean up to exit() on all errors. Signed-off-by: Markus Armbruster Reviewed-by: Stefano Stabellini Signed-off-by: Stefano Stabellini --- hw/i386/pc_piix.c|5 ++--- hw/i386/pc_q35.c |5 ++--- include/hw/xen/xen.h |2 +- xen-hvm-stub.c |3 +-- xen-hvm.c| 61 ++ 5 files changed, 38 insertions(+), 38 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index df2b824..db0ae9c 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -121,9 +121,8 @@ static void pc_init1(MachineState *machine, pcms->below_4g_mem_size = machine->ram_size; } -if (xen_enabled() && xen_hvm_init(pcms, _memory) != 0) { -fprintf(stderr, "xen hardware virtual machine initialisation failed\n"); -exit(1); +if (xen_enabled()) { +xen_hvm_init(pcms, _memory); } pc_cpus_init(pcms); diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 412b3cd..abbcbf9 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -111,9 +111,8 @@ static void pc_q35_init(MachineState *machine) pcms->below_4g_mem_size = machine->ram_size; } -if (xen_enabled() && xen_hvm_init(pcms, _memory) != 0) { -fprintf(stderr, "xen hardware virtual machine initialisation failed\n"); -exit(1); +if (xen_enabled()) { +xen_hvm_init(pcms, _memory); } pc_cpus_init(pcms); diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index e90931a..d07bc99 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -39,7 +39,7 @@ qemu_irq *xen_interrupt_controller_init(void); void xenstore_store_pv_console_info(int i, struct CharDriverState *chr); #if defined(NEED_CPU_H) && !defined(CONFIG_USER_ONLY) -int xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory); +void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory); void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, struct MemoryRegion *mr); void xen_modified_memory(ram_addr_t start, ram_addr_t length); diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c index 6a39425..7ff0602 100644 --- a/xen-hvm-stub.c +++ b/xen-hvm-stub.c @@ -47,9 +47,8 @@ void xen_modified_memory(ram_addr_t start, ram_addr_t length) { } -int xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory) +void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory) { -return 0; } void qmp_xen_set_global_dirty_log(bool enable, Error **errp) diff --git a/xen-hvm.c b/xen-hvm.c index 2a93390..cb7128c 100644 --- a/xen-hvm.c +++ b/xen-hvm.c @@ -17,6 +17,7 @@ #include "qmp-commands.h" #include "sysemu/char.h" +#include "qemu/error-report.h" #include "qemu/range.h" #include "sysemu/xen-mapcache.h" #include "trace.h" @@ -1189,16 +1190,8 @@ static void xen_wakeup_notifier(Notifier *notifier, void *data) xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 0); } -/* return 0 means OK, or -1 means critical issue -- will exit(1) */ -int xen_hvm_init(PCMachineState *pcms, - MemoryRegion **ram_memory) +void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory) { -/* - * FIXME Returns -1 without cleaning up on some errors (harmless - * as long as the caller exit()s on error), dies with hw_error() - * on others. hw_error() isn't approprate here. Should probably - * simply exit() on all errors. - */ int i, rc; xen_pfn_t ioreq_pfn; xen_pfn_t bufioreq_pfn; @@ -1210,19 +1203,19 @@ int xen_hvm_init(PCMachineState *pcms, state->xce_handle = xen_xc_evtchn_open(NULL, 0); if (state->xce_handle == XC_HANDLER_INITIAL_VALUE) { perror("xen: event channel open"); -return -1; +goto err; } state->xenstore = xs_daemon_open(); if (state->xenstore == NULL) { perror("xen: xenstore open"); -return -1; +goto err; } rc = xen_create_ioreq_server(xen_xc, xen_domid, >ioservid); if (rc < 0) { perror("xen: ioreq server create"); -return -1; +goto err; } state->exit.notify = xen_exit_notifier; @@ -1238,8 +1231,9 @@ int xen_hvm_init(PCMachineState *pcms, _pfn, _pfn, _evtchn); if (rc < 0) { -hw_error("failed to get ioreq server info: error %d handle=" XC_INTERFACE_FMT, - errno, xen_xc); +error_report("failed to get ioreq server info: error %d handle=" XC_INTERFACE_FMT, + errno, xen_xc); +goto err; } DPRINTF("shared page at pfn %lx\n", ioreq_pfn); @@ -1249,8