[Xen-devel] [qemu-upstream-4.2-testing test] 60553: regressions - FAIL
flight 60553 qemu-upstream-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/60553/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-win7-amd64 6 xen-boot fail REGR. vs. 60207 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 60207 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass test-amd64-i386-xend-qemuu-winxpsp3 20 leak-check/checkfail never pass version targeted for testing: qemuu138906105dd47b9dc6b1e5010e81fc606983dd75 baseline version: qemuuda2e633ec99da897f51f388217f070c53aea6674 Last test of basis60207 2015-07-31 22:20:43 Z5 days Testing same since60553 2015-08-03 19:06:47 Z2 days1 attempts People who touched revisions under test: Stefan Hajnoczi stefa...@redhat.com Stefano Stabellini stefano.stabell...@eu.citrix.com jobs: build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 fail test-amd64-i386-xl-qemuu-ovmf-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass test-amd64-i386-xend-qemuu-winxpsp3 fail test-amd64-amd64-xl-qemuu-winxpsp3 pass test-i386-i386-xl-qemuu-winxpsp3 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit 138906105dd47b9dc6b1e5010e81fc606983dd75 Author: Stefano Stabellini stefano.stabell...@eu.citrix.com Date: Mon Aug 3 14:07:02 2015 + Fix release_drive on unplugged devices (pci_piix3_xen_ide_unplug) pci_piix3_xen_ide_unplug should completely unhook the unplugged IDEDevice from the corresponding BlockBackend, otherwise the next call to release_drive will try to detach the drive again. Suggested-by: Kevin Wolf kw...@redhat.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com commit 3b841b4daa66fba7422697df8d4af2e96bba89ad Author: Stefan Hajnoczi stefa...@redhat.com Date: Wed Jul 15 18:17:04 2015 +0100 rtl8139: check TCP Data Offset field The TCP Data Offset field contains the length of the header. Make sure it is valid and does not exceed the IP data length. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com commit e2c8930853198fc4c9e50e46653bc1e577f785c8 Author: Stefan Hajnoczi stefa...@redhat.com Date: Wed Jul 15 18:17:03 2015 +0100 rtl8139: skip offload on short TCP header TCP Large Segment Offload accesses the TCP header in the packet. If the packet is too short we must not attempt to access header fields: tcp_header *p_tcp_hdr = (tcp_header*)(eth_payload_data + hlen); int tcp_hlen = TCP_HEADER_DATA_OFFSET(p_tcp_hdr); Signed-off-by: Stefan Hajnoczi stefa...@redhat.com commit a1c76be898c9da3a4a6c71d21e5ca6b8591fc9f9 Author: Stefan Hajnoczi stefa...@redhat.com Date: Wed Jul 15 18:17:02 2015 +0100 rtl8139: check IP Total Length field The IP Total Length field includes the IP header and data. Make sure it is valid and does not exceed the Ethernet payload size. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com commit
Re: [Xen-devel] OSSTEST -- nested test case development, RFC: ts-guest-destroy doesn't call guest_await_dhcp_tcp() if guest has fixed IP
-Original Message- From: Ian Campbell [mailto:ian.campb...@citrix.com] Sent: Wednesday, August 5, 2015 8:27 PM To: Hu, Robert; ian.jack...@eu.citrix.com; wei.l...@citrix.com Cc: xen-devel@lists.xen.org Subject: Re: OSSTEST -- nested test case development, RFC: ts-guest-destroy doesn't call guest_await_dhcp_tcp() if guest has fixed IP On Wed, 2015-08-05 at 06:22 +, Hu, Robert wrote: Hi Ians, I don't 100% recall how this is supposed to fit together. IIRC: 1# L0 is installed as usual #2 An L1 guest is installed. That L1 guest gets an IP address from DHCP in the normal way. yes 3# Then ts-nested setup customises the L1 guest into an L1 host, storing the DHCP assigned address in $r{${l1ident}_ip}. (I'm not sure if it is actually called l1ident, but whatever it is). After l1 installed Xen and turned into Xen environment (12 testid xen-install/l1 below), its IP get fixed (this is current normal Xen installation behavior, same in l0's Xen install). See /etc/network/interface in l1: root@l1:~# cat /etc/network/interfaces ... auto xenbr0 iface xenbr0 inet static address 192.168.199.175 netmask 255.255.255.0 gateway 192.168.199.1 bridge_ports eth0 bridge_fd 0 bridge_stp off Then, in guest-destroy, we cannot get its IP by searching lease file. 4# Then operations which selecthost(l1ident) see that $r{${l1ident}_ip} and use it as the $ho-{Ip} instead of looking for it in the host db. Yes 5# At some point an L2 guest is installed on the L1 host and it also gets an IP from DHCP in the usual way. Yes, L2 will get IP from dhcp. Is that all correct? Here is current test steps 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 1 testid build-check(1) == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 2 testid hosts-allocate == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 3 testid host-install(3) == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 4 testid host-ping-check-native == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 5 testid xen-install == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 6 testid xen-boot == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 7 testid host-ping-check-xen == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 8 testid leak-check/basis(8) == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 9 testid debian-hvm-install == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 10 testid nested-setup == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 11 testid host-ping-check-native/l1 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 12 testid xen-install/l1 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 13 testid xen-boot/l1 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 14 testid host-ping-check-xen/l1 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 15 testid leak-check/basis/l1(15) == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 16 testid debian-hvm-install/l1/l2 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 17 testid guest-stop/l1/l2 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 18 testid guest-destroy == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 19 testid leak-check/check/l1 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 20 testid capture-logs/l1(20) == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 21 testid final-poweroff/l1 == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 22 testid leak-check/check == 2015-08-06 01:44:58 Z standalone.test-amd64-amd64-qemuu-nested == 23 testid capture-logs(23) == Current ts-guest-destory will invoke guest_await_dhcp_tcp(); but in nested case, after l1 turns into Xen environment, it then has fixed IP address; which in turn has failed at dhcp lease check. So here are we talking about ts-guest-destroy of an L2 guest on the L1 host, or of the L1 guest on the L0 host? l1 on l0. I think you are talking about the L1 guest on the L0 host. In that context ts-guest-destroy will be considering the L1 as a guest, so I would expect that guest_await_dhcp_tcp should work, because the L1 guest's IP was assigned via DHCP in #2 above. explained above, it is not after we xen-install l1. So I suppose the question is how/why is guest_await_dhcp_tcp failing when
Re: [Xen-devel] [PATCH V5 3/7] libxl: add pvusb API
As 4.6 goes to bug fixing stage, maybe we can pick up this thread? :-) Beside to call for your precious review comments and suggestions so that we can make progress, I also want to confirm about the previous discussed two TODO things: 1) use UDEV name rule to specify usb device uniquely even across reboot. That got consensus. Next thing is exposing that name to some sysfs entry, right? 2) use libusb instead of reading sysfs by ourselves. As George mentioned, using libusb is not simpler than reading sysfs; and if UDEV name is stored to some sysfs entry for us to retrieve, then we still need reading sysfs things. Could we get to a final decision? If these are settled down, I can update related code. Thanks, Chunyan On 7/7/2015 at 05:57 PM, in message 559ba270.4000...@eu.citrix.com, George Dunlap george.dun...@eu.citrix.com wrote: On 07/07/2015 02:25 AM, Chun Yan Liu wrote: Any comments on the implementation? If there is anything improper, I can update. Since we decided to wait until the next release, I've been prioritizing reviewing patch series which might make it into 4.6. I'll come back to it once the freeze starts. Believe me, I'd much rather be working on the USB stuff. :-) -George Thanks, Chunyan On 6/25/2015 at 06:07 PM, in message 1435226838-3067-4-git-send-email-cy...@suse.com, Chunyan Liu cy...@suse.com wrote: Add pvusb APIs, including: - attach/detach (create/destroy) virtual usb controller. - attach/detach usb device - list usb controller and usb devices - some other helper functions Signed-off-by: Chunyan Liu cy...@suse.com Signed-off-by: Simon Cao caobosi...@gmail.com --- changes: - Use macros DEFINE_DEVICE_ADD and DEFINE_DEVICES_ADD for adding usbctrl and usb, update all related codes. - Use an extend macro DEFINE_DEVICE_REMOVE_EXT for removimg usbctrl update all related codes. - Remove busid from libxl_device_usb definition, keep bus.addr only, update all related codes. - Remove documentation since it's mostly about design, move to cover-letter. Some parts are moved to code comments. - Address some other comments tools/libxl/Makefile |2 +- tools/libxl/libxl.c | 53 ++ tools/libxl/libxl.h | 64 ++ tools/libxl/libxl_device.c |4 + tools/libxl/libxl_internal.h | 20 +- tools/libxl/libxl_osdeps.h | 13 + tools/libxl/libxl_pvusb.c| 1305 ++ tools/libxl/libxl_types.idl | 50 ++ tools/libxl/libxl_types_internal.idl |1 + tools/libxl/libxl_utils.c| 16 + tools/libxl/libxl_utils.h|5 + 11 files changed, 1531 insertions(+), 2 deletions(-) create mode 100644 tools/libxl/libxl_pvusb.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index cc9c152..b820105 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -95,7 +95,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ - libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) + libxl_qmp.o libxl_event.o libxl_fork.o libxl_pvusb.o $(LIBXL_OBJS-y) LIBXL_OBJS += libxl_genid.o LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 6570476..6542127 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -4233,11 +4233,54 @@ DEFINE_DEVICE_REMOVE(vtpm, destroy, 1) / **/ +/* Macro for defining device remove/destroy functions for usbctrl */ +/* Following functions are defined: + * libxl_device_usbctrl_remove + * libxl_device_usbctrl_destroy + */ + +#define DEFINE_DEVICE_REMOVE_EXT(type, removedestroy, f)\ +int libxl_device_##type##_##removedestroy(libxl_ctx *ctx, \ +uint32_t domid, libxl_device_##type *type, \ +const libxl_asyncop_how *ao_how)\ +{ \ +AO_CREATE(ctx, domid, ao_how); \ +libxl__device *device; \ +libxl__ao_device *aodev;\ +int rc;
[Xen-devel] [qemu-upstream-4.4-testing test] 60565: tolerable FAIL - PUSHED
flight 60565 qemu-upstream-4.4-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/60565/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-libvirt 11 guest-start fail like 58380 test-amd64-amd64-libvirt 11 guest-start fail like 58380 Tests which did not succeed, but are not blocking: test-amd64-i386-libvirt-qcow2 9 debian-di-installfail never pass test-amd64-amd64-xl-raw 9 debian-di-installfail never pass test-amd64-amd64-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: qemuu0fc147387f0b683d2dfefec7b1af569f17b72e9c baseline version: qemuu32226f429cca6c79826364d8d18acb2226f2f102 Last test of basis58380 2015-06-10 17:15:50 Z 56 days Failing since 59500 2015-07-13 10:48:26 Z 23 days4 attempts Testing same since60565 2015-08-04 01:59:38 Z2 days1 attempts People who touched revisions under test: James Harper james.har...@ejbdigital.com.au James Harper ja...@ejbdigital.com.au Kevin Wolf kw...@redhat.com Stefan Hajnoczi stefa...@redhat.com Stefano Stabellini stefano.stabell...@eu.citrix.com jobs: build-amd64-xend pass build-i386-xend pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-credit2 pass test-amd64-i386-freebsd10-i386 pass test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-amd64-libvirt fail test-amd64-i386-libvirt fail test-amd64-amd64-xl-multivcpupass test-amd64-amd64-pairpass test-amd64-i386-pair pass test-amd64-amd64-pv pass test-amd64-i386-pv pass test-amd64-amd64-amd64-pvgrubpass test-amd64-amd64-i386-pvgrub pass test-amd64-amd64-pygrub pass test-amd64-amd64-libvirt-qcow2 pass test-amd64-i386-libvirt-qcow2fail test-amd64-amd64-xl-qcow2pass test-amd64-i386-xl-qcow2 pass test-amd64-amd64-libvirt-raw pass test-amd64-i386-libvirt-raw pass test-amd64-amd64-xl-raw fail test-amd64-i386-xl-raw pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass test-amd64-amd64-libvirt-vhd pass test-amd64-i386-libvirt-vhd pass test-amd64-amd64-xl-vhd pass test-amd64-i386-xl-vhd pass test-amd64-amd64-xl-qemuu-winxpsp3
Re: [Xen-devel] [v4 16/17] vmx: Add some scheduler hooks for VT-d posted interrupts
-Original Message- From: Dario Faggioli [mailto:dario.faggi...@citrix.com] Sent: Monday, August 03, 2015 6:02 PM To: Wu, Feng Cc: xen-devel@lists.xen.org; Keir Fraser; Jan Beulich; Andrew Cooper; Tian, Kevin; George Dunlap Subject: Re: [v4 16/17] vmx: Add some scheduler hooks for VT-d posted interrupts On Mon, 2015-08-03 at 01:36 +, Wu, Feng wrote: -Original Message- From: Dario Faggioli [mailto:dario.faggi...@citrix.com] It's safe in any case. In fact, the spinlock will prevent both the vcpu's processor to schedule, as well as any other processors to steal the waking vcpu from the runqueue to run it. Good to know this. For as well as any other processors to steal the waking vcpu from the runqueue to run it , could you please show some hints in the code side, so I can better understand how this can be protected by the spinlock. Thank you! Sure. For instance, for the Credit1 scheduler, look at csched_load_balance(). vcpus are moved from one runqueue to another by means of csched_runq_steal() which, as you can see, is called only if the spinlock for the pcpu from where we want to try stealing has been successfully acquired: spinlock_t *lock = pcpu_schedule_trylock(peer_cpu); if ( !lock ) { SCHED_STAT_CRANK(steal_trylock_failed); peer_cpu = cpumask_cycle(peer_cpu, workers); continue; } /* Any work over there to steal? */ speer = cpumask_test_cpu(peer_cpu, online) ? csched_runq_steal(peer_cpu, cpu, snext-pri, bstep) : NULL; pcpu_schedule_unlock(lock, peer_cpu); Same happen in Credit2. Check choose_cpu(), and you'll fine similar reasoning and code. In RTDS, finally, there's just one runqueue, for all pcpus, so each time each one of them has to schedule it has to take the only one spinlock protecting it. Hope this helped. Thanks for the elaborate description, it is very helpful. Thanks, Feng Regards, Dario -- This happens because I choose it to happen! (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] OSSTEST -- nested test case development, RFC: ts-guest-destroy doesn't call guest_await_dhcp_tcp() if guest has fixed IP
Hi Ians, Current ts-guest-destory will invoke guest_await_dhcp_tcp(); but in nested case, after l1 turns into Xen environment, it then has fixed IP address; which in turn has failed at dhcp lease check. So, how about if I in ts-guest-destroy bypass guest_await_dhcp_tcp() if we have $r{guest-Guest_ip}? Best Regards, Robert Ho ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] OSSTEST -- nested test case development, RFC: ts-guest-destroy doesn't call guest_await_dhcp_tcp() if guest has fixed IP
-Original Message- From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Hu, Robert Sent: Wednesday, August 5, 2015 2:22 PM To: ian.jack...@eu.citrix.com; Ian Campbell; wei.l...@citrix.com Cc: xen-devel@lists.xen.org Subject: [Xen-devel] OSSTEST -- nested test case development, RFC: ts-guest-destroy doesn't call guest_await_dhcp_tcp() if guest has fixed IP Hi Ians, Current ts-guest-destory will invoke guest_await_dhcp_tcp(); but in nested case, after l1 turns into Xen environment, it then has fixed IP address; which in turn has failed at dhcp lease check. So, how about if I in ts-guest-destroy bypass guest_await_dhcp_tcp() if we have $r{guest-Guest_ip}? Detailed diff @@ -1354,6 +1355,8 @@ sub selectguest ($$) { $gho-{Options}{$opt}++; } logm(guest: using $gn on $gho-{Host}{Name}); +$gho-{Ip} = $r{$gho-{Guest}_ip}; +logm(guest: $gn has fixed IP $gho-{Ip}); guest_find_lv($gho); guest_find_ether($gho); guest_find_tcpcheckport($gho); @@ -1758,7 +1761,7 @@ sub guest_await_dhcp_tcp ($$) { $gho-{TcpCheckPort}. link/ip/tcp, sub { -my $err= guest_check_ip($gho); +my $err= guest_check_ip($gho) if !$gho-{Ip}; return $err if defined $err; return Best Regards, Robert Ho ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] ts-raisin-build: consolidate open coded iterations over enabled components
On Tue, 2015-08-04 at 14:02 +0100, Ian Campbell wrote: +my %treerev_override = ( This is a kindofcrappy name actually. Maybe %comp2runvar or %raisin2osstest or (perhaps going too far) just %r2o? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/1] x86: allow to enable/disable modify_ldt at run time
* Willy Tarreau w...@1wt.eu wrote: @@ -276,6 +282,15 @@ asmlinkage int sys_modify_ldt(int func, void __user *ptr, { int ret = -ENOSYS; + if (!sysctl_modify_ldt) { + printk_ratelimited(KERN_INFO + Denied a call to modify_ldt() from %s[%d] (uid: %d). + Adjust sysctl if this was not an exploit attempt.\n, + current-comm, task_pid_nr(current), + from_kuid_munged(current_user_ns(), current_uid())); UI nit: so this message should really tell the user _which_ sysctl to configure, instead of passive-aggressively alluding to the fact that there's a sysctl somewhere that might do the trick... Thanks, Ingo ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/1] x86: allow to enable/disable modify_ldt at run time
Hi Ingo, On Wed, Aug 05, 2015 at 10:00:37AM +0200, Ingo Molnar wrote: * Willy Tarreau w...@1wt.eu wrote: @@ -276,6 +282,15 @@ asmlinkage int sys_modify_ldt(int func, void __user *ptr, { int ret = -ENOSYS; + if (!sysctl_modify_ldt) { + printk_ratelimited(KERN_INFO + Denied a call to modify_ldt() from %s[%d] (uid: %d). +Adjust sysctl if this was not an exploit attempt.\n, + current-comm, task_pid_nr(current), + from_kuid_munged(current_user_ns(), current_uid())); UI nit: so this message should really tell the user _which_ sysctl to configure, instead of passive-aggressively alluding to the fact that there's a sysctl somewhere that might do the trick... I agree, I did it first and changed my mind due to the repetition of the word modify_ldt. Here's an updated version instead. Willy From 17b2720cd54df0fde6686c1d85aaed38d679cbe7 Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Sat, 25 Jul 2015 12:18:33 +0200 Subject: [PATCH] x86/ldt: allow to disable modify_ldt at runtime For distros who prefer not to take the risk of completely disabling the modify_ldt syscall using CONFIG_MODIFY_LDT_SYSCALL, this patch adds a sysctl to enable or/disable it at runtime, and proposes to disable it by default. This can be a safe alternative. A message is logged if an attempt was stopped so that it's easy to spot if/when it is needed. Cc: Andy Lutomirski l...@kernel.org Cc: Kees Cook keesc...@chromium.org Signed-off-by: Willy Tarreau w...@1wt.eu --- Documentation/sysctl/kernel.txt | 15 +++ arch/x86/Kconfig| 17 + arch/x86/kernel/ldt.c | 16 kernel/sysctl.c | 14 ++ 4 files changed, 62 insertions(+) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 6fccb69..60c7c7a 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -41,6 +41,7 @@ show up in /proc/sys/kernel: - kptr_restrict - kstack_depth_to_print [ X86 only ] - l2cr[ PPC only ] +- modify_ldt [ X86 only ] - modprobe== Documentation/debugging-modules.txt - modules_disabled - msg_next_id[ sysv ipc ] @@ -391,6 +392,20 @@ This flag controls the L2 cache of G3 processor boards. If == +modify_ldt: (X86 only) + +Enables (1) or disables (0) the modify_ldt syscall. Modifying the LDT +(Local Descriptor Table) may be needed to run a 16-bit or segmented code +such as Dosemu or Wine. This is done via a system call which is not needed +to run portable applications, and which can sometimes be abused to exploit +some weaknesses of the architecture, opening new vulnerabilities. + +This sysctl allows one to increase the system's security by disabling the +system call, or to restore compatibility with specific applications when it +was already disabled. + +== + modules_disabled: A toggle value indicating if modules are allowed to be loaded diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index beabf30..88d10a0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2069,6 +2069,23 @@ config MODIFY_LDT_SYSCALL surface. Disabling it removes the modify_ldt(2) system call. Saying 'N' here may make sense for embedded or server kernels. + If really unsure, say 'Y', you'll be able to disable it at runtime. + +config DEFAULT_MODIFY_LDT_SYSCALL + bool Allow userspace to modify the LDT by default + depends on MODIFY_LDT_SYSCALL + default y + ---help--- + Modifying the LDT (Local Descriptor Table) may be needed to run a + 16-bit or segmented code such as Dosemu or Wine. This is done via + a system call which is not needed to run portable applications, + and which can sometimes be abused to exploit some weaknesses of + the architecture, opening new vulnerabilities. + + For this reason this option allows one to enable or disable the + feature at runtime. It is recommended to say 'N' here to leave + the system protected, and to enable it at runtime only if needed + by setting the sys.kernel.modify_ldt sysctl. source kernel/livepatch/Kconfig diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c index 2bcc052..354e854 100644 --- a/arch/x86/kernel/ldt.c +++ b/arch/x86/kernel/ldt.c @@ -11,6 +11,7 @@ #include linux/sched.h #include linux/string.h #include linux/mm.h +#include linux/ratelimit.h #include linux/smp.h #include linux/slab.h #include linux/vmalloc.h @@ -21,6 +22,11 @@ #include asm/mmu_context.h #include asm/syscalls.h +#ifdef CONFIG_MODIFY_LDT_SYSCALL +int sysctl_modify_ldt __read_mostly = +
Re: [Xen-devel] [PATCH v3 1/1] x86: allow to enable/disable modify_ldt at run time
* Willy Tarreau w...@1wt.eu wrote: Hi Ingo, On Wed, Aug 05, 2015 at 10:00:37AM +0200, Ingo Molnar wrote: * Willy Tarreau w...@1wt.eu wrote: @@ -276,6 +282,15 @@ asmlinkage int sys_modify_ldt(int func, void __user *ptr, { int ret = -ENOSYS; + if (!sysctl_modify_ldt) { + printk_ratelimited(KERN_INFO + Denied a call to modify_ldt() from %s[%d] (uid: %d). + Adjust sysctl if this was not an exploit attempt.\n, + current-comm, task_pid_nr(current), + from_kuid_munged(current_user_ns(), current_uid())); UI nit: so this message should really tell the user _which_ sysctl to configure, instead of passive-aggressively alluding to the fact that there's a sysctl somewhere that might do the trick... I agree, I did it first and changed my mind due to the repetition of the word modify_ldt. Here's an updated version instead. Willy From 17b2720cd54df0fde6686c1d85aaed38d679cbe7 Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Sat, 25 Jul 2015 12:18:33 +0200 Subject: [PATCH] x86/ldt: allow to disable modify_ldt at runtime For distros who prefer not to take the risk of completely disabling the modify_ldt syscall using CONFIG_MODIFY_LDT_SYSCALL, this patch adds a sysctl to enable or/disable it at runtime, and proposes to disable it by default. This can be a safe alternative. A message is logged if an attempt was stopped so that it's easy to spot if/when it is needed. Cc: Andy Lutomirski l...@kernel.org Cc: Kees Cook keesc...@chromium.org Signed-off-by: Willy Tarreau w...@1wt.eu --- Documentation/sysctl/kernel.txt | 15 +++ arch/x86/Kconfig| 17 + arch/x86/kernel/ldt.c | 16 kernel/sysctl.c | 14 ++ 4 files changed, 62 insertions(+) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 6fccb69..60c7c7a 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -41,6 +41,7 @@ show up in /proc/sys/kernel: - kptr_restrict - kstack_depth_to_print [ X86 only ] - l2cr[ PPC only ] +- modify_ldt [ X86 only ] - modprobe== Documentation/debugging-modules.txt - modules_disabled - msg_next_id [ sysv ipc ] @@ -391,6 +392,20 @@ This flag controls the L2 cache of G3 processor boards. If == +modify_ldt: (X86 only) + +Enables (1) or disables (0) the modify_ldt syscall. Modifying the LDT +(Local Descriptor Table) may be needed to run a 16-bit or segmented code s/run a/run +such as Dosemu or Wine. This is done via a system call which is not needed s/Dosemu/DOSEMU +to run portable applications, and which can sometimes be abused to exploit +some weaknesses of the architecture, opening new vulnerabilities. So that's pretty vague IMHO, and a bit FUD-ish in character. How about: ... , and which system call exposes complex, rarely used legacy hardware features and semantics that had suffered vulnerabilities in the past. + +This sysctl allows one to increase the system's security by disabling the +system call, or to restore compatibility with specific applications when it +was already disabled. + +== + modules_disabled: A toggle value indicating if modules are allowed to be loaded diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index beabf30..88d10a0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2069,6 +2069,23 @@ config MODIFY_LDT_SYSCALL surface. Disabling it removes the modify_ldt(2) system call. Saying 'N' here may make sense for embedded or server kernels. + If really unsure, say 'Y', you'll be able to disable it at runtime. + +config DEFAULT_MODIFY_LDT_SYSCALL + bool Allow userspace to modify the LDT by default + depends on MODIFY_LDT_SYSCALL + default y + ---help--- + Modifying the LDT (Local Descriptor Table) may be needed to run a + 16-bit or segmented code such as Dosemu or Wine. This is done via + a system call which is not needed to run portable applications, + and which can sometimes be abused to exploit some weaknesses of + the architecture, opening new vulnerabilities. + + For this reason this option allows one to enable or disable the + feature at runtime. It is recommended to say 'N' here to leave + the system protected, and to enable it at runtime only if needed + by setting the sys.kernel.modify_ldt sysctl. Here I'd do the same modifications as to the sysctl text above. + if (!sysctl_modify_ldt) { + printk_ratelimited(KERN_INFO + Denied a call to
Re: [Xen-devel] [PATCH V3 4/6] libxc: expose xsaves/xgetbv1/xsavec to hvm guest
On Wed, 2015-08-05 at 09:57 +0800, Shuai Ruan wrote: This patch exposes xsaves/xgetbv1/xsavec to hvm guest. The reserved bits of eax/ebx/ecx/edx must be cleaned up when call cpuid(0dh) with leaf 1 or 2..63. According to the spec the following bits must be reseved: reserved For leaf 1, bits 03-04/08-31 of ecx is reserved. Edx is reserved. For leaf 2...63, bits 01-31 of ecx is reserved. Edx is reserved. Signed-off-by: Shuai Ruan shuai.r...@linux.intel.com Although this is toolstack code I think this really ought to be acked by the hypervisor x86 maintainers, if they are happy with it then I am too, and in that case you may add: Acked-by: Ian Campbell ian.campb...@citrix.com --- tools/libxc/xc_cpuid_x86.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c index c97f91a..b69676a 100644 --- a/tools/libxc/xc_cpuid_x86.c +++ b/tools/libxc/xc_cpuid_x86.c @@ -211,6 +211,9 @@ static void intel_xc_cpuid_policy( } #define XSAVEOPT(1 0) +#define XSAVEC (1 1) +#define XGETBV1 (1 2) +#define XSAVES (1 3) /* Configure extended state enumeration leaves (0x000D for xsave) */ static void xc_cpuid_config_xsave( xc_interface *xch, domid_t domid, uint64_t xfeature_mask, @@ -247,8 +250,9 @@ static void xc_cpuid_config_xsave( regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */ break; case 1: /* leaf 1 */ -regs[0] = XSAVEOPT; -regs[1] = regs[2] = regs[3] = 0; +regs[0] = (XSAVEOPT | XSAVEC | XGETBV1 | XSAVES); +regs[2] = 0xe7; +regs[3] = 0; break; case 2 ... 63: /* sub-leaves */ if ( !(xfeature_mask (1ULL input[1])) ) @@ -256,8 +260,9 @@ static void xc_cpuid_config_xsave( regs[0] = regs[1] = regs[2] = regs[3] = 0; break; } -/* Don't touch EAX, EBX. Also cleanup ECX and EDX */ -regs[2] = regs[3] = 0; +/* Don't touch EAX, EBX. Also cleanup EDX. Cleanup bits 01-32 of ECX*/ +regs[2] = 0x1; +regs[3] = 0; break; } } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/libxl: Correct use of phyinfo_{init, dispose}()
On Tue, 2015-08-04 at 19:58 +0100, Andrew Cooper wrote: Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com For my cpuid work which I have just started, I added an array to physinfo. Coverity then frowned at me when it spotted all the memory leaks. Technically speaking, 4.6 isn't broken due to not having an allocation to free in _dispose(), but this patch might still be worth taking in 4.6. We've tended to be rather lax about this for internal code when there is no actual work in the init/dispose functions in the current code base, so while this improvement is the sort of thing we should/would routinely except during development window I personally don't think it is worth a freeze exception. @@ -4653,6 +4654,8 @@ static int libxl__fill_dom0_memory_info(libxl__gc *gc, uint32_t *target_memkb, if (rc 0) goto out; +libxl_physinfo_dispose(physinfo); +libxl_physinfo_init(physinfo); This pattern is starting to crop up a lot in loops of this type. I wonder if I should make the idl generate a libxl_FOO_reinit() which just == dispose+init? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [BUG] kernel panics with drbd
On Tue, 2015-08-04 at 22:24 +0100, Matthew Vernon wrote: On 04/08/15 15:39, Matthew Vernon wrote: Hi, On 04/08/15 15:22, Ian Campbell wrote: On Tue, 2015-08-04 at 14:52 +0100, Matthew Vernon wrote: Hi, Hello, I'm getting dom0 kernel panics, relating to moderately heavy use of drbd. I think this is a Xen bug. It is remarkably similar looking to http://blog.chinewalking.com/drbd-kernel-oops-w-trim/ . Do you have trim? I seem to, yes: Aug 4 14:28:24 ophon kernel: [2856757.049680] drbd mws-02474: Agreed to support TRIM on protocol level I have DRBD 8.9.2~rc1-2. [FTAOD, I think this means that this cannot be a repeat of the issue mentioned in that blog post?] I must confess I'm a bit confused about the versioning of drbd, while Jessie ships the 8.9.2~rc1-2 userspace it appears (from inspection of the kernel source) to contain version 8.4.3 of the kernel side. I think it's the kernel side which the blog post is referring too, which would also be consistent with the kernel side crash. What does /proc/drbd say for you? My usual first suggestion would be to try a newer kernel from jessie -backports, but there doesn't appear to be one right now. Depending on the nature of the systems (e.g. production vs. testlab) you could consider installing the 4.0.8 kernel from Stretch. Alternatively I don't know if drbd upstream provides a convenient way to build and install the modules in an out of tree way? Or maybe there is some option somewhere to tell drbd to not attempt to do trim, even if it thinks it can? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v3.1 2/2] xsplice: Add hook for build_id
On 27.07.2015 21:20, Konrad Rzeszutek Wilk wrote: Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- tools/libxc/xc_private.c | 3 +++ tools/misc/xen-xsplice.c | 25 + xen/common/kernel.c | 11 +++ xen/common/version.c | 5 + xen/include/public/version.h | 4 xen/include/xen/compile.h.in | 1 + xen/include/xen/version.h| 1 + 7 files changed, 50 insertions(+) diff --git a/tools/libxc/xc_private.c b/tools/libxc/xc_private.c index 2ffebd9..7c039ca 100644 --- a/tools/libxc/xc_private.c +++ b/tools/libxc/xc_private.c @@ -713,6 +713,9 @@ int xc_version(xc_interface *xch, int cmd, void *arg) case XENVER_commandline: sz = sizeof(xen_commandline_t); break; +case XENVER_build_id: +sz = sizeof(xen_build_id_t); +break; default: ERROR(xc_version: unknown command %d\n, cmd); return -EINVAL; diff --git a/tools/misc/xen-xsplice.c b/tools/misc/xen-xsplice.c index 7cf9879..dd8266c 100644 --- a/tools/misc/xen-xsplice.c +++ b/tools/misc/xen-xsplice.c @@ -17,6 +17,7 @@ void show_help(void) id An unique name of payload. Up to 40 characters.\n Commands:\n help display this help\n + build-id display build-id of hypervisor.\n upload id file upload file cpuid with id name\n list list payloads uploaded.\n apply id apply id patch.\n @@ -306,12 +307,36 @@ int action_func(int argc, char *argv[], unsigned int idx) return rc; } + +static int build_id_func(int argc, char *argv[]) +{ +xen_build_id_t build_id; + +if ( argc ) +{ +show_help(); +return -1; +} + +memset(build_id, 0, sizeof(*build_id)); + +if ( xc_version(xch, XENVER_build_id, build_id) 0 ) +{ +printf(Failed to get build_id: %d(%s)\n, errno, strerror(errno)); +return -1; +} + +printf(%s\n, build_id); +return 0; +} + struct { const char *name; int (*function)(int argc, char *argv[]); } main_options[] = { { help, help_func }, { list, list_func }, +{ build-id, build_id_func }, { upload, upload_func }, }; diff --git a/xen/common/kernel.c b/xen/common/kernel.c index 6a3196a..e9d41b6 100644 --- a/xen/common/kernel.c +++ b/xen/common/kernel.c @@ -357,6 +357,17 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) if ( copy_to_guest(arg, saved_cmdline, ARRAY_SIZE(saved_cmdline)) ) return -EFAULT; return 0; + +case XENVER_build_id: +{ +xen_build_id_t build_id; + +memset(build_id, 0, sizeof(build_id)); +safe_strcpy(build_id, xen_build_id()); You seem to want to store and transfer the build_id as a string. Any reason why we don't directly expose the build_id embedded by the linker in binary format? +if ( copy_to_guest(arg, build_id, ARRAY_SIZE(build_id)) ) +return -EFAULT; +return 0; +} We should not expose the build_id to normal guests, but only to Dom0. A build_id uniquely identifies a specific build and I don't see how that information would be required from DomU. It might actually help an attacker to build his return-oriented programming exploit against a specific build. The normal version numbers should be enough to know about capabilities and API. } return -ENOSYS; diff --git a/xen/common/version.c b/xen/common/version.c index b152e27..5c3dbb0 100644 --- a/xen/common/version.c +++ b/xen/common/version.c @@ -55,3 +55,8 @@ const char *xen_banner(void) { return XEN_BANNER; } + +const char *xen_build_id(void) +{ +return XEN_BUILD_ID; +} diff --git a/xen/include/public/version.h b/xen/include/public/version.h index 44f26b0..c863393 100644 --- a/xen/include/public/version.h +++ b/xen/include/public/version.h @@ -83,6 +83,10 @@ typedef struct xen_feature_info xen_feature_info_t; #define XENVER_commandline 9 typedef char xen_commandline_t[1024]; +#define XENVER_build_id 10 +typedef char xen_build_id_t[1024]; +#define XEN_BUILD_ID_LEN (sizeof(xen_build_id_t)) + #endif /* __XEN_PUBLIC_VERSION_H__ */ /* diff --git a/xen/include/xen/compile.h.in b/xen/include/xen/compile.h.in index 440ecb2..939685e 100644 --- a/xen/include/xen/compile.h.in +++ b/xen/include/xen/compile.h.in @@ -10,4 +10,5 @@ #define XEN_EXTRAVERSION @@extraversion@@ #define XEN_CHANGESET@@changeset@@ +#define XEN_BUILD_ID@@changeset@@ That leads to a chicken and egg problem when embedding a real build_id. Some linker script magic seems to be required. I will try to refine the patch. #define XEN_BANNER \ diff --git a/xen/include/xen/version.h b/xen/include/xen/version.h index
Re: [Xen-devel] [PATCH] tools/libxl: Correct use of phyinfo_{init, dispose}()
On 05/08/15 09:40, Ian Campbell wrote: On Tue, 2015-08-04 at 19:58 +0100, Andrew Cooper wrote: Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com For my cpuid work which I have just started, I added an array to physinfo. Coverity then frowned at me when it spotted all the memory leaks. Technically speaking, 4.6 isn't broken due to not having an allocation to free in _dispose(), but this patch might still be worth taking in 4.6. We've tended to be rather lax about this for internal code when there is no actual work in the init/dispose functions in the current code base, so while this improvement is the sort of thing we should/would routinely except during development window I personally don't think it is worth a freeze exception. Ok. @@ -4653,6 +4654,8 @@ static int libxl__fill_dom0_memory_info(libxl__gc *gc, uint32_t *target_memkb, if (rc 0) goto out; +libxl_physinfo_dispose(physinfo); +libxl_physinfo_init(physinfo); This pattern is starting to crop up a lot in loops of this type. I wonder if I should make the idl generate a libxl_FOO_reinit() which just == dispose+init? Probably a good idea. I believe I introduced the first use of that style, finding nothing better as an alternative. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Hotpatch construction and __LINE__ (was: [RFC PATCH v3.1] xSplice design.)
Hi, Another high-level point to think about is how we want to handle inlined __LINE__ references. This problem is related to hotpatch construction and potentially has influence on the design of the hotpatching infrastructure in Xen. Let me try to explain the problem: We have file1.c with functions f1 and f2 (in that order). f2 contains a BUG() (or WARN()) macro and at that point embeds the source line number into the generated code for f2. Now we want to hotpatch f1 and the hotpatch source-code patch adds 2 lines to f1 and as a consequence shifts out f2 by two lines. The newly constructed file1.o will now contain differences in both binary functions f1 (because we actually changed it with the applied patch) and f2 (because the contained BUG macro embeds the new line number). Without additional information, an algorithm comparing file1.o before and after hotpatch application will determine both functions to be changed and will have to include both into the binary hotpatch. Options: 1. Transform source code patches for hotpatches to be line-neutral for each chunk. This can be done in almost all cases with either reformatting of the source code or by introducing artificial preprocessor #line n directives to adjust for the introduced differences. This approach is low-tech and simple. Potentially generated backtraces and existing debug information refers to the original build and does not reflect hotpatching state except for actually hotpatched functions but should be mostly correct. 2. Ignoring the problem and living with artificially large hotpatches that unnecessarily patch many functions. This approach might lead to some very large hotpatches depending on content of specific source file. It may also trigger pulling in functions into the hotpatch that cannot reasonable be hotpatched due to limitations of a hotpatching framework (init-sections, parts of the hotpatching framework itself, ...) and may thereby prevent us from patching a specific problem. The decision between 1. and 2. can be made on a patch--by-patch basis. 3. Introducing an indirection table for storing line numbers and treating that specially for binary diffing. I believe Linux follows this approach, but the details escape me ATM. We might either use this indirection table for runtime use and patch that with each hotpatch (similarly to exception tables) or we might purely use it when building hotpatches to ignore functions that only differ at exactly the location where a line-number is embedded. Similar considerations are true to a lesser extent for __FILE__, but I would argue that file renaming should be done outside of hotpatches. Martin On 27.07.2015 21:20, Konrad Rzeszutek Wilk wrote: Hey! Since v3 [http://lists.xen.org/archives/html/xen-devel/2015-07/msg00990.html] - Nailed down the comments, ingested them in. - Wrote and tested some code. RFC v2 [http://lists.xen.org/archives/html/xen-devel/2015-05/msg02142.html] - Ingested every review comment in. The patches for the code are a shell - there is no patching done at all and it is very much just to test out the design and hypercalls. The hard parts are yet to come :-) At the Seattle LinuxCon/Xen Summit I will be presenting about xSplice and referring to this URL. There is also an slot for brainstorming to talk in details about things we disagree - and there is ample time to talk during dinner. Martin who has been heavily reviewing the design will be there and I hope other folks will be there as well to shape the design and how we want this to work. The big outstanding issues are how we want to handle preemption. That is the problem of making an hypercall and waiting for the hypervisor to do its job (and the VCPU is blocked). In the past some XSAs have come out to resolve this and I would very much like this to have it addressed at start. I think the other issues that have been raised should also be discussed naturally, but the above is crucial (at least for me). I've attached the patches on how I thought the preemption part could be solved by having an 'worker' in hypervisor acting on the commands - and we just poll on the status to see what the hypervisor has done so far. Lastly, I also plan to add an Wiki to outline the dependency implementation parts that so far bubbled up - I figured Wiki would be better as some folks could put their name behind it. Now please excuse the roughness of the patch and this giant one huge having everything in it. It ought to be split in three at least: hypervisor, toolstacks (libxc and libxl) - that is to be done later. docs/misc/xsplice.h | 80 +++ docs/misc/xsplice.markdown| 1230 + docs/misc/xsplice_test.c | 78 +++ tools/libxc/include/xenctrl.h | 16 + tools/libxc/xc_misc.c | 183 ++ tools/libxc/xc_private.c |3
Re: [Xen-devel] [RFC PATCH v3.1 2/2] xsplice: Add hook for build_id
On 05/08/15 09:50, Martin Pohlack wrote: On 27.07.2015 21:20, Konrad Rzeszutek Wilk wrote: Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- tools/libxc/xc_private.c | 3 +++ tools/misc/xen-xsplice.c | 25 + xen/common/kernel.c | 11 +++ xen/common/version.c | 5 + xen/include/public/version.h | 4 xen/include/xen/compile.h.in | 1 + xen/include/xen/version.h| 1 + 7 files changed, 50 insertions(+) diff --git a/tools/libxc/xc_private.c b/tools/libxc/xc_private.c index 2ffebd9..7c039ca 100644 --- a/tools/libxc/xc_private.c +++ b/tools/libxc/xc_private.c @@ -713,6 +713,9 @@ int xc_version(xc_interface *xch, int cmd, void *arg) case XENVER_commandline: sz = sizeof(xen_commandline_t); break; +case XENVER_build_id: +sz = sizeof(xen_build_id_t); +break; default: ERROR(xc_version: unknown command %d\n, cmd); return -EINVAL; diff --git a/tools/misc/xen-xsplice.c b/tools/misc/xen-xsplice.c index 7cf9879..dd8266c 100644 --- a/tools/misc/xen-xsplice.c +++ b/tools/misc/xen-xsplice.c @@ -17,6 +17,7 @@ void show_help(void) id An unique name of payload. Up to 40 characters.\n Commands:\n help display this help\n + build-id display build-id of hypervisor.\n upload id file upload file cpuid with id name\n list list payloads uploaded.\n apply id apply id patch.\n @@ -306,12 +307,36 @@ int action_func(int argc, char *argv[], unsigned int idx) return rc; } + +static int build_id_func(int argc, char *argv[]) +{ +xen_build_id_t build_id; + +if ( argc ) +{ +show_help(); +return -1; +} + +memset(build_id, 0, sizeof(*build_id)); + +if ( xc_version(xch, XENVER_build_id, build_id) 0 ) +{ +printf(Failed to get build_id: %d(%s)\n, errno, strerror(errno)); +return -1; +} + +printf(%s\n, build_id); +return 0; +} + struct { const char *name; int (*function)(int argc, char *argv[]); } main_options[] = { { help, help_func }, { list, list_func }, +{ build-id, build_id_func }, { upload, upload_func }, }; diff --git a/xen/common/kernel.c b/xen/common/kernel.c index 6a3196a..e9d41b6 100644 --- a/xen/common/kernel.c +++ b/xen/common/kernel.c @@ -357,6 +357,17 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) if ( copy_to_guest(arg, saved_cmdline, ARRAY_SIZE(saved_cmdline)) ) return -EFAULT; return 0; + +case XENVER_build_id: +{ +xen_build_id_t build_id; + +memset(build_id, 0, sizeof(build_id)); +safe_strcpy(build_id, xen_build_id()); You seem to want to store and transfer the build_id as a string. Any reason why we don't directly expose the build_id embedded by the linker in binary format? +if ( copy_to_guest(arg, build_id, ARRAY_SIZE(build_id)) ) +return -EFAULT; +return 0; +} We should not expose the build_id to normal guests, but only to Dom0. A build_id uniquely identifies a specific build and I don't see how that information would be required from DomU. It might actually help an attacker to build his return-oriented programming exploit against a specific build. The normal version numbers should be enough to know about capabilities and API. It will need its own XSM hook, but need not be strictly limited to just dom0. } return -ENOSYS; diff --git a/xen/common/version.c b/xen/common/version.c index b152e27..5c3dbb0 100644 --- a/xen/common/version.c +++ b/xen/common/version.c @@ -55,3 +55,8 @@ const char *xen_banner(void) { return XEN_BANNER; } + +const char *xen_build_id(void) +{ +return XEN_BUILD_ID; +} diff --git a/xen/include/public/version.h b/xen/include/public/version.h index 44f26b0..c863393 100644 --- a/xen/include/public/version.h +++ b/xen/include/public/version.h @@ -83,6 +83,10 @@ typedef struct xen_feature_info xen_feature_info_t; #define XENVER_commandline 9 typedef char xen_commandline_t[1024]; +#define XENVER_build_id 10 +typedef char xen_build_id_t[1024]; +#define XEN_BUILD_ID_LEN (sizeof(xen_build_id_t)) + #endif /* __XEN_PUBLIC_VERSION_H__ */ /* diff --git a/xen/include/xen/compile.h.in b/xen/include/xen/compile.h.in index 440ecb2..939685e 100644 --- a/xen/include/xen/compile.h.in +++ b/xen/include/xen/compile.h.in @@ -10,4 +10,5 @@ #define XEN_EXTRAVERSION@@extraversion@@ #define XEN_CHANGESET @@changeset@@ +#define XEN_BUILD_ID@@changeset@@ That leads to a chicken and egg problem when embedding a real build_id. Some linker script magic seems to be required. I will try to refine
Re: [Xen-devel] [PATCH v3 1/1] x86: allow to enable/disable modify_ldt at run time
On Wed, Aug 05, 2015 at 10:26:16AM +0200, Ingo Molnar wrote: +modify_ldt: (X86 only) + +Enables (1) or disables (0) the modify_ldt syscall. Modifying the LDT +(Local Descriptor Table) may be needed to run a 16-bit or segmented code s/run a/run good catch, thanks. +such as Dosemu or Wine. This is done via a system call which is not needed s/Dosemu/DOSEMU I didn't know. Fixed. +to run portable applications, and which can sometimes be abused to exploit +some weaknesses of the architecture, opening new vulnerabilities. So that's pretty vague IMHO, and a bit FUD-ish in character. How about: ... , and which system call exposes complex, rarely used legacy hardware features and semantics that had suffered vulnerabilities in the past. OK fine for me, fixed. + if (!sysctl_modify_ldt) { + printk_ratelimited(KERN_INFO + Denied a call to modify_ldt() from %s[%d] (uid: %d). +Adjust the modify_ldt sysctl if this was not an Would it really be so difficult to write this as: Set sys.kernel.modify_ldt = 1 in /etc/sysctl.conf if this was not an exploit attempt. It's just a matter of taste. Normally I consider the kernel distro-agnostic so I don't like to suggest one way to adjust sysctls nor to reference config files. Here we're in a case where only standard distro users may hit the issue, and users of embedded distros will not face this message or will easily translate it into their respective configuration scheme. So OK for this one. 99% of the users seeing this message will see it right after an app of theirs ended up not working. Let's not add to the annoyance factor! That's exactly why I wrote the message in the first place! +exploit attempt.\n, + current-comm, task_pid_nr(current), + from_kuid_munged(current_user_ns(), current_uid())); Also generally please don't break message lines in the source code while they are a single line in the syslog, to make it easier to grep for and to expose kernel hackers to the form of message they are emitting. Ignore checkpatch. I'm fine with this rule as well, it's only in the kenrel that I tend to care about the 80-col limit, in my own code I easily ignore it for the same reason. @@ -960,6 +963,17 @@ static struct ctl_table kern_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, +#ifdef CONFIG_MODIFY_LDT_SYSCALL + { + .procname = modify_ldt, + .data = sysctl_modify_ldt, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = zero, + .extra2 = one, + }, +#endif So I'd actually make the permissions 0600: to make it a tiny bit harder for exploits to silently query the current value to figure out whether they can safely attempt the syscall or not ... That's a good point. If we later make other syscalls configurable, it might be different because some users might want to contact their admin to ask for a specific one. But here, there's usually no admin so I'm fine with hardening it. (Sadly /etc/sysctl.conf is world-readable on most distros.) Yes, just like most executables are readable while not strictly needed, especially setuid ones which allow the code to be studied in place or easily duplicated to script some exploits. But we could discuss hours about basic system hardening! I've updated the patch with your suggestions. Thanks, Willy From 8521f170515dfc0f390c396100140504c8dfbcfc Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Sat, 25 Jul 2015 12:18:33 +0200 Subject: [PATCH] x86/ldt: allow to disable modify_ldt at runtime For distros who prefer not to take the risk of completely disabling the modify_ldt syscall using CONFIG_MODIFY_LDT_SYSCALL, this patch adds a sysctl to enable or/disable it at runtime, and proposes to disable it by default. This can be a safe alternative. A message is logged if an attempt was stopped so that it's easy to spot if/when it is needed. Cc: Andy Lutomirski l...@kernel.org Cc: Kees Cook keesc...@chromium.org Cc: Ingo Molnar mi...@kernel.org Signed-off-by: Willy Tarreau w...@1wt.eu --- Documentation/sysctl/kernel.txt | 16 arch/x86/Kconfig| 17 + arch/x86/kernel/ldt.c | 15 +++ kernel/sysctl.c | 14 ++ 4 files changed, 62 insertions(+) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 6fccb69..7dcdebd 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -41,6 +41,7 @@ show up in /proc/sys/kernel: - kptr_restrict - kstack_depth_to_print [ X86 only ] - l2cr[ PPC only ] +- modify_ldt [ X86
Re: [Xen-devel] [PATCH v3 1/1] x86: allow to enable/disable modify_ldt at run time
* Willy Tarreau w...@1wt.eu wrote: + if (!sysctl_modify_ldt) { + printk_ratelimited(KERN_INFO + Denied a call to modify_ldt() from %s[%d] (uid: %d). + Adjust the modify_ldt sysctl if this was not an Would it really be so difficult to write this as: Set sys.kernel.modify_ldt = 1 in /etc/sysctl.conf if this was not an exploit attempt. It's just a matter of taste. Normally I consider the kernel distro-agnostic so I don't like to suggest one way to adjust sysctls nor to reference config files. Here we're in a case where only standard distro users may hit the issue, and users of embedded distros will not face this message or will easily translate it into their respective configuration scheme. So OK for this one. So it's a side issue, but it's not a matter of taste at all: why should we end up hurting 99% of Linux users (that use regular distros), just to make it slightly more 'correct' for the weird 1% 'embedded distro' case that decided to put sysctl configuration elsewhere? Users of 'embedded' distros won't normally see kernel messages, and even if they do, the message is crystal clear even to them... Such messages should be as helpful to the regular case as possible. The weird cases will be OK too: and it does not help to make a message unhelpful for _both_ cases. Thanks, Ingo ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3] VT-d: add iommu=igfx option to workaround graphics issues
When using Linux = 3.19 (commit 47591df) as dom0 on some Intel Ironlake devices, It is possible to encounter graphics issues that make screen unreadable or crash the system. It was reported in freedesktop bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90037 As we still cannot find a proper fix for this problem, this patch adds iommu=igfx option to control whether Intel graphics IOMMU is enabled. Running Xen with iommu=no-igfx is similar to running Linux with intel_iommu=igfx_off, which disables IOMMU for Intel GPU. This can be used by users to manually workaround the problem before a fix is available for i915 driver. Signed-off-by: Ting-Wei Lan lant...@gmail.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com --- Changed since v2: * Make no-igfx available for all Intel devices, not just Calpella/Ironlake Changed since v1: * Replace igfx_off with igfx docs/misc/xen-command-line.markdown | 11 ++- xen/drivers/passthrough/iommu.c | 3 +++ xen/drivers/passthrough/vtd/quirks.c | 3 +++ xen/include/xen/iommu.h | 2 +- 4 files changed, 17 insertions(+), 2 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 13f03ad..486e53b 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -793,7 +793,7 @@ debug hypervisor only). Default: `new` unless directed-EOI is supported ### iommu - `= List of [ boolean | force | required | intremap | qinval | snoop | sharept | dom0-passthrough | dom0-strict | amd-iommu-perdev-intremap | workaround_bios_bug | verbose | debug ]` + `= List of [ boolean | force | required | intremap | qinval | snoop | sharept | dom0-passthrough | dom0-strict | amd-iommu-perdev-intremap | workaround_bios_bug | igfx | verbose | debug ]` Sub-options: @@ -867,6 +867,15 @@ debug hypervisor only). ignored (normally IOMMU setup fails if any of the devices listed by a DRHD entry aren't PCI discoverable). + `igfx` (VT-d) + + Default: `true` + + Enable IOMMU for Intel graphics devices. The intended usage of this option + is `no-igfx`, which is silimar to Linux `intel_iommu=igfx_off` option used + to workaround graphics issues. If adding `no-igfx` fixes anything, you + should file a bug reporting the problem. + `verbose` Default: `false` diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index cc12735..966cc66 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -47,6 +47,7 @@ bool_t __read_mostly force_iommu; bool_t __hwdom_initdata iommu_dom0_strict; bool_t __read_mostly iommu_verbose; bool_t __read_mostly iommu_workaround_bios_bug; +bool_t __read_mostly iommu_igfx = 1; bool_t __read_mostly iommu_passthrough; bool_t __read_mostly iommu_snoop = 1; bool_t __read_mostly iommu_qinval = 1; @@ -87,6 +88,8 @@ static void __init parse_iommu_param(char *s) force_iommu = val; else if ( !strcmp(s, workaround_bios_bug) ) iommu_workaround_bios_bug = val; +else if ( !strcmp(s, igfx) ) +iommu_igfx = val; else if ( !strcmp(s, verbose) ) iommu_verbose = val; else if ( !strcmp(s, snoop) ) diff --git a/xen/drivers/passthrough/vtd/quirks.c b/xen/drivers/passthrough/vtd/quirks.c index 69d29ab..92cff1d 100644 --- a/xen/drivers/passthrough/vtd/quirks.c +++ b/xen/drivers/passthrough/vtd/quirks.c @@ -72,6 +72,9 @@ int is_igd_vt_enabled_quirk(void) { u16 ggc; +if ( !iommu_igfx ) +return 0; + if ( !IS_ILK(ioh_id) ) return 1; diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 8eb764a..29eed51 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -29,7 +29,7 @@ extern bool_t iommu_enable, iommu_enabled; extern bool_t force_iommu, iommu_verbose; -extern bool_t iommu_workaround_bios_bug, iommu_passthrough; +extern bool_t iommu_workaround_bios_bug, iommu_igfx, iommu_passthrough; extern bool_t iommu_snoop, iommu_qinval, iommu_intremap; extern bool_t iommu_hap_pt_share; extern bool_t iommu_debug; -- 2.4.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen 4.6 retrospective] [urgent] rename freeze window and make release branch as soon as possible after RC1
This is one item of feedback, which I believe is a quick win for us. This is one piece of feedback from a list of items that have during the last few weeks been raised with me personally, either during face-2-face conversations in a private e-mail thread. See http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg00173.html http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg00173.html for information on the retrospective = Issue / Observation = The name freeze window/period - aka the time period from when we feature freeze until we branch master and/or make the release leads some contributors to mistakenly assume that development for the next release stops. I saw a few mails on xen-devel@ recently, pointing out to contributors that development does not stop during freeze. Chatting to Ian Campbell, he mentioned that he replied to several different people who said they were waiting for the tree to reopen. Maybe choosing a better name will help. In addition, we used to branch master a lot earlier I believe up to Xen 4.1 (around RC2 or RC3). At some point we started branching master when we release. I do not know why we changed, but it seems we did not have any issues branching master around RC2 or RC3. Branching earlier, would mean that contributors do not have to carry patches for as long as they do now, and the risk of having to rebase patches several times is lower. = Possible Solution / Improvement = Change Terminology: * Keep Feature Freeze as is * Rename Freeze Window/Period to Stabilisation Window/Period or something similar * Make clear that Stabilisation Window/Period != no development in the Development Update x.y mail template Release Process improvements: * Reopen the tree development tree as soon as possible after RC1 (I will let you guys figure out the best RC - let's call it RCx) * In other words, create the release branch at RCx There could be some optimisations and additional things that may make sense: * Encourage maintainers/committers to refrain from committing big refactoring changes during RCx and the final RC for a release to avoid complications if we want to cherry port bug fixes, etc. from master to the release branch * Committers should be permitted to apply backports to the release branch until the actual release rather than putting all the burden on the stable maintainer(s)___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen
On 2015/8/4 22:37, Stefano Stabellini wrote: On Tue, 4 Aug 2015, Shannon Zhao wrote: This document is going to explain the design details of Xen booting with ACPI on ARM. Maybe parts of it may not be appropriate. Any comments are welcome. Good start! To Xen itself booting with ACPI, this is similar to Linux kernel except that Xen doesn't parse DSDT table. So I'll skip this part and focus on how Xen prepares ACPI tables for DOM0 and how Xen passes them to DOM0. 1)copy and change some EFI and ACPI tables. a) Copy EFI_SYSTEM_TABLE and change the value of FirmwareVendor, VendorGuid, VendorTable, ConfigurationTable. These changes are not very special and it just assign values to these members. b) Create EFI_MEMORY_DESCRIPTOR table. This will add memory start and size information of DOM0. And DOM0 will get the memory information through this EFI table. c) Copy FADT table. Change the value of arm_boot_flags to enable PSCI and HVC. Let the hypervisor_id be XenVMM in order to tell DOM0 that it runs on Xen hypervisor, so DOM0 can call hypercall to get some informations for booting necessity, such as grant tab start address and size. Change header revison, length and checksum as well. d) Copy GTDT table. Set non_secure_el2_interrupt and non_secure_el2_flags to 0 to mask EL2 timer for DOM0. e) Copy MADT table. According to the value of dom0_max_vcpus to change the number GICC entries. f) Create STAO table. This table is a new added one that's used to define a list of ACPI namespace names that are to be ignored by the OSPM in DOM0. Currently we use it to tell OSPM should ignore UART defined in SPCR table. g) Copy XSDT table. Add a new table entry for STAO and change other table's entries. h) Change the value of xsdt_physical_address in RSDP table. i) The reset of tables are not copied or changed. They are reused including DSDT, SPCR. OK so far All these tables will be copied or mapped to guest memory. Are they copied or mapped? Also I think we need to recalculate the md5sum? 2)Create minimal DT to pass required informations to DOM0 The minimal DT mainly passes DOM0 bootargs, address and size of initrd (if available), address and size of uefi system table, address and size of uefi memory table, uefi-mmap-desc-size and uefi-mmap-desc-ver. I think we need to specify which Linux entry point is called, that I think will be the proper non-EFI kernel entry point, which requires MMU off (see Documentation/efi-stub.txt in linux). Also it would be better to write the full bindings of the generated minimal DT, see http://marc.info/?l=linux-kernelm=142362266626403w=2 and Documentation/arm/uefi.txt in linux. An example of the minimal DT: / { #address-cells = 2; #size-cells = 1; chosen { bootargs = kernel=Image console=hvc0 earlycon=pl011,0x1c09 root=/dev/vda2 rw rootfstype=ext4 init=/bin/sh acpi=force; linux,initrd-start = 0x; linux,initrd-end = 0x; linux,uefi-system-table = 0x; linux,uefi-mmap-start = 0x; linux,uefi-mmap-size = 0x; linux,uefi-mmap-desc-size = 0x; linux,uefi-mmap-desc-ver = 0x; }; }; 3)DOM0 how to get grant table and event channel irq informations As said above, we assign the hypervisor_id be XenVMM to tell DOM0 that it runs on Xen hypervisor. Then save the start address and size of grant table in domain-grant_table-start_addr and domain-grant_table-size. DOM0 can call a new hypercall GNTTABOP_get_start_addr to get these info. Same to event channel, we've already save interrupt number in d-arch.evtchn_irq, so DOM0 can call a new hypercall EVTCHNOP_get_irq to get the irq. It would be nice to go down into more details and write the parameters of the hypercalls in the doc as they will become a newly supported ABI. The parameters of GNTTABOP_get_start_addr is like below: struct gnttab_get_start_addr { /* IN parameters */ domid_t dom; uint16_t pad; /* OUT parameters */ uint64_t start_addr; uint64_t size; }; The parameters of EVTCHNOP_get_irq is like below: struct evtchn_get_irq { /* IN parameters. */ domid_t dom; uint16_t pad; /* OUT parameters. */ uint32_t irq; }; The evtchnop would need to be called something like EVTCHNOP_get_notification_irq and would need to be ARM specific (on x86 things are different). 4)How to map MMIO regions a)Current implementation is mapping MMIO regions in Dom0 on demand when trapping in Xen with a data abort. I think this approach is prone to failures. A driver could program a device for DMA involving regions not yet mapped. As a consequence the DMA operation would fail because the SMMU would stop the transaction.
Re: [Xen-devel] [PATCH for-4.6 v3 2/6] docs/libxl: Re-specify XENSTORE_DATA as EMULATOR_XENSTORE_DATA
On Tue, 2015-08-04 at 18:16 +0100, Andrew Cooper wrote: The legacy toolstack record as implemented in libxl turns out not to be 32/64bit safe. As migration v2 has not shipped yet, take this opportunity to adjust the specification and fix the incompatibility. Libxl shall loose all knowledge of the old toolstack blob and use this EMULATOR_XENSTORE_DATA record instead. Compatibility shall be handled by the legacy conversion script. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 v3 3/6] tools/libxl: Save and restore EMULATOR_XENSTORE_DATA content
On Tue, 2015-08-04 at 18:16 +0100, Andrew Cooper wrote: The new EMULATOR_XENSTORE_DATA content is a sequence of NUL terminated key/value strings, with the key relative to the device model's xenstore tree. A sample might look like (as decoded by verify-stream-v2): Emulator Xenstore Data (Qemu Upstream, idx 0) 'physmap/1f0/start_addr' = 'f000' 'physmap/1f0/size' = '80' 'physmap/1f0/name' = 'vga.vram' This patch introduces libxl helpers to save and restore this new format, which reimplement the existing libxl__toolstack_{save,restore}() logic. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 v3 2/6] docs/libxl: Re-specify XENSTORE_DATA as EMULATOR_XENSTORE_DATA
On Tue, Aug 04, 2015 at 06:16:32PM +0100, Andrew Cooper wrote: The legacy toolstack record as implemented in libxl turns out not to be 32/64bit safe. As migration v2 has not shipped yet, take this opportunity to adjust the specification and fix the incompatibility. Libxl shall loose all knowledge of the old toolstack blob and use this EMULATOR_XENSTORE_DATA record instead. Compatibility shall be handled by the legacy conversion script. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 v3 3/6] tools/libxl: Save and restore EMULATOR_XENSTORE_DATA content
On Tue, Aug 04, 2015 at 06:16:33PM +0100, Andrew Cooper wrote: The new EMULATOR_XENSTORE_DATA content is a sequence of NUL terminated key/value strings, with the key relative to the device model's xenstore tree. A sample might look like (as decoded by verify-stream-v2): Emulator Xenstore Data (Qemu Upstream, idx 0) 'physmap/1f0/start_addr' = 'f000' 'physmap/1f0/size' = '80' 'physmap/1f0/name' = 'vga.vram' This patch introduces libxl helpers to save and restore this new format, which reimplement the existing libxl__toolstack_{save,restore}() logic. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com v2: v3 :-) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/8] xen: Make clear that swiotlb and biomerge are dealing with DMA address
On Tue, 4 Aug 2015, Julien Grall wrote: The swiotlb is required when programming a DMA address on ARM when a device is not protected by an IOMMU. In this case, the DMA address should always be equal to the machine address. For DOM0 memory, Xen ensure it by have an identity mapping between the guest address and host address. However, when mapping a foreign grant reference, the 1:1 model doesn't work. For ARM guest, most of the callers of pfn_to_mfn expects to get a GFN (Guest Frame Number), i.e a PFN (Page Frame Number) from the Linux point of view given that all ARM guest are auto-translated. Even though the name pfn_to_mfn is misleading, we need to ensure that those caller get a GFN and not by mistake a MFN. In pratical, I haven't seen error related to this but we should fix it for the sake of correctness. In order to fix the implementation of pfn_to_mfn on ARM in a follow-up patch, we have to introduce new helpers to return the DMA from a PFN and the invert. On x86, the new helpers will be an alias of pfn_to_mfn and mfn_to_pfn. The helpers will be used in swiotlb and xen_biovec_phys_mergeable. This is necessary in the latter because we have to ensure that the biovec code will not try to merge a biovec using foreign page and another using Linux memory. Lastly, the helper mfn_to_local_pfn has been renamed to dnf_to_local_pfn ^ please update given that the only usage was in swiotlb. Signed-off-by: Julien Grall julien.gr...@citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Russell King li...@arm.linux.org.uk Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: x...@kernel.org Cc: linux-arm-ker...@lists.infradead.org --- Changes in v2: - Use bfn (Bus Frame Number) rather than dfn to match the proposed terminology for pv-iommu hypercall. --- arch/arm/include/asm/xen/page.h | 23 +-- arch/arm/xen/mm.c | 4 ++-- arch/x86/include/asm/xen/page.h | 8 ++-- drivers/xen/biomerge.c | 6 +++--- drivers/xen/swiotlb-xen.c | 16 5 files changed, 40 insertions(+), 17 deletions(-) diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index 98b1084..bc5e77c 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -52,7 +52,26 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn) return mfn; } -#define mfn_to_local_pfn(mfn) mfn_to_pfn(mfn) +/* Pseudo-physical - BUS conversion */ +static inline unsigned long pfn_to_bfn(unsigned long pfn) +{ + unsigned long mfn; + + if (phys_to_mach.rb_node != NULL) { + mfn = __pfn_to_mfn(pfn); + if (mfn != INVALID_P2M_ENTRY) + return mfn; + } + + return pfn; +} + +static inline unsigned long bfn_to_pfn(unsigned long bfn) +{ + return bfn; +} + +#define bfn_to_local_pfn(bfn)bfn_to_pfn(bfn) /* VIRT - MACHINE conversion */ #define virt_to_mfn(v) (pfn_to_mfn(virt_to_pfn(v))) @@ -96,7 +115,7 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn) bool xen_arch_need_swiotlb(struct device *dev, unsigned long pfn, -unsigned long mfn); +unsigned long dfn); unsigned long xen_get_swiotlb_free_pages(unsigned int order); You missed a bunch of dfn-bfn renamings. Aside from those and the commit message error: Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com #endif /* _ASM_ARM_XEN_PAGE_H */ diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c index 03e75fe..12bde72 100644 --- a/arch/arm/xen/mm.c +++ b/arch/arm/xen/mm.c @@ -139,9 +139,9 @@ void __xen_dma_sync_single_for_device(struct device *hwdev, bool xen_arch_need_swiotlb(struct device *dev, unsigned long pfn, -unsigned long mfn) +unsigned long dfn) { - return (!hypercall_cflush (pfn != mfn) !is_device_dma_coherent(dev)); + return (!hypercall_cflush (pfn != dfn) !is_device_dma_coherent(dev)); } int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h index c44a5d5..8ba04b8 100644 --- a/arch/x86/include/asm/xen/page.h +++ b/arch/x86/include/asm/xen/page.h @@ -178,6 +178,10 @@ static inline xpaddr_t machine_to_phys(xmaddr_t machine) return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset); } +/* Pseudo-physical - Bus conversion */ +#define pfn_to_bfn(pfn)
Re: [Xen-devel] [PATCH v2 3/8] arm/xen: implement correctly pfn_to_mfn
On Tue, 4 Aug 2015, Julien Grall wrote: After the commit introducing convertion between DMA and guest address, ^ addresses all the callers of pfn_to_mfn are expecting to get a GFN (Guest Frame Number). On ARM, all the guests are auto-translated so the GFN is equal to the Linux PFN (Pseudo-physical Frame Number). The current implementation may return an MFN if the caller is passing a PFN associated to a mapped foreign grant. In pratical, I haven't seen ^ practice the problem on running guest but we should fix it for the sake of correctness. Correct the implementation by always returning the pfn passed in parameter. A follow-up patch will take care to rename pfn_to_mfn to a suitable name. Signed-off-by: Julien Grall julien.gr...@citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Russell King li...@arm.linux.org.uk Cc: linux-arm-ker...@lists.infradead.org Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com arch/arm/include/asm/xen/page.h | 8 1 file changed, 8 deletions(-) diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index bc5e77c..087d86e 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -36,14 +36,6 @@ extern struct rb_root phys_to_mach; static inline unsigned long pfn_to_mfn(unsigned long pfn) { - unsigned long mfn; - - if (phys_to_mach.rb_node != NULL) { - mfn = __pfn_to_mfn(pfn); - if (mfn != INVALID_P2M_ENTRY) - return mfn; - } - return pfn; } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests
El 04/08/15 a les 20.08, Andrew Cooper ha escrit: On 03/08/15 18:31, Roger Pau Monné wrote: Therefore, I am leaning slightly towards the specify the internals side of things, which removes some complexity from the Xen side of the hypercall. I've updated the proposed structure, so now it looks like: (I still need to figure out how to order the bits in the access rights (_ar) fields.) All struct's need a flags register. I was unsure about adding the {e/r}flags register, I will add it in new versions. struct vcpu_hvm_x86_16 { uint16_t ax; uint16_t cx; uint16_t dx; uint16_t bx; uint16_t sp; uint16_t bp; uint16_t si; uint16_t di; uint16_t ip; uint32_t cr[8]; Definitely no need for anything other than cr0 and 4 in 16 bit mode. Yes. Would you rather prefer to spell out the exact control registers that are going to be used instead of using an array? For 16bit mode this is going to be CR0/CR4, for 32bit or long mode mode CR0/CR3/CR4. uint32_t cs_base; uint32_t ds_base; uint32_t ss_base; uint32_t cs_limit; uint32_t ds_limit; uint32_t ss_limit; uint16_t cs_ar; uint16_t ds_ar; uint16_t ss_ar; }; struct vcpu_hvm_x86_32 { uint32_t eax; uint32_t ecx; uint32_t edx; uint32_t ebx; uint32_t esp; uint32_t ebp; uint32_t esi; uint32_t edi; uint32_t eip; uint32_t cr[8]; Don't need cr's 5-8. uint64_t efer; uint32_t cs_base; uint32_t ds_base; uint32_t ss_base; uint32_t cs_limit; uint32_t ds_limit; uint32_t ss_limit; uint16_t cs_ar; uint16_t ds_ar; uint16_t ss_ar; You need selector entries for each segment as well. Really? What's the point in having the selector if we don't have a GDT, and we allow loading the cached part, which is the relevant one. }; struct vcpu_hvm_x86_64 { uint64_t rax; uint64_t rcx; uint64_t rdx; uint64_t rbx; uint64_t rsp; uint64_t rbp; uint64_t rsi; uint64_t rdi; uint64_t r8; uint64_t r9; uint64_t r10; uint64_t r11; uint64_t r12; uint64_t r13; uint64_t r14; uint64_t r15; uint64_t rip; uint64_t cr[8]; uint64_t efer; What has happened to the segment information? It cannot be omitted completely, even in 64bit. I had in mind to set them to the values required for long mode: base = 0 limit = 0x and the attributes field for CS to: ar = 0x29b (L bit set) And for SS/DS: ar = 0x093 But maybe it makes sense to allow setting them in case someone wants to start in compatibility mode. }; typedef enum { VCPU_HVM_MODE_16B, /* 16bit fields of the structure will be used. */ VCPU_HVM_MODE_32B, /* 32bit fields of the structure will be used. */ VCPU_HVM_MODE_64B, /* 64bit fields of the structure will be used. */ } vcpu_hvm_mode_t; struct vcpu_hvm_context { vcpu_hvm_mode_t mode; The width of an enum is implementation defined, which makes them unsuitable in the public interface. Right, I'm going to change it to a uint32_t and the modes to defines. /* CPU registers. */ union { struct vcpu_hvm_x86_16 x86_16; struct vcpu_hvm_x86_32 x86_32; struct vcpu_hvm_x86_64 x86_64; }; We require C89 compatibility for the public interface, so no anonymous unions. Ack, thanks for the review. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Reminder: Urgent - Action Required - Xen Dev Summit Bof's Developer Meetings and WG Meetings on Aug 18 19t (print and food order deadline Friday 7th)
Hi folks, this email is for people planning to attend the Xen Dev Summit in Seattle (Aug 17 18) and the Developer Meeting on the 19th. Please read and take action before *Friday* Aug 7th Best Regards Lars = BoF's (Print Deadline Friday 7th) == If you do want to host some BoF's and/or suggest topics for the Bofs at the Developer Summit on Aug 18th, please fill out * http://wiki.xenproject.org/wiki/Developer_Meeting/Aug2015-BoFs#Topics http://wiki.xenproject.org/wiki/Developer_Meeting/Aug2015-BoFs#Topics - otherwise we will miss the print deadline * For the overall program, see http://events.linuxfoundation.org/events/xen-project-developer-summit/program/schedule http://events.linuxfoundation.org/events/xen-project-developer-summit/program/schedule Alternatively you can reply to this mail (make sure I am CC'ed). = Developer Meeting = Also, I need to order food and coffee for the Developer Meeting on Aug 19th, 10:00-13:30, and need to get a sense of numbers. In the past we were between 25-30 and I expect this to be similar this year. If you plan to attend, please fill out * http://wiki.xenproject.org/wiki/Developer_Meeting/Aug2015 http://wiki.xenproject.org/wiki/Developer_Meeting/Aug2015 Do note that the Advisory Board meeting is from 9:00 to 10:00. I am intending to maybe have a 30 minute overlap, where AB members and Developer Meeting attendees can mingle and raise issues and suggestions with board members. Alternatively you can reply to this mail (make sure I am CC'ed). = WG Meetings = I do have a meeting room on Aug 19th, which is free from 13:30. If any working groups such as OpenStack or Test Framework folks want to get together then do let me know. I can book space in the room. Right now, it looks as if we don't have enough critical mass in terms of attendees (we have maybe 2 or 3 from each WG).___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: remove stray declaration of libxl__hotplug_settings
On Tue, 2015-08-04 at 11:25 +0100, Ian Campbell wrote: On Tue, 2015-08-04 at 11:16 +0100, Wei Liu wrote: That function was removed in 2ba368d1 (libxl: Remove linux udev rules) Signed-off-by: Wei Liu wei.l...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Trivially fine for 4.6 IMHO. Applied. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] libxl: increase hotplug timeout to 40s
On Tue, 2015-08-04 at 11:26 +0100, Wei Liu wrote: On Tue, Aug 04, 2015 at 12:02:55PM +0200, Roger Pau Monne wrote: The default libxl timeout for hotplug scripts execution is too low, when launching 40 HVM guests in parallel, all using the same file as disk, execution times of ~20s are expected. Increase the timeout to 40s in order to be sure hotplug scripts have enough time to execute. Signed-off-by: Roger Pau Monné roger@citrix.com Reported-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com As I understand this from the discussion in the other thread, there is no regression in hotplug scripts. Increasing the timeout is a reasonable solution for 4.6. Acked-by: Wei Liu wei.l...@citrix.com Might be worth mentioning this is short term solution in commit message though. After consultation on IRC I added This is a short term solution and applied, thanks. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] x86/gdt: Drop write-only, xalloc()'d array from set_gdt()
On Tue, 2015-08-04 at 15:17 +0100, George Dunlap wrote: On Mon, Aug 3, 2015 at 6:05 PM, Andrew Cooper andrew.coop...@citrix.com wrote: It is not used, and can cause a spurious failure of the set_gdt() hypercall in low memory situations. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Reviewed-by: George Dunlap george.dun...@eu.citrix.com It's unclear from MAINTAINERS if this is formally enough for the patch to be applied but under the circumstances (pretty obvious patch, Reviewed-by more than one person, including the X86 MEMORY MANAGEMENT maintainer even if mm.c isn't strictly under that for some reason) I've added my own Revieed-by and applied, hopefully that is ok. Perhaps there should be a patch to MAINTAINERS to add arch/x86/mm.c to the relevant subsection? (Or maybe the file is just horribly named?) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 v3 0/6] Fix libxl TOOLSTACK records for migration v2
On Tue, 2015-08-04 at 18:16 +0100, Andrew Cooper wrote: The libxl blob for toolstack records was not 32/64bit safe. In 64bit builds, it had an extra 4 bytes of padding due to alignment, and because of the pointer arithmatic used to build it, leaked 4 bytes of libxl heap into the stream. Respecify XENSTORE_DATA as EMULATOR_XENSTORE_DATA and take remedial action to prevent this bitness issue escaping into a Xen release. Only minor changes from v3. Details in patches. Summary of Modified/Acks: Andrew Cooper (6): A tools/libxl: Make libxl__conversion_helper_abort() safe to use M docs/libxl: Re-specify XENSTORE_DATA as EMULATOR_XENSTORE_DATA M tools/libxl: Save and restore EMULATOR_XENSTORE_DATA content I acked these two and applied, thanks! (I don't recall if I saw a formal release-ack for these, but it's on the blocker list so I didn't bother looking...) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 5/8] xen/tmem: Use page_to_gfn rather than pfn_to_gfn
On Tue, 4 Aug 2015, Julien Grall wrote: All the caller of xen_tmem_{get,put}_page have a struct page * in hand and call pfn_to_gfn for the only benefits of these 2 functions. Rather than passing the pfn in parameter, pass directly the page and use directly page_to_gfn. Signed-off-by: Julien Grall julien.gr...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com drivers/xen/tmem.c | 21 +++-- 1 file changed, 7 insertions(+), 14 deletions(-) diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c index 28c97ff..e0c8dc7 100644 --- a/drivers/xen/tmem.c +++ b/drivers/xen/tmem.c @@ -129,21 +129,17 @@ static int xen_tmem_new_pool(struct tmem_pool_uuid uuid, /* xen generic tmem ops */ static int xen_tmem_put_page(u32 pool_id, struct tmem_oid oid, - u32 index, unsigned long pfn) + u32 index, struct page *page) { - unsigned long gmfn = pfn_to_gfn(pfn); - return xen_tmem_op(TMEM_PUT_PAGE, pool_id, oid, index, - gmfn, 0, 0, 0); +page_to_gfn(page), 0, 0, 0); } static int xen_tmem_get_page(u32 pool_id, struct tmem_oid oid, - u32 index, unsigned long pfn) + u32 index, struct page *page) { - unsigned long gmfn = pfn_to_gfn(pfn); - return xen_tmem_op(TMEM_GET_PAGE, pool_id, oid, index, - gmfn, 0, 0, 0); +page_to_gfn(page), 0, 0, 0); } static int xen_tmem_flush_page(u32 pool_id, struct tmem_oid oid, u32 index) @@ -173,14 +169,13 @@ static void tmem_cleancache_put_page(int pool, struct cleancache_filekey key, { u32 ind = (u32) index; struct tmem_oid oid = *(struct tmem_oid *)key; - unsigned long pfn = page_to_pfn(page); if (pool 0) return; if (ind != index) return; mb(); /* ensure page is quiescent; tmem may address it with an alias */ - (void)xen_tmem_put_page((u32)pool, oid, ind, pfn); + (void)xen_tmem_put_page((u32)pool, oid, ind, page); } static int tmem_cleancache_get_page(int pool, struct cleancache_filekey key, @@ -287,7 +282,6 @@ static int tmem_frontswap_store(unsigned type, pgoff_t offset, { u64 ind64 = (u64)offset; u32 ind = (u32)offset; - unsigned long pfn = page_to_pfn(page); int pool = tmem_frontswap_poolid; int ret; @@ -296,7 +290,7 @@ static int tmem_frontswap_store(unsigned type, pgoff_t offset, if (ind64 != ind) return -1; mb(); /* ensure page is quiescent; tmem may address it with an alias */ - ret = xen_tmem_put_page(pool, oswiz(type, ind), iswiz(ind), pfn); + ret = xen_tmem_put_page(pool, oswiz(type, ind), iswiz(ind), page); /* translate Xen tmem return values to linux semantics */ if (ret == 1) return 0; @@ -313,7 +307,6 @@ static int tmem_frontswap_load(unsigned type, pgoff_t offset, { u64 ind64 = (u64)offset; u32 ind = (u32)offset; - unsigned long pfn = page_to_pfn(page); int pool = tmem_frontswap_poolid; int ret; @@ -321,7 +314,7 @@ static int tmem_frontswap_load(unsigned type, pgoff_t offset, return -1; if (ind64 != ind) return -1; - ret = xen_tmem_get_page(pool, oswiz(type, ind), iswiz(ind), pfn); + ret = xen_tmem_get_page(pool, oswiz(type, ind), iswiz(ind), page); /* translate Xen tmem return values to linux semantics */ if (ret == 1) return 0; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Fri, Jul 18, 2014 at 12:28 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This adds systemd socket activation support for the C xenstored. Active sockets enable xenstored to be loaded only if required by a system onto which Xen is installed on. Socket activation is handled by systemd, once a port for a service which claims a socket is used systemd will start the required services for it, on demand. For more details on socket activation refer to Lennart's socket-activation post regarding this [0]. Right now this code adds a no-op for this functionality, leaving the enablement to be done later once systemd is properly hooked into the build system. The socket activation is ordered in aligment with the socket activation order passed on to systemd. [0] http://0pointer.de/blog/projects/socket-activation2.html So with this patch in place, xenstored will not start on a system that has systemd, *even if it wasn't started from systemd*. Lots of systems (e.g., CentOS 7) have legacy systems in place to allow you to do things like chkconfig --add xencommons even on a systemd system. I think we still want to work with those, right? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies
On Tue, 4 Aug 2015, Boris Ostrovsky wrote: On 08/04/2015 02:12 PM, Julien Grall wrote: /* * We detect special mappings in one of two ways: @@ -217,9 +232,13 @@ static inline unsigned long bfn_to_local_pfn(unsigned long mfn) /* VIRT - MACHINE conversion */ #define virt_to_machine(v)(phys_to_machine(XPADDR(__pa(v -#define virt_to_pfn(v) (PFN_DOWN(__pa(v))) #define virt_to_mfn(v)(pfn_to_mfn(virt_to_pfn(v))) #define mfn_to_virt(m)(__va(mfn_to_pfn(m) PAGE_SHIFT)) +#define virt_to_pfn(v) (PFN_DOWN(__pa(v))) This looks like unnecessary change. diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 09dc447..25e3cce 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev) static unsigned long vmalloc_to_mfn(void *address) { - return pfn_to_mfn(vmalloc_to_pfn(address)); + return pfn_to_gfn(vmalloc_to_pfn(address)); } Are you sure? This will return vmalloc_to_pfn(address)). I think that is OK: there is no behavioural change here. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies
On Tue, 4 Aug 2015, Julien Grall wrote: Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is meant, I suspect this is because the first support for Xen was for PV. This resulted in some misimplementation of helpers on ARM and confused developers about the expected behavior. For instance, with pfn_to_mfn, we expect to get an MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. For clarity and avoid new confusion, replace any reference to mfn with gfn in any helpers used by PV drivers. The x86 code will still keep some reference of pfn_to_mfn but exclusively for PV (a BUG_ON has been added to ensure this). No changes as been made in the hypercall field, even though they may be invalid, in order to keep the same as the defintion in xen repo. Take also the opportunity to simplify simple construction such as pfn_to_mfn(page_to_pfn(page)) into page_to_gfn. More complex clean up will come in follow-up patches. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb Signed-off-by: Julien Grall julien.gr...@citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Russell King li...@arm.linux.org.uk Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: x...@kernel.org Cc: Roger Pau Monné roger@citrix.com Cc: Dmitry Torokhov dmitry.torok...@gmail.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Juergen Gross jgr...@suse.com Cc: James E.J. Bottomley jbottom...@odin.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Jiri Slaby jsl...@suse.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: linux-in...@vger.kernel.org Cc: net...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: linuxppc-...@lists.ozlabs.org Cc: linux-fb...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Aside from the x86 bits: Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Note that I've re-introduced mfn_to_pfn co only for x86 PV code. The helpers contain a BUG_ON to ensure that it's never called for auto-translated guests. I did as best as my can to determine whether mfn or gfn helpers should be used. Although, I haven't tried to boot it. It may be possible to do further cleanup in the mmu.c where I found some check to auto-translated. I'm not sure why given that the pvmmu callback are only used for non-auto translated guest. Finally, given those changes, I didn't retain the Reviewed-by/Acked-by. Changes in v2: - Give directly the URL to the commit rather than the commit ID - xenstored_local_init: keep the cast to void * - Typoes - Keep pfn_to_mfn for x86 and PV-only. The *mfn* helpers are used in arch/x86/xen for enlighten.c, mmu.c, p2m.c, setup.c, smp.c and mm.c --- arch/arm/include/asm/xen/page.h | 13 +++-- arch/x86/include/asm/xen/page.h | 33 ++--- arch/x86/xen/smp.c | 2 +- drivers/block/xen-blkfront.c| 6 +++--- drivers/input/misc/xen-kbdfront.c | 4 ++-- drivers/net/xen-netback/netback.c | 4 ++-- drivers/net/xen-netfront.c | 8 drivers/scsi/xen-scsifront.c| 8 +++- drivers/tty/hvc/hvc_xen.c | 5 +++-- drivers/video/fbdev/xen-fbfront.c | 4 ++-- drivers/xen/balloon.c | 2 +- drivers/xen/events/events_base.c| 2 +- drivers/xen/events/events_fifo.c| 4 ++-- drivers/xen/gntalloc.c | 3 ++- drivers/xen/manage.c| 2 +- drivers/xen/tmem.c | 4 ++-- drivers/xen/xenbus/xenbus_client.c | 2 +- drivers/xen/xenbus/xenbus_dev_backend.c | 2 +- drivers/xen/xenbus/xenbus_probe.c | 8 +++- include/xen/page.h | 4 ++-- 20 files changed, 69 insertions(+), 51 deletions(-) diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index 087d86e..51e5bf1 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -34,14 +34,15 @@ typedef struct xpaddr { unsigned long __pfn_to_mfn(unsigned long pfn); extern struct rb_root phys_to_mach; -static inline unsigned long pfn_to_mfn(unsigned long pfn) +/* Pseudo-physical - Guest conversion */ +static inline unsigned long pfn_to_gfn(unsigned long pfn) { return pfn; } -static inline unsigned long mfn_to_pfn(unsigned long mfn) +static inline unsigned long gfn_to_pfn(unsigned long gfn) { - return mfn; + return gfn; } /*
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Wed, 2015-08-05 at 11:06 +0100, George Dunlap wrote: On Fri, Jul 18, 2014 at 12:28 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This adds systemd socket activation support for the C xenstored. Active sockets enable xenstored to be loaded only if required by a system onto which Xen is installed on. Socket activation is handled by systemd, once a port for a service which claims a socket is used systemd will start the required services for it, on demand. For more details on socket activation refer to Lennart's socket-activation post regarding this [0]. Right now this code adds a no-op for this functionality, leaving the enablement to be done later once systemd is properly hooked into the build system. The socket activation is ordered in aligment with the socket activation order passed on to systemd. [0] http://0pointer.de/blog/projects/socket-activation2.html So with this patch in place, xenstored will not start on a system that has systemd, *even if it wasn't started from systemd*. But where systemd is /sbin/init, right? The case where xenstored was compiled with systemd support but systemd is not /sbin/init should still be expected to work, and isn't what I think you are complaining about here. Lots of systems (e.g., CentOS 7) have legacy systems in place to allow you to do things like chkconfig --add xencommons even on a systemd system. I think we still want to work with those, right? Isn't chkconfig --add still arranging for the thing to be started by systemd under the hood? If not systemd on a host where /sbin/init==systemd then what does else would start it? If you are asking should the sysvinit initscripts still be us(ed|able) even though systemd is being used as /sbin/init on the host and a unit file is present then AIUI the systemd answer is no. (We may choose to disagree with systemd on this I suppose) On the other hand, does this mean I can no longer start xenstored by hand from the CLI? _That_ would seem to be worth preserving, for debugging etc if nothing else. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.6] libxl: move calling libxl__arch_domain_construct_memmap to right place
This function was called in the wrong place, because both libxl__vnuma_build_vmemrange_hvm and xc_hvm_build rely on its output. Move the call of said function to the right place -- before the other two functions which reply on its output. Signed-off-by: Wei Liu wei.l...@citrix.com --- Cc: Chen, Tiejun tiejun.c...@intel.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Discovered this issue by code inspection. This issue is not discovered by osstest because we don't have hardware or test case to test that code path. Tiejun, can you confirm this is the right fix? Can you test this change? --- tools/libxl/libxl_dom.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index fea..811f7da 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -960,6 +960,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } +if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { +LOG(ERROR, setting domain memory map failed); +goto out; +} + if (info-num_vnuma_nodes != 0) { int i; @@ -997,11 +1002,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } -if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { -LOG(ERROR, setting domain memory map failed); -goto out; -} - ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port, state-store_mfn, state-console_port, state-console_mfn, state-store_domid, -- 2.4.6 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen
Hi Shannon, On 05/08/15 10:29, Shannon Zhao wrote: 3)DOM0 how to get grant table and event channel irq informations As said above, we assign the hypervisor_id be XenVMM to tell DOM0 that it runs on Xen hypervisor. Then save the start address and size of grant table in domain-grant_table-start_addr and domain-grant_table-size. DOM0 can call a new hypercall GNTTABOP_get_start_addr to get these info. Same to event channel, we've already save interrupt number in d-arch.evtchn_irq, so DOM0 can call a new hypercall EVTCHNOP_get_irq to get the irq. It would be nice to go down into more details and write the parameters of the hypercalls in the doc as they will become a newly supported ABI. The parameters of GNTTABOP_get_start_addr is like below: The hypercall is not only giving the start but also the size. So I would rename into GNTTABOP_get_region struct gnttab_get_start_addr { /* IN parameters */ domid_t dom; The domain ID is not necessary. We only want to retrieve grant table region of the current domain. uint16_t pad; /* OUT parameters */ uint64_t start_addr; uint64_t size; }; The parameters of EVTCHNOP_get_irq is like below: struct evtchn_get_irq { /* IN parameters. */ domid_t dom; Same remark here. uint16_t pad; /* OUT parameters. */ uint32_t irq; We also need to expose the type of the IRQ (i.e level/edge ...) as ACPI and DT does. This is to avoid been tight on edge interrupt for the event channel. }; OOI, did you consider to use an HVM_PARAM rather than introducing two new hypercalls? The evtchnop would need to be called something like EVTCHNOP_get_notification_irq and would need to be ARM specific (on x86 things are different). 4)How to map MMIO regions a)Current implementation is mapping MMIO regions in Dom0 on demand when trapping in Xen with a data abort. I think this approach is prone to failures. A driver could program a device for DMA involving regions not yet mapped. As a consequence the DMA operation would fail because the SMMU would stop the transaction. b)Another way is to map all the non-ram memory regions before booting. But as suggested by Stefano, this will use a lot of memory to store the pagetables. c)Another suggested way is to use a hypercall from DOM0 to request MMIO regions mappings after Linux complete parsing the DSDT. But I didn't find a proper place to issue this call. Anyone has some suggestion? I suggested to exploit the bus_notifier callbacks and issue an hypercall there. In the case of the PCI bus, we are already handling notifications in drivers/xen/pci.c:xen_pci_notifier. Once you have a struct pci_dev pointer in your hand, you can get the MMIO regions from pdev-resource[bar]. Does that work? I investigate and test this approach. Adding a bus notifier for platform bus, it could map the mmio regions. Stefano, thanks for your suggestion. And does anyone else have other comments on this approach? I think this approach would be the best one. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/8] xen: Make clear that swiotlb and biomerge are dealing with DMA address
Hi Stefano, On 05/08/15 10:49, Stefano Stabellini wrote: /* VIRT - MACHINE conversion */ #define virt_to_mfn(v) (pfn_to_mfn(virt_to_pfn(v))) @@ -96,7 +115,7 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn) bool xen_arch_need_swiotlb(struct device *dev, unsigned long pfn, - unsigned long mfn); + unsigned long dfn); unsigned long xen_get_swiotlb_free_pages(unsigned int order); You missed a bunch of dfn-bfn renamings. Sorry, I forgot to double check that before sending the patch. I will fix it in the next version. Aside from those and the commit message error: Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Thank you! Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen
On Wed, 5 Aug 2015, Shannon Zhao wrote: On 2015/8/4 22:37, Stefano Stabellini wrote: On Tue, 4 Aug 2015, Shannon Zhao wrote: This document is going to explain the design details of Xen booting with ACPI on ARM. Maybe parts of it may not be appropriate. Any comments are welcome. Good start! To Xen itself booting with ACPI, this is similar to Linux kernel except that Xen doesn't parse DSDT table. So I'll skip this part and focus on how Xen prepares ACPI tables for DOM0 and how Xen passes them to DOM0. 1)copy and change some EFI and ACPI tables. a) Copy EFI_SYSTEM_TABLE and change the value of FirmwareVendor, VendorGuid, VendorTable, ConfigurationTable. These changes are not very special and it just assign values to these members. b) Create EFI_MEMORY_DESCRIPTOR table. This will add memory start and size information of DOM0. And DOM0 will get the memory information through this EFI table. c) Copy FADT table. Change the value of arm_boot_flags to enable PSCI and HVC. Let the hypervisor_id be XenVMM in order to tell DOM0 that it runs on Xen hypervisor, so DOM0 can call hypercall to get some informations for booting necessity, such as grant tab start address and size. Change header revison, length and checksum as well. d) Copy GTDT table. Set non_secure_el2_interrupt and non_secure_el2_flags to 0 to mask EL2 timer for DOM0. e) Copy MADT table. According to the value of dom0_max_vcpus to change the number GICC entries. f) Create STAO table. This table is a new added one that's used to define a list of ACPI namespace names that are to be ignored by the OSPM in DOM0. Currently we use it to tell OSPM should ignore UART defined in SPCR table. g) Copy XSDT table. Add a new table entry for STAO and change other table's entries. h) Change the value of xsdt_physical_address in RSDP table. i) The reset of tables are not copied or changed. They are reused including DSDT, SPCR. OK so far All these tables will be copied or mapped to guest memory. Are they copied or mapped? Also I think we need to recalculate the md5sum? 2)Create minimal DT to pass required informations to DOM0 The minimal DT mainly passes DOM0 bootargs, address and size of initrd (if available), address and size of uefi system table, address and size of uefi memory table, uefi-mmap-desc-size and uefi-mmap-desc-ver. I think we need to specify which Linux entry point is called, that I think will be the proper non-EFI kernel entry point, which requires MMU off (see Documentation/efi-stub.txt in linux). Also it would be better to write the full bindings of the generated minimal DT, see http://marc.info/?l=linux-kernelm=142362266626403w=2 and Documentation/arm/uefi.txt in linux. An example of the minimal DT: / { #address-cells = 2; #size-cells = 1; chosen { bootargs = kernel=Image console=hvc0 earlycon=pl011,0x1c09 root=/dev/vda2 rw rootfstype=ext4 init=/bin/sh acpi=force; linux,initrd-start = 0x; linux,initrd-end = 0x; linux,uefi-system-table = 0x; linux,uefi-mmap-start = 0x; linux,uefi-mmap-size = 0x; linux,uefi-mmap-desc-size = 0x; linux,uefi-mmap-desc-ver = 0x; }; }; Good, please include this example in the doc. Please include a pointer to Documentation/arm/uefi.txt which lists these paramaters. 3)DOM0 how to get grant table and event channel irq informations As said above, we assign the hypervisor_id be XenVMM to tell DOM0 that it runs on Xen hypervisor. Then save the start address and size of grant table in domain-grant_table-start_addr and domain-grant_table-size. DOM0 can call a new hypercall GNTTABOP_get_start_addr to get these info. Same to event channel, we've already save interrupt number in d-arch.evtchn_irq, so DOM0 can call a new hypercall EVTCHNOP_get_irq to get the irq. It would be nice to go down into more details and write the parameters of the hypercalls in the doc as they will become a newly supported ABI. The parameters of GNTTABOP_get_start_addr is like below: struct gnttab_get_start_addr { /* IN parameters */ domid_t dom; uint16_t pad; /* OUT parameters */ uint64_t start_addr; uint64_t size; }; The parameters of EVTCHNOP_get_irq is like below: struct evtchn_get_irq { /* IN parameters. */ domid_t dom; uint16_t pad; /* OUT parameters. */ uint32_t irq; }; I think that it makes sense to reuse the existing HVM_PARAM_CALLBACK_IRQ hvmop call in this case. See drivers/xen/events/events_base.c:xen_set_callback_via in Linux and xen/include/public/hvm/params.h in Xen. I would just add a new
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen
On Wed, 5 Aug 2015, Julien Grall wrote: Hi Shannon, On 05/08/15 10:29, Shannon Zhao wrote: 3)DOM0 how to get grant table and event channel irq informations As said above, we assign the hypervisor_id be XenVMM to tell DOM0 that it runs on Xen hypervisor. Then save the start address and size of grant table in domain-grant_table-start_addr and domain-grant_table-size. DOM0 can call a new hypercall GNTTABOP_get_start_addr to get these info. Same to event channel, we've already save interrupt number in d-arch.evtchn_irq, so DOM0 can call a new hypercall EVTCHNOP_get_irq to get the irq. It would be nice to go down into more details and write the parameters of the hypercalls in the doc as they will become a newly supported ABI. The parameters of GNTTABOP_get_start_addr is like below: The hypercall is not only giving the start but also the size. So I would rename into GNTTABOP_get_region I agree struct gnttab_get_start_addr { /* IN parameters */ domid_t dom; The domain ID is not necessary. We only want to retrieve grant table region of the current domain. Most other GNTTABOP have a domid parameter, so for uniformity I would keep it. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] libxl: move calling libxl__arch_domain_construct_memmap to right place
On Wed, 2015-08-05 at 11:23 +0100, Wei Liu wrote: This function was called in the wrong place, because both libxl__vnuma_build_vmemrange_hvm and xc_hvm_build rely on its output. What is the effect of this call being in the wrong place? Presumably one or the other of those functions reaches the wrong conclusion? Move the call of said function to the right place -- before the other two functions which reply on its output. Signed-off-by: Wei Liu wei.l...@citrix.com --- Cc: Chen, Tiejun tiejun.c...@intel.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Discovered this issue by code inspection. This issue is not discovered by osstest because we don't have hardware or test case to test that code path. Tiejun, can you confirm this is the right fix? Can you test this change? --- tools/libxl/libxl_dom.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index fea..811f7da 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -960,6 +960,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } +if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { +LOG(ERROR, setting domain memory map failed); +goto out; +} + if (info-num_vnuma_nodes != 0) { int i; @@ -997,11 +1002,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } -if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { -LOG(ERROR, setting domain memory map failed); -goto out; -} - ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port, state-store_mfn, state-console_port, state-console_mfn, state-store_domid, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] libxl: move calling libxl__arch_domain_construct_memmap to right place
On Wed, Aug 05, 2015 at 11:38:54AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:23 +0100, Wei Liu wrote: This function was called in the wrong place, because both libxl__vnuma_build_vmemrange_hvm and xc_hvm_build rely on its output. What is the effect of this call being in the wrong place? Presumably one or the other of those functions reaches the wrong conclusion? Originally, by the time that function got called, all guest pages were already populated. The end result is E820 map disagrees with what vNUMA says and what address ranges memory actually resides, i.e. risk of guest accessing region that doesn't have backing pages. Move the call of said function to the right place -- before the other two functions which reply on its output. Signed-off-by: Wei Liu wei.l...@citrix.com --- Cc: Chen, Tiejun tiejun.c...@intel.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Discovered this issue by code inspection. This issue is not discovered by osstest because we don't have hardware or test case to test that code path. Tiejun, can you confirm this is the right fix? Can you test this change? --- tools/libxl/libxl_dom.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index fea..811f7da 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -960,6 +960,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } +if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { +LOG(ERROR, setting domain memory map failed); +goto out; +} + if (info-num_vnuma_nodes != 0) { int i; @@ -997,11 +1002,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } -if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { -LOG(ERROR, setting domain memory map failed); -goto out; -} - ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port, state-store_mfn, state-console_port, state-console_mfn, state-store_domid, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] libxl: move calling libxl__arch_domain_construct_memmap to right place
On Wed, 2015-08-05 at 11:43 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 11:38:54AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:23 +0100, Wei Liu wrote: This function was called in the wrong place, because both libxl__vnuma_build_vmemrange_hvm and xc_hvm_build rely on its output. What is the effect of this call being in the wrong place? Presumably one or the other of those functions reaches the wrong conclusion? Originally, by the time that function got called, all guest pages were already populated. The end result is E820 map disagrees with what vNUMA says and what address ranges memory actually resides, i.e. risk of guest accessing region that doesn't have backing pages. Ouch. This should certainly be explained in the commit message. With that: Acked-by: Ian Campbell ian.campb...@citrix.com Although perhaps we should wait for confirmation this fix doesn't regress RMRR somehow? Move the call of said function to the right place -- before the other two functions which reply on its output. Signed-off-by: Wei Liu wei.l...@citrix.com --- Cc: Chen, Tiejun tiejun.c...@intel.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Discovered this issue by code inspection. This issue is not discovered by osstest because we don't have hardware or test case to test that code path. Tiejun, can you confirm this is the right fix? Can you test this change? --- tools/libxl/libxl_dom.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index fea..811f7da 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -960,6 +960,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } +if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { +LOG(ERROR, setting domain memory map failed); +goto out; +} + if (info-num_vnuma_nodes != 0) { int i; @@ -997,11 +1002,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } -if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { -LOG(ERROR, setting domain memory map failed); -goto out; -} - ret = hvm_build_set_params(ctx-xch, domid, info, state -store_port, state-store_mfn, state -console_port, state-console_mfn, state -store_domid, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST v2] standalone: Extend -h to support ident=host style specifications
On Fri, 2015-07-31 at 16:24 +0100, Ian Jackson wrote: Ian Campbell writes ([PATCH OSSTEST v2] standalone: Extend -h to support ident=host style specifications): Allowing for multi-host tests. Also make reset-host reset all hosts. Acked-by: Ian Jackson ian.jack...@eu.citrix.com Thanks, added to pretest. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] get_hostflags: return an empty list when there is no flight/job.
On Fri, 2015-07-31 at 16:39 +0100, Ian Campbell wrote: On Fri, 2015-07-31 at 16:31 +0100, Ian Jackson wrote: Ian Campbell writes ([PATCH OSSTEST] get_hostflags: return an empty list when there is no flight/job.): From: Ian Campbell ian.campb...@citrix.com Otherwise trying to use mg-hosts mkpxedir fails with: I think your proposed fix is incorrect. It is wrong to call get_hostflags outside the context of a job, because get_hostflags is supposed to return the job's host flags for that ident. The bug was introduced by me in 11e788f7 JobDB/Executive: Improve an internal `die' error, where a refactoring meant that we always call get_hostflags. How about this instead ? diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm index 1ec947e..cc52f57 100644 --- a/Osstest/JobDB/Executive.pm +++ b/Osstest/JobDB/Executive.pm @@ -128,7 +128,7 @@ sub host_check_allocated ($$) { #method $ho-{Shared} $ho-{Shared}{State} eq 'ready'; my $harness = get_harness_rev(); -my @flags = get_hostflags($ho-{Ident}); +my @flags = defined($job) ? get_hostflags($ho-{Ident}) : qw(OUTSIDE -JOB); $ho-{SharedReady}= $ho-{SharedMaybeOthers} !! (grep { $_. .$harness eq share-.$ho-{Shared}{Type} } Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com LGTM, I suppose the use of the OUTSIDE-JOB sentinel value is just for the benefit of the reader of the following die() should it occur. Acked-by: Ian Campbell ian.campb...@citrix.com I fabricated some sort of commit message and pushed to pretest. commit 2e51119a34d06162b69275f38010130193d5501e Author: Ian Jackson ian.jack...@eu.citrix.com Date: Wed Aug 5 11:42:10 2015 +0100 Executive: Support host_check_allocated outside a job. When called outside a job there are no hostflags, so get_hostflags cannot be used. Instead assume a new pseudo-flag OUTSIDE-JOB when there is no $job. Otherwise uses of select_host such as mg-hosts mkpxedir fail. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com [ ijc -- wrong commit message ] diff --git a/Osstest/JobDB/Executive.pm b/Osstest/JobDB/Executive.pm index 1ec947e..cc52f57 100644 --- a/Osstest/JobDB/Executive.pm +++ b/Osstest/JobDB/Executive.pm @@ -128,7 +128,7 @@ sub host_check_allocated ($$) { #method $ho-{Shared} $ho-{Shared}{State} eq 'ready'; my $harness = get_harness_rev(); -my @flags = get_hostflags($ho-{Ident}); +my @flags = defined($job) ? get_hostflags($ho-{Ident}) : qw(OUTSIDE-JOB); $ho-{SharedReady}= $ho-{SharedMaybeOthers} !! (grep { $_. .$harness eq share-.$ho-{Shared}{Type} } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/3] ts-debian-hvm-install: use di_installcmdline_core
On Fri, 2015-07-31 at 16:20 +0100, Ian Jackson wrote: Ian Campbell writes ([PATCH v2 2/3] ts-debian-hvm-install: use di_installcmdline_core): This is primarily to get DEBIAN_FRONTEND=test, for easier to read logging. text Aside from that, Acked-by: Ian Jackson ian.jack...@eu.citrix.com Thanks, I fixed the typo and push the series to pretest. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies
Hi Boris, On 05/08/15 00:16, Boris Ostrovsky wrote: On 08/04/2015 02:12 PM, Julien Grall wrote: /* * We detect special mappings in one of two ways: @@ -217,9 +232,13 @@ static inline unsigned long bfn_to_local_pfn(unsigned long mfn) /* VIRT - MACHINE conversion */ #define virt_to_machine(v)(phys_to_machine(XPADDR(__pa(v -#define virt_to_pfn(v) (PFN_DOWN(__pa(v))) #define virt_to_mfn(v)(pfn_to_mfn(virt_to_pfn(v))) #define mfn_to_virt(m)(__va(mfn_to_pfn(m) PAGE_SHIFT)) +#define virt_to_pfn(v) (PFN_DOWN(__pa(v))) This looks like unnecessary change. Right, I made the mistake when I re-introduced virt_to_mfn in this version. It was dropped in the previous one. diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 09dc447..25e3cce 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev) static unsigned long vmalloc_to_mfn(void *address) { -return pfn_to_mfn(vmalloc_to_pfn(address)); +return pfn_to_gfn(vmalloc_to_pfn(address)); } Are you sure? This will return vmalloc_to_pfn(address)). I guess you mean vmalloc_to_mfn will return vmalloc_to_pfn? If so, it will be only the case on auto-translated case (because pfn == gfn). In the case of PV, the mfn will be returned. Although, this function is misnamed. It's fixed in a follow-up patch (see #6) because it's required more renaming than this function. I didn't want to add such changes within this patch. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] osstest: install libnl3 packages
On Thu, 2015-07-02 at 16:26 +0100, Ian Jackson wrote: Roger Pau Monne writes ([PATCH] osstest: install libnl3 packages): Install the libnl3 packages needed by the remus code. Those are available on both Wheezy and Jessie, although the Wheezy ones are too old. This patch implicitly drops support for lenny and squeeze. I think you should mention that in the commit message. I added your This patch...squeeze. sentence to the commit message, acked it and pushed to pretest. There was a conflict with the addition of ebtables, the result is below. commit 2ff90f75e5b40152998b69900adf1985382409bf Author: Roger Pau Monne roger@citrix.com Date: Wed Jul 1 17:12:25 2015 +0200 osstest: install libnl3 packages Install the libnl3 packages needed by the remus code. Those are available on both Wheezy and Jessie, although the Wheezy ones are too old. This patch implicitly drops support for lenny and squeeze. Signed-off-by: Roger Pau Monné roger@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Shriram Rajagopalan rshri...@cs.ubc.ca Cc: Yang Hongyang yan...@cn.fujitsu.com [ ijc -- resolved conflict with addition of ebtables ] diff --git a/ts-xen-build-prep b/ts-xen-build-prep index 03ad35c..ace8080 100755 --- a/ts-xen-build-prep +++ b/ts-xen-build-prep @@ -206,14 +206,9 @@ sub prep () { autoconf automake libtool xsltproc libxml2-utils libxml2-dev libdevmapper-dev w3c-dtd-xhtml libxml-xpath-perl - ccache nasm checkpolicy ebtables); + ccache nasm checkpolicy ebtalbes + libnl-3-dev libnl-route-3-dev); -if ($ho-{Suite} =~ m/wheezy|squeeze|lenny/) { - push(@packages, libnl-dev); -} else { - # jessie (jessie?) - push(@packages, libnl-route-3-dev); -} target_install_packages($ho, @packages); target_cmd_root($ho, chmod -R a+r /usr/share/git-core/templates); # workaround for Debian #595728 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] x86/gdt: Drop write-only, xalloc()'d array from set_gdt()
At 14:59 +0100 on 04 Aug (1438700361), Ian Campbell wrote: On Tue, 2015-08-04 at 10:34 +0100, Wei Liu wrote: On Mon, Aug 03, 2015 at 06:05:43PM +0100, Andrew Cooper wrote: It is not used, and can cause a spurious failure of the set_gdt() hypercall in low memory situations. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Jan Beulich jbeul...@suse.com --- This array does appear to be write-only and never gets read. FWIW Reviewed-by: Wei Liu wei.l...@citrix.com The use was removed by: commit 5889a3e1d123bdad4a3d150310d647db55459966 Author: Tim Deegan t...@xen.org Date: Thu May 17 10:24:54 2012 +0100 x86/mm: Use get_page_from_gfn() instead of get_gfn()/put_gfn. Signed-off-by: Tim Deegan t...@xen.org Signed-off-by: Andres Lagar-Cavilla and...@lagarcavilla.org Yep, clearly an oversight in this patch. after it was introduced by: commit 51032ca058e43fbd37ea1f7c7c003496f6451340 Author: Andres Lagar-Cavilla and...@lagarcavilla.org Date: Fri Nov 11 18:11:34 2011 + Modify naming of queries into the p2m [...] Signed-off-by: Andres Lagar-Cavilla and...@lagarcavilla.org Acked-by: Tim Deegan t...@xen.org Committed-by: Keir Fraser k...@xen.org It's obviously correct and fixes a problem, so it can be applied to 4.6 tree. +1, FWIW. Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Wed, Aug 5, 2015 at 11:17 AM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-08-05 at 11:06 +0100, George Dunlap wrote: On Fri, Jul 18, 2014 at 12:28 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This adds systemd socket activation support for the C xenstored. Active sockets enable xenstored to be loaded only if required by a system onto which Xen is installed on. Socket activation is handled by systemd, once a port for a service which claims a socket is used systemd will start the required services for it, on demand. For more details on socket activation refer to Lennart's socket-activation post regarding this [0]. Right now this code adds a no-op for this functionality, leaving the enablement to be done later once systemd is properly hooked into the build system. The socket activation is ordered in aligment with the socket activation order passed on to systemd. [0] http://0pointer.de/blog/projects/socket-activation2.html So with this patch in place, xenstored will not start on a system that has systemd, *even if it wasn't started from systemd*. But where systemd is /sbin/init, right? The case where xenstored was compiled with systemd support but systemd is not /sbin/init should still be expected to work, and isn't what I think you are complaining about here. Lots of systems (e.g., CentOS 7) have legacy systems in place to allow you to do things like chkconfig --add xencommons even on a systemd system. I think we still want to work with those, right? Isn't chkconfig --add still arranging for the thing to be started by systemd under the hood? If not systemd on a host where /sbin/init==systemd then what does else would start it? If you are asking should the sysvinit initscripts still be us(ed|able) even though systemd is being used as /sbin/init on the host and a unit file is present then AIUI the systemd answer is no. (We may choose to disagree with systemd on this I suppose) Well that's not (apparently) the RHEL answer; doing chkconfig --add [foo] Just Works on CentOS 7 for all the sysvinit scripts I've used (including the Xen 4.4 Xen4CentOS packages). I think we want to still *enable* people to use that mode if they want to. But I won't argue if people feel strongly otherwise. On the other hand, does this mean I can no longer start xenstored by hand from the CLI? _That_ would seem to be worth preserving, for debugging etc if nothing else. So what happens at the moment is that xenstored, run either from the command-line says, Oh, look! I'm running on a systemd system. I'll check for my systemd sockets. Oh no, no sockets! *dies*. If run from xencommons, it doesn't even get that far: it says, Oh, look! I'm running on a systemd system. But wait! You asked me to use a pidfile! BAD USER! NO PIDFILE ON SYSTEMD! *dies*. Modifying xenstored to try to open the systemd sockets, and fall back to normal sockets if it doesn't find any, works when started from the command-line. But for some reason, if you take out the aforementioned check preventing --pid-file, it still segfaults (!) at some point. I haven't tracked down what the problem is there yet; but I don't want to bother trying if that's not what we're going for. :-) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] libxl: move calling libxl__arch_domain_construct_memmap to right place
On Wed, Aug 05, 2015 at 11:48:55AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:43 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 11:38:54AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:23 +0100, Wei Liu wrote: This function was called in the wrong place, because both libxl__vnuma_build_vmemrange_hvm and xc_hvm_build rely on its output. What is the effect of this call being in the wrong place? Presumably one or the other of those functions reaches the wrong conclusion? Originally, by the time that function got called, all guest pages were already populated. The end result is E820 map disagrees with what vNUMA says and what address ranges memory actually resides, i.e. risk of guest accessing region that doesn't have backing pages. Ouch. This should certainly be explained in the commit message. With that: Acked-by: Ian Campbell ian.campb...@citrix.com I will post v2 shortly with that information in commit message. Although perhaps we should wait for confirmation this fix doesn't regress RMRR somehow? I doubt it will. The code as-is is already broken for RMRR. But let's wait till tomorrow for Tiejun to reply. If he doesn't reply by tomorrow, I suggest we apply v2 first and fix up any subsequent issues later. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 60541: tolerable FAIL - PUSHED
flight 60541 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/60541/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-rumpuserxen-i386 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 60199 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail like 60199 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 60199 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 60199 test-armhf-armhf-xl-rtds 11 guest-start fail like 60199 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate fail like 60199 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-vhd 9 debian-di-installfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-armhf-armhf-xl-vhd 9 debian-di-installfail never pass test-armhf-armhf-libvirt-raw 9 debian-di-installfail never pass test-armhf-armhf-xl-raw 9 debian-di-installfail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-libvirt-qcow2 9 debian-di-installfail never pass test-armhf-armhf-xl-qcow2 9 debian-di-installfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass version targeted for testing: xen 480b83162a12520466d3933b0bea18aa43344d11 baseline version: xen 27524b5e1556067545ee19b3f482ec755aa82de3 Last test of basis60199 2015-07-31 13:32:03 Z4 days Testing same since60541 2015-08-03 11:55:45 Z1 days1 attempts People who touched revisions under test: Paul Durrant paul.durr...@citrix.com Roger Pau Monné roger@citrix.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
Re: [Xen-devel] [PATCH for-4.6] libxl: move calling libxl__arch_domain_construct_memmap to right place
On Wed, 2015-08-05 at 11:58 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 11:48:55AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:43 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 11:38:54AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:23 +0100, Wei Liu wrote: This function was called in the wrong place, because both libxl__vnuma_build_vmemrange_hvm and xc_hvm_build rely on its output. What is the effect of this call being in the wrong place? Presumably one or the other of those functions reaches the wrong conclusion? Originally, by the time that function got called, all guest pages were already populated. The end result is E820 map disagrees with what vNUMA says and what address ranges memory actually resides, i.e. risk of guest accessing region that doesn't have backing pages. Ouch. This should certainly be explained in the commit message. With that: Acked-by: Ian Campbell ian.campb...@citrix.com I will post v2 shortly with that information in commit message. Although perhaps we should wait for confirmation this fix doesn't regress RMRR somehow? I doubt it will. The code as-is is already broken for RMRR. But let's wait till tomorrow for Tiejun to reply. If he doesn't reply by tomorrow, I suggest we apply v2 first and fix up any subsequent issues later. Agreed. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Wed, 2015-08-05 at 11:56 +0100, George Dunlap wrote: On Wed, Aug 5, 2015 at 11:17 AM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-08-05 at 11:06 +0100, George Dunlap wrote: On Fri, Jul 18, 2014 at 12:28 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This adds systemd socket activation support for the C xenstored. Active sockets enable xenstored to be loaded only if required by a system onto which Xen is installed on. Socket activation is handled by systemd, once a port for a service which claims a socket is used systemd will start the required services for it, on demand. For more details on socket activation refer to Lennart's socket-activation post regarding this [0]. Right now this code adds a no-op for this functionality, leaving the enablement to be done later once systemd is properly hooked into the build system. The socket activation is ordered in aligment with the socket activation order passed on to systemd. [0] http://0pointer.de/blog/projects/socket-activation2.html So with this patch in place, xenstored will not start on a system that has systemd, *even if it wasn't started from systemd*. But where systemd is /sbin/init, right? The case where xenstored was compiled with systemd support but systemd is not /sbin/init should still be expected to work, and isn't what I think you are complaining about here. Lots of systems (e.g., CentOS 7) have legacy systems in place to allow you to do things like chkconfig --add xencommons even on a systemd system. I think we still want to work with those, right? Isn't chkconfig --add still arranging for the thing to be started by systemd under the hood? If not systemd on a host where /sbin/init==systemd then what does else would start it? If you are asking should the sysvinit initscripts still be us(ed|able) even though systemd is being used as /sbin/init on the host and a unit file is present then AIUI the systemd answer is no. (We may choose to disagree with systemd on this I suppose) Well that's not (apparently) the RHEL answer; doing chkconfig --add [foo] Just Works on CentOS 7 for all the sysvinit scripts I've used (including the Xen 4.4 Xen4CentOS packages). I would expect that the CentOS 7 packaging guidelines would require/encourage you to use the systemd unit files (via whatever command that is) in preference to the sysvinit initscripts when they are available. AUIU the compatibility works the other way round, which is if you use the new systemd commands and there is no unit file with that name but there is a sysv initscript with that name then systemd will invoke the initscript in a compatibility mode. I'm a bit surprised that chkconfig doesn't just to the right thing. It's possible that the fact that our initscript and our systemd unitfiles do not share the same names has defeated its heuristics. I think we want to still *enable* people to use that mode if they want to. But I won't argue if people feel strongly otherwise. On the other hand, does this mean I can no longer start xenstored by hand from the CLI? _That_ would seem to be worth preserving, for debugging etc if nothing else. So what happens at the moment is that xenstored, run either from the command-line says, Oh, look! I'm running on a systemd system. I'll check for my systemd sockets. Oh no, no sockets! *dies*. This shouldn't happen... If run from xencommons, it doesn't even get that far: it says, Oh, look! I'm running on a systemd system. But wait! You asked me to use a pidfile! BAD USER! NO PIDFILE ON SYSTEMD! *dies*. This isn't strictly speaking what we want people to be doing (they should use the systemd units), but I think by properly fixing the first issue this will make this start working too. Modifying xenstored to try to open the systemd sockets, and fall back to normal sockets if it doesn't find any, works when started from the command-line. But for some reason, if you take out the aforementioned check preventing --pid-file, it still segfaults (!) at some point. I haven't tracked down what the problem is there yet; but I don't want to bother trying if that's not what we're going for. :-) I think the check for socket activation should be gated on a new command line option as well as the presence of systemd, and the systemd unit file should pass that option. Then invoking from the CLI works. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Wed, 2015-08-05 at 12:11 +0100, Ian Campbell wrote: I'm a bit surprised that chkconfig doesn't just to the right thing. It's possible that the fact that our initscript and our systemd unitfiles do not share the same names has defeated its heuristics. Perhaps as well as fixing xenstored we should also add a xencommons meta -unit which depends on the others, solely for the purpose of having chkconfig spot that it should use those and not try the initscript? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Patch V6 12/16] mm: provide early_memremap_ro to establish read-only mapping
MM maintainers, is it really so hard to comment on this patch? Juergen On 07/29/2015 11:20 AM, Juergen Gross wrote: On 07/21/2015 06:49 AM, Juergen Gross wrote: Hi MM maintainers, this patch is the last requiring an ack for the series to go in. Could you please comment? PING? Juergen On 07/17/2015 06:51 AM, Juergen Gross wrote: During early boot as Xen pv domain the kernel needs to map some page tables supplied by the hypervisor read only. This is needed to be able to relocate some data structures conflicting with the physical memory map especially on systems with huge RAM (above 512GB). Provide the function early_memremap_ro() to provide this read only mapping. Signed-off-by: Juergen Gross jgr...@suse.com Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Arnd Bergmann a...@arndb.de Cc: linux...@kvack.org Cc: linux-a...@vger.kernel.org --- include/asm-generic/early_ioremap.h | 2 ++ include/asm-generic/fixmap.h| 3 +++ mm/early_ioremap.c | 12 3 files changed, 17 insertions(+) diff --git a/include/asm-generic/early_ioremap.h b/include/asm-generic/early_ioremap.h index a5de55c..316bd04 100644 --- a/include/asm-generic/early_ioremap.h +++ b/include/asm-generic/early_ioremap.h @@ -11,6 +11,8 @@ extern void __iomem *early_ioremap(resource_size_t phys_addr, unsigned long size); extern void *early_memremap(resource_size_t phys_addr, unsigned long size); +extern void *early_memremap_ro(resource_size_t phys_addr, + unsigned long size); extern void early_iounmap(void __iomem *addr, unsigned long size); extern void early_memunmap(void *addr, unsigned long size); diff --git a/include/asm-generic/fixmap.h b/include/asm-generic/fixmap.h index f23174f..1cbb833 100644 --- a/include/asm-generic/fixmap.h +++ b/include/asm-generic/fixmap.h @@ -46,6 +46,9 @@ static inline unsigned long virt_to_fix(const unsigned long vaddr) #ifndef FIXMAP_PAGE_NORMAL #define FIXMAP_PAGE_NORMAL PAGE_KERNEL #endif +#if !defined(FIXMAP_PAGE_RO) defined(PAGE_KERNEL_RO) +#define FIXMAP_PAGE_RO PAGE_KERNEL_RO +#endif #ifndef FIXMAP_PAGE_NOCACHE #define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_NOCACHE #endif diff --git a/mm/early_ioremap.c b/mm/early_ioremap.c index e10ccd2..0cfadaf 100644 --- a/mm/early_ioremap.c +++ b/mm/early_ioremap.c @@ -217,6 +217,13 @@ early_memremap(resource_size_t phys_addr, unsigned long size) return (__force void *)__early_ioremap(phys_addr, size, FIXMAP_PAGE_NORMAL); } +#ifdef FIXMAP_PAGE_RO +void __init * +early_memremap_ro(resource_size_t phys_addr, unsigned long size) +{ +return (__force void *)__early_ioremap(phys_addr, size, FIXMAP_PAGE_RO); +} +#endif #else /* CONFIG_MMU */ void __init __iomem * @@ -231,6 +238,11 @@ early_memremap(resource_size_t phys_addr, unsigned long size) { return (void *)phys_addr; } +void __init * +early_memremap_ro(resource_size_t phys_addr, unsigned long size) +{ +return (void *)phys_addr; +} void __init early_iounmap(void __iomem *addr, unsigned long size) { -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Wed, Aug 5, 2015 at 12:11 PM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-08-05 at 11:56 +0100, George Dunlap wrote: On Wed, Aug 5, 2015 at 11:17 AM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-08-05 at 11:06 +0100, George Dunlap wrote: On Fri, Jul 18, 2014 at 12:28 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This adds systemd socket activation support for the C xenstored. Active sockets enable xenstored to be loaded only if required by a system onto which Xen is installed on. Socket activation is handled by systemd, once a port for a service which claims a socket is used systemd will start the required services for it, on demand. For more details on socket activation refer to Lennart's socket-activation post regarding this [0]. Right now this code adds a no-op for this functionality, leaving the enablement to be done later once systemd is properly hooked into the build system. The socket activation is ordered in aligment with the socket activation order passed on to systemd. [0] http://0pointer.de/blog/projects/socket-activation2.html So with this patch in place, xenstored will not start on a system that has systemd, *even if it wasn't started from systemd*. But where systemd is /sbin/init, right? The case where xenstored was compiled with systemd support but systemd is not /sbin/init should still be expected to work, and isn't what I think you are complaining about here. Lots of systems (e.g., CentOS 7) have legacy systems in place to allow you to do things like chkconfig --add xencommons even on a systemd system. I think we still want to work with those, right? Isn't chkconfig --add still arranging for the thing to be started by systemd under the hood? If not systemd on a host where /sbin/init==systemd then what does else would start it? If you are asking should the sysvinit initscripts still be us(ed|able) even though systemd is being used as /sbin/init on the host and a unit file is present then AIUI the systemd answer is no. (We may choose to disagree with systemd on this I suppose) Well that's not (apparently) the RHEL answer; doing chkconfig --add [foo] Just Works on CentOS 7 for all the sysvinit scripts I've used (including the Xen 4.4 Xen4CentOS packages). I would expect that the CentOS 7 packaging guidelines would require/encourage you to use the systemd unit files (via whatever command that is) in preference to the sysvinit initscripts when they are available. AUIU the compatibility works the other way round, which is if you use the new systemd commands and there is no unit file with that name but there is a sysv initscript with that name then systemd will invoke the initscript in a compatibility mode. I'm a bit surprised that chkconfig doesn't just to the right thing. It's possible that the fact that our initscript and our systemd unitfiles do not share the same names has defeated its heuristics. It seems to me that the right thing for chkconfig to do is to run the script you've asked it to run, not do some other thing you haven't asked it to do. :-) If you think about how different systemd is than sysvinit, the chance of a script with a similar name to a systemd rule being *actually* interchangeable is pretty low. If they really want to push people into using systemd they should remove chkconfig altogether, or make it print a warning, not do something completely different. I think we want to still *enable* people to use that mode if they want to. But I won't argue if people feel strongly otherwise. On the other hand, does this mean I can no longer start xenstored by hand from the CLI? _That_ would seem to be worth preserving, for debugging etc if nothing else. So what happens at the moment is that xenstored, run either from the command-line says, Oh, look! I'm running on a systemd system. I'll check for my systemd sockets. Oh no, no sockets! *dies*. This shouldn't happen... If run from xencommons, it doesn't even get that far: it says, Oh, look! I'm running on a systemd system. But wait! You asked me to use a pidfile! BAD USER! NO PIDFILE ON SYSTEMD! *dies*. This isn't strictly speaking what we want people to be doing (they should use the systemd units), but I think by properly fixing the first issue this will make this start working too. Modifying xenstored to try to open the systemd sockets, and fall back to normal sockets if it doesn't find any, works when started from the command-line. But for some reason, if you take out the aforementioned check preventing --pid-file, it still segfaults (!) at some point. I haven't tracked down what the problem is there yet; but I don't want to bother trying if that's not what we're going for. :-) I think the check for socket activation should be gated on a new
Re: [Xen-devel] [PATCH for-4.6] libxl: move calling libxl__arch_domain_construct_memmap to right place
On Wed, Aug 05, 2015 at 12:06:22PM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:58 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 11:48:55AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:43 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 11:38:54AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 11:23 +0100, Wei Liu wrote: This function was called in the wrong place, because both libxl__vnuma_build_vmemrange_hvm and xc_hvm_build rely on its output. What is the effect of this call being in the wrong place? Presumably one or the other of those functions reaches the wrong conclusion? Originally, by the time that function got called, all guest pages were already populated. The end result is E820 map disagrees with what vNUMA says and what address ranges memory actually resides, i.e. risk of guest accessing region that doesn't have backing pages. Ouch. This should certainly be explained in the commit message. With that: Acked-by: Ian Campbell ian.campb...@citrix.com I will post v2 shortly with that information in commit message. Although perhaps we should wait for confirmation this fix doesn't regress RMRR somehow? I doubt it will. The code as-is is already broken for RMRR. But let's wait till tomorrow for Tiejun to reply. If he doesn't reply by tomorrow, I suggest we apply v2 first and fix up any subsequent issues later. Agreed. Actually I want to retract this patch. I confused hvm path with pv path and drew my conclusion when looking at both code paths. In hvm path, neither libxl__vnuma_build_vmemragen_hvm nor xc_hvm_build depends on output of libxl__arch_domain_construct_memmap (in fact it doesn't change anything). So the code is OK. In pv path, there is a path which relies on having a valid E820 map first, but that path 1) relies on host E820 map; 2) doesn't involve RMRR support. In the end, moving that function call has no effect whatsoever. Sorry for the noise! Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Wed, 2015-08-05 at 12:21 +0100, George Dunlap wrote: On Wed, Aug 5, 2015 at 12:11 PM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-08-05 at 11:56 +0100, George Dunlap wrote: On Wed, Aug 5, 2015 at 11:17 AM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-08-05 at 11:06 +0100, George Dunlap wrote: On Fri, Jul 18, 2014 at 12:28 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This adds systemd socket activation support for the C xenstored. Active sockets enable xenstored to be loaded only if required by a system onto which Xen is installed on. Socket activation is handled by systemd, once a port for a service which claims a socket is used systemd will start the required services for it, on demand. For more details on socket activation refer to Lennart's socket -activation post regarding this [0]. Right now this code adds a no-op for this functionality, leaving the enablement to be done later once systemd is properly hooked into the build system. The socket activation is ordered in aligment with the socket activation order passed on to systemd. [0] http://0pointer.de/blog/projects/socket-activation2.html So with this patch in place, xenstored will not start on a system that has systemd, *even if it wasn't started from systemd*. But where systemd is /sbin/init, right? The case where xenstored was compiled with systemd support but systemd is not /sbin/init should still be expected to work, and isn't what I think you are complaining about here. Lots of systems (e.g., CentOS 7) have legacy systems in place to allow you to do things like chkconfig --add xencommons even on a systemd system. I think we still want to work with those, right? Isn't chkconfig --add still arranging for the thing to be started by systemd under the hood? If not systemd on a host where /sbin/init==systemd then what does else would start it? If you are asking should the sysvinit initscripts still be us(ed|able) even though systemd is being used as /sbin/init on the host and a unit file is present then AIUI the systemd answer is no. (We may choose to disagree with systemd on this I suppose) Well that's not (apparently) the RHEL answer; doing chkconfig --add [foo] Just Works on CentOS 7 for all the sysvinit scripts I've used (including the Xen 4.4 Xen4CentOS packages). I would expect that the CentOS 7 packaging guidelines would require/encourage you to use the systemd unit files (via whatever command that is) in preference to the sysvinit initscripts when they are available. AUIU the compatibility works the other way round, which is if you use the new systemd commands and there is no unit file with that name but there is a sysv initscript with that name then systemd will invoke the initscript in a compatibility mode. I'm a bit surprised that chkconfig doesn't just to the right thing. It's possible that the fact that our initscript and our systemd unitfiles do not share the same names has defeated its heuristics. It seems to me that the right thing for chkconfig to do is to run the script you've asked it to run, not do some other thing you haven't asked it to do. :-) If you think about how different systemd is than sysvinit, the chance of a script with a similar name to a systemd rule being *actually* interchangeable is pretty low. If they really want to push people into using systemd they should remove chkconfig altogether, or make it print a warning, not do something completely different. There's no point arguing about that here or with me. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [RFC 1/2] xen/mm: Clarify the granularity for each Frame Number
From: Stefano Stabellini stefano.stabell...@eu.citrix.com ARM64 is able to support 64KB and 4KB page granularities. While the guest will support both granularities, Xen and hypercall interface will always be in 4KB. Signed-off-by: Stefano Stabellini stefano.stabell...@citrix.com Signed-off-by: Julien Grall julien.gr...@citrix.com --- Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org Cc: Tim Deegan t...@xen.org Cc: Andrew Cooper andrew.coop...@citrix.com I'm missing one term for Linux Frame Number but always in 4KB granularity. It's necessary in few places such as the balloon code where we need to map a Linux 4KB Frame Number into a Machine Frame Number. I was thinking to name it xen_pfn. --- xen/include/xen/mm.h | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 876d370..8dfd61a 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -15,16 +15,24 @@ * * mfn: Machine Frame Number * The values Xen puts into its own pagetables. This is the host physical - * memory address space with RAM, MMIO etc. + * memory address space with RAM, MMIO etc. Always 4KB granularity. * * gfn: Guest Frame Number - * The values a guest puts in its own pagetables. For an auto-translated - * guest (hardware assisted with 2nd stage translation, or shadowed), gfn != - * mfn. For a non-translated guest which is aware of Xen, gfn == mfn. + * The values a guest puts in its own pagetables, except for 64KB + * granularity. Gfns are always based on 4KB granularity, while actually + * the guest could use other granularities. For an auto-translated guest + * (hardware assisted with 2nd stage translation, or shadowed), gfn != mfn. + * For a non-translated guest which is aware of Xen, gfn == mfn. + * Hypercalls take gfns, not mfns, as parameters unless clearly specified + * otherwise. * * pfn: Pseudophysical Frame Number - * A linear idea of a guest physical address space. For an auto-translated - * guest, pfn == gfn while for a non-translated guest, pfn != gfn. + * A linear idea of a guest physical address space. Pfns are in guest + * granularity, which can be 64KB or 4KB. PV guests must only use 4KB + * granularity. For an auto-translated guest, pfn == gfn shift, + * where the shift is the different between the Xen and Linux page + * granularity, while for a non-translated guest, pfn != gfn. Pfns are + * internal to the guest and are not passed to hypercalls. * * WARNING: Some of these terms have changed over time while others have been * used inconsistently, meaning that a lot of existing code does not match the -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [RFC 2/2] xen/public: grant-table: Specificy the size of the grant
The grant is always 4KB irrespectively of the page granularity of the guest. Signed-off-by: Julien Grall julien.gr...@citrix.com --- Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org Cc: Tim Deegan t...@xen.org --- xen/include/public/grant_table.h | 5 + 1 file changed, 5 insertions(+) diff --git a/xen/include/public/grant_table.h b/xen/include/public/grant_table.h index e9393fd..67a250d 100644 --- a/xen/include/public/grant_table.h +++ b/xen/include/public/grant_table.h @@ -51,6 +51,11 @@ * know the real machine address of a page it is sharing. This makes * it possible to share memory correctly with domains running in * fully virtualised memory. + * + * The size of a grant is always 4KB irrespectively of the page + * granularity of the guest. This is means that when the guest is using + * 64KB page granularity, it will have to split the page in 4KB chunks + * and request a grant for every of them. */ /*** -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [RFC 0/2] xen: Clarify the page granularity for the hypercall
Hi all, ARM64 is able to support both 64KB and 4KB page granularity. With the upcoming support of Linux guest with 64KB page granularity, the in-tree documentation needs to be clarify in order to avoid mixing granularity. I'm not sure if the wording is clear and correct, hence, the RFC. Regards, Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org Cc: Tim Deegan t...@xen.org Cc: Andrew Cooper andrew.coop...@citrix.com Julien Grall (1): xen/public: grant-table: Specificy the size of the grant Stefano Stabellini (1): xen/mm: Clarify the granularity for each Frame Number xen/include/public/grant_table.h | 5 + xen/include/xen/mm.h | 20 ++-- 2 files changed, 19 insertions(+), 6 deletions(-) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 1/2] xen/mm: Clarify the granularity for each Frame Number
On 05/08/15 12:28, Julien Grall wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com ARM64 is able to support 64KB and 4KB page granularities. While the guest will support both granularities, Xen and hypercall interface will always be in 4KB. Signed-off-by: Stefano Stabellini stefano.stabell...@citrix.com Signed-off-by: Julien Grall julien.gr...@citrix.com --- Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org Cc: Tim Deegan t...@xen.org Cc: Andrew Cooper andrew.coop...@citrix.com I'm missing one term for Linux Frame Number but always in 4KB granularity. It's necessary in few places such as the balloon code where we need to map a Linux 4KB Frame Number into a Machine Frame Number. I was thinking to name it xen_pfn. --- xen/include/xen/mm.h | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 876d370..8dfd61a 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -15,16 +15,24 @@ * * mfn: Machine Frame Number * The values Xen puts into its own pagetables. This is the host physical - * memory address space with RAM, MMIO etc. + * memory address space with RAM, MMIO etc. Always 4KB granularity. * * gfn: Guest Frame Number - * The values a guest puts in its own pagetables. For an auto-translated - * guest (hardware assisted with 2nd stage translation, or shadowed), gfn != - * mfn. For a non-translated guest which is aware of Xen, gfn == mfn. + * The values a guest puts in its own pagetables, except for 64KB + * granularity. Gfns are always based on 4KB granularity, while actually + * the guest could use other granularities. For an auto-translated guest + * (hardware assisted with 2nd stage translation, or shadowed), gfn != mfn. + * For a non-translated guest which is aware of Xen, gfn == mfn. + * Hypercalls take gfns, not mfns, as parameters unless clearly specified + * otherwise. * * pfn: Pseudophysical Frame Number - * A linear idea of a guest physical address space. For an auto-translated - * guest, pfn == gfn while for a non-translated guest, pfn != gfn. + * A linear idea of a guest physical address space. Pfns are in guest + * granularity, which can be 64KB or 4KB. PV guests must only use 4KB + * granularity. For an auto-translated guest, pfn == gfn shift, + * where the shift is the different between the Xen and Linux page + * granularity, while for a non-translated guest, pfn != gfn. Pfns are + * internal to the guest and are not passed to hypercalls. * * WARNING: Some of these terms have changed over time while others have been * used inconsistently, meaning that a lot of existing code does not match the I am not certain whether this is relevant information in these locations specifically. These descriptions are for the address spaces themselves, rather than for the representations therewithin. 64K granularity is also similar to 2M/1G superpages in their handling, the difference being that 64K can't be subdivided if necessary? I think a section about granularity is worthwhile, but probably a separate paragraph. I think it is also worth keeping Xen's idea of memory all at 4K, and in cases where 64K is in use, require appropriate alignment in the parameter. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen
On 2015/8/5 18:31, Stefano Stabellini wrote: On Wed, 5 Aug 2015, Shannon Zhao wrote: On 2015/8/4 22:37, Stefano Stabellini wrote: On Tue, 4 Aug 2015, Shannon Zhao wrote: This document is going to explain the design details of Xen booting with ACPI on ARM. Maybe parts of it may not be appropriate. Any comments are welcome. Good start! To Xen itself booting with ACPI, this is similar to Linux kernel except that Xen doesn't parse DSDT table. So I'll skip this part and focus on how Xen prepares ACPI tables for DOM0 and how Xen passes them to DOM0. 1)copy and change some EFI and ACPI tables. a) Copy EFI_SYSTEM_TABLE and change the value of FirmwareVendor, VendorGuid, VendorTable, ConfigurationTable. These changes are not very special and it just assign values to these members. b) Create EFI_MEMORY_DESCRIPTOR table. This will add memory start and size information of DOM0. And DOM0 will get the memory information through this EFI table. c) Copy FADT table. Change the value of arm_boot_flags to enable PSCI and HVC. Let the hypervisor_id be XenVMM in order to tell DOM0 that it runs on Xen hypervisor, so DOM0 can call hypercall to get some informations for booting necessity, such as grant tab start address and size. Change header revison, length and checksum as well. d) Copy GTDT table. Set non_secure_el2_interrupt and non_secure_el2_flags to 0 to mask EL2 timer for DOM0. e) Copy MADT table. According to the value of dom0_max_vcpus to change the number GICC entries. f) Create STAO table. This table is a new added one that's used to define a list of ACPI namespace names that are to be ignored by the OSPM in DOM0. Currently we use it to tell OSPM should ignore UART defined in SPCR table. g) Copy XSDT table. Add a new table entry for STAO and change other table's entries. h) Change the value of xsdt_physical_address in RSDP table. i) The reset of tables are not copied or changed. They are reused including DSDT, SPCR. OK so far All these tables will be copied or mapped to guest memory. Are they copied or mapped? Also I think we need to recalculate the md5sum? 2)Create minimal DT to pass required informations to DOM0 The minimal DT mainly passes DOM0 bootargs, address and size of initrd (if available), address and size of uefi system table, address and size of uefi memory table, uefi-mmap-desc-size and uefi-mmap-desc-ver. I think we need to specify which Linux entry point is called, that I think will be the proper non-EFI kernel entry point, which requires MMU off (see Documentation/efi-stub.txt in linux). Also it would be better to write the full bindings of the generated minimal DT, see http://marc.info/?l=linux-kernelm=142362266626403w=2 and Documentation/arm/uefi.txt in linux. An example of the minimal DT: / { #address-cells = 2; #size-cells = 1; chosen { bootargs = kernel=Image console=hvc0 earlycon=pl011,0x1c09 root=/dev/vda2 rw rootfstype=ext4 init=/bin/sh acpi=force; linux,initrd-start = 0x; linux,initrd-end = 0x; linux,uefi-system-table = 0x; linux,uefi-mmap-start = 0x; linux,uefi-mmap-size = 0x; linux,uefi-mmap-desc-size = 0x; linux,uefi-mmap-desc-ver = 0x; }; }; Good, please include this example in the doc. Please include a pointer to Documentation/arm/uefi.txt which lists these paramaters. ok, will add it in this doc v2. 3)DOM0 how to get grant table and event channel irq informations As said above, we assign the hypervisor_id be XenVMM to tell DOM0 that it runs on Xen hypervisor. Then save the start address and size of grant table in domain-grant_table-start_addr and domain-grant_table-size. DOM0 can call a new hypercall GNTTABOP_get_start_addr to get these info. Same to event channel, we've already save interrupt number in d-arch.evtchn_irq, so DOM0 can call a new hypercall EVTCHNOP_get_irq to get the irq. It would be nice to go down into more details and write the parameters of the hypercalls in the doc as they will become a newly supported ABI. The parameters of GNTTABOP_get_start_addr is like below: struct gnttab_get_start_addr { /* IN parameters */ domid_t dom; uint16_t pad; /* OUT parameters */ uint64_t start_addr; uint64_t size; }; The parameters of EVTCHNOP_get_irq is like below: struct evtchn_get_irq { /* IN parameters. */ domid_t dom; uint16_t pad; /* OUT parameters. */ uint32_t irq; }; I think that it makes sense to reuse the existing HVM_PARAM_CALLBACK_IRQ hvmop call in this case. See drivers/xen/events/events_base.c:xen_set_callback_via in Linux and xen/include/public/hvm/params.h in Xen. I would just add a
Re: [Xen-devel] [RFC 1/2] xen/mm: Clarify the granularity for each Frame Number
On Wed, 2015-08-05 at 12:40 +0100, Andrew Cooper wrote: 64K granularity is also similar to 2M/1G superpages in their handling, the difference being that 64K can't be subdivided if necessary? 64K is actually a separate basic granule (to use the ARM term), i.e. alternative to the 4K leaf page size, you can still have e.g. 32M super pages pages if them etc (assuming there are still 512 PTEs per page, I didn't actually check). So I think thinking of 64K as a superpage while superficially correct would probably lead to some misunderstanding or confusion at some point. FWIW there is also a 16K granule available on ARM systems. PPC supports 256K pages too... Ian ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] osstest: install libnl3 packages
On Wed, 2015-08-05 at 11:51 +0100, Ian Campbell wrote: There was a conflict with the addition of ebtables, the result is below. commit 2ff90f75e5b40152998b69900adf1985382409bf Author: Roger Pau Monne roger@citrix.com Date: Wed Jul 1 17:12:25 2015 +0200 osstest: install libnl3 packages Install the libnl3 packages needed by the remus code. Those are available on both Wheezy and Jessie, although the Wheezy ones are too old. This patch implicitly drops support for lenny and squeeze. Signed-off-by: Roger Pau Monné roger@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Shriram Rajagopalan rshri...@cs.ubc.ca Cc: Yang Hongyang yan...@cn.fujitsu.com [ ijc -- resolved conflict with addition of ebtables ] diff --git a/ts-xen-build-prep b/ts-xen-build-prep index 03ad35c..ace8080 100755 --- a/ts-xen-build-prep +++ b/ts-xen-build-prep @@ -206,14 +206,9 @@ sub prep () { autoconf automake libtool xsltproc libxml2-utils libxml2-dev libdevmapper-dev w3c-dtd-xhtml libxml-xpath-perl - ccache nasm checkpolicy ebtables); + ccache nasm checkpolicy ebtalbes Ahem, fixed that and repushed... + libnl-3-dev libnl-route-3-dev); -if ($ho-{Suite} =~ m/wheezy|squeeze|lenny/) { - push(@packages, libnl-dev); -} else { - # jessie (jessie?) - push(@packages, libnl-route-3-dev); -} target_install_packages($ho, @packages); target_cmd_root($ho, chmod -R a+r /usr/share/git-core/templates); # workaround for Debian #595728 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-upstream-unstable test] 60546: trouble: blocked/broken/fail/pass
flight 60546 qemu-upstream-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/60546/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 3 host-install(3) broken REGR. vs. 58880 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-vhd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-raw 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: qemuubcf35eec0b621c46dbf0aeb40c6bc06b5d3981aa baseline version: qemuuc4a962ec0c61aa9b860a3635c8424472e6c2cc2c Last test of basis58880 2015-06-24 13:45:58 Z 41 days Failing since 59777 2015-07-20 12:49:32 Z 15 days 10 attempts Testing same since60546 2015-08-03 16:17:04 Z1 days1 attempts People who touched revisions under test: Gerd Hoffmann kra...@redhat.com Jan Beulich jbeul...@suse.com Kevin Wolf kw...@redhat.com Marc-André Lureau marcandre.lur...@gmail.com Paolo Bonzini pbonz...@redhat.com Stefan Hajnoczi stefa...@redhat.com Stefano Stabellini stefano.stabell...@eu.citrix.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf broken build-i386 pass build-amd64-libvirt pass build-armhf-libvirt blocked build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-armhf-armhf-xl blocked test-amd64-i386-xl pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm pass test-armhf-armhf-libvirt-xsm blocked test-amd64-i386-libvirt-xsm pass test-amd64-amd64-xl-xsm pass test-armhf-armhf-xl-xsm pass test-amd64-i386-xl-xsm
Re: [Xen-devel] [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies
On 08/05/2015 06:51 AM, Julien Grall wrote: diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 09dc447..25e3cce 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev) static unsigned long vmalloc_to_mfn(void *address) { -return pfn_to_mfn(vmalloc_to_pfn(address)); +return pfn_to_gfn(vmalloc_to_pfn(address)); } Are you sure? This will return vmalloc_to_pfn(address)). I guess you mean vmalloc_to_mfn will return vmalloc_to_pfn? If so, it will be only the case on auto-translated case (because pfn == gfn). In the case of PV, the mfn will be returned. How will mfn be returned on PV when pfn_to_gfn() is an identity function? static inline unsigned long pfn_to_gfn(unsigned long pfn) { return pfn; } -boris Although, this function is misnamed. It's fixed in a follow-up patch (see #6) because it's required more renaming than this function. I didn't want to add such changes within this patch. Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] OSSTEST -- nested test case development, RFC: ts-guest-destroy doesn't call guest_await_dhcp_tcp() if guest has fixed IP
On Wed, 2015-08-05 at 06:22 +, Hu, Robert wrote: Hi Ians, I don't 100% recall how this is supposed to fit together. IIRC: 1# L0 is installed as usual #2 An L1 guest is installed. That L1 guest gets an IP address from DHCP in the normal way. 3# Then ts-nested setup customises the L1 guest into an L1 host, storing the DHCP assigned address in $r{${l1ident}_ip}. (I'm not sure if it is actually called l1ident, but whatever it is). 4# Then operations which selecthost(l1ident) see that $r{${l1ident}_ip} and use it as the $ho-{Ip} instead of looking for it in the host db. 5# At some point an L2 guest is installed on the L1 host and it also gets an IP from DHCP in the usual way. Is that all correct? Current ts-guest-destory will invoke guest_await_dhcp_tcp(); but in nested case, after l1 turns into Xen environment, it then has fixed IP address; which in turn has failed at dhcp lease check. So here are we talking about ts-guest-destroy of an L2 guest on the L1 host, or of the L1 guest on the L0 host? I think you are talking about the L1 guest on the L0 host. In that context ts-guest-destroy will be considering the L1 as a guest, so I would expect that guest_await_dhcp_tcp should work, because the L1 guest's IP was assigned via DHCP in #2 above. So I suppose the question is how/why is guest_await_dhcp_tcp failing when operating on the L1 guest? It should be just a guest in this context I think. So, how about if I in ts-guest-destroy bypass guest_await_dhcp_tcp() if we have $r{guest-Guest_ip}? Best Regards, Robert Ho ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies
On 05/08/15 13:19, Boris Ostrovsky wrote: On 08/05/2015 06:51 AM, Julien Grall wrote: diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 09dc447..25e3cce 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev) static unsigned long vmalloc_to_mfn(void *address) { -return pfn_to_mfn(vmalloc_to_pfn(address)); +return pfn_to_gfn(vmalloc_to_pfn(address)); } Are you sure? This will return vmalloc_to_pfn(address)). I guess you mean vmalloc_to_mfn will return vmalloc_to_pfn? If so, it will be only the case on auto-translated case (because pfn == gfn). In the case of PV, the mfn will be returned. How will mfn be returned on PV when pfn_to_gfn() is an identity function? static inline unsigned long pfn_to_gfn(unsigned long pfn) { return pfn; } The identity function is only for ARM guest which are always auto-translated (arch/arm/include/asm/xen/page.h). The x86 version contains a check if the guest is auto-translated or not (arch/x86/include/asm/xen/page.): static inline unsigned long pfn_to_gfn(unsigned long pfn) { if (xen_feature(XENFEAT_auto_translated_physmap)) return pfn; else return pfn_to_mfn(pfn); } Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.6] x86/mm: Make {hap, shadow}_teardown() preemptible
From: Anshul Makkar anshul.mak...@citrix.com A domain with sufficient shadow allocation can cause a watchdog timeout during domain destruction. Expand the existing -ERESTART logic in paging_teardown() to allow {hap/sh}_set_allocation() to become restartable during the DOMCTL_destroydomain hypercall. Signed-off-by: Anshul Makkar anshul.mak...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Tim Deegan t...@xen.org CC: George Dunlap george.dun...@eu.citrix.com CC: Jan Beulich jbeul...@suse.com CC: Wei Liu wei.l...@citrix.com Wei: Currently, Xen can't actually run 1TB guests without suffering a watchdog timeout when destroying the domain. Therefore this is fairly important for 4.6 (and backport), or we will have to retroactively drop the currently-stated supported limits on the wiki. The patch has had extensive testing in XenServer, although has been forward ported from 4.5. --- xen/arch/x86/mm/hap/hap.c |9 ++--- xen/arch/x86/mm/paging.c|8 +--- xen/arch/x86/mm/shadow/common.c |9 ++--- xen/include/asm-x86/hap.h |2 +- xen/include/asm-x86/shadow.h|4 ++-- 5 files changed, 20 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index d375c4d..21ae5d4 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -551,7 +551,7 @@ void hap_final_teardown(struct domain *d) } if ( d-arch.paging.hap.total_pages != 0 ) -hap_teardown(d); +hap_teardown(d, NULL); p2m_teardown(p2m_get_hostp2m(d)); /* Free any memory that the p2m teardown released */ @@ -561,7 +561,7 @@ void hap_final_teardown(struct domain *d) paging_unlock(d); } -void hap_teardown(struct domain *d) +void hap_teardown(struct domain *d, int *preempted) { struct vcpu *v; mfn_t mfn; @@ -595,7 +595,9 @@ void hap_teardown(struct domain *d) d-arch.paging.hap.total_pages, d-arch.paging.hap.free_pages, d-arch.paging.hap.p2m_pages); -hap_set_allocation(d, 0, NULL); +hap_set_allocation(d, 0, preempted); +if ( preempted *preempted ) +goto out; HAP_PRINTK(teardown done. pages total = %u, free = %u, p2m=%u\n, d-arch.paging.hap.total_pages, @@ -609,6 +611,7 @@ void hap_teardown(struct domain *d) xfree(d-arch.hvm_domain.dirty_vram); d-arch.hvm_domain.dirty_vram = NULL; +out: paging_unlock(d); } diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 618f475..43fdd48 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -799,12 +799,14 @@ long paging_domctl_continuation(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) /* Call when destroying a domain */ int paging_teardown(struct domain *d) { -int rc; +int rc, preempted = 0; if ( hap_enabled(d) ) -hap_teardown(d); +hap_teardown(d, preempted); else -shadow_teardown(d); +shadow_teardown(d, preempted); +if ( preempted ) +return -ERESTART; /* clean up log dirty resources. */ rc = paging_free_log_dirty_bitmap(d, 0); diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c index abce8e2..953cacf 100644 --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -3071,7 +3071,7 @@ int shadow_enable(struct domain *d, u32 mode) return rv; } -void shadow_teardown(struct domain *d) +void shadow_teardown(struct domain *d, int *preempted) /* Destroy the shadow pagetables of this domain and free its shadow memory. * Should only be called for dying domains. */ { @@ -3139,7 +3139,9 @@ void shadow_teardown(struct domain *d) d-arch.paging.shadow.free_pages, d-arch.paging.shadow.p2m_pages); /* Destroy all the shadows and release memory to domheap */ -sh_set_allocation(d, 0, NULL); +sh_set_allocation(d, 0, preempted); +if ( preempted *preempted ) +goto out; /* Release the hash table back to xenheap */ if (d-arch.paging.shadow.hash_table) shadow_hash_teardown(d); @@ -3177,6 +3179,7 @@ void shadow_teardown(struct domain *d) d-arch.hvm_domain.dirty_vram = NULL; } +out: paging_unlock(d); /* Must be called outside the lock */ @@ -3198,7 +3201,7 @@ void shadow_final_teardown(struct domain *d) * It is possible for a domain that never got domain_kill()ed * to get here with its shadow allocation intact. */ if ( d-arch.paging.shadow.total_pages != 0 ) -shadow_teardown(d); +shadow_teardown(d, NULL); /* It is now safe to pull down the p2m map. */ p2m_teardown(p2m_get_hostp2m(d)); diff --git a/xen/include/asm-x86/hap.h b/xen/include/asm-x86/hap.h index bd87481..c613836 100644 --- a/xen/include/asm-x86/hap.h +++
Re: [Xen-devel] [RFC 1/2] xen/mm: Clarify the granularity for each Frame Number
On 05/08/15 12:40, Andrew Cooper wrote: I think a section about granularity is worthwhile, but probably a separate paragraph. I think it is also worth keeping Xen's idea of memory all at 4K, and in cases where 64K is in use, require appropriate alignment in the parameter. Which would confuse the reader because PFN which, based on the description, is the OS Frame Number. This frame will always be in the granularity of the OS. So we need to introduce the concept of in each definition. This patch makes clear that MFN and GFN is always 4KB and PFN may vary. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH OSSTEST] libvirt: Pass correct arguments to virsh migrate
$dst is a host hash/object, resulting in: 2015-08-04 22:35:25 Z executing ssh ... root@172.16.144.34 virsh migrate debian.guest.osstest HASH(0x28f4310) bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: `virsh migrate debian.guest.osstest HASH(0x28f4310)' Switch to using the same pattern as xl.pm, which is to call the argument (containing the host hash) $dho and for $dst to be a local variable containing $dho-{Name}. Also s/$ho/$sho/ to match xl.pm, since I think that is clearer about what role everything has. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Toolstack/libvirt.pm | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm index 32dca84..e2d646c 100644 --- a/Osstest/Toolstack/libvirt.pm +++ b/Osstest/Toolstack/libvirt.pm @@ -104,10 +104,11 @@ sub saverestore_check ($) { } sub migrate ($) { -my ($self,$gho,$dst,$timeout) = @_; -my $ho = $self-{Host}; +my ($self,$gho,$dho,$timeout) = @_; +my $sho = $self-{Host}; +my $dst = $dho-{Name}; my $gn = $gho-{Name}; -target_cmd_root($ho, virsh migrate $gn $dst, $timeout); +target_cmd_root($sho, virsh migrate $gn $dst, $timeout); } sub save () { -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] libvirt: Pass correct arguments to virsh migrate
On Wed, Aug 05, 2015 at 01:42:19PM +0100, Ian Campbell wrote: $dst is a host hash/object, resulting in: 2015-08-04 22:35:25 Z executing ssh ... root@172.16.144.34 virsh migrate debian.guest.osstest HASH(0x28f4310) bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: `virsh migrate debian.guest.osstest HASH(0x28f4310)' Switch to using the same pattern as xl.pm, which is to call the argument (containing the host hash) $dho and for $dst to be a local variable containing $dho-{Name}. Also s/$ho/$sho/ to match xl.pm, since I think that is clearer about what role everything has. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Toolstack/libvirt.pm | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm index 32dca84..e2d646c 100644 --- a/Osstest/Toolstack/libvirt.pm +++ b/Osstest/Toolstack/libvirt.pm @@ -104,10 +104,11 @@ sub saverestore_check ($) { } sub migrate ($) { sub migrate () ? -my ($self,$gho,$dst,$timeout) = @_; -my $ho = $self-{Host}; +my ($self,$gho,$dho,$timeout) = @_; +my $sho = $self-{Host}; +my $dst = $dho-{Name}; my $gn = $gho-{Name}; -target_cmd_root($ho, virsh migrate $gn $dst, $timeout); +target_cmd_root($sho, virsh migrate $gn $dst, $timeout); } sub save () { -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/8] xen: Use the correctly the Xen memory terminologies
On 08/05/2015 08:33 AM, Julien Grall wrote: On 05/08/15 13:19, Boris Ostrovsky wrote: On 08/05/2015 06:51 AM, Julien Grall wrote: diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 09dc447..25e3cce 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -539,7 +539,7 @@ static int xenfb_remove(struct xenbus_device *dev) static unsigned long vmalloc_to_mfn(void *address) { -return pfn_to_mfn(vmalloc_to_pfn(address)); +return pfn_to_gfn(vmalloc_to_pfn(address)); } Are you sure? This will return vmalloc_to_pfn(address)). I guess you mean vmalloc_to_mfn will return vmalloc_to_pfn? If so, it will be only the case on auto-translated case (because pfn == gfn). In the case of PV, the mfn will be returned. How will mfn be returned on PV when pfn_to_gfn() is an identity function? static inline unsigned long pfn_to_gfn(unsigned long pfn) { return pfn; } The identity function is only for ARM guest which are always auto-translated (arch/arm/include/asm/xen/page.h). The x86 version contains a check if the guest is auto-translated or not (arch/x86/include/asm/xen/page.): static inline unsigned long pfn_to_gfn(unsigned long pfn) { if (xen_feature(XENFEAT_auto_translated_physmap)) return pfn; else return pfn_to_mfn(pfn); } Of course --- I was looking at the top of the patch and didn't realize it was ARM changes. Sorry for the noise. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 1/2] xen/mm: Clarify the granularity for each Frame Number
On 05/08/15 13:36, Julien Grall wrote: On 05/08/15 12:40, Andrew Cooper wrote: I think a section about granularity is worthwhile, but probably a separate paragraph. I think it is also worth keeping Xen's idea of memory all at 4K, and in cases where 64K is in use, require appropriate alignment in the parameter. Which would confuse the reader because PFN which, based on the description, is the OS Frame Number. This frame will always be in the granularity of the OS. A linear idea of a guest physical address space. says nothing about the OS. It is purely a Xen concept, as described here. So we need to introduce the concept of in each definition. This patch makes clear that MFN and GFN is always 4KB and PFN may vary. Is (or rather will) a 4K dom0 able to make 4K mappings of a 64K domU? How is a 64K dom0 expected to make mappings of a 4K domU? The primary use of pfn in Xen is logdirty tracking (which ARM doesn't have yet), but will have to be set at the minimum granularity of the toolstack domain, domU and the logdirty ABI which currently is assumed to be 4K pages because of its x86 heritage. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] libvirt: Pass correct arguments to virsh migrate
On Wed, 2015-08-05 at 13:43 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 01:42:19PM +0100, Ian Campbell wrote: $dst is a host hash/object, resulting in: 2015-08-04 22:35:25 Z executing ssh ... root@172.16.144.34 virsh migrate debian.guest.osstest HASH(0x28f4310) bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: `virsh migrate debian.guest.osstest HASH(0x28f4310)' Switch to using the same pattern as xl.pm, which is to call the argument (containing the host hash) $dho and for $dst to be a local variable containing $dho-{Name}. Also s/$ho/$sho/ to match xl.pm, since I think that is clearer about what role everything has. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Toolstack/libvirt.pm | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm index 32dca84..e2d646c 100644 --- a/Osstest/Toolstack/libvirt.pm +++ b/Osstest/Toolstack/libvirt.pm @@ -104,10 +104,11 @@ sub saverestore_check ($) { } sub migrate ($) { sub migrate () ? I think so, I wonder why Perl wasn't complaining about that... I'll respin. -my ($self,$gho,$dst,$timeout) = @_; -my $ho = $self-{Host}; +my ($self,$gho,$dho,$timeout) = @_; +my $sho = $self-{Host}; +my $dst = $dho-{Name}; my $gn = $gho-{Name}; -target_cmd_root($ho, virsh migrate $gn $dst, $timeout); +target_cmd_root($sho, virsh migrate $gn $dst, $timeout); } sub save () { ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH OSSTEST v2] libvirt: Pass correct arguments to virsh migrate
$dst is a host hash/object, resulting in: 2015-08-04 22:35:25 Z executing ssh ... root@172.16.144.34 virsh migrate debian.guest.osstest HASH(0x28f4310) bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: `virsh migrate debian.guest.osstest HASH(0x28f4310)' Switch to using the same pattern as xl.pm, which is to call the argument (containing the host hash) $dho and for $dst to be a local variable containing $dho-{Name}. Also s/$ho/$sho/ to match xl.pm, since I think that is clearer about what role everything has. Fix the prototype too while editing this function. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- v2: Fix the prototype as well. --- Osstest/Toolstack/libvirt.pm | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm index 32dca84..bfb5ccb 100644 --- a/Osstest/Toolstack/libvirt.pm +++ b/Osstest/Toolstack/libvirt.pm @@ -103,11 +103,12 @@ sub saverestore_check ($) { return check_for_command($self, save); } -sub migrate ($) { -my ($self,$gho,$dst,$timeout) = @_; -my $ho = $self-{Host}; +sub migrate () { +my ($self,$gho,$dho,$timeout) = @_; +my $sho = $self-{Host}; +my $dst = $dho-{Name}; my $gn = $gho-{Name}; -target_cmd_root($ho, virsh migrate $gn $dst, $timeout); +target_cmd_root($sho, virsh migrate $gn $dst, $timeout); } sub save () { -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen
On 05/08/15 12:49, Shannon Zhao wrote: That's great! Keep in mind that many ARM platforms have non-PCI busses, so I think we'll need an amba and a platform bus_notifier too, in addition to the existing pci bus notifier. Thanks for your reminding. I thought about amba. Since ACPI of current linux kernel doesn't support probe amba bus devices, so this bus_notifier will not be used at the moment. But there are some voice that we need to make ACPI support amba on the linux arm kernel mail list. And to me it doesn't matter to add the amba bus_notifier. This comment raised one question. What happen if the hardware has MMIO region not described in the ACPI? Does the driver would have to call a specific xen function to register the missing I/O? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST v2] libvirt: Pass correct arguments to virsh migrate
On Wed, Aug 05, 2015 at 01:48:27PM +0100, Ian Campbell wrote: $dst is a host hash/object, resulting in: 2015-08-04 22:35:25 Z executing ssh ... root@172.16.144.34 virsh migrate debian.guest.osstest HASH(0x28f4310) bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: `virsh migrate debian.guest.osstest HASH(0x28f4310)' Switch to using the same pattern as xl.pm, which is to call the argument (containing the host hash) $dho and for $dst to be a local variable containing $dho-{Name}. Also s/$ho/$sho/ to match xl.pm, since I think that is clearer about what role everything has. Fix the prototype too while editing this function. Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Thanks for the fix! --- v2: Fix the prototype as well. --- Osstest/Toolstack/libvirt.pm | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm index 32dca84..bfb5ccb 100644 --- a/Osstest/Toolstack/libvirt.pm +++ b/Osstest/Toolstack/libvirt.pm @@ -103,11 +103,12 @@ sub saverestore_check ($) { return check_for_command($self, save); } -sub migrate ($) { -my ($self,$gho,$dst,$timeout) = @_; -my $ho = $self-{Host}; +sub migrate () { +my ($self,$gho,$dho,$timeout) = @_; +my $sho = $self-{Host}; +my $dst = $dho-{Name}; my $gn = $gho-{Name}; -target_cmd_root($ho, virsh migrate $gn $dst, $timeout); +target_cmd_root($sho, virsh migrate $gn $dst, $timeout); } sub save () { -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST v2] libvirt: Pass correct arguments to virsh migrate
On Wed, 2015-08-05 at 13:49 +0100, Wei Liu wrote: On Wed, Aug 05, 2015 at 01:48:27PM +0100, Ian Campbell wrote: $dst is a host hash/object, resulting in: 2015-08-04 22:35:25 Z executing ssh ... root@172.16.144.34 virsh migrate debian.guest.osstest HASH(0x28f4310) bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: `virsh migrate debian.guest.osstest HASH(0x28f4310)' Switch to using the same pattern as xl.pm, which is to call the argument (containing the host hash) $dho and for $dst to be a local variable containing $dho-{Name}. Also s/$ho/$sho/ to match xl.pm, since I think that is clearer about what role everything has. Fix the prototype too while editing this function. Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Thanks for the fix! NP. I've pushed this to pretest and killed flight 60606 which hadn't alloca ted any hosts yet. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen
On 2015/8/5 20:48, Julien Grall wrote: On 05/08/15 12:49, Shannon Zhao wrote: That's great! Keep in mind that many ARM platforms have non-PCI busses, so I think we'll need an amba and a platform bus_notifier too, in addition to the existing pci bus notifier. Thanks for your reminding. I thought about amba. Since ACPI of current linux kernel doesn't support probe amba bus devices, so this bus_notifier will not be used at the moment. But there are some voice that we need to make ACPI support amba on the linux arm kernel mail list. And to me it doesn't matter to add the amba bus_notifier. This comment raised one question. What happen if the hardware has MMIO region not described in the ACPI? This sounds weird. If a device is described in ACPI table, it will not describe the MMIO region which the driver will use? Does this situation exist? If the hardware has mmio region not described in the ACPI, how does the driver know the region and use it? Does the driver would have to call a specific xen function to register the missing I/O? -- Shannon ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/8] cxenstored: add support for systemd active sockets
On Wed, Aug 05, 2015 at 11:56:44AM +0100, George Dunlap wrote: On Wed, Aug 5, 2015 at 11:17 AM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-08-05 at 11:06 +0100, George Dunlap wrote: On Fri, Jul 18, 2014 at 12:28 AM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This adds systemd socket activation support for the C xenstored. Active sockets enable xenstored to be loaded only if required by a system onto which Xen is installed on. Socket activation is handled by systemd, once a port for a service which claims a socket is used systemd will start the required services for it, on demand. For more details on socket activation refer to Lennart's socket-activation post regarding this [0]. Right now this code adds a no-op for this functionality, leaving the enablement to be done later once systemd is properly hooked into the build system. The socket activation is ordered in aligment with the socket activation order passed on to systemd. [0] http://0pointer.de/blog/projects/socket-activation2.html So with this patch in place, xenstored will not start on a system that has systemd, *even if it wasn't started from systemd*. But where systemd is /sbin/init, right? The case where xenstored was compiled with systemd support but systemd is not /sbin/init should still be expected to work, and isn't what I think you are complaining about here. Lots of systems (e.g., CentOS 7) have legacy systems in place to allow you to do things like chkconfig --add xencommons even on a systemd system. I think we still want to work with those, right? Isn't chkconfig --add still arranging for the thing to be started by systemd under the hood? If not systemd on a host where /sbin/init==systemd then what does else would start it? If you are asking should the sysvinit initscripts still be us(ed|able) even though systemd is being used as /sbin/init on the host and a unit file is present then AIUI the systemd answer is no. (We may choose to disagree with systemd on this I suppose) Well that's not (apparently) the RHEL answer; doing chkconfig --add [foo] Just Works on CentOS 7 for all the sysvinit scripts I've used (including the Xen 4.4 Xen4CentOS packages). I think we want to still *enable* people to use that mode if they want to. But I won't argue if people feel strongly otherwise. On the other hand, does this mean I can no longer start xenstored by hand from the CLI? _That_ would seem to be worth preserving, for debugging etc if nothing else. So what happens at the moment is that xenstored, run either from the command-line says, Oh, look! I'm running on a systemd system. I'll check for my systemd sockets. Oh no, no sockets! *dies*. If run from xencommons, it doesn't even get that far: it says, Oh, look! I'm running on a systemd system. But wait! You asked me to use a pidfile! BAD USER! NO PIDFILE ON SYSTEMD! *dies*. Modifying xenstored to try to open the systemd sockets, and fall back to normal sockets if it doesn't find any, works when started from the command-line. I have always thought this is the expected behaviour. Just that the code has a bug. Here is a patch that is not even compile test. :-) ---8--- From 6f050ca085014fc121e2bc2c0ff66feded0cd210 Mon Sep 17 00:00:00 2001 From: Wei Liu wei.l...@citrix.com Date: Wed, 5 Aug 2015 14:15:27 +0100 Subject: [PATCH] cxenstored: fix systemd activation Function sd_booted() returns positive number when systemd is running. Don't use barf when we don't intent to exit the program. Signed-off-by: Wei Liu wei.l...@citrix.com --- tools/xenstore/xenstored_core.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c index b7e4936..3a8e2fe 100644 --- a/tools/xenstore/xenstored_core.c +++ b/tools/xenstore/xenstored_core.c @@ -1990,10 +1990,10 @@ int main(int argc, char *argv[]) barf(%s: No arguments desired, argv[0]); #if defined(XEN_SYSTEMD_ENABLED) - if (sd_booted()) { + if (sd_booted() 0) { dofork = false; if (pidfile) - barf(%s: PID file not needed on systemd, argv[0]); + xprintf(%s: PID file not needed on systemd, argv[0]); pidfile = NULL; } #endif @@ -2020,7 +2020,7 @@ int main(int argc, char *argv[]) signal(SIGPIPE, SIG_IGN); #if defined(XEN_SYSTEMD_ENABLED) - if (sd_booted()) + if (sd_booted() 0) xen_claim_active_sockets(sock, ro_sock); else #endif @@ -2057,7 +2057,7 @@ int main(int argc, char *argv[]) xenbus_notify_running(); #if defined(XEN_SYSTEMD_ENABLED) - if (sd_booted()) { + if (sd_booted() 0) { sd_notify(1, READY=1); fprintf(stderr, SD_NOTICE xenstored is ready\n);
Re: [Xen-devel] [RFC 1/2] xen/mm: Clarify the granularity for each Frame Number
Hi, On 05/08/15 13:46, Andrew Cooper wrote: On 05/08/15 13:36, Julien Grall wrote: On 05/08/15 12:40, Andrew Cooper wrote: I think a section about granularity is worthwhile, but probably a separate paragraph. I think it is also worth keeping Xen's idea of memory all at 4K, and in cases where 64K is in use, require appropriate alignment in the parameter. Which would confuse the reader because PFN which, based on the description, is the OS Frame Number. This frame will always be in the granularity of the OS. A linear idea of a guest physical address space. says nothing about the OS. It is purely a Xen concept, as described here. Right. Thank you for your explanation IRL. I will have to find another place to clear specify the granularity of the hypercalls. So we need to introduce the concept of in each definition. This patch makes clear that MFN and GFN is always 4KB and PFN may vary. Is (or rather will) a 4K dom0 able to make 4K mappings of a 64K domU? How is a 64K dom0 expected to make mappings of a 4K domU? The Xen interface will stay 4K even with 64K guest. We have to support 64K guest/dom0 on the current Xen because some distro may do the choice to only ship 64K. In my current implementation of Linux 64K support (see [1]), there is no changes in Xen (hypervisor and tools). Linux is breaking the 64K page in 4K chunk. When the backend is 64K, it will map the foreign 4K at the top of a 64K page. It's a waste of memory, but it's easier to implement and it's still and improvement compare to have Linux crashing at boot. Note that there is lots of room of improvement but I'd like to get a series as soon as possible. That doesn't mean we should have a hack, just something sensible and working. The primary use of pfn in Xen is logdirty tracking (which ARM doesn't have yet), but will have to be set at the minimum granularity of the toolstack domain, domU and the logdirty ABI which currently is assumed to be 4K pages because of its x86 heritage. Which will be fine for ARM given that ARM32 only supports 4K. Regards, [1] https://lkml.org/lkml/2015/7/9/611 -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Hotpatch construction and __LINE__
On 05/08/15 09:55, Martin Pohlack wrote: Hi, Another high-level point to think about is how we want to handle inlined __LINE__ references. This problem is related to hotpatch construction and potentially has influence on the design of the hotpatching infrastructure in Xen. Let me try to explain the problem: We have file1.c with functions f1 and f2 (in that order). f2 contains a BUG() (or WARN()) macro and at that point embeds the source line number into the generated code for f2. Now we want to hotpatch f1 and the hotpatch source-code patch adds 2 lines to f1 and as a consequence shifts out f2 by two lines. The newly constructed file1.o will now contain differences in both binary functions f1 (because we actually changed it with the applied patch) and f2 (because the contained BUG macro embeds the new line number). Without additional information, an algorithm comparing file1.o before and after hotpatch application will determine both functions to be changed and will have to include both into the binary hotpatch. Options: 1. Transform source code patches for hotpatches to be line-neutral for each chunk. This can be done in almost all cases with either reformatting of the source code or by introducing artificial preprocessor #line n directives to adjust for the introduced differences. This approach is low-tech and simple. Potentially generated backtraces and existing debug information refers to the original build and does not reflect hotpatching state except for actually hotpatched functions but should be mostly correct. 2. Ignoring the problem and living with artificially large hotpatches that unnecessarily patch many functions. This approach might lead to some very large hotpatches depending on content of specific source file. It may also trigger pulling in functions into the hotpatch that cannot reasonable be hotpatched due to limitations of a hotpatching framework (init-sections, parts of the hotpatching framework itself, ...) and may thereby prevent us from patching a specific problem. The decision between 1. and 2. can be made on a patch--by-patch basis. 3. Introducing an indirection table for storing line numbers and treating that specially for binary diffing. I believe Linux follows this approach, but the details escape me ATM. We might either use this indirection table for runtime use and patch that with each hotpatch (similarly to exception tables) or we might purely use it when building hotpatches to ignore functions that only differ at exactly the location where a line-number is embedded. Similar considerations are true to a lesser extent for __FILE__, but I would argue that file renaming should be done outside of hotpatches. Looking at `strings xen-syms`, we have very few embedded __FILE__/__LINE__'s in other strings, meaning that most come from %s/%d formatting. bug/warn/assert frames have their references in the exception table, so should be reasonably self contained to fix up. Regular printk()s are definitely more problematic however. Their references will typically be in %rsi/%rdx and shifting any #line reference across the 256 boundary will change the encoding size of setting the print up. Jan had a plan to make Xen read its own DWARF symbol table rather than using the current cludge we have where we partially link Xen, extract the public symbol table, rewrite symbol-offsets.c and relink it onto the end. Perhaps there is an opportunity to piggy-back onto that and have all file/line references coming from the DWARF information. I suspect this will be similar to what Linux does, as it already has support for using its own DWARF/STABS information. Personally, I think #2 should be avoided wherever possible. The bigger the hotpatch, the greater the opportunity for something to go wrong. Any panic/stack trace should be able to identify exactly which Xen is running, including some hotpatch status. Hopatches will be[1] small targeted fixes, so I don't think it is unreasonable to expect someone interpreting a stack trace to need to adjust file/line references by the hotpatch delta. ~Andrew [1] woe betide anyone attempting to use hotpatches for general software updates. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v3.1 2/2] xsplice: Add hook for build_id
On 05.08.2015 10:58, Andrew Cooper wrote: On 05/08/15 09:50, Martin Pohlack wrote: On 27.07.2015 21:20, Konrad Rzeszutek Wilk wrote: Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- tools/libxc/xc_private.c | 3 +++ tools/misc/xen-xsplice.c | 25 + xen/common/kernel.c | 11 +++ xen/common/version.c | 5 + xen/include/public/version.h | 4 xen/include/xen/compile.h.in | 1 + xen/include/xen/version.h| 1 + 7 files changed, 50 insertions(+) diff --git a/tools/libxc/xc_private.c b/tools/libxc/xc_private.c index 2ffebd9..7c039ca 100644 --- a/tools/libxc/xc_private.c +++ b/tools/libxc/xc_private.c @@ -713,6 +713,9 @@ int xc_version(xc_interface *xch, int cmd, void *arg) case XENVER_commandline: sz = sizeof(xen_commandline_t); break; +case XENVER_build_id: +sz = sizeof(xen_build_id_t); +break; default: ERROR(xc_version: unknown command %d\n, cmd); return -EINVAL; diff --git a/tools/misc/xen-xsplice.c b/tools/misc/xen-xsplice.c index 7cf9879..dd8266c 100644 --- a/tools/misc/xen-xsplice.c +++ b/tools/misc/xen-xsplice.c @@ -17,6 +17,7 @@ void show_help(void) id An unique name of payload. Up to 40 characters.\n Commands:\n help display this help\n + build-id display build-id of hypervisor.\n upload id file upload file cpuid with id name\n list list payloads uploaded.\n apply id apply id patch.\n @@ -306,12 +307,36 @@ int action_func(int argc, char *argv[], unsigned int idx) return rc; } + +static int build_id_func(int argc, char *argv[]) +{ +xen_build_id_t build_id; + +if ( argc ) +{ +show_help(); +return -1; +} + +memset(build_id, 0, sizeof(*build_id)); + +if ( xc_version(xch, XENVER_build_id, build_id) 0 ) +{ +printf(Failed to get build_id: %d(%s)\n, errno, strerror(errno)); +return -1; +} + +printf(%s\n, build_id); +return 0; +} + struct { const char *name; int (*function)(int argc, char *argv[]); } main_options[] = { { help, help_func }, { list, list_func }, +{ build-id, build_id_func }, { upload, upload_func }, }; diff --git a/xen/common/kernel.c b/xen/common/kernel.c index 6a3196a..e9d41b6 100644 --- a/xen/common/kernel.c +++ b/xen/common/kernel.c @@ -357,6 +357,17 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) if ( copy_to_guest(arg, saved_cmdline, ARRAY_SIZE(saved_cmdline)) ) return -EFAULT; return 0; + +case XENVER_build_id: +{ +xen_build_id_t build_id; + +memset(build_id, 0, sizeof(build_id)); +safe_strcpy(build_id, xen_build_id()); You seem to want to store and transfer the build_id as a string. Any reason why we don't directly expose the build_id embedded by the linker in binary format? +if ( copy_to_guest(arg, build_id, ARRAY_SIZE(build_id)) ) +return -EFAULT; +return 0; +} We should not expose the build_id to normal guests, but only to Dom0. A build_id uniquely identifies a specific build and I don't see how that information would be required from DomU. It might actually help an attacker to build his return-oriented programming exploit against a specific build. The normal version numbers should be enough to know about capabilities and API. It will need its own XSM hook, but need not be strictly limited to just dom0. } return -ENOSYS; diff --git a/xen/common/version.c b/xen/common/version.c index b152e27..5c3dbb0 100644 --- a/xen/common/version.c +++ b/xen/common/version.c @@ -55,3 +55,8 @@ const char *xen_banner(void) { return XEN_BANNER; } + +const char *xen_build_id(void) +{ +return XEN_BUILD_ID; +} diff --git a/xen/include/public/version.h b/xen/include/public/version.h index 44f26b0..c863393 100644 --- a/xen/include/public/version.h +++ b/xen/include/public/version.h @@ -83,6 +83,10 @@ typedef struct xen_feature_info xen_feature_info_t; #define XENVER_commandline 9 typedef char xen_commandline_t[1024]; +#define XENVER_build_id 10 +typedef char xen_build_id_t[1024]; +#define XEN_BUILD_ID_LEN (sizeof(xen_build_id_t)) + #endif /* __XEN_PUBLIC_VERSION_H__ */ /* diff --git a/xen/include/xen/compile.h.in b/xen/include/xen/compile.h.in index 440ecb2..939685e 100644 --- a/xen/include/xen/compile.h.in +++ b/xen/include/xen/compile.h.in @@ -10,4 +10,5 @@ #define XEN_EXTRAVERSION @@extraversion@@ #define XEN_CHANGESET @@changeset@@ +#define XEN_BUILD_ID@@changeset@@ That leads to a chicken and egg problem when embedding a real build_id. Some linker script magic
Re: [Xen-devel] [PATCH v3] VT-d: add iommu=igfx option to workaround graphics issues
On Wed, Aug 05, 2015 at 01:18:05PM +0100, Andrew Cooper wrote: On 05/08/15 10:11, Ting-Wei Lan wrote: When using Linux = 3.19 (commit 47591df) as dom0 on some Intel Ironlake devices, It is possible to encounter graphics issues that make screen unreadable or crash the system. It was reported in freedesktop bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90037 As we still cannot find a proper fix for this problem, this patch adds iommu=igfx option to control whether Intel graphics IOMMU is enabled. Running Xen with iommu=no-igfx is similar to running Linux with intel_iommu=igfx_off, which disables IOMMU for Intel GPU. This can be used by users to manually workaround the problem before a fix is available for i915 driver. Signed-off-by: Ting-Wei Lan lant...@gmail.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Wei: I think this is a candidate for inclusion into 4.6 This patch is straight-forward and it's basically risk free. So Release-acked-by: Wei Liu wei.l...@citrix.com (subject to an ack from vt-d maintainers) ~Andrew --- Changed since v2: * Make no-igfx available for all Intel devices, not just Calpella/Ironlake Changed since v1: * Replace igfx_off with igfx docs/misc/xen-command-line.markdown | 11 ++- xen/drivers/passthrough/iommu.c | 3 +++ xen/drivers/passthrough/vtd/quirks.c | 3 +++ xen/include/xen/iommu.h | 2 +- 4 files changed, 17 insertions(+), 2 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 13f03ad..486e53b 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -793,7 +793,7 @@ debug hypervisor only). Default: `new` unless directed-EOI is supported ### iommu - `= List of [ boolean | force | required | intremap | qinval | snoop | sharept | dom0-passthrough | dom0-strict | amd-iommu-perdev-intremap | workaround_bios_bug | verbose | debug ]` + `= List of [ boolean | force | required | intremap | qinval | snoop | sharept | dom0-passthrough | dom0-strict | amd-iommu-perdev-intremap | workaround_bios_bug | igfx | verbose | debug ]` Sub-options: @@ -867,6 +867,15 @@ debug hypervisor only). ignored (normally IOMMU setup fails if any of the devices listed by a DRHD entry aren't PCI discoverable). + `igfx` (VT-d) + + Default: `true` + + Enable IOMMU for Intel graphics devices. The intended usage of this option + is `no-igfx`, which is silimar to Linux `intel_iommu=igfx_off` option used ^ similar Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] xsplice: Use ld-embedded build-ids
Todo: * Should be moved to sysctl to only allow Dom0 access * Maybe convert to binary transport to userland instead of printable form * use ld to actually embed the build ID * convert to textual representation in hypervisor and report in printable form Signed-off-by: Martin Pohlack mpohl...@amazon.de --- xen/arch/x86/Makefile| 4 ++-- xen/arch/x86/xen.lds.S | 5 + xen/common/kernel.c | 33 + xen/common/version.c | 5 - xen/include/xen/compile.h.in | 1 - 5 files changed, 36 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile index 5f24951..f724bd8 100644 --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -108,11 +108,11 @@ $(TARGET)-syms: prelink.o xen.lds $(BASEDIR)/common/symbols-dummy.o $(BASEDIR)/common/symbols-dummy.o -o $(@D)/.$(@F).0 $(NM) -n $(@D)/.$(@F).0 | $(BASEDIR)/tools/symbols $(@D)/.$(@F).0.S $(MAKE) -f $(BASEDIR)/Rules.mk $(@D)/.$(@F).0.o - $(LD) $(LDFLAGS) -T xen.lds -N prelink.o \ + $(LD) $(LDFLAGS) -T xen.lds -N prelink.o --build-id=sha1 \ $(@D)/.$(@F).0.o -o $(@D)/.$(@F).1 $(NM) -n $(@D)/.$(@F).1 | $(BASEDIR)/tools/symbols $(@D)/.$(@F).1.S $(MAKE) -f $(BASEDIR)/Rules.mk $(@D)/.$(@F).1.o - $(LD) $(LDFLAGS) -T xen.lds -N prelink.o \ + $(LD) $(LDFLAGS) -T xen.lds -N prelink.o --build-id=sha1 \ $(@D)/.$(@F).1.o -o $@ rm -f $(@D)/.$(@F).[0-9]* diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index 6553cff..2176782 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -67,6 +67,11 @@ SECTIONS *(.rodata.*) } :text + .note.gnu.build-id : { + __note_gnu_build_id_start = .; + *(.note.gnu.build-id) + } :text + . = ALIGN(SMP_CACHE_BYTES); .data.read_mostly : { /* Exception table */ diff --git a/xen/common/kernel.c b/xen/common/kernel.c index e9d41b6..9814585 100644 --- a/xen/common/kernel.c +++ b/xen/common/kernel.c @@ -6,9 +6,11 @@ #include xen/init.h #include xen/lib.h +#include xen/elf.h #include xen/errno.h #include xen/version.h #include xen/sched.h +#include xen/types.h #include xen/paging.h #include xen/nmi.h #include xen/guest_access.h @@ -227,6 +229,10 @@ void __init do_initcalls(void) * Simple hypercalls. */ +#define NT_GNU_BUILD_ID 3 + +extern char * __note_gnu_build_id_start; /* defined in linker script */ + DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { switch ( cmd ) @@ -360,11 +366,30 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) case XENVER_build_id: { -xen_build_id_t build_id; +xen_build_id_t ascii_id; +Elf_Note * n = (Elf_Note *)__note_gnu_build_id_start; +char * binary_id; +int i; + +memset(ascii_id, 0, sizeof(ascii_id)); + +/* check if we really have a build-id */ +if ( NT_GNU_BUILD_ID != n-type ) +return 0; + +/* sanity check, name should be GNU for ld-generated build-id */ +if ( 0 != strncmp(ELFNOTE_NAME(n), GNU, n-namesz)) +return 0; + +binary_id = (char *)ELFNOTE_DESC(n); + +/* convert to printable format */ +for (i = 0; i n-descsz (i + 1) * 2 sizeof(xen_build_id_t); i++) +{ +snprintf(ascii_id[i * 2], 3, %02hhx, binary_id[i]); +} -memset(build_id, 0, sizeof(build_id)); -safe_strcpy(build_id, xen_build_id()); -if ( copy_to_guest(arg, build_id, ARRAY_SIZE(build_id)) ) +if ( copy_to_guest(arg, ascii_id, ARRAY_SIZE(ascii_id)) ) return -EFAULT; return 0; } diff --git a/xen/common/version.c b/xen/common/version.c index 5c3dbb0..b152e27 100644 --- a/xen/common/version.c +++ b/xen/common/version.c @@ -55,8 +55,3 @@ const char *xen_banner(void) { return XEN_BANNER; } - -const char *xen_build_id(void) -{ -return XEN_BUILD_ID; -} diff --git a/xen/include/xen/compile.h.in b/xen/include/xen/compile.h.in index 939685e..440ecb2 100644 --- a/xen/include/xen/compile.h.in +++ b/xen/include/xen/compile.h.in @@ -10,5 +10,4 @@ #define XEN_EXTRAVERSION @@extraversion@@ #define XEN_CHANGESET @@changeset@@ -#define XEN_BUILD_ID@@changeset@@ #define XEN_BANNER \ -- 2.5.0 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] x86/mm: Make {hap, shadow}_teardown() preemptible
Hi, At 13:36 +0100 on 05 Aug (1438781760), Andrew Cooper wrote: From: Anshul Makkar anshul.mak...@citrix.com A domain with sufficient shadow allocation can cause a watchdog timeout during domain destruction. Expand the existing -ERESTART logic in paging_teardown() to allow {hap/sh}_set_allocation() to become restartable during the DOMCTL_destroydomain hypercall. Signed-off-by: Anshul Makkar anshul.mak...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Good catch, thanks. @@ -3139,7 +3139,9 @@ void shadow_teardown(struct domain *d) d-arch.paging.shadow.free_pages, d-arch.paging.shadow.p2m_pages); /* Destroy all the shadows and release memory to domheap */ -sh_set_allocation(d, 0, NULL); +sh_set_allocation(d, 0, preempted); +if ( preempted *preempted ) +goto out; /* Release the hash table back to xenheap */ if (d-arch.paging.shadow.hash_table) shadow_hash_teardown(d); If the debug printout just above this is ever enabled, it will get very loud since the printout will make preemption likely. Please just delete the SHADOW_PRINTK above; you can also delete the one below if you like. The HAP side looks like it needs the same adjustment. With that done, Reviewed-by: Tim Deegan t...@xen.org Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 02/20] xen: Introduce a function to split a Linux page into Xen page
Hi David, On 24/07/15 11:10, David Vrabel wrote: On 24/07/15 10:54, Julien Grall wrote: On 24/07/15 10:31, David Vrabel wrote: On 09/07/15 21:42, Julien Grall wrote: The Xen interface is always using 4KB page. This means that a Linux page may be split across multiple Xen page when the page granularity is not the same. This helper will break down a Linux page into 4KB chunk and call the helper on each of them. [...] --- a/include/xen/page.h +++ b/include/xen/page.h @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; extern unsigned long xen_released_pages; +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data); + +/* Break down the page in 4KB granularity and call fn foreach xen pfn */ +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn, + void *data) I think this should be outlined (unless you have measurements that support making it inlined). I don't have any performance measurements. Although, when Linux is using 4KB page granularity, the loop in this helper will be dropped by the helper. The code would look like: unsigned long pfn = xen_page_to_pfn(page); ret = fn(page, fn, data); if (ret) return ret; The compiler could even inline the callback (fn). So it drops 2 functions call. Ok, keep it inlined. Also perhaps make it int xen_for_each_gfn(struct page *page, xen_gfn_fn_t fn, void *data); gfn standing for Guest Frame Number right? Yes. This suggestion is just changing the name to make it more obvious what it does. Thinking more about this suggestion. The callback (fn) is getting a 4K PFN in parameter and not a GFN. This is because the balloon code seems to require having a 4K PFN in hand in few places. For instance XENMEM_populate_physmap and HYPERVISOR_update_va_mapping. Although, I'm not sure to understand the difference between GMFN, and GPFN in the hypercall doc. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-4.1 test] 60547: tolerable FAIL - PUSHED
flight 60547 linux-4.1 real [real] http://logs.test-lab.xenproject.org/osstest/logs/60547/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 11 guest-start fail like 59909 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail like 59909 test-amd64-amd64-rumpuserxen-amd64 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail like 59960 test-amd64-i386-xl-xsm 13 guest-saverestorefail like 60030 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 60030 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 60030 test-amd64-i386-pair21 guest-migrate/src_host/dst_host fail like 60030 test-amd64-i386-xl 13 guest-saverestorefail like 60030 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-qcow2 9 debian-di-installfail never pass test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail never pass test-armhf-armhf-xl-raw 9 debian-di-installfail never pass test-armhf-armhf-libvirt-vhd 9 debian-di-installfail never pass test-armhf-armhf-xl-qcow2 9 debian-di-installfail never pass test-armhf-armhf-libvirt-raw 9 debian-di-installfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 9 debian-di-installfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: linux89e419960fb6a260f6a112821507d516117d5aa1 baseline version: linuxc8bde72f9af412de57f0ceae218d648640118b0b Last test of basis60030 2015-07-27 16:52:24 Z8 days Testing same since60547 2015-08-03 16:41:21 Z1 days1 attempts People who touched revisions under test: Aaro Koskinen aaro.koski...@nokia.com Aaron Brown aaron.f.br...@intel.com Adriana Reus adriana.r...@intel.com Al Viro v...@zeniv.linux.org.uk Alan Stern st...@rowland.harvard.edu Aleksei Mamlin mamli...@gmail.com Aleksei Volkov i...@dv2c.ru Alex Deucher alexander.deuc...@amd.com Alexander Aring alex.ar...@gmail.com Alexander Duyck alexander.h.du...@redhat.com Alexander Sverdlin alexander.sverd...@nokia.com Alexandre Belloni alexandre.bell...@free-electrons.com Alexandre Courbot acour...@nvidia.com Alexei Starovoitov a...@plumgrid.com Alexey Khoroshilov khoroshi...@ispras.ru Aman Deep aman.d...@samsung.com Andrew Morton a...@linux-foundation.org Andrey Ryabinin a.ryabi...@samsung.com Archit Taneja arch...@codeaurora.org Ard Biesheuvel ard.biesheu...@linaro.org Arnaldo Carvalho de Melo a...@redhat.com Arnd Bergmann a...@arndb.de Arne Fitzenreiter arn...@ipfire.org Axel Lin axel@ingics.com Bo Shen voice.s...@atmel.com Bob Moore robert.mo...@intel.com Boris Brezillon boris.brezil...@free-electrons.com Borislav Petkov b...@suse.de Brian Foster bfos...@redhat.com Bryan Wu coolo...@gmail.com Catalin Marinas catalin.mari...@arm.com Charles Keepax ckee...@opensource.wolfsonmicro.com Chris Mason c...@fb.com Chris Metcalf cmetc...@ezchip.com Chris Wilson ch...@chris-wilson.co.uk Christian König christian.koe...@amd.com Christoffer Dall
Re: [Xen-devel] virtio on pv/pvh xen
On Wed, Aug 05, 2015 at 11:09:41PM +0800, Lai Jiangshan wrote: Hi, Liu Does pv or pvh guest support virtio devices? No. If yes, how can I configure the guest? If not, how can I make it support? A new transport which makes use of xenbus and grant table needs to be developed. There were talks on standardising virtio and adding Xen PV transport support long time ago but no visible progress was made. Wei. Thanks Lai ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests
On 05/08/15 10:53, Roger Pau Monné wrote: El 04/08/15 a les 20.08, Andrew Cooper ha escrit: On 03/08/15 18:31, Roger Pau Monné wrote: Therefore, I am leaning slightly towards the specify the internals side of things, which removes some complexity from the Xen side of the hypercall. I've updated the proposed structure, so now it looks like: (I still need to figure out how to order the bits in the access rights (_ar) fields.) All struct's need a flags register. I was unsure about adding the {e/r}flags register, I will add it in new versions. For first-gen VT-x, CR0.PE in unable to be cleared. HVMLoader has to do a dance with vm86 mode, which is where HVM_PARAM_IDENT_PT gets used. I don't realistically expect this to be a problem for DMlite guests. struct vcpu_hvm_x86_16 { uint16_t ax; uint16_t cx; uint16_t dx; uint16_t bx; uint16_t sp; uint16_t bp; uint16_t si; uint16_t di; uint16_t ip; uint32_t cr[8]; Definitely no need for anything other than cr0 and 4 in 16 bit mode. Yes. Would you rather prefer to spell out the exact control registers that are going to be used instead of using an array? I would suggest so. CRs 1,5-7 are unconditionally #UD to attempt to use. CR2 is useless until you enter the #PF handler. CR8 may (hardware dependent) have the TPR in it, which is useless until the IDT is set up and interrupts have been enabled. For 16bit mode this is going to be CR0/CR4, for 32bit or long mode mode CR0/CR3/CR4. Agreed. uint32_t cs_base; uint32_t ds_base; uint32_t ss_base; uint32_t cs_limit; uint32_t ds_limit; uint32_t ss_limit; uint16_t cs_ar; uint16_t ds_ar; uint16_t ss_ar; }; struct vcpu_hvm_x86_32 { uint32_t eax; uint32_t ecx; uint32_t edx; uint32_t ebx; uint32_t esp; uint32_t ebp; uint32_t esi; uint32_t edi; uint32_t eip; uint32_t cr[8]; Don't need cr's 5-8. uint64_t efer; uint32_t cs_base; uint32_t ds_base; uint32_t ss_base; uint32_t cs_limit; uint32_t ds_limit; uint32_t ss_limit; uint16_t cs_ar; uint16_t ds_ar; uint16_t ss_ar; You need selector entries for each segment as well. Really? What's the point in having the selector if we don't have a GDT, and we allow loading the cached part, which is the relevant one. push %cs; push 1f; lret At all points when segment details are updated, it is the responsibility of software to ensure that the details match with the GDT/LDT entry. See for example the Intel and AMD manuals for syscall/sysenter where similar updating segment details behind the scenes occurs. }; struct vcpu_hvm_x86_64 { uint64_t rax; uint64_t rcx; uint64_t rdx; uint64_t rbx; uint64_t rsp; uint64_t rbp; uint64_t rsi; uint64_t rdi; uint64_t r8; uint64_t r9; uint64_t r10; uint64_t r11; uint64_t r12; uint64_t r13; uint64_t r14; uint64_t r15; uint64_t rip; uint64_t cr[8]; uint64_t efer; What has happened to the segment information? It cannot be omitted completely, even in 64bit. I had in mind to set them to the values required for long mode: base = 0 limit = 0x and the attributes field for CS to: ar = 0x29b (L bit set) And for SS/DS: ar = 0x093 But maybe it makes sense to allow setting them in case someone wants to start in compatibility mode. Agreed. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.6 v2] x86/mm: Make {hap, shadow}_teardown() preemptible
From: Anshul Makkar anshul.mak...@citrix.com A domain with sufficient shadow allocation can cause a watchdog timeout during domain destruction. Expand the existing -ERESTART logic in paging_teardown() to allow {hap/sh}_set_allocation() to become restartable during the DOMCTL_destroydomain hypercall. Signed-off-by: Anshul Makkar anshul.mak...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Reviewed-by: Tim Deegan t...@xen.org --- CC: George Dunlap george.dun...@eu.citrix.com CC: Jan Beulich jbeul...@suse.com CC: Wei Liu wei.l...@citrix.com v2: * Drop {HAP,SHADOW}_PRINTK(). They are not very useful, and will get very noisy if enabled. Wei: Currently, Xen can't actually run 1TB guests without suffering a watchdog timeout when destroying the domain. Therefore this is fairly important for 4.6 (and backport), or we will have to retroactively drop the currently-stated supported limits on the wiki. The patch has had extensive testing in XenServer, although has been forward ported from 4.5. --- xen/arch/x86/mm/hap/hap.c | 22 -- xen/arch/x86/mm/paging.c|9 ++--- xen/arch/x86/mm/shadow/common.c | 24 +--- xen/include/asm-x86/hap.h |2 +- xen/include/asm-x86/shadow.h|4 ++-- 5 files changed, 26 insertions(+), 35 deletions(-) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index d375c4d..e9c0080 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -551,7 +551,7 @@ void hap_final_teardown(struct domain *d) } if ( d-arch.paging.hap.total_pages != 0 ) -hap_teardown(d); +hap_teardown(d, NULL); p2m_teardown(p2m_get_hostp2m(d)); /* Free any memory that the p2m teardown released */ @@ -561,7 +561,7 @@ void hap_final_teardown(struct domain *d) paging_unlock(d); } -void hap_teardown(struct domain *d) +void hap_teardown(struct domain *d, int *preempted) { struct vcpu *v; mfn_t mfn; @@ -589,18 +589,11 @@ void hap_teardown(struct domain *d) if ( d-arch.paging.hap.total_pages != 0 ) { -HAP_PRINTK(teardown of domain %u starts. -pages total = %u, free = %u, p2m=%u\n, - d-domain_id, - d-arch.paging.hap.total_pages, - d-arch.paging.hap.free_pages, - d-arch.paging.hap.p2m_pages); -hap_set_allocation(d, 0, NULL); -HAP_PRINTK(teardown done. -pages total = %u, free = %u, p2m=%u\n, - d-arch.paging.hap.total_pages, - d-arch.paging.hap.free_pages, - d-arch.paging.hap.p2m_pages); +hap_set_allocation(d, 0, preempted); + +if ( preempted *preempted ) +goto out; + ASSERT(d-arch.paging.hap.total_pages == 0); } @@ -609,6 +602,7 @@ void hap_teardown(struct domain *d) xfree(d-arch.hvm_domain.dirty_vram); d-arch.hvm_domain.dirty_vram = NULL; +out: paging_unlock(d); } diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 618f475..5becee8 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -799,12 +799,15 @@ long paging_domctl_continuation(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) /* Call when destroying a domain */ int paging_teardown(struct domain *d) { -int rc; +int rc, preempted = 0; if ( hap_enabled(d) ) -hap_teardown(d); +hap_teardown(d, preempted); else -shadow_teardown(d); +shadow_teardown(d, preempted); + +if ( preempted ) +return -ERESTART; /* clean up log dirty resources. */ rc = paging_free_log_dirty_bitmap(d, 0); diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c index abce8e2..0264b91 100644 --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -3071,7 +3071,7 @@ int shadow_enable(struct domain *d, u32 mode) return rv; } -void shadow_teardown(struct domain *d) +void shadow_teardown(struct domain *d, int *preempted) /* Destroy the shadow pagetables of this domain and free its shadow memory. * Should only be called for dying domains. */ { @@ -3132,23 +3132,16 @@ void shadow_teardown(struct domain *d) if ( d-arch.paging.shadow.total_pages != 0 ) { -SHADOW_PRINTK(teardown of domain %u starts. - Shadow pages total = %u, free = %u, p2m=%u\n, - d-domain_id, - d-arch.paging.shadow.total_pages, - d-arch.paging.shadow.free_pages, - d-arch.paging.shadow.p2m_pages); /* Destroy all the shadows and release memory to domheap */ -sh_set_allocation(d, 0, NULL); +sh_set_allocation(d, 0, preempted); + +if ( preempted *preempted ) +goto out; + /* Release the hash table back to xenheap */
Re: [Xen-devel] [PATCH v2 02/20] xen: Introduce a function to split a Linux page into Xen page
On 05/08/15 15:30, Julien Grall wrote: Hi David, On 24/07/15 11:10, David Vrabel wrote: On 24/07/15 10:54, Julien Grall wrote: On 24/07/15 10:31, David Vrabel wrote: On 09/07/15 21:42, Julien Grall wrote: The Xen interface is always using 4KB page. This means that a Linux page may be split across multiple Xen page when the page granularity is not the same. This helper will break down a Linux page into 4KB chunk and call the helper on each of them. [...] --- a/include/xen/page.h +++ b/include/xen/page.h @@ -39,4 +39,24 @@ struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS]; extern unsigned long xen_released_pages; +typedef int (*xen_pfn_fn_t)(struct page *page, unsigned long pfn, void *data); + +/* Break down the page in 4KB granularity and call fn foreach xen pfn */ +static inline int xen_apply_to_page(struct page *page, xen_pfn_fn_t fn, + void *data) I think this should be outlined (unless you have measurements that support making it inlined). I don't have any performance measurements. Although, when Linux is using 4KB page granularity, the loop in this helper will be dropped by the helper. The code would look like: unsigned long pfn = xen_page_to_pfn(page); ret = fn(page, fn, data); if (ret) return ret; The compiler could even inline the callback (fn). So it drops 2 functions call. Ok, keep it inlined. Also perhaps make it int xen_for_each_gfn(struct page *page, xen_gfn_fn_t fn, void *data); gfn standing for Guest Frame Number right? Yes. This suggestion is just changing the name to make it more obvious what it does. Thinking more about this suggestion. The callback (fn) is getting a 4K PFN in parameter and not a GFN. I would like only APIs that deal with 64 KiB PFNs and 4 KiB GFNs. I think having a 4 KiB PFN is confusing. Can you rework this xen_for_each_gfn() to pass GFNs to fn, instead? This is because the balloon code seems to require having a 4K PFN in hand in few places. For instance XENMEM_populate_physmap and HYPERVISOR_update_va_mapping. Ug. For an auto-xlate guest frame-list needs GFNs, for a PV guest XENMEM_populate_physmap does want PFNs (so it can fill in the M2P). Perhaps in increase_reservation: if (auto-xlate) frame_list[i] = page_to_gfn(page); /* Or whatever per-GFN loop you need. */ else frame_list[i] = page_to_pfn(page); update_va_mapping takes VAs (e.g, __va(pfn PAGE_SHIFT) could be page_to_virt(page). Sorry for being so picky here, but the inconsistency of terminology and API misuse is already confusing and I don't want to see it get worse. David Although, I'm not sure to understand the difference between GMFN, and GPFN in the hypercall doc. Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel