Re: [Xen-devel] [PATCH v2 5/6] tools/libxl: move save/restore code into libxl_dom_save.c
On 06/16/2015 09:09 PM, Ian Campbell wrote: On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: move save/restore code into libxl_dom_save.c. If this (unlike other patches in the series) is purely code motion please indicate that this is the case. You might also like to consider refactoring things such that all patches are pure motion. Yes, this is only code move, no refactoring or other thing. . -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
Andrew Cooper writes (Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children): It is possible that one bit fails before it can be calculated whether the second bit needs to start or not. At the moment, all bits in libxl in this area do initialisation immediately before use; most bits are even initialised in the function which starts their actions. Some bits are initialised differently depending on the path taken to get to the initialisation site. As a rule of thumb a function libxl__initiate_foo_ which takes a libxl__foo_state* should do this initialisation for the whole libxl__foo_state. I don't see why you can't do that. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 15/27] tools/libxl: Migration v2 stream format
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com C structures describing the Libxl migration v2 stream format Do we think these should be internal or are they part of the library API? I suppose it's a bit of a grey area, obviously the file format is ABI, but its not one a user should ever interact with directly. What I'm getting at is may most of this should be in the libxl__ rather than libxl_ namespace (also some of it being un-namespaced would further suggest these are strictly speaking install, as would the fact it isn't installed...) Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_sr_stream_format.h | 57 ++ 1 file changed, 57 insertions(+) create mode 100644 tools/libxl/libxl_sr_stream_format.h diff --git a/tools/libxl/libxl_sr_stream_format.h b/tools/libxl/libxl_sr_stream_format.h new file mode 100644 index 000..487f9e2 --- /dev/null +++ b/tools/libxl/libxl_sr_stream_format.h @@ -0,0 +1,57 @@ +#ifndef LIBXL_SR_STREAM_FORMAT_H +#define LIBXL_SR_STREAM_FORMAT_H + +/* + * C structures for the Migration v2 stream format. + * See docs/specs/libxl-migration-stream.pandoc + */ + +#include stdint.h + +typedef struct libxl_sr_hdr +{ +uint64_t ident; +uint32_t version; +uint32_t options; +} libxl_sr_hdr; + +#define RESTORE_STREAM_IDENT 0x4c6962786c466d74UL +#define RESTORE_STREAM_VERSION 0x0002U + +#define RESTORE_OPT_BIG_ENDIAN (1 0) +#define RESTORE_OPT_LEGACY (1 1) + + +typedef struct libxl_sr_rec_hdr +{ +uint32_t type; +uint32_t length; +} libxl_sr_rec_hdr; + +/* All records must be aligned up to an 8 octet boundary */ +#define REC_ALIGN_ORDER 3U + +#define REC_TYPE_END 0xU +#define REC_TYPE_LIBXC_CONTEXT 0x0001U +#define REC_TYPE_XENSTORE_DATA 0x0002U +#define REC_TYPE_EMULATOR_CONTEXT0x0003U + +typedef struct libxl_sr_emulator_hdr +{ +uint32_t id; +uint32_t index; +} libxl_sr_emulator_hdr; + +#define EMULATOR_UNKNOWN 0xU +#define EMULATOR_QEMU_TRADITIONAL0x0001U +#define EMULATOR_QEMU_UPSTREAM 0x0002U + +#endif /* LIBXL_SR_STREAM_FORMAT_H */ + +/* + * Local variables: + * mode: C + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls
On 16.06.15 at 16:03, stefano.stabell...@eu.citrix.com wrote: On Fri, 5 Jun 2015, Jan Beulich wrote: +} else if (s-msix-enabled) { +if (!(value PCI_MSIX_FLAGS_ENABLE)) { +xen_pt_msix_disable(s); +s-msix-enabled = false; +} else if (!s-msix-maskall) { Why are you changing the state of s-msix-maskall here? This is the value PCI_MSIX_FLAGS_ENABLE case, nothing to do with maskall, right? We're at an else if inside an else if here. The only case left after the if() still seen above is that value has PCI_MSIX_FLAGS_MASKALL set. +s-msix-maskall = true; +xen_pt_msix_maskall(s, true); +} } -debug_msix_enabled_old = s-msix-enabled; -s-msix-enabled = !!(*val PCI_MSIX_FLAGS_ENABLE); if (s-msix-enabled != debug_msix_enabled_old) { XEN_PT_LOG(s-dev, %s MSI-X\n, s-msix-enabled ? enable : disable); } +xen_host_pci_get_word(s-real_device, s-msix-ctrl_offset, dev_value); I have to say that I don't like the asymmetry between reading and writing PCI config registers. If writes go via hypercalls, reads should go via hypercalls too. We're not doing any cfg register write via hypercalls (not here, and not elsewhere). What is being replaced by the patch are write to two bits which happen to live in PCI config space. Plus, reading directly, and doing writes via hypercall only when really needed would still be the right thing from a performance pov. --- a/qemu/upstream/hw/xen/xen_pt_msi.c +++ b/qemu/upstream/hw/xen/xen_pt_msi.c @@ -301,8 +301,11 @@ static int msix_set_enable(XenPCIPassthr return -1; } -return msi_msix_enable(s, s-msix-ctrl_offset, PCI_MSIX_FLAGS_ENABLE, - enabled); Would it make sense to remove msi_msix_enable completely to avoid any further mistakes? Perhaps, yes. I think I actually had suggested so quite a while back. But I don't see myself wasting much more time on this, ehm, code. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes
On 16/06/15 14:13, David Vrabel wrote: On 16/06/15 13:57, Julien Grall wrote: On 16/06/15 12:59, Jan Beulich wrote: On 16.06.15 at 13:14, julien.gr...@citrix.com wrote: On 15/06/2015 16:48, David Vrabel wrote: diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 44ea92d..a0ff9d2 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -129,7 +129,7 @@ struct evtchn #endif } ssid; #endif -}; +} __attribute__((aligned(64))); Why don't you use __cacheline_aligned? That would double the size on x86, for little or no benefit. Well, the cache line size is not necessarily 64 bytes on every architecture. In the case of ARM, the cache line depends on the processor version. __cacheline_aligned is the only way to ensure that the cache line is not shared on ARM. AFAIU, the goal of this patch is to avoid sharing the cache line. If not, the commit message is misleading because it claims that a cache line is always 64 bytes... We want to avoid sharing the cache line where we can do so for no additional memory cost. I would expand the commit message to make clear that we may not share the cache line. The 64 bytes (cache line) is confusing and without any background it can be interpreted wrongly. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
Ian Campbell writes (Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children): On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Shortly, libxl will be juggling multiple parallel operations, and will possibly have to take error decisions before some tasks have been set up. It would be preferable, I think, to arrange to call libxl__ev_child_init on all such libxl__ev_child structs either up front or certainly before there is any possibility of needing to unwind them. Yes. Such an init would at worst correspond to exactly the place where the zeroed structure you refer to is zeroed. I would welcome a patch which caused an assertion failure if -pid==0. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 10/27] docs: Libxl migration v2 stream specification
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Signed-off-by: Andrew Cooper andrew.coop...@citrix.com +EMULATOR\_CONTEXT + + +A context blob for a specific emulator associated with the domain. + + 0 1 2 3 4 5 6 7 octet ++++ +| emulator_id| index | ++++ +| emulator_ctx| +... ++-+ + + +FieldDescription + --- +emulator_id 0x: Unknown (In the case of a legacy stream) + + 0x0001: Qemu Traditional + + 0x0002: Qemu Upstream + + 0x0003 - 0x: Reserved for future emulators. Would it be useful for future proofing to carve out some space for a per-emulator version field too? Otherwise LGTM. One thought, it might be useful (here or elsewhere) to have an explicit overview of the expected control flow (as in the ownership of the fd, and/or nesting of the layers as you prefer to think about it) between libxc, libxl and the next layer (i.e. xl). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Proposal: Feature Maturity Lifecycle
On Mon, Jun 15, 2015 at 05:34:08PM +0100, Lars Kurth wrote: On 15 Jun 2015, at 15:51, George Dunlap dunl...@umich.edu wrote: A more practical requirement for a feature with Supported status is that _either_: * The feature is tested automatically. * At least once during each release freeze, the feature's maintainers produce a test report (by a deadline specified by the release manager). Features with no test report get downgraded from Supported to some lower status. This seems a reasonable compromise. Rather than forcing a maintainer to engage with osstest as a condition of considering the feature flagged Supported, we gently nudge them to automate it to save themselves the hassle of testing it every release. :-) Alright. I will make a note of this and amend the next version. I was wondering whether any maintainers who will be impacted by this and the Release Manager (current and past) have a view The end result is probably we spam all maintainers for their components, regardlessly whether that components are tested in OSSTest or not. Triaging whether a component is tested by OSSTest can only be done by hand and is tedious and time consuming. The extra work load may be moderate if we only do it once per release and have the corresponding tool to generate emails automatically. Wei. Also, please let me know of any additional feedback on the main proposal, such that I can make a note of people's views and make modifications. Best Regards Lars ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v24 12/15] x86/VPMU: Handle PMU interrupts for PV(H) guests
On 06/16/2015 03:45 AM, Jan Beulich wrote: On 15.06.15 at 19:17, boris.ostrov...@oracle.com wrote: +for ( i = 0; i num_counters; i++ ) +{ +if ( (ctrl_regs[i] CTRL_RSVD_MASK) != ctrl_rsvd[i] ) +{ +/* + * Not necessary to re-init context since we should never load + * it until guest provides valid values. But just to be safe. + */ +amd_vpmu_init_regs(ctxt); +return -EINVAL; +} + +if ( is_pmu_enabled(ctrl_regs[i]) ) +num_enabled++; +} + +if ( num_enabled ) Looks like a boolean flag would do - the exact count doesn't seem to be of interest here or in later patches? So the reason why I use a counter here is because keeping track of VPMU_RUNNING state is currently broken on AMD, I noticed it when I was updating this patch. amd_vpmu_do_wrmsr() will reset VPMU_RUNNING if the MSR write is disabling current counter, even though there may still be other counters running. This may be related to HVM brokenness of AMD counters that I mentioned a while ago where the guest, when running with multiple counters, sometimes gets unexpected NMIs. (This goes back all the way to to 4.1.) I don't want to fix this in this series but I will likely need to count number of active counters when I do (just like I do for Intel). I can use a boolean now though since I am not dealing with this problem here. If another rev is needed, I'd prefer if you did so. But if we can have this version go in (provided we get all the necessary acks), I wouldn't insist on you doing another round just because of this. I think there are a couple of (fairly cosmetic) changes that need to be done so there will be another rev. OTOH I just tried a quick-and-dirty fix for this problem and it doesn't resolve it so presumably there is more to this. It's not particularly invasive but I think it would be rather pointless to put it in as it still doesn't allow multiple counters on AMD and I suspect the final fix will be touching the same code again. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/6] tools/libxl: move domain suspend code into libxl_dom_suspend.c
On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: Move domain suspend code into a separate file libxl_dom_suspend.c. export an API libxl__domain_suspend() which wrappers the static just ..which wraps the... function domain_suspend_callback_common() for internal use. Note that the newly added file libxl_dom_suspend.c is used for suspend/resume code. Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com --- tools/libxl/Makefile| 3 +- tools/libxl/libxl_dom.c | 350 +--- tools/libxl/libxl_dom_suspend.c | 381 tools/libxl/libxl_internal.h| 6 + 4 files changed, 393 insertions(+), 347 deletions(-) create mode 100644 tools/libxl/libxl_dom_suspend.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index cc9c152..3f98d62 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -95,7 +95,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ - libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) + libxl_qmp.o libxl_event.o libxl_fork.o libxl_dom_suspend.o \ + $(LIBXL_OBJS-y) LIBXL_OBJS += libxl_genid.o LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index cce04dd..9444329 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -1103,11 +1103,6 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf, /* Domain suspend (save) */ -static void domain_save_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, int rc); -static void domain_suspend_callback_common_done(libxl__egc *egc, -libxl__domain_suspend_state *dss, int ok); - /*- complicated callback, called by xc_domain_save -*/ /* @@ -1324,35 +1319,6 @@ static void switch_logdirty_done(libxl__egc *egc, /*- callbacks, called by xc_domain_save -*/ -int libxl__domain_suspend_device_model(libxl__gc *gc, - libxl__domain_suspend_state *dss) -{ -int ret = 0; -uint32_t const domid = dss-domid; -const char *const filename = dss-dm_savefile; - -switch (libxl__device_model_version_running(gc, domid)) { -case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: { -LOG(DEBUG, Saving device model state to %s, filename); -libxl__qemu_traditional_cmd(gc, domid, save); -libxl__wait_for_device_model_deprecated(gc, domid, paused, NULL, NULL, NULL); -break; -} -case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: -if (libxl__qmp_stop(gc, domid)) -return ERROR_FAIL; -/* Save DM state into filename */ -ret = libxl__qmp_save(gc, domid, filename); -if (ret) -unlink(filename); -break; -default: -return ERROR_INVAL; -} - -return ret; -} - int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid) { @@ -1373,301 +1339,6 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid) return 0; } -static void domain_suspend_common_wait_guest(libxl__egc *egc, - libxl__domain_suspend_state *dss); -static void domain_suspend_common_guest_suspended(libxl__egc *egc, - libxl__domain_suspend_state *dss); - -static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc, - libxl__xswait_state *xswa, int rc, const char *state); -static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc, -libxl__ev_evtchn *evev); -static void suspend_common_wait_guest_watch(libxl__egc *egc, - libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path); -static void suspend_common_wait_guest_check(libxl__egc *egc, -libxl__domain_suspend_state *dss); -static void suspend_common_wait_guest_timeout(libxl__egc *egc, - libxl__ev_time *ev, const struct timeval *requested_abs); - -static void domain_suspend_common_failed(libxl__egc *egc, - libxl__domain_suspend_state *dss); -static void domain_suspend_common_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, - bool ok); -
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 01:45 PM, Ian Jackson wrote: Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): On 06/16/2015 01:10 PM, Ian Jackson wrote: AIUI some devices have serial numbers, which means you can distinguish them ? Yes, they have. The question is whether those are different on multiple instances? With lsusb I've found a device with serial number 0123456789ABCD. Do you really believe I'm just lucky owning the one with such a nice serial number? ;-) Heh. I think, though, that this suggests that if the user has devices whose serial numbers are actually different, they should be able to specify a device by vid/pid/serial. I think you are right. The next question: should this be part of libxl or qemu? It should be possible to extend the qemu vendorId:productId syntax to vendorId:productId:serial without too much effort, as libusb includes some support to obtain the serial number. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 02:06 PM, Juergen Gross wrote: On 06/16/2015 01:45 PM, Ian Jackson wrote: Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): On 06/16/2015 01:10 PM, Ian Jackson wrote: AIUI some devices have serial numbers, which means you can distinguish them ? Yes, they have. The question is whether those are different on multiple instances? With lsusb I've found a device with serial number 0123456789ABCD. Do you really believe I'm just lucky owning the one with such a nice serial number? ;-) Heh. I think, though, that this suggests that if the user has devices whose serial numbers are actually different, they should be able to specify a device by vid/pid/serial. I think you are right. The next question: should this be part of libxl or qemu? It should be possible to extend the qemu vendorId:productId syntax to vendorId:productId:serial without too much effort, as libusb includes some support to obtain the serial number. It doesn't really matter what qemu or pvusb can do, as long as libxl can convert what it has into something that they can use. So if libxl is given vid, pid, and serial (remember, this will be a struct, not a string), it can look in sysfs and find the bus:addr (which is unique at any given time) and hand it to qemu (or the bus-port in the case of pvusb). -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] Xen PV IOMMU interface draft B
On 12.06.15 at 18:43, malcolm.cross...@citrix.com wrote: IOMMUOP_query_caps -- This subop queries the runtime capabilities of the PV-IOMMU interface for the specific called domain. This subop uses `struct pv_iommu_op` directly. calling domain perhaps? -- Field Purpose - --- `flags`[out] This field details the IOMMUOP capabilities. `status` [out] Status of this op, op specific values listed below -- Defined bits for flags field: -- NameBitDefinition -- -- IOMMU_QUERY_map_cap 0IOMMUOP_map_page or IOMMUOP_map_foreign can be used for this domain this (see also above) perhaps being the calling domain? In which case I wonder how the for and IOMMUOP_map_foreign are meant to fit together: I assume the flag to indicate that mapping into the (calling) domain is possible. Which then makes me wonder - what use if the new hypercall when this flag isn't set? IOMMU_QUERY_map_all_gfns 1IOMMUOP_map_page subop can map any MFN not used by Xen gfns or MFN? Defined values for map_page subop status field: Value Reason -- -- 0 subop successfully returned -EIOIOMMU unit returned error when attempting to map BFN to GFN. -EPERM GFN could not be mapped because the GFN belongs to Xen. -EPERM Domain is not a domain and GFN does not belong to domain is not a hardware domain? Also, I think we're pretty determined for there to ever only be one, so perhaps it should be the hardware domain here and elsewhere. IOMMUOP_unmap_page -- This subop uses `struct unmap_page` part of the `struct pv_iommu_op`. The subop usage of the struct pv_iommu_op and ``struct unmap_page` fields are detailed below: Field Purpose - - `bfn` [in] Bus address frame number to be unmapped in DOMID_SELF Has it been determined that unmapping based on GFN is never going to be needed, and that unmapping by BFN is the more practical solution? The map counterpart doesn't seem to exclude establishing multiple mappings for the same BFN, and hence the inverse here would become kind of fuzzy in that case. IOMMUOP_map_foreign_page This subop uses `struct map_foreign_page` part of the `struct pv_iommu_op`. It is not valid to use domid representing the calling domain. And what's the point of that? Was it considered to have only one map/unmap pair, capable of mapping both local and foreign pages? If so, what speaks against that? The M2B mechanism is a MFN to (BFN,domid,ioserver) tuple. I didn't see anything explaining the significance of this (namely the ioserver part, I think I can see the need for the domid), can you explain the backgound here please? Every new M2B entry will take a reference to the MFN backing the GFN. What happens when that GFN-MFN mapping changes? All the following conditions are required to be true for PV IOMMU map_foreign subop to succeed: 1. IOMMU detected and supported by Xen 2. The domain has IOMMU controlled hardware allocated to it 3. The domain is a hardware_domain and the following Xen IOMMU options are NOT enabled: dom0-passthrough Is there a way for the hardware domain to know it is running in pass-through mode? Also, the domain is ambiguous here; I'm sure you mean the invoking domain, not the one owning the page. This subop usage of the struct pv_iommu_op and ``struct map_foreign_page` fields are detailed below: Field Purpose - - `domid`[in] The domain ID for which the gfn field applies `ioserver` [in] IOREQ server id associated with mapping `bfn` [in] Bus address frame number for gfn address In the description above you speak of returning data in this field. Is [in] really correct? Defined bits for flags field: Name BitDefinition - -- IOMMUOP_readable 0BFN IOMMU mapping is readable IOMMUOP_writeable 1BFN IOMMU mapping is writeable IOMMUOP_swap_mfn 2BFN IOMMU mapping can be safely
Re: [Xen-devel] [PATCH 1/6] xen/MSI-X: latch MSI-X table writes
On Fri, 5 Jun 2015, Jan Beulich wrote: The remaining log message in pci_msix_write() is wrong, as there guest behavior may only appear to be wrong: For one, the old logic didn't take the mask-all bit into account. And then this shouldn't depend on host device state (i.e. the host may have masked the entry without the guest having done so). Plus these writes shouldn't be dropped even when an entry gets unmasked. Instead, if they can't be made take effect right away, they should take effect on the next unmasking or enabling operation - the specification explicitly describes such caching behavior. Signed-off-by: Jan Beulich jbeul...@suse.com @@ -322,6 +323,13 @@ static int xen_pt_msix_update_one(XenPCI pirq = entry-pirq; +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s-msix-maskall || +(entry-latch(VECTOR_CTRL) PCI_MSIX_ENTRY_CTRL_MASKBIT)) { I admit I am having difficulties understanding the full purpose of these checks. Please add a comment on them. I guess the intention is only to make changes using the latest values, the ones in entry-latch, when the right conditions are met, otherwise keep using the old values. Is that right? In that case, don't we want to use the latest values on MASKBIT - !MASKBIT transitions? In general when unmasking? Here it looks like we are using the latest values when maskall is set or PCI_MSIX_ENTRY_CTRL_MASKBIT is set, that is the opposite of what we want. +entry-addr = entry-latch(LOWER_ADDR) | + ((uint64_t)entry-latch(UPPER_ADDR) 32); +entry-data = entry-latch(DATA); +} + rc = msi_msix_setup(s, entry-addr, entry-data, pirq, true, entry_nr, entry-pirq == XEN_PT_UNASSIGNED_PIRQ); if (rc) { @@ -444,39 +432,28 @@ static void pci_msix_write(void *opaque, offset = addr % PCI_MSIX_ENTRY_SIZE; if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { -const volatile uint32_t *vec_ctrl; - if (get_entry_value(entry, offset) == val entry-pirq != XEN_PT_UNASSIGNED_PIRQ) { return; } +entry-updated = true; +} else if (msix-enabled entry-updated + !(val PCI_MSIX_ENTRY_CTRL_MASKBIT)) { +const volatile uint32_t *vec_ctrl; + /* * If Xen intercepts the mask bit access, entry-vec_ctrl may not be * up-to-date. Read from hardware directly. */ vec_ctrl = s-msix-phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL; +set_entry_value(entry, offset, *vec_ctrl); Why are you calling set_entry_value with the hardware vec_ctrl value? It doesn't look correct to me. In any case, if you wanted to do it, shouldn't you just set/unset PCI_MSIX_ENTRY_CTRL_MASKBIT instead of the whole *vec_ctrl? -if (msix-enabled !(*vec_ctrl PCI_MSIX_ENTRY_CTRL_MASKBIT)) { -if (!entry-warned) { -entry-warned = true; -XEN_PT_ERR(s-dev, Can't update msix entry %d since MSI-X is -already enabled.\n, entry_nr); -} -return; -} - -entry-updated = true; +xen_pt_msix_update_one(s, entry_nr); Shouldn't we call xen_pt_msix_update_one only if (*vec_ctrl PCI_MSIX_ENTRY_CTRL_MASKBIT)? In other words, only when we see a MASKBIT - !MASKBIT transition? } set_entry_value(entry, offset, val); - -if (offset == PCI_MSIX_ENTRY_VECTOR_CTRL) { -if (msix-enabled !(val PCI_MSIX_ENTRY_CTRL_MASKBIT)) { -xen_pt_msix_update_one(s, entry_nr); -} -} } static uint64_t pci_msix_read(void *opaque, hwaddr addr, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/6] tools/libxl: move remus code into libxl_remus.c
On 06/16/2015 09:08 PM, Ian Campbell wrote: On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: move remus code into libxl_remus.c. Please say something like ... by refactoring bits of libxl_domain_remus_start and domain_save_done into X and Y and moving the remaining functionality unchanged into the new file. I gave two examples of functions which changed there, but please make sure the list is complete+accurate. Okay, will fix the commit message next time, thank you! Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/Makefile | 2 +- tools/libxl/libxl.c | 55 +--- tools/libxl/libxl_dom.c | 206 + tools/libxl/libxl_internal.h | 11 ++ tools/libxl/libxl_remus.c| 304 +++ 5 files changed, 318 insertions(+), 260 deletions(-) create mode 100644 tools/libxl/libxl_remus.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index 3f98d62..8535eaa 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -56,7 +56,7 @@ else LIBXL_OBJS-y += libxl_nonetbuffer.o endif -LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o +LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 77c6a36..0f9248e 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -792,10 +792,6 @@ out: return ptr; } -static void libxl__remus_setup_done(libxl__egc *egc, -libxl__remus_devices_state *rds, int rc); -static void libxl__remus_setup_failed(libxl__egc *egc, - libxl__remus_devices_state *rds, int rc); static void remus_failover_cb(libxl__egc *egc, libxl__domain_suspend_state *dss, int rc); @@ -844,63 +840,14 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info, assert(info); -/* Convenience aliases */ -libxl__remus_devices_state *const rds = dss-rds; - -if (libxl_defbool_val(info-netbuf)) { -if (!libxl__netbuffer_enabled(gc)) { -LOG(ERROR, Remus: No support for network buffering); -rc = ERROR_FAIL; -goto out; -} -rds-device_kind_flags |= (1 LIBXL__DEVICE_KIND_VIF); -} - -if (libxl_defbool_val(info-diskbuf)) -rds-device_kind_flags |= (1 LIBXL__DEVICE_KIND_VBD); - -rds-ao = ao; -rds-domid = domid; -rds-callback = libxl__remus_setup_done; - /* Point of no return */ -libxl__remus_devices_setup(egc, rds); +libxl__remus_setup(egc, dss); return AO_INPROGRESS; out: return AO_ABORT(rc); } -static void libxl__remus_setup_done(libxl__egc *egc, -libxl__remus_devices_state *rds, int rc) -{ -libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds); -STATE_AO_GC(dss-ao); - -if (!rc) { -libxl__domain_save(egc, dss); -return; -} - -LOG(ERROR, Remus: failed to setup device for guest with domid %u, rc %d, -dss-domid, rc); -rds-callback = libxl__remus_setup_failed; -libxl__remus_devices_teardown(egc, rds); -} - -static void libxl__remus_setup_failed(libxl__egc *egc, - libxl__remus_devices_state *rds, int rc) -{ -libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds); -STATE_AO_GC(dss-ao); - -if (rc) -LOG(ERROR, Remus: failed to teardown device after setup failed - for guest with domid %u, rc %d, dss-domid, rc); - -dss-callback(egc, dss, rc); -} - static void remus_failover_cb(libxl__egc *egc, libxl__domain_suspend_state *dss, int rc) { diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 701e9f7..0f81081 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -1409,189 +1409,6 @@ int libxl__toolstack_save(uint32_t domid, uint8_t **buf, return 0; } -/*- remus callbacks -*/ -static void remus_domain_suspend_callback_common_done(libxl__egc *egc, -libxl__domain_suspend_state *dss, int ok); -static void remus_devices_postsuspend_cb(libxl__egc *egc, - libxl__remus_devices_state *rds, - int rc); -static void remus_devices_preresume_cb(libxl__egc *egc, - libxl__remus_devices_state *rds, - int rc); - -static void libxl__remus_domain_suspend_callback(void *data) -{ -libxl__save_helper_state *shs = data; -libxl__egc *egc = shs-egc;
Re: [Xen-devel] [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: If a conversion of a legacy stream is needed, libxl__xc_domain_restore() will need to use an fd other to the one found in the domain_create_state. No functional change. It could be argued that the one in domain_create_state should always be the correct one. If that means that it differs from the one ultimately passed in by the user then the original ought to be saved in the state associated with the filter as its input and the filter's output plumbed into domain_create_state instead. I'll reserve judgement until I see a bit more of how this pans out through the series. @@ -41,13 +41,13 @@ static void helper_exited(libxl__egc *egc, libxl__ev_child *ch, /*- entrypoints -*/ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs, + int restore_fd, Assuming we go this way: can it remain const? int hvm, int pae, int superpages) { STATE_AO_GC(dcs-ao); /* Convenience aliases */ const uint32_t domid = dcs-guest_domid; -const int restore_fd = dcs-restore_fd; libxl__domain_build_state *const state = dcs-build_state; unsigned cbflags = libxl__srm_callout_enumcallbacks_restore ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 07/12] x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code
On 06/16/2015 10:15 AM, David Vrabel wrote: On 15/06/15 21:35, Ingo Molnar wrote: * David Vrabel david.vra...@citrix.com wrote: On 15/06/15 10:05, Ian Campbell wrote: On Sat, 2015-06-13 at 11:49 +0200, Ingo Molnar wrote: xen_mm_pin_all()/unpin_all() are used to implement full guest instance suspend/restore. It's a stop-all method that needs to iterate through all allocated pgds in the system to fix them up for Xen's use. This code uses pgd_list, probably because it was an easy interface. But we want to remove the pgd_list, so convert the code over to walk all tasks in the system. This is an equivalent method. It is not equivalent because pgd_alloc() now populates entries in pgds that are not visible to xen_mm_pin_all() (note how the original code adds the pgd to the pgd_list in pgd_ctor() before calling pgd_prepopulate_pmd()). These newly allocated page tables won't be correctly converted on suspend/resume and the new process will die after resume. So how should the Xen logic be fixed for the new scheme? I can't say I can see through the paravirt complexity here. Actually, since we freeze_processes() before trying to pin page tables, I think it should be ok as-is. I'll put the patch through some tests. Actually, I just ran this through a couple of boot/suspend/resume tests and didn't see any issues (with the one fix I mentioned to Ingo earlier). On unstable Xen only. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes
On 16.06.15 at 13:14, julien.gr...@citrix.com wrote: On 15/06/2015 16:48, David Vrabel wrote: diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 44ea92d..a0ff9d2 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -129,7 +129,7 @@ struct evtchn #endif } ssid; #endif -}; +} __attribute__((aligned(64))); Why don't you use __cacheline_aligned? That would double the size on x86, for little or no benefit. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 02:23 PM, Juergen Gross wrote: On 06/16/2015 03:09 PM, George Dunlap wrote: On 06/16/2015 02:06 PM, Juergen Gross wrote: On 06/16/2015 01:45 PM, Ian Jackson wrote: Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): On 06/16/2015 01:10 PM, Ian Jackson wrote: AIUI some devices have serial numbers, which means you can distinguish them ? Yes, they have. The question is whether those are different on multiple instances? With lsusb I've found a device with serial number 0123456789ABCD. Do you really believe I'm just lucky owning the one with such a nice serial number? ;-) Heh. I think, though, that this suggests that if the user has devices whose serial numbers are actually different, they should be able to specify a device by vid/pid/serial. I think you are right. The next question: should this be part of libxl or qemu? It should be possible to extend the qemu vendorId:productId syntax to vendorId:productId:serial without too much effort, as libusb includes some support to obtain the serial number. It doesn't really matter what qemu or pvusb can do, as long as libxl can convert what it has into something that they can use. So if libxl is given vid, pid, and serial (remember, this will be a struct, not a string), it can look in sysfs and find the bus:addr (which is unique at any given time) and hand it to qemu (or the bus-port in the case of pvusb). Hmm, I'd rather have it all in one place. Putting it in qemu would enable us to handle hotplug as well. A USB device assigned via it's serial to a domU could be automatically passed through by qemu when showing up. This isn't possible today, but we wouldn't have to change libxl again for supporting it, only qemu would have to be adapted. Look, we're talking here about the libxl interface. Ian thinks that *at the libxl layer* we need to be able to specify all these things. It doesn't matter at this point whether qemu can also do it or not. In the future, if we actually implement the automatically grab this device when it shows up functionality, then yes, having it in qemu rather than having a libxl daemon sit around and watch for those things, might be handy. But we can cross that bridge when we come to it. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/6] xen/MSI-X: latch MSI-X table writes
On 16.06.15 at 15:35, stefano.stabell...@eu.citrix.com wrote: On Fri, 5 Jun 2015, Jan Beulich wrote: @@ -322,6 +323,13 @@ static int xen_pt_msix_update_one(XenPCI pirq = entry-pirq; +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s-msix-maskall || +(entry-latch(VECTOR_CTRL) PCI_MSIX_ENTRY_CTRL_MASKBIT)) { I admit I am having difficulties understanding the full purpose of these checks. Please add a comment on them. The comment would (pointlessly imo) re-state what the code already says: I guess the intention is only to make changes using the latest values, the ones in entry-latch, when the right conditions are met, otherwise keep using the old values. Is that right? In that case, don't we want to use the latest values on MASKBIT - !MASKBIT transitions? In general when unmasking? This is what we want. And with that, the questions you ask further down should be answered too: The function gets invoked with the pre-change mask flag state in -latch[], and updates the values used for actually setting up when that one has the entry masked (or mask-all is set). The actual new value gets written to -latch[] after the call. @@ -444,39 +432,28 @@ static void pci_msix_write(void *opaque, offset = addr % PCI_MSIX_ENTRY_SIZE; if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { -const volatile uint32_t *vec_ctrl; - if (get_entry_value(entry, offset) == val entry-pirq != XEN_PT_UNASSIGNED_PIRQ) { return; } +entry-updated = true; +} else if (msix-enabled entry-updated + !(val PCI_MSIX_ENTRY_CTRL_MASKBIT)) { +const volatile uint32_t *vec_ctrl; + /* * If Xen intercepts the mask bit access, entry-vec_ctrl may not be * up-to-date. Read from hardware directly. */ vec_ctrl = s-msix-phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL; +set_entry_value(entry, offset, *vec_ctrl); Why are you calling set_entry_value with the hardware vec_ctrl value? It doesn't look correct to me. In any case, if you wanted to do it, shouldn't you just set/unset PCI_MSIX_ENTRY_CTRL_MASKBIT instead of the whole *vec_ctrl? The comment above the code explains it: What we have stored locally may not reflect reality, as we may not have seen all writes (and this indeed isn't just a may). And if out cached value isn't valid anymore, why would we not want to update all of it, rather than just the mask bit? -if (msix-enabled !(*vec_ctrl PCI_MSIX_ENTRY_CTRL_MASKBIT)) { -if (!entry-warned) { -entry-warned = true; -XEN_PT_ERR(s-dev, Can't update msix entry %d since MSI-X is -already enabled.\n, entry_nr); -} -return; -} - -entry-updated = true; +xen_pt_msix_update_one(s, entry_nr); Shouldn't we call xen_pt_msix_update_one only if (*vec_ctrl PCI_MSIX_ENTRY_CTRL_MASKBIT)? In other words, only when we see a MASKBIT - !MASKBIT transition? The combination of the !(val PCI_MSIX_ENTRY_CTRL_MASKBIT) check in the if() surrounding this call and the (entry-latch(VECTOR_CTRL) PCI_MSIX_ENTRY_CTRL_MASKBIT) check inside the function guarantee just that (i.e. the function invocation is benign in the other case, as entry-addr/entry-data would remain unchanged). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls
On Fri, 5 Jun 2015, Jan Beulich wrote: Particularly the maskall bit has to be under exclusive hypervisor control (and since they live in the same config space field, the enable bit has to follow suit). Use the replacement hypercall interfaces. From this commit message, I expect a patch that simply substitutes maskall and enable writings with hypercalls and nothing else. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/qemu/upstream/hw/xen/xen_pt.h +++ b/qemu/upstream/hw/xen/xen_pt.h @@ -181,6 +181,7 @@ typedef struct XenPTMSIXEntry { typedef struct XenPTMSIX { uint32_t ctrl_offset; bool enabled; +bool maskall; int total_entries; int bar_index; uint64_t table_base; @@ -293,7 +294,9 @@ int xen_pt_msix_init(XenPCIPassthroughSt void xen_pt_msix_delete(XenPCIPassthroughState *s); int xen_pt_msix_update(XenPCIPassthroughState *s); int xen_pt_msix_update_remap(XenPCIPassthroughState *s, int bar_index); +void xen_pt_msix_enable(XenPCIPassthroughState *s); void xen_pt_msix_disable(XenPCIPassthroughState *s); +int xen_pt_msix_maskall(XenPCIPassthroughState *s, bool mask); static inline bool xen_pt_has_msix_mapping(XenPCIPassthroughState *s, int bar) { --- a/qemu/upstream/hw/xen/xen_pt_config_init.c +++ b/qemu/upstream/hw/xen/xen_pt_config_init.c @@ -1436,32 +1436,58 @@ static int xen_pt_msixctrl_reg_write(Xen uint16_t dev_value, uint16_t valid_mask) { XenPTRegInfo *reg = cfg_entry-reg; -uint16_t writable_mask = 0; +uint16_t writable_mask, value; uint16_t throughable_mask = get_throughable_mask(s, reg, valid_mask); int debug_msix_enabled_old; /* modify emulate register */ writable_mask = reg-emu_mask ~reg-ro_mask valid_mask; -cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, writable_mask); +value = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, writable_mask); +cfg_entry-data = value; /* create value for writing to I/O device register */ *val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask); +debug_msix_enabled_old = s-msix-enabled; + /* update MSI-X */ -if ((*val PCI_MSIX_FLAGS_ENABLE) Please explain in the commit message the reason of the *val/value change. - !(*val PCI_MSIX_FLAGS_MASKALL)) { +if ((value PCI_MSIX_FLAGS_ENABLE) + !(value PCI_MSIX_FLAGS_MASKALL)) { +if (!s-msix-enabled) { +if (!s-msix-maskall) { +xen_pt_msix_maskall(s, true); +} +xen_pt_msix_enable(s); +} xen_pt_msix_update(s); -} else if (!(*val PCI_MSIX_FLAGS_ENABLE) s-msix-enabled) { -xen_pt_msix_disable(s); +s-msix-enabled = true; +s-msix-maskall = false; +xen_pt_msix_maskall(s, false); Please explain in the commit message the reason behind the maskall-enable-update-un_maskall logic, that wasn't present before. +} else if (s-msix-enabled) { +if (!(value PCI_MSIX_FLAGS_ENABLE)) { +xen_pt_msix_disable(s); +s-msix-enabled = false; +} else if (!s-msix-maskall) { Why are you changing the state of s-msix-maskall here? This is the value PCI_MSIX_FLAGS_ENABLE case, nothing to do with maskall, right? +s-msix-maskall = true; +xen_pt_msix_maskall(s, true); +} } -debug_msix_enabled_old = s-msix-enabled; -s-msix-enabled = !!(*val PCI_MSIX_FLAGS_ENABLE); if (s-msix-enabled != debug_msix_enabled_old) { XEN_PT_LOG(s-dev, %s MSI-X\n, s-msix-enabled ? enable : disable); } +xen_host_pci_get_word(s-real_device, s-msix-ctrl_offset, dev_value); I have to say that I don't like the asymmetry between reading and writing PCI config registers. If writes go via hypercalls, reads should go via hypercalls too. +if (s-msix-enabled !(dev_value PCI_MSIX_FLAGS_ENABLE)) { +XEN_PT_ERR(s-dev, MSI-X unexpectedly disabled\n); +} else if ((dev_value PCI_MSIX_FLAGS_ENABLE) + s-msix-maskall + !(dev_value PCI_MSIX_FLAGS_MASKALL)) { +XEN_PT_ERR(s-dev, MSI-X unexpectedly unmasked\n); +} + return 0; } @@ -1483,9 +1509,12 @@ static XenPTRegInfo xen_pt_emu_reg_msix[ .offset = PCI_MSI_FLAGS, .size = 2, .init_val = 0x, -.res_mask = 0x3800, -.ro_mask= 0x07FF, -.emu_mask = 0x, +/* This must not be split into res_mask (0x3800) and ro_mask (0x07FF) + * because even in permissive mode there must not be any write back + * to this register. + */ +.ro_mask= 0x3FFF, +.emu_mask = 0xC000, .init = xen_pt_msixctrl_reg_init, .u.w.read = xen_pt_word_reg_read, .u.w.write = xen_pt_msixctrl_reg_write, ---
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): So it sounds like we're converging on Allow multiple ways to specify the interface, with at least the following fields: - bus (int - 1,2,3, c) - port (string - 2.1.3, c) - address/devnum (int) - vendorid (uint16_t) - deviceid (uint16_t) You're missing the full device path from that list. Typically that will include at least one pci bus address. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 03:29 PM, George Dunlap wrote: On 06/16/2015 02:23 PM, Juergen Gross wrote: On 06/16/2015 03:09 PM, George Dunlap wrote: On 06/16/2015 02:06 PM, Juergen Gross wrote: On 06/16/2015 01:45 PM, Ian Jackson wrote: Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): On 06/16/2015 01:10 PM, Ian Jackson wrote: AIUI some devices have serial numbers, which means you can distinguish them ? Yes, they have. The question is whether those are different on multiple instances? With lsusb I've found a device with serial number 0123456789ABCD. Do you really believe I'm just lucky owning the one with such a nice serial number? ;-) Heh. I think, though, that this suggests that if the user has devices whose serial numbers are actually different, they should be able to specify a device by vid/pid/serial. I think you are right. The next question: should this be part of libxl or qemu? It should be possible to extend the qemu vendorId:productId syntax to vendorId:productId:serial without too much effort, as libusb includes some support to obtain the serial number. It doesn't really matter what qemu or pvusb can do, as long as libxl can convert what it has into something that they can use. So if libxl is given vid, pid, and serial (remember, this will be a struct, not a string), it can look in sysfs and find the bus:addr (which is unique at any given time) and hand it to qemu (or the bus-port in the case of pvusb). Hmm, I'd rather have it all in one place. Putting it in qemu would enable us to handle hotplug as well. A USB device assigned via it's serial to a domU could be automatically passed through by qemu when showing up. This isn't possible today, but we wouldn't have to change libxl again for supporting it, only qemu would have to be adapted. Look, we're talking here about the libxl interface. Ian thinks that *at the libxl layer* we need to be able to specify all these things. It doesn't matter at this point whether qemu can also do it or not. When I'm adding new functionality (and this is the case here) I always try to add it at the level where it should be. qemu already is capable of finding a USB device by various means and can be extended easily to support our needs. And please remember, for writing the pvusb backend I'm already doing changes to qemu. So why should we add the same functionality to libxl by reading sysfs instead of letting it do qemu via libusb? And what about BSD? Letting qemu find the device would avoid two variants in libxl. In the future, if we actually implement the automatically grab this device when it shows up functionality, then yes, having it in qemu rather than having a libxl daemon sit around and watch for those things, might be handy. But we can cross that bridge when we come to it. I would agree if the efforts in libxl would be much smaller than in qemu. But if the efforts are comparable I'd rather do it in the component which will make such an enhancement easier. Just my $0.02 Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/27] tools/libxl: Extra APIs for the save helper
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: With libxl migration v2, there will be other moving parts which might fail, requiring the helper to be stopped for reasons which are not its fault. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_internal.h |8 tools/libxl/libxl_save_callout.c | 16 2 files changed, 24 insertions(+) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 4f204f9..3fcc37a 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -3182,6 +3182,14 @@ _hidden void libxl__xc_domain_restore(libxl__egc *egc, _hidden void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, int rc, int retval, int errnoval); +_hidden void libxl__save_helper_abort(libxl__egc *egc, + libxl__save_helper_state *shs); + +static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs) +{ +return libxl__ev_child_inuse(shs-child); +} Will this be used other than in libxl__save_helper_abort? + /* Each time the dm needs to be saved, we must call suspend and then save */ _hidden int libxl__domain_suspend_device_model(libxl__gc *gc, libxl__domain_suspend_state *dss); diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c index 231de2f..71de297 100644 --- a/tools/libxl/libxl_save_callout.c +++ b/tools/libxl/libxl_save_callout.c @@ -256,6 +256,22 @@ static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs, libxl__kill(gc, shs-child.pid, SIGKILL, save/restore helper); } +void libxl__save_helper_abort(libxl__egc *egc, + libxl__save_helper_state *shs) +{ +STATE_AO_GC(shs-ao); + +if (!libxl__ev_child_inuse(shs-child)) { +helper_failed(egc, shs, ERROR_FAIL); Did you mean helper_done here? helper_failed deregisters the fd, and potentially sends SIGKILL, although in this case it won't because its not inuse. So all this is doing is deregistering the fd, which AFAICT isn't registered if there is no pid. In fact, apart from the specific signal used, this function looks a lot like helper_failed. +return; +} + +if (!shs-rc) +shs-rc = ERROR_FAIL; + +libxl__kill(gc, shs-child.pid, SIGTERM, save/restore helper); +} + static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev, int fd, short events, short revents) { ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes
On 16/06/15 12:59, Jan Beulich wrote: On 16.06.15 at 13:14, julien.gr...@citrix.com wrote: On 15/06/2015 16:48, David Vrabel wrote: diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 44ea92d..a0ff9d2 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -129,7 +129,7 @@ struct evtchn #endif } ssid; #endif -}; +} __attribute__((aligned(64))); Why don't you use __cacheline_aligned? That would double the size on x86, for little or no benefit. Well, the cache line size is not necessarily 64 bytes on every architecture. In the case of ARM, the cache line depends on the processor version. __cacheline_aligned is the only way to ensure that the cache line is not shared on ARM. AFAIU, the goal of this patch is to avoid sharing the cache line. If not, the commit message is misleading because it claims that a cache line is always 64 bytes... Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes
On 16/06/15 13:57, Julien Grall wrote: On 16/06/15 12:59, Jan Beulich wrote: On 16.06.15 at 13:14, julien.gr...@citrix.com wrote: On 15/06/2015 16:48, David Vrabel wrote: diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 44ea92d..a0ff9d2 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -129,7 +129,7 @@ struct evtchn #endif } ssid; #endif -}; +} __attribute__((aligned(64))); Why don't you use __cacheline_aligned? That would double the size on x86, for little or no benefit. Well, the cache line size is not necessarily 64 bytes on every architecture. In the case of ARM, the cache line depends on the processor version. __cacheline_aligned is the only way to ensure that the cache line is not shared on ARM. AFAIU, the goal of this patch is to avoid sharing the cache line. If not, the commit message is misleading because it claims that a cache line is always 64 bytes... We want to avoid sharing the cache line where we can do so for no additional memory cost. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 03:19 PM, George Dunlap wrote: On Tue, Jun 16, 2015 at 1:02 PM, Ian Jackson ian.jack...@eu.citrix.com wrote: George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): So it sounds like we're converging on Allow multiple ways to specify the interface, with at least the following fields: - bus (int - 1,2,3, c) - port (string - 2.1.3, c) - address/devnum (int) - vendorid (uint16_t) - deviceid (uint16_t) You're missing the full device path from that list. Typically that will include at least one pci bus address. I'm not sure what constitutes a full device path. Is that defined somewhere? Remember that the path you gave in your previous e-mail isn't the path for the *usb device*, it's the path for the *block device*. It contains a PCI address, but it looks like it also contains part of the USB topology. Are you sure that's actually a stable interface, or does it just happen that on your hardware the discovery always happens in the same order? On my system /sys/bus/usb/devices/2-3.3 is a link to /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/. This contains the pci bus address, but it also contains the bus number, which we've just said may be unstable across reboots. I suppose it might be possible to specify buspci,port -- the pci address of the root bus, and the topology from there. In theory I guess that should be stable? Hmm, perhaps. On my system I've got: /sys/devices/pci:00/:00:14.0/usb3/ /sys/devices/pci:00/:00:14.0/usb4/ So two busses on one pci bus address. Are usb3 and usb4 always in this order or are they sometimes just numbered the other way round? Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
On 16/06/15 14:21, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Shortly, libxl will be juggling multiple parallel operations, and will possibly have to take error decisions before some tasks have been set up. It would be preferable, I think, to arrange to call libxl__ev_child_init on all such libxl__ev_child structs either up front or certainly before there is any possibility of needing to unwind them. Such an init would at worst correspond to exactly the place where the zeroed structure you refer to is zeroed. It is possible that one bit fails before it can be calculated whether the second bit needs to start or not. At the moment, all bits in libxl in this area do initialisation immediately before use; most bits are even initialised in the function which starts their actions. Some bits are initialised differently depending on the path taken to get to the initialisation site. It would be non-trivial to initialise everything appropriately at the very start. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Shortly more parameters will appear, and this saves unboxing each one. No functional change. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_create.c | 12 ++-- tools/libxl/libxl_internal.h |2 +- tools/libxl/libxl_save_callout.c |2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 86384d2..385891c 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc, int rc, uint32_t domid); static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, -uint32_t *domid, -int restore_fd, int checkpointed_stream, +uint32_t *domid, int restore_fd, +const libxl_domain_restore_params *params, const libxl_asyncop_how *ao_how, const libxl_asyncprogress_how *aop_console_how) { @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, libxl_domain_config_init(cdcs-dcs.guest_config_saved); libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config); cdcs-dcs.restore_fd = restore_fd; +if (params) cdcs-dcs.restore_params = *params; Is this eventually going to become non-optional? I think not and its validity is entirely intertwined with the validity of restore_fd (as I suspect it was before, but I've not checked). Perhaps an error check to that effect would be useful? Anyway, I think what you've done here is correct, so: Acked-by: Ian Campbell ian.campb...@citrix.com [...] @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state { libxl_domain_config *guest_config; libxl_domain_config guest_config_saved; /* vanilla config */ int restore_fd; +libxl_domain_restore_params restore_params; libxl__domain_create_cb *callback; libxl_asyncprogress_how aop_console_how; /* private to domain_create */ int guest_domid; -int checkpointed_stream; This has, in effect moved from private to domain_create to filled in by user, I don't think the change here has actually changed its status, but I suspect it was wrong before (alternatively restore_fd is in the wrong place instead). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/6] tools/libxl: move domain suspend code into libxl_dom_suspend.c
On 06/16/2015 09:00 PM, Ian Campbell wrote: On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: Move domain suspend code into a separate file libxl_dom_suspend.c. export an API libxl__domain_suspend() which wrappers the static just ..which wraps the... will fix, thank you ! function domain_suspend_callback_common() for internal use. Note that the newly added file libxl_dom_suspend.c is used for suspend/resume code. Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com --- tools/libxl/Makefile| 3 +- tools/libxl/libxl_dom.c | 350 +--- tools/libxl/libxl_dom_suspend.c | 381 tools/libxl/libxl_internal.h| 6 + 4 files changed, 393 insertions(+), 347 deletions(-) create mode 100644 tools/libxl/libxl_dom_suspend.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index cc9c152..3f98d62 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -95,7 +95,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ - libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) + libxl_qmp.o libxl_event.o libxl_fork.o libxl_dom_suspend.o \ + $(LIBXL_OBJS-y) LIBXL_OBJS += libxl_genid.o LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index cce04dd..9444329 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -1103,11 +1103,6 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf, /* Domain suspend (save) */ -static void domain_save_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, int rc); -static void domain_suspend_callback_common_done(libxl__egc *egc, -libxl__domain_suspend_state *dss, int ok); - /*- complicated callback, called by xc_domain_save -*/ /* @@ -1324,35 +1319,6 @@ static void switch_logdirty_done(libxl__egc *egc, /*- callbacks, called by xc_domain_save -*/ -int libxl__domain_suspend_device_model(libxl__gc *gc, - libxl__domain_suspend_state *dss) -{ -int ret = 0; -uint32_t const domid = dss-domid; -const char *const filename = dss-dm_savefile; - -switch (libxl__device_model_version_running(gc, domid)) { -case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: { -LOG(DEBUG, Saving device model state to %s, filename); -libxl__qemu_traditional_cmd(gc, domid, save); -libxl__wait_for_device_model_deprecated(gc, domid, paused, NULL, NULL, NULL); -break; -} -case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: -if (libxl__qmp_stop(gc, domid)) -return ERROR_FAIL; -/* Save DM state into filename */ -ret = libxl__qmp_save(gc, domid, filename); -if (ret) -unlink(filename); -break; -default: -return ERROR_INVAL; -} - -return ret; -} - int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid) { @@ -1373,301 +1339,6 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid) return 0; } -static void domain_suspend_common_wait_guest(libxl__egc *egc, - libxl__domain_suspend_state *dss); -static void domain_suspend_common_guest_suspended(libxl__egc *egc, - libxl__domain_suspend_state *dss); - -static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc, - libxl__xswait_state *xswa, int rc, const char *state); -static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc, -libxl__ev_evtchn *evev); -static void suspend_common_wait_guest_watch(libxl__egc *egc, - libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path); -static void suspend_common_wait_guest_check(libxl__egc *egc, -libxl__domain_suspend_state *dss); -static void suspend_common_wait_guest_timeout(libxl__egc *egc, - libxl__ev_time *ev, const struct timeval *requested_abs); - -static void domain_suspend_common_failed(libxl__egc *egc, - libxl__domain_suspend_state *dss); -static void domain_suspend_common_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, - bool ok); - -static bool
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 02:49 PM, Juergen Gross wrote: On 06/16/2015 03:29 PM, George Dunlap wrote: On 06/16/2015 02:23 PM, Juergen Gross wrote: Hmm, I'd rather have it all in one place. Putting it in qemu would enable us to handle hotplug as well. A USB device assigned via it's serial to a domU could be automatically passed through by qemu when showing up. This isn't possible today, but we wouldn't have to change libxl again for supporting it, only qemu would have to be adapted. Look, we're talking here about the libxl interface. Ian thinks that *at the libxl layer* we need to be able to specify all these things. It doesn't matter at this point whether qemu can also do it or not. When I'm adding new functionality (and this is the case here) I always try to add it at the level where it should be. qemu already is capable of finding a USB device by various means and can be extended easily to support our needs. And please remember, for writing the pvusb backend I'm already doing changes to qemu. So why should we add the same functionality to libxl by reading sysfs instead of letting it do qemu via libusb? And what about BSD? Letting qemu find the device would avoid two variants in libxl. If the libxl interface only has bus and port, then it doesn't matter what qemu can do -- the caller has no way of asking qemu to watch for vid:did:serial. If libxl has vid:did:serial in the interface, then it's only an implementation detail whether we pass that into qemu or whether we pass some other way of uniquely indentifying a particular device. And given that no *current* versions of qemu support vid:did:serial, the most sensible way to implement this would be to have libxl do the conversion and send over bus:addr -- that way when you update libxl you get that functionality regardless of whether qemu has it. Unless you're suggesting that the libxl interface should be a string that we just pass verbatim to qemu. If that's what you mean, it's a complete non-starter, and I'm sure Ian will agree with me. There's no way that we can provide any interface consistency or any documentation of what this string does if it boils down to This does whatever the version of qemu you happen to be running does. In the future, if we actually implement the automatically grab this device when it shows up functionality, then yes, having it in qemu rather than having a libxl daemon sit around and watch for those things, might be handy. But we can cross that bridge when we come to it. I would agree if the efforts in libxl would be much smaller than in qemu. But if the efforts are comparable I'd rather do it in the component which will make such an enhancement easier. They're both small, so this is really bikeshedding at the moment. The most important question is the interface: do we include vid:did:serial in the libxl interface. Since you want qemu to do that, I take it that your answer to that is yes. When we have patches submitted, we can discuss whether libxl should only support the (as-yet unreleased) versions of qemu that do the search themselves, or whether they should support all versions of qemu-xen. (The answer seems pretty obvious to me...) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 07/12] x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code
On 15/06/15 21:35, Ingo Molnar wrote: * David Vrabel david.vra...@citrix.com wrote: On 15/06/15 10:05, Ian Campbell wrote: On Sat, 2015-06-13 at 11:49 +0200, Ingo Molnar wrote: xen_mm_pin_all()/unpin_all() are used to implement full guest instance suspend/restore. It's a stop-all method that needs to iterate through all allocated pgds in the system to fix them up for Xen's use. This code uses pgd_list, probably because it was an easy interface. But we want to remove the pgd_list, so convert the code over to walk all tasks in the system. This is an equivalent method. It is not equivalent because pgd_alloc() now populates entries in pgds that are not visible to xen_mm_pin_all() (note how the original code adds the pgd to the pgd_list in pgd_ctor() before calling pgd_prepopulate_pmd()). These newly allocated page tables won't be correctly converted on suspend/resume and the new process will die after resume. So how should the Xen logic be fixed for the new scheme? I can't say I can see through the paravirt complexity here. Actually, since we freeze_processes() before trying to pin page tables, I think it should be ok as-is. I'll put the patch through some tests. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/6] tools/libxl: move domain resume code into libxl_dom_suspend.c
On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: move domain resume code into libxl_dom_suspend.c. Even though it has resume in the name, I'm not sure that libxl__domain_s3_resume is a good candidate for moving to the suspend code, it's called only from libxl_send_trigger and IIRC we don't really implement s3, just a fake version where the domain is paused/unpaused (rather the destroyed and resumed, say. Having moved libxl__domain_resume and libxl__domain_resume_device_model into the same file I think the latter could become static. (If you do that in this patch please say something along the lines of pure code motion except for making libxl__domain_resume_device_model static in the commit message) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: This is safe (as the variable will simply be unused), and is required for correct compilation when midway through untangling the libxc/libxl interaction. The #define is left in place to highlight that the variables can be removed once the untangling is complete. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 03:09 PM, George Dunlap wrote: On 06/16/2015 02:06 PM, Juergen Gross wrote: On 06/16/2015 01:45 PM, Ian Jackson wrote: Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): On 06/16/2015 01:10 PM, Ian Jackson wrote: AIUI some devices have serial numbers, which means you can distinguish them ? Yes, they have. The question is whether those are different on multiple instances? With lsusb I've found a device with serial number 0123456789ABCD. Do you really believe I'm just lucky owning the one with such a nice serial number? ;-) Heh. I think, though, that this suggests that if the user has devices whose serial numbers are actually different, they should be able to specify a device by vid/pid/serial. I think you are right. The next question: should this be part of libxl or qemu? It should be possible to extend the qemu vendorId:productId syntax to vendorId:productId:serial without too much effort, as libusb includes some support to obtain the serial number. It doesn't really matter what qemu or pvusb can do, as long as libxl can convert what it has into something that they can use. So if libxl is given vid, pid, and serial (remember, this will be a struct, not a string), it can look in sysfs and find the bus:addr (which is unique at any given time) and hand it to qemu (or the bus-port in the case of pvusb). Hmm, I'd rather have it all in one place. Putting it in qemu would enable us to handle hotplug as well. A USB device assigned via it's serial to a domU could be automatically passed through by qemu when showing up. This isn't possible today, but we wouldn't have to change libxl again for supporting it, only qemu would have to be adapted. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Shortly, libxl will be juggling multiple parallel operations, and will possibly have to take error decisions before some tasks have been set up. It would be preferable, I think, to arrange to call libxl__ev_child_init on all such libxl__ev_child structs either up front or certainly before there is any possibility of needing to unwind them. Such an init would at worst correspond to exactly the place where the zeroed structure you refer to is zeroed. No child process of libxl will ever have a pid of 0, so gate libxl__ev_child_inuse() on a pid strictly greater than 0. This makes it safe to use on a zeroed structure of a task which has not yet been set up. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- This change does make libxl__ev_child_init() functionally useless. I am undecided between leaving it in place in case it is useful in the future, or to remove it completely. --- tools/libxl/libxl_internal.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index e96d6b5..6226c18 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -880,7 +880,7 @@ _hidden pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *childw_out, static inline void libxl__ev_child_init(libxl__ev_child *childw_out) { childw_out-pid = -1; } static inline int libxl__ev_child_inuse(const libxl__ev_child *childw_out) -{ return childw_out-pid = 0; } +{ return childw_out-pid 0; } /* Useable (only) in the child to once more make the ctx useable for * xenstore operations. logs failure in the form what: error ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Introduced at this point so the python stream conversion code has a concrete ABI to use. Please could you also explicitly mention that it isn't added to FLAG_ALL yet because we don't actually implement it yet and that it will be added there later. With that: Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com (although I wish I had a better name in mind...) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]
George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): Remember that the path you gave in your previous e-mail isn't the path for the *usb device*, it's the path for the *block device*. It contains a PCI address, but it looks like it also contains part of the USB topology. Are you sure that's actually a stable interface, or does it just happen that on your hardware the discovery always happens in the same order? The block device is (in path terms) underneath the usb device, obviously. Not all of that path is relevant to identifying the USB device. On my system /sys/bus/usb/devices/2-3.3 is a link to /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/. This contains the pci bus address, but it also contains the bus number, which we've just said may be unstable across reboots. You mean the 2 in `usb2' ? I think that bus number is the bus number within the controller, not globally. I suppose it might be possible to specify buspci,port -- the pci address of the root bus, and the topology from there. In theory I guess that should be stable? Yes. The whole point of paths like this is that they are stable if the physical topology doesn't change. So on my netbook /dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1 always refers to the 1st MBR partition on logical device 0 on the USB storage device plugged into the USB port physically on the front left of the computer. In any case, at the moment you're essentially inventing from whole cloth a new way of specifying USB devices that (as far as I know) isn't supported by any other program that uses USB. If you can't specify the device by hardware path, you can't specify it deterministically. And as you can see it _is_ supported by other programs that use USB. mount can use it! I think the hardware path to the controller, at least, should be treated as an opaque OS-specific string. It might have a different format on BSD. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/6] tools/libxl: move domain resume code into libxl_dom_suspend.c
On 06/16/2015 09:04 PM, Ian Campbell wrote: On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: move domain resume code into libxl_dom_suspend.c. Even though it has resume in the name, I'm not sure that libxl__domain_s3_resume is a good candidate for moving to the suspend code, it's called only from libxl_send_trigger and IIRC we don't really implement s3, just a fake version where the domain is paused/unpaused (rather the destroyed and resumed, say. Having moved libxl__domain_resume and libxl__domain_resume_device_model into the same file I think the latter could become static. (If you do that in this patch please say something along the lines of pure code motion except for making libxl__domain_resume_device_model static in the commit message) ok, will fix this in the next version, thank you for the review! . -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/6] xen/MSI-X: really enforce alignment
On Fri, 5 Jun 2015, Jan Beulich wrote: The way the generic infrastructure works the intention of not allowing unaligned accesses can't be achieved by simply setting .unaligned to false. The benefit is that we can now replace the conditionals in {get,set}_entry_value() by assert()-s. Signed-off-by: Jan Beulich jbeul...@suse.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- a/qemu/upstream/hw/xen/xen_pt_msi.c +++ b/qemu/upstream/hw/xen/xen_pt_msi.c @@ -421,16 +421,14 @@ int xen_pt_msix_update_remap(XenPCIPasst static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset) { -return !(offset % sizeof(*e-latch)) - ? e-latch[offset / sizeof(*e-latch)] : 0; +assert(!(offset % sizeof(*e-latch))); +return e-latch[offset / sizeof(*e-latch)]; } static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val) { -if (!(offset % sizeof(*e-latch))) -{ -e-latch[offset / sizeof(*e-latch)] = val; -} +assert(!(offset % sizeof(*e-latch))); +e-latch[offset / sizeof(*e-latch)] = val; } static void pci_msix_write(void *opaque, hwaddr addr, @@ -496,6 +494,12 @@ static uint64_t pci_msix_read(void *opaq } } +static bool pci_msix_accepts(void *opaque, hwaddr addr, + unsigned size, bool is_write) +{ +return !(addr (size - 1)); +} + static const MemoryRegionOps pci_msix_ops = { .read = pci_msix_read, .write = pci_msix_write, @@ -504,7 +508,13 @@ static const MemoryRegionOps pci_msix_op .min_access_size = 4, .max_access_size = 4, .unaligned = false, +.accepts = pci_msix_accepts }, +.impl = { +.min_access_size = 4, +.max_access_size = 4, +.unaligned = false +} }; int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state
On 16/06/15 14:37, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Shortly more parameters will appear, and this saves unboxing each one. No functional change. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_create.c | 12 ++-- tools/libxl/libxl_internal.h |2 +- tools/libxl/libxl_save_callout.c |2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 86384d2..385891c 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc, int rc, uint32_t domid); static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, -uint32_t *domid, -int restore_fd, int checkpointed_stream, +uint32_t *domid, int restore_fd, +const libxl_domain_restore_params *params, const libxl_asyncop_how *ao_how, const libxl_asyncprogress_how *aop_console_how) { @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, libxl_domain_config_init(cdcs-dcs.guest_config_saved); libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config); cdcs-dcs.restore_fd = restore_fd; +if (params) cdcs-dcs.restore_params = *params; Is this eventually going to become non-optional? I think not and its validity is entirely intertwined with the validity of restore_fd (as I suspect it was before, but I've not checked). Perhaps an error check to that effect would be useful? It is mandatory for restore, and currently unused for plain create. restore_fd being -1 does appear to be the canonical switch between a restore and a create, so should be the qualification of validity. Anyway, I think what you've done here is correct, so: Acked-by: Ian Campbell ian.campb...@citrix.com [...] @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state { libxl_domain_config *guest_config; libxl_domain_config guest_config_saved; /* vanilla config */ int restore_fd; +libxl_domain_restore_params restore_params; libxl__domain_create_cb *callback; libxl_asyncprogress_how aop_console_how; /* private to domain_create */ int guest_domid; -int checkpointed_stream; This has, in effect moved from private to domain_create to filled in by user, I don't think the change here has actually changed its status, but I suspect it was wrong before (alternatively restore_fd is in the wrong place instead). I think it was wrong before. It was always a caller-provided parameter, albeit implicit by virtue of essentially being a remus boolean. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream
On 16/06/15 14:39, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Introduced at this point so the python stream conversion code has a concrete ABI to use. Please could you also explicitly mention that it isn't added to FLAG_ALL yet because we don't actually implement it yet and that it will be added there later. Certainly. With that: Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com (although I wish I had a better name in mind...) Any suggestions welcome. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 04:06 PM, George Dunlap wrote: On 06/16/2015 02:49 PM, Juergen Gross wrote: On 06/16/2015 03:29 PM, George Dunlap wrote: On 06/16/2015 02:23 PM, Juergen Gross wrote: Hmm, I'd rather have it all in one place. Putting it in qemu would enable us to handle hotplug as well. A USB device assigned via it's serial to a domU could be automatically passed through by qemu when showing up. This isn't possible today, but we wouldn't have to change libxl again for supporting it, only qemu would have to be adapted. Look, we're talking here about the libxl interface. Ian thinks that *at the libxl layer* we need to be able to specify all these things. It doesn't matter at this point whether qemu can also do it or not. When I'm adding new functionality (and this is the case here) I always try to add it at the level where it should be. qemu already is capable of finding a USB device by various means and can be extended easily to support our needs. And please remember, for writing the pvusb backend I'm already doing changes to qemu. So why should we add the same functionality to libxl by reading sysfs instead of letting it do qemu via libusb? And what about BSD? Letting qemu find the device would avoid two variants in libxl. If the libxl interface only has bus and port, then it doesn't matter what qemu can do -- the caller has no way of asking qemu to watch for vid:did:serial. If libxl has vid:did:serial in the interface, then it's only an implementation detail whether we pass that into qemu or whether we pass some other way of uniquely indentifying a particular device. And given that no *current* versions of qemu support vid:did:serial, the most sensible way to implement this would be to have libxl do the conversion and send over bus:addr -- that way when you update libxl you get that functionality regardless of whether qemu has it. Unless you're suggesting that the libxl interface should be a string that we just pass verbatim to qemu. If that's what you mean, it's a complete non-starter, and I'm sure Ian will agree with me. There's no way that we can provide any interface consistency or any documentation of what this string does if it boils down to This does whatever the version of qemu you happen to be running does. No, I didn't expect libxl to just pass an arbitrary string to qemu. My point was to avoid the sysfs accesses in libxl in order to support BSD as well and to reduce the complexity. In the future, if we actually implement the automatically grab this device when it shows up functionality, then yes, having it in qemu rather than having a libxl daemon sit around and watch for those things, might be handy. But we can cross that bridge when we come to it. I would agree if the efforts in libxl would be much smaller than in qemu. But if the efforts are comparable I'd rather do it in the component which will make such an enhancement easier. They're both small, so this is really bikeshedding at the moment. The most important question is the interface: do we include vid:did:serial in the libxl interface. Since you want qemu to do that, I take it that your answer to that is yes. Correct. When we have patches submitted, we can discuss whether libxl should only support the (as-yet unreleased) versions of qemu that do the search themselves, or whether they should support all versions of qemu-xen. (The answer seems pretty obvious to me...) May be I made one wrong assumption: I thought adding USB-passthrough for HVM domains would require qemu changes as well. If this is not the case I can understand your reasoning. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/6] xen/pass-through: correctly deal with RW1C bits
On Fri, 5 Jun 2015, Jan Beulich wrote: Introduce yet another mask for them, so that the generic routine can handle them, at once rendering xen_pt_pmcsr_reg_write() superfluous. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/qemu/upstream/hw/xen/xen_pt.h +++ b/qemu/upstream/hw/xen/xen_pt.h @@ -105,6 +105,8 @@ struct XenPTRegInfo { uint32_t res_mask; /* reg read only field mask (ON:RO/ROS, OFF:other) */ uint32_t ro_mask; +/* reg read/write-1-clear field mask (ON:RW1C/RW1CS, OFF:other) */ +uint32_t rw1c_mask; /* reg emulate field mask (ON:emu, OFF:passthrough) */ uint32_t emu_mask; xen_pt_conf_reg_init init; --- a/qemu/upstream/hw/xen/xen_pt_config_init.c +++ b/qemu/upstream/hw/xen/xen_pt_config_init.c @@ -176,7 +176,8 @@ static int xen_pt_byte_reg_write(XenPCIP cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, writable_mask); /* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask); +*val = XEN_PT_MERGE_VALUE(*val, dev_value ~reg-rw1c_mask, + throughable_mask); return 0; } @@ -193,7 +194,8 @@ static int xen_pt_word_reg_write(XenPCIP cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, writable_mask); /* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask); +*val = XEN_PT_MERGE_VALUE(*val, dev_value ~reg-rw1c_mask, + throughable_mask); return 0; } @@ -210,7 +212,8 @@ static int xen_pt_long_reg_write(XenPCIP cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, writable_mask); /* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask); +*val = XEN_PT_MERGE_VALUE(*val, dev_value ~reg-rw1c_mask, + throughable_mask); return 0; } @@ -611,6 +614,7 @@ static XenPTRegInfo xen_pt_emu_reg_heade .init_val = 0x, .res_mask = 0x0007, .ro_mask= 0x06F8, +.rw1c_mask = 0xF900, .emu_mask = 0x0010, .init = xen_pt_status_reg_init, .u.w.read = xen_pt_word_reg_read, @@ -910,6 +914,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[ .size = 2, .res_mask = 0xFFC0, .ro_mask= 0x0030, +.rw1c_mask = 0x000F, .init = xen_pt_common_reg_init, .u.w.read = xen_pt_word_reg_read, .u.w.write = xen_pt_word_reg_write, @@ -930,6 +935,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[ .offset = PCI_EXP_LNKSTA, .size = 2, .ro_mask= 0x3FFF, +.rw1c_mask = 0xC000, .init = xen_pt_common_reg_init, .u.w.read = xen_pt_word_reg_read, .u.w.write = xen_pt_word_reg_write, @@ -966,26 +972,6 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[ * Power Management Capability */ -/* write Power Management Control/Status register */ -static int xen_pt_pmcsr_reg_write(XenPCIPassthroughState *s, - XenPTReg *cfg_entry, uint16_t *val, - uint16_t dev_value, uint16_t valid_mask) -{ -XenPTRegInfo *reg = cfg_entry-reg; -uint16_t writable_mask = 0; -uint16_t throughable_mask = get_throughable_mask(s, reg, valid_mask); - -/* modify emulate register */ -writable_mask = reg-emu_mask ~reg-ro_mask valid_mask; -cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, writable_mask); - -/* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value ~PCI_PM_CTRL_PME_STATUS, - throughable_mask); - -return 0; -} - /* Power Management Capability reg static information table */ static XenPTRegInfo xen_pt_emu_reg_pm[] = { /* Next Pointer reg */ @@ -1016,11 +1002,12 @@ static XenPTRegInfo xen_pt_emu_reg_pm[] .size = 2, .init_val = 0x0008, .res_mask = 0x00F0, -.ro_mask= 0xE10C, +.ro_mask= 0x610C, +.rw1c_mask = 0x8000, .emu_mask = 0x810B, .init = xen_pt_common_reg_init, .u.w.read = xen_pt_word_reg_read, -.u.w.write = xen_pt_pmcsr_reg_write, +.u.w.write = xen_pt_word_reg_write, }, { .size = 0, I can see that the code change doesn't cause a change in behaviour for PCI_PM_CTRL, but it does for PCI_STATUS, PCI_EXP_DEVSTA and PCI_EXP_LNKSTA. Please explain why in the commit message. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/6] tools/libxl: rename libxl__domain_suspend to libxl__domain_save
On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: Rename libxl__domain_suspend() to libxl__domain_save() since it actually do the save domain work. This results in some strangeness in that some functions called *save* are now passed a struct called *suspend*. I think this is probably temporary and is all fixed up by the end of the series, is that true? If so then this temporary state affairs is: Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/6] tools/libxl: move remus code into libxl_remus.c
On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: move remus code into libxl_remus.c. Please say something like ... by refactoring bits of libxl_domain_remus_start and domain_save_done into X and Y and moving the remaining functionality unchanged into the new file. I gave two examples of functions which changed there, but please make sure the list is complete+accurate. Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/Makefile | 2 +- tools/libxl/libxl.c | 55 +--- tools/libxl/libxl_dom.c | 206 + tools/libxl/libxl_internal.h | 11 ++ tools/libxl/libxl_remus.c| 304 +++ 5 files changed, 318 insertions(+), 260 deletions(-) create mode 100644 tools/libxl/libxl_remus.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index 3f98d62..8535eaa 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -56,7 +56,7 @@ else LIBXL_OBJS-y += libxl_nonetbuffer.o endif -LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o +LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 77c6a36..0f9248e 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -792,10 +792,6 @@ out: return ptr; } -static void libxl__remus_setup_done(libxl__egc *egc, -libxl__remus_devices_state *rds, int rc); -static void libxl__remus_setup_failed(libxl__egc *egc, - libxl__remus_devices_state *rds, int rc); static void remus_failover_cb(libxl__egc *egc, libxl__domain_suspend_state *dss, int rc); @@ -844,63 +840,14 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info, assert(info); -/* Convenience aliases */ -libxl__remus_devices_state *const rds = dss-rds; - -if (libxl_defbool_val(info-netbuf)) { -if (!libxl__netbuffer_enabled(gc)) { -LOG(ERROR, Remus: No support for network buffering); -rc = ERROR_FAIL; -goto out; -} -rds-device_kind_flags |= (1 LIBXL__DEVICE_KIND_VIF); -} - -if (libxl_defbool_val(info-diskbuf)) -rds-device_kind_flags |= (1 LIBXL__DEVICE_KIND_VBD); - -rds-ao = ao; -rds-domid = domid; -rds-callback = libxl__remus_setup_done; - /* Point of no return */ -libxl__remus_devices_setup(egc, rds); +libxl__remus_setup(egc, dss); return AO_INPROGRESS; out: return AO_ABORT(rc); } -static void libxl__remus_setup_done(libxl__egc *egc, -libxl__remus_devices_state *rds, int rc) -{ -libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds); -STATE_AO_GC(dss-ao); - -if (!rc) { -libxl__domain_save(egc, dss); -return; -} - -LOG(ERROR, Remus: failed to setup device for guest with domid %u, rc %d, -dss-domid, rc); -rds-callback = libxl__remus_setup_failed; -libxl__remus_devices_teardown(egc, rds); -} - -static void libxl__remus_setup_failed(libxl__egc *egc, - libxl__remus_devices_state *rds, int rc) -{ -libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds); -STATE_AO_GC(dss-ao); - -if (rc) -LOG(ERROR, Remus: failed to teardown device after setup failed - for guest with domid %u, rc %d, dss-domid, rc); - -dss-callback(egc, dss, rc); -} - static void remus_failover_cb(libxl__egc *egc, libxl__domain_suspend_state *dss, int rc) { diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 701e9f7..0f81081 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -1409,189 +1409,6 @@ int libxl__toolstack_save(uint32_t domid, uint8_t **buf, return 0; } -/*- remus callbacks -*/ -static void remus_domain_suspend_callback_common_done(libxl__egc *egc, -libxl__domain_suspend_state *dss, int ok); -static void remus_devices_postsuspend_cb(libxl__egc *egc, - libxl__remus_devices_state *rds, - int rc); -static void remus_devices_preresume_cb(libxl__egc *egc, - libxl__remus_devices_state *rds, - int rc); - -static void libxl__remus_domain_suspend_callback(void *data) -{ -libxl__save_helper_state *shs = data; -libxl__egc
Re: [Xen-devel] [PATCH v2 1/6] tools/libxl: rename libxl__domain_suspend to libxl__domain_save
On 06/16/2015 08:59 PM, Ian Campbell wrote: On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: Rename libxl__domain_suspend() to libxl__domain_save() since it actually do the save domain work. This results in some strangeness in that some functions called *save* are now passed a struct called *suspend*. I think this is probably temporary and is all fixed up by the end of the series, is that true? Yes, it is fixed by the refactor of the suspend state. If so then this temporary state affairs is: Acked-by: Ian Campbell ian.campb...@citrix.com . -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 11/27] tools/python: Libxc migration v2 infrastructure
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Contains: * Python implementation of the libxc migration v2 records * Verification code for spec compliance * Unit tests Signed-off-by: Andrew Cooper andrew.coop...@citrix.com For this, and the following lobxl and verification patches: Acked-by: Ian Campbell ian.campb...@citrix.com (I've not read them closely on this pass, I recall looking at them before) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
On 16/06/15 14:47, Ian Jackson wrote: Andrew Cooper writes (Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children): It is possible that one bit fails before it can be calculated whether the second bit needs to start or not. At the moment, all bits in libxl in this area do initialisation immediately before use; most bits are even initialised in the function which starts their actions. Some bits are initialised differently depending on the path taken to get to the initialisation site. As a rule of thumb a function libxl__initiate_foo_ which takes a libxl__foo_state* should do this initialisation for the whole libxl__foo_state. I don't see why you can't do that. The only example of libxl__initiate_foo_ is libxl__initiate_device_remove() which starts the first action involved with removing a device. I will see what I can do, but there are areas of this code which can't have their initialisation brought any further forward. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 14/27] tools/python: Conversion utility for legacy migration streams
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: This utility will take a legacy stream as in input, and produce a v2 stream as an output. It is exec()'d by libxl to provide backwards compatibility. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/python/Makefile |4 + tools/python/scripts/convert-legacy-stream.py | 683 + 2 files changed, 687 insertions(+) create mode 100755 tools/python/scripts/convert-legacy-stream.py diff --git a/tools/python/Makefile b/tools/python/Makefile index e933be8..531c862 100644 --- a/tools/python/Makefile +++ b/tools/python/Makefile @@ -17,9 +17,13 @@ build: genwrap.py $(XEN_ROOT)/tools/libxl/libxl_types.idl \ .PHONY: install install: + $(INSTALL_DIR) $(DESTDIR)$(PRIVATE_BINDIR) + CC=$(CC) CFLAGS=$(PY_CFLAGS) $(PYTHON) setup.py install \ $(PYTHON_PREFIX_ARG) --root=$(DESTDIR) --force + $(INSTALL_PROG) scripts/convert-legacy-stream.py $(DESTDIR)$(PRIVATE_BINDIR) Please drop the .py on install, or from the source too if you prefer. With that: Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 5/6] tools/libxl: move save/restore code into libxl_dom_save.c
On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: move save/restore code into libxl_dom_save.c. If this (unlike other patches in the series) is purely code motion please indicate that this is the case. You might also like to consider refactoring things such that all patches are pure motion. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state
On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote: Currently struct libxl__domain_suspend_state contains 2 type of states, one is save state, another is suspend state. This patch separate it out. This patch separates those two out. The motivation of this is that COLO will need to do suspend/resume continuesly, we need a more common suspend state. continuously After this change, dss stands for libxl__domain_save_state, dsps stands for libxl__domain_suspend_state. Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Andrew Cooper andrew.coop...@citrix.com I presume this is almost entirely mechanical and the compiler mostly took care of identifying everywhere to be changed: Acked-by: Ian Campbell ian.campb...@citrix.com [...] @@ -2839,9 +2840,27 @@ typedef struct libxl__logdirty_switch { } libxl__logdirty_switch; struct libxl__domain_suspend_state { [...] +struct libxl__domain_save_state { Please could you add a little comment before each of these explaining their scope for the hard of thinking (i.e. me). One is the live phase and one is the final stop-and-copy phase, I think. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On Tue, Jun 16, 2015 at 1:02 PM, Ian Jackson ian.jack...@eu.citrix.com wrote: George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): So it sounds like we're converging on Allow multiple ways to specify the interface, with at least the following fields: - bus (int - 1,2,3, c) - port (string - 2.1.3, c) - address/devnum (int) - vendorid (uint16_t) - deviceid (uint16_t) You're missing the full device path from that list. Typically that will include at least one pci bus address. I'm not sure what constitutes a full device path. Is that defined somewhere? Remember that the path you gave in your previous e-mail isn't the path for the *usb device*, it's the path for the *block device*. It contains a PCI address, but it looks like it also contains part of the USB topology. Are you sure that's actually a stable interface, or does it just happen that on your hardware the discovery always happens in the same order? On my system /sys/bus/usb/devices/2-3.3 is a link to /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/. This contains the pci bus address, but it also contains the bus number, which we've just said may be unstable across reboots. I suppose it might be possible to specify buspci,port -- the pci address of the root bus, and the topology from there. In theory I guess that should be stable? In any case, at the moment you're essentially inventing from whole cloth a new way of specifying USB devices that (as far as I know) isn't supported by any other program that uses USB. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state
Yang Hongyang writes ([PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state): Currently struct libxl__domain_suspend_state contains 2 type of states, one is save state, another is suspend state. This patch separate it out. The motivation of this is that COLO will need to do suspend/resume continuesly, we need a more common suspend state. Currently in libxl/libxc/etc. suspend and save have referred to the same thing, but different terminology has been used at different layers. Ie both suspend and save each refer to either or both of save to disk or suspend for live migrate, or to the relevant underlying mechanisms. So I'm not sure introducing a distinction between those two terms in libxl is really helpful. If it is to be done there should be a clear explanation of what the difference is. On IRC you said: 14:21 yanghy Diziet, currently, in libxl, suspend is used as 2 means, one is save(corrspond to libxc save), another is suspend the guest(related to suspend callback) That's rather different, I think. Or, at least, I'm not sure that I understand this distinction as you are making it. The suspend callback is part of the implementation of what at higher layers we save/suspend/restore/migration. AIUI this callback is related to pausing the guest, or manipulating its VCPUs ? Perhaps we should rename this callback ? Maybe Andrew Cooper can suggest a name ? Forgive me if I'm confused and going off in the wrong direction... Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 05/27] tools/libxl: Introduce ROUNDUP()
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: This is the same as is used by libxc. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/6] xen/pass-through: log errno values rather than function return ones
On Fri, 5 Jun 2015, Jan Beulich wrote: Functions setting errno commonly return just -1, which is of no particular use in the log file. Signed-off-by: Jan Beulich jbeul...@suse.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- a/qemu/upstream/hw/xen/xen_pt.c +++ b/qemu/upstream/hw/xen/xen_pt.c @@ -609,8 +609,8 @@ static void xen_pt_region_update(XenPCIP guest_port, machine_port, size, op); if (rc) { -XEN_PT_ERR(d, %s ioport mapping failed! (rc: %i)\n, - adding ? create new : remove old, rc); +XEN_PT_ERR(d, %s ioport mapping failed! (err: %i)\n, + adding ? create new : remove old, errno); } } else { pcibus_t guest_addr = sec-offset_within_address_space; @@ -623,8 +623,8 @@ static void xen_pt_region_update(XenPCIP XEN_PFN(size + XC_PAGE_SIZE - 1), op); if (rc) { -XEN_PT_ERR(d, %s mem mapping failed! (rc: %i)\n, - adding ? create new : remove old, rc); +XEN_PT_ERR(d, %s mem mapping failed! (err: %i)\n, + adding ? create new : remove old, errno); } } } @@ -738,8 +738,8 @@ static int xen_pt_initfn(PCIDevice *d) rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, pirq); if (rc 0) { -XEN_PT_ERR(d, Mapping machine irq %u to pirq %i failed, (rc: %d)\n, - machine_irq, pirq, rc); +XEN_PT_ERR(d, Mapping machine irq %u to pirq %i failed, (err: %d)\n, + machine_irq, pirq, errno); /* Disable PCI intx assertion (turn on bit10 of devctl) */ cmd |= PCI_COMMAND_INTX_DISABLE; @@ -760,8 +760,8 @@ static int xen_pt_initfn(PCIDevice *d) PCI_SLOT(d-devfn), e_intx); if (rc 0) { -XEN_PT_ERR(d, Binding of interrupt %i failed! (rc: %d)\n, - e_intx, rc); +XEN_PT_ERR(d, Binding of interrupt %i failed! (err: %d)\n, + e_intx, errno); /* Disable PCI intx assertion (turn on bit10 of devctl) */ cmd |= PCI_COMMAND_INTX_DISABLE; @@ -770,7 +770,7 @@ static int xen_pt_initfn(PCIDevice *d) if (xen_pt_mapped_machine_irq[machine_irq] == 0) { if (xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq)) { XEN_PT_ERR(d, Unmapping of machine interrupt %i failed! -(rc: %d)\n, machine_irq, rc); +(err: %d)\n, machine_irq, errno); } } s-machine_irq = 0; @@ -808,9 +808,9 @@ static void xen_pt_unregister_device(PCI 0 /* isa_irq */); if (rc 0) { XEN_PT_ERR(d, unbinding of interrupt INT%c failed. -(machine irq: %i, rc: %d) +(machine irq: %i, err: %d) But bravely continuing on..\n, - 'a' + intx, machine_irq, rc); + 'a' + intx, machine_irq, errno); } } @@ -828,9 +828,9 @@ static void xen_pt_unregister_device(PCI rc = xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq); if (rc 0) { -XEN_PT_ERR(d, unmapping of interrupt %i failed. (rc: %d) +XEN_PT_ERR(d, unmapping of interrupt %i failed. (err: %d) But bravely continuing on..\n, - machine_irq, rc); + machine_irq, errno); } } } --- a/qemu/upstream/hw/xen/xen_pt_msi.c +++ b/qemu/upstream/hw/xen/xen_pt_msi.c @@ -133,8 +133,8 @@ static int msi_msix_setup(XenPCIPassthro msix_entry, table_base); if (rc) { XEN_PT_ERR(s-dev, - Mapping of MSI%s (rc: %i, vec: %#x, entry %#x)\n, - is_msix ? -X : , rc, gvec, msix_entry); + Mapping of MSI%s (err: %i, vec: %#x, entry %#x)\n, + is_msix ? -X : , errno, gvec, msix_entry); return rc; } } @@ -167,12 +167,12 @@ static int msi_msix_update(XenPCIPassthr pirq, gflags, table_addr); if (rc) { -XEN_PT_ERR(d, Updating of MSI%s failed. (rc: %d)\n, - is_msix ? -X : , rc); +XEN_PT_ERR(d, Updating of MSI%s failed. (err: %d)\n, + is_msix ? -X : , errno); if (xc_physdev_unmap_pirq(xen_xc, xen_domid, *old_pirq)) { -XEN_PT_ERR(d, Unmapping of MSI%s pirq %d
[Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)
Introduces the Xen pvUSB frontend. With pvUSB it is possible for a Xen domU to communicate with a USB device assigned to that domU. The communication is all done via the pvUSB backend in a driver domain (usually Dom0) which is owner of the physical device. The pvUSB frontend is a USB hcd for a virtual USB host connector. The code is taken from the pvUSB implementation in Xen done by Fujitsu based on Linux kernel 2.6.18. Changes from the original version are: - port to upstream kernel - put all code in just one source file - move module to appropriate location in kernel tree - adapt to Linux style guide - minor code modifications to increase readability Signed-off-by: Juergen Gross jgr...@suse.com --- drivers/usb/host/Kconfig | 11 + drivers/usb/host/Makefile |1 + drivers/usb/host/xen-hcd.c | 1638 3 files changed, 1650 insertions(+) create mode 100644 drivers/usb/host/xen-hcd.c diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig index 197a6a3..3361b4b 100644 --- a/drivers/usb/host/Kconfig +++ b/drivers/usb/host/Kconfig @@ -788,3 +788,14 @@ config USB_HCD_TEST_MODE This option is of interest only to developers who need to validate their USB hardware designs. It is not needed for normal use. If unsure, say N. + +config USB_XEN_HCD + tristate Xen usb virtual host driver + depends on XEN + select XEN_XENBUS_FRONTEND + help + The Xen usb virtual host driver serves as a frontend driver enabling + a Xen guest system to access USB Devices passed through to the guest + by the Xen host (usually Dom0). + Only needed if the kernel is running in a Xen guest and generic + access to a USB device is needed. diff --git a/drivers/usb/host/Makefile b/drivers/usb/host/Makefile index 65b0b6a..3779696 100644 --- a/drivers/usb/host/Makefile +++ b/drivers/usb/host/Makefile @@ -75,3 +75,4 @@ obj-$(CONFIG_USB_HCD_SSB) += ssb-hcd.o obj-$(CONFIG_USB_FUSBH200_HCD) += fusbh200-hcd.o obj-$(CONFIG_USB_FOTG210_HCD) += fotg210-hcd.o obj-$(CONFIG_USB_MAX3421_HCD) += max3421-hcd.o +obj-$(CONFIG_USB_XEN_HCD) += xen-hcd.o diff --git a/drivers/usb/host/xen-hcd.c b/drivers/usb/host/xen-hcd.c new file mode 100644 index 000..9da2856 --- /dev/null +++ b/drivers/usb/host/xen-hcd.c @@ -0,0 +1,1638 @@ +/* + * xen-hcd.c + * + * Xen USB Virtual Host Controller driver + * + * Copyright (C) 2009, FUJITSU LABORATORIES LTD. + * Author: Noboru Iwamatsu n_iwama...@jp.fujitsu.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see http://www.gnu.org/licenses/. + * + * Or, by your choice: + * + * When distributed separately from the Linux kernel or incorporated into + * other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include linux/module.h +#include linux/usb.h +#include linux/list.h +#include linux/usb/hcd.h +#include linux/io.h + +#include xen/xen.h +#include xen/xenbus.h +#include xen/grant_table.h +#include xen/events.h +#include xen/page.h + +#include xen/interface/io/usbif.h + +/* Private per-URB data */ +struct urb_priv { + struct list_head list; + struct urb *urb; + int req_id; /* RING_REQUEST id for submitting */ + int unlink_req_id; /* RING_REQUEST id for unlinking */ +
Re: [Xen-devel] [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: @@ -377,6 +384,28 @@ static void record_body_done(libxl__egc *egc, stream_failed(egc, stream, ret); } +void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, + int ret, int retval, int errnoval) +{ +libxl__domain_create_state *dcs = dcs_void; +STATE_AO_GC(dcs-ao); + +if (ret) +goto err; + +if (retval) { +LOGEV(ERROR, errnoval, restoring domain); +ret = ERROR_FAIL; +goto err; +} + +libxl__stream_read_continue(egc, dcs-srs); continue? Is this something to do with checkpointing? diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 23f27d4..7418d92 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -346,6 +346,8 @@ libxl_domain_create_info = Struct(domain_create_info,[ libxl_domain_restore_params = Struct(domain_restore_params, [ At some point we will need a LIBXL_HAVE #define. (checkpointed_stream, integer), +(stream_version, uint32, {'init_val': '1'}), If we aren't going to go for an IDL enum rather than a uint32 you probably just want the bare integer 1. But, I suspect we would prefer an enum, i.e an explicit list of known versions, rather than an integer? I wonder when, if ever, we will be able to flip this to 2? I suppose whenever the legacy conversion stuff gets pulled out. +(legacy_width, uint32), From what I've seen so far this is never user provided but is internal to libxl and detected[0] at runtime. As such it belongs somewhere else other than in the public API. [0] FVO detected == hardcoded depending on the build arch ]) libxl_domain_sched_params = Struct(domain_sched_params,[ diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index ddb293c..14d96c9 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -110,7 +110,9 @@ #define XL_MANDATORY_FLAG_JSON (1U 0) /* config data is in JSON format */ #define XL_MANDATORY_FLAG_STREAMv2 (1U 1) /* stream is v2 */ -#define XL_MANDATORY_FLAG_ALL (XL_MANDATORY_FLAG_JSON) +#define XL_MANDATORY_FLAG_ALL (XL_MANDATORY_FLAG_JSON |\ +XL_MANDATORY_FLAG_STREAMv2) + struct save_file_header { char magic[32]; /* savefileheader_magic */ /* All uint32_ts are in domain's byte order. */ @@ -2724,6 +2726,9 @@ static uint32_t create_domain(struct domain_create *dom_info) libxl_domain_restore_params_init(params); params.checkpointed_stream = dom_info-checkpointed_stream; +params.stream_version = +(hdr.mandatory_flags XL_MANDATORY_FLAG_STREAMv2) ? 2 : 1; + ret = libxl_domain_create_restore(ctx, d_config, domid, restore_fd, params, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On Tue, Jun 16, 2015 at 12:25 PM, Juergen Gross jgr...@suse.com wrote: Yes, they have. The question is whether those are different on multiple instances? With lsusb I've found a device with serial number 0123456789ABCD. Do you really believe I'm just lucky owning the one with such a nice serial number? ;-) FWIW apparently the spec for USB printers requires a unique serial number; so there are probably cases where serial number is actually useful. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
On Tue, 2015-06-16 at 16:01 +0100, Andrew Cooper wrote: On 16/06/15 15:31, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Overall looks good, I've got some comments below and I think it almost certainly wants eyes from Ian who knows more about the dc infra etc. +void libxl__stream_read_start(libxl__egc *egc, + libxl__stream_read_state *stream) +{ +libxl__datacopier_state *dc = stream-dc; +int ret = 0; + +/* State initialisation. */ +assert(!stream-running); + +memset(dc, 0, sizeof(*dc)); libxl__datacopier_init, please That call is made by libxl__datacopier_start() each and every time, and unlike here, is matched with an equivalent _kill() call. Hrm, I think I'd best defer to Ian J on what the right way to deal with a dc being setup and resused in this way is. +/* Start reading the stream header. */ +dc-readwhat = stream header; +dc-readbuf = stream-hdr; +stream-expected_len = dc-bytes_to_read = sizeof(stream-hdr); +dc-used = 0; +dc-callback = stream_header_done; This pattern of resetting and reinitialising the dc occurs in multiple places, I think a helper would be in order, some sort of stream_next_record_init or something perhaps? The only feasible helper would have to take everything as parameters; there is insufficient similarity between all users. I dunno whether that would be harder to read... I was more concerned about ensuring everyone does everything (especially if a new field gets added), having a function with parameters would cause a compile failure when the addition was (hopefully) propagated to that function and added as a parameter. Lets see what Ian thinks. +void libxl__stream_read_abort(libxl__egc *egc, + libxl__stream_read_state *stream, int rc) +{ +stream_failed(egc, stream, rc); +} + +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream) +{ +stream-rc = 0; +stream-running = false; + +stream_done(egc, stream); Push the running = false into stream_done and flip the assert there? Logically the stream is still running until it is done, so having done assert it isn't running seems counter-intuitive. This is more for piece of mind. stream_done() my strictly only ever be called once, hence its assert. assert(stream-running); stream-running = false in stream_done() gives the same piece of mind, doesn't it? +if (onwrite || dc-used != stream-expected_len) { +ret = ERROR_FAIL; +LOG(ERROR, write %d, err %d, expected %zu, got %zu, +onwrite, errnoval, stream-expected_len, dc-used); +goto err; +} I think you need to check errnoval == 0 in the !onwrite case, otherwise you may miss a read error? dc-used != stream-expected_len covers all possible read errors, in the something went wrong kind of way. With the current implementation, perhaps, but the doc doesn't seem to say you can rely on it (an appropriate reaction might be to change the doc rather than the code, I don't mind). Sharing the dc between al these differing usages is starting to rankle a little, but I think it is necessary because it may have queued data from a previous read which was larger than the current record, correct? Hrm, isn't setting dc-used = 0 on each reset potentially throwing some stuff away? We should never be in a case where we are setting up a new read/write from the dc with any previous IO pending. Essentially because dc uses its idea of bytes remaining to do the read, the scenario I was imagining only comes about if the dc is reading large blocks and segmenting them later, which isn't how it is used here. +if (dc-writefd == -1) { +ret = ERROR_FAIL; +LOGE(ERROR, Unable to open '%s', path); +goto err; +} +dc-maxsz = dc-bytes_to_read = rec_hdr-length - sizeof(*emu_hdr); +stream-expected_len = dc-used = 0; expecting 0? This differs from the pattern common everywhere else and I'm not sure why. The datacopier has been overloaded so many times, it is messy to use. In this case, we are splicing from read fd to a write fd, rather than to a local buffer. Therefore, when the IO is complete, we expect 0 bytes in the local buffer, as all data should end up in the fd. I think using 2 or more data copiers to cover the different configurations might help? You can still reuse one for the normal record processing but a separate dedicated one for writing the emu to a file might iron out a wrinkle. +memset(dc, 0,
Re: [Xen-devel] [Draft F] Xen on ARM vITS Handling
On 16/06/15 16:10, Ian Campbell wrote: This could happen when the device is not quiescent. We had this issue on the vexpress at boot time when the network card was trying to send an interrupt before DOM0 is setup. I don't fully understand the issue you are trying to describe, but do you want to propose a change to the spec? I actually don't know how to modify it. So it's an open question. For SPI too, or just for LPI? Only LPI. How is it fixed for SPI? We queue the SPI but don't inject it (gic_raise_guest_irq) to the guest. The injection will be done when the guest enable the IRQ. We can't use it for LPIs as the mapping vLPI - pLPI may not yet exists. vgic_vcpu_inject_irq doesn't queue the interrupt if a VCPU is down. I think this is because the state of the VCPU wouldn't be correct. The process would be something like: - Creation of the domain = All vCPUs are down - Device is assigned to the guest = Enable physical LPIs * physical LPI is received * = Will be ignored and not EOIed (VCPU0 is down) = The LPI will never fired again during the guest life - Domain is started by the toolstack = VCPU0 is online Is it sufficient to queue interrupts even for VCPUs which are down? How does the lack of a vITT entry when this interrupt occurred affect this? Well, in this case we don't know on which vLPI we have to inject it. But as said above, I guess we don't care if we ensure that the device can't send an event (by ensuring that the device doesn't know the GITS_TRANSLATER address is). Yes, I think that works. I'm not sure where to spell that out though. We want to find a way to avoid the PCI device to send an event. I see multiple possibility to do it: 1) Mask the event MSI-X allow to mask a specific event. There is a similar feature for MSI but it's optional. 2) Compose the MSI message only when the vLPI is enabled. That require Xen/PCI-back to delay the write in the config space. None of this solution satisfy me. But I don't have much knowledge on how PCI-passthrough works on Xen. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 6/6] xen/pass-through: constify some static data
On 16.06.15 at 16:27, stefano.stabell...@eu.citrix.com wrote: On Fri, 5 Jun 2015, Jan Beulich wrote: This is done indirectly by adjusting two typedefs and helps emphasizing that the respective tables aren't supposed to be modified at runtime (as they may be shared between devices). Signed-off-by: Jan Beulich jbeul...@suse.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Perhaps patch 5 and 6 could go in right away, without waiting for the issues on the other ones addressed? After all, it may well be that patch 1 will be dropped following the discussion on the hypervisor side patch series... Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Overall looks good, I've got some comments below and I think it almost certainly wants eyes from Ian who knows more about the dc infra etc. +void libxl__stream_read_start(libxl__egc *egc, + libxl__stream_read_state *stream) +{ +libxl__datacopier_state *dc = stream-dc; +int ret = 0; + +/* State initialisation. */ +assert(!stream-running); + +memset(dc, 0, sizeof(*dc)); libxl__datacopier_init, please +dc-ao = stream-ao; +dc-readfd = stream-fd; +dc-writefd = -1; + +/* Start reading the stream header. */ +dc-readwhat = stream header; +dc-readbuf = stream-hdr; +stream-expected_len = dc-bytes_to_read = sizeof(stream-hdr); +dc-used = 0; +dc-callback = stream_header_done; This pattern of resetting and reinitialising the dc occurs in multiple places, I think a helper would be in order, some sort of stream_next_record_init or something perhaps? +void libxl__stream_read_abort(libxl__egc *egc, + libxl__stream_read_state *stream, int rc) +{ +stream_failed(egc, stream, rc); +} + +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream) +{ +stream-rc = 0; +stream-running = false; + +stream_done(egc, stream); Push the running = false into stream_done and flip the assert there? Logically the stream is still running until it is done, so having done assert it isn't running seems counter-intuitive. +static void stream_done(libxl__egc *egc, +libxl__stream_read_state *stream) +{ +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs); + +assert(!stream-running); + +stream-completion_callback(egc, dcs, stream-rc); +} + +static void stream_header_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval) +{ +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc); +libxl_sr_hdr *hdr = stream-hdr; +STATE_AO_GC(dc-ao); +int ret = 0; + +if (onwrite || dc-used != stream-expected_len) { +ret = ERROR_FAIL; +LOG(ERROR, write %d, err %d, expected %zu, got %zu, +onwrite, errnoval, stream-expected_len, dc-used); +goto err; +} I think you need to check errnoval == 0 in the !onwrite case, otherwise you may miss a read error? Also it looks like onwrite can be -1, which is a separate error case. + +static void record_header_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval) +{ +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc); +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr; +STATE_AO_GC(dc-ao); +int ret = 0; + +if (onwrite || dc-used != stream-expected_len) { +ret = ERROR_FAIL; +LOG(ERROR, write %d, err %d, expected %zu, got %zu, +onwrite, errnoval, stream-expected_len, dc-used); +goto err; +} Same comments wrt the arguments as the previous one. Maybe a common helper to check (and log) the status at the head of each callback? So you can effectively do if (!everything_ok(stream, dc) goto err? +assert(!ret); +if (rec_hdr-length) { +free(stream-rec_body); +stream-rec_body = NULL; reset length too? +static void read_emulator_body(libxl__egc *egc, + libxl__stream_read_state *stream) +{ +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs); +libxl__datacopier_state *dc = stream-dc; +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr; +libxl_sr_emulator_hdr *emu_hdr = stream-rec_body; +STATE_AO_GC(stream-ao); +char path[256]; +int ret = 0; + +sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE.%u, dcs-guest_domid); + +dc-readwhat = save/migration stream; +dc-copywhat = emulator context; +dc-writewhat = qemu save file; +dc-readbuf = NULL; +dc-writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666); Since it this is all done in the same process (or children of it) with not setuid etc, I think 0600 would be better to avoid accidentally leaving the save state world readable (just in case it matters). Also, should consider whether this fd needs to be subject to the carefd machinery. Sharing the dc between al these differing usages is starting to rankle a little, but I think it is necessary because it may have queued data from a previous read which was larger than the current record, correct? Hrm, isn't
Re: [Xen-devel] [PATCH 1/6] xen/MSI-X: latch MSI-X table writes
On Tue, 16 Jun 2015, Jan Beulich wrote: On 16.06.15 at 15:35, stefano.stabell...@eu.citrix.com wrote: On Fri, 5 Jun 2015, Jan Beulich wrote: @@ -322,6 +323,13 @@ static int xen_pt_msix_update_one(XenPCI pirq = entry-pirq; +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s-msix-maskall || +(entry-latch(VECTOR_CTRL) PCI_MSIX_ENTRY_CTRL_MASKBIT)) { I admit I am having difficulties understanding the full purpose of these checks. Please add a comment on them. The comment would (pointlessly imo) re-state what the code already says: I guess the intention is only to make changes using the latest values, the ones in entry-latch, when the right conditions are met, otherwise keep using the old values. Is that right? In that case, don't we want to use the latest values on MASKBIT - !MASKBIT transitions? In general when unmasking? This is what we want. And with that, the questions you ask further down should be answered too: The function gets invoked with the pre-change mask flag state in -latch[], and updates the values used for actually setting up when that one has the entry masked (or mask-all is set). The actual new value gets written to -latch[] after the call. I think this logic is counter-intuitive and prone to confuse the reader. This change doesn't make sense on its own: when one will read xen_pt_msix_update_one, won't be able to understand the function without checking the call sites. Could we turn it around to be more obvious? Here check if !(entry-latch(VECTOR_CTRL) PCI_MSIX_ENTRY_CTRL_MASKBIT) and below only call xen_pt_msix_update_one on MASKBIT - !MASKBIT transactions? Or something like that? @@ -444,39 +432,28 @@ static void pci_msix_write(void *opaque, offset = addr % PCI_MSIX_ENTRY_SIZE; if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { -const volatile uint32_t *vec_ctrl; - if (get_entry_value(entry, offset) == val entry-pirq != XEN_PT_UNASSIGNED_PIRQ) { return; } +entry-updated = true; +} else if (msix-enabled entry-updated + !(val PCI_MSIX_ENTRY_CTRL_MASKBIT)) { +const volatile uint32_t *vec_ctrl; + /* * If Xen intercepts the mask bit access, entry-vec_ctrl may not be * up-to-date. Read from hardware directly. */ vec_ctrl = s-msix-phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL; +set_entry_value(entry, offset, *vec_ctrl); Why are you calling set_entry_value with the hardware vec_ctrl value? It doesn't look correct to me. In any case, if you wanted to do it, shouldn't you just set/unset PCI_MSIX_ENTRY_CTRL_MASKBIT instead of the whole *vec_ctrl? The comment above the code explains it: What we have stored locally may not reflect reality, as we may not have seen all writes (and this indeed isn't just a may). And if out cached value isn't valid anymore, why would we not want to update all of it, rather than just the mask bit? OK, however the previous code wasn't actually updating the entirety of vector_ctrl. It was just using the updated value to check for PCI_MSIX_ENTRY_CTRL_MASKBIT. This is something else. The new behavior might be correct, but at least the commit message needs to explain it. -if (msix-enabled !(*vec_ctrl PCI_MSIX_ENTRY_CTRL_MASKBIT)) { -if (!entry-warned) { -entry-warned = true; -XEN_PT_ERR(s-dev, Can't update msix entry %d since MSI-X is -already enabled.\n, entry_nr); -} -return; -} - -entry-updated = true; +xen_pt_msix_update_one(s, entry_nr); Shouldn't we call xen_pt_msix_update_one only if (*vec_ctrl PCI_MSIX_ENTRY_CTRL_MASKBIT)? In other words, only when we see a MASKBIT - !MASKBIT transition? The combination of the !(val PCI_MSIX_ENTRY_CTRL_MASKBIT) check in the if() surrounding this call and the (entry-latch(VECTOR_CTRL) PCI_MSIX_ENTRY_CTRL_MASKBIT) check inside the function guarantee just that (i.e. the function invocation is benign in the other case, as entry-addr/entry-data would remain unchanged). OK, maybe the code works as is, but it took me a long time to make sense of it because it relies on the combinations of three checks in three different places. I would prefer to change it into something more obvious. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)
On Tue, Jun 16, 2015 at 04:56:35PM +0200, Juergen Gross wrote: Hmm, I'm beginning to question the value of that file. May be I should just throw it away. Given that you are going to have to justify every one of them, and ensure that you are creating them in a race-free manner (I didn't look to check), you might just drop it if they are not needed. There's always debugfs for things like generic debugging stuff. If you want to do that, use the usb subdir in debugfs for your directory and files. thanks, greg k-h ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/6] libxl: do not add a vkb backend to hvm guests
On Wed, Jun 10, 2015 at 11:09:50AM +0100, Stefano Stabellini wrote: When QEMU restricts its xenstore connection, it cannot provide PV backends. A separate QEMU instance is required to provide PV backends in userspace, such as qdisk. With two separate instances, it is not possible to take advantage of vkb for mouse and keyboard, as the QEMU that emulates the graphic card (the device model), would be separate from the QEMU running the vkb backend (PV QEMU). The question is that how would this affect the non-split setup. But do I understand correctly we are moving towards a split setup regardlessly whether QEMU supports xsrestrict or not? I.e. there is no non-split setup in the future. Wei. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- tools/libxl/libxl_create.c |5 - 1 file changed, 5 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index f0da7dc..a74b340 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -1275,17 +1275,12 @@ static void domcreate_launch_dm(libxl__egc *egc, libxl__multidev *multidev, { libxl__device_console console; libxl__device device; -libxl_device_vkb vkb; init_console_info(gc, console, 0); console.backend_domid = state-console_domid; libxl__device_console_add(gc, domid, console, state, device); libxl__device_console_dispose(console); -libxl_device_vkb_init(vkb); -libxl__device_vkb_add(gc, domid, vkb); -libxl_device_vkb_dispose(vkb); - dcs-dmss.dm.guest_domid = domid; if (libxl_defbool_val(d_config-b_info.device_model_stubdomain)) libxl__spawn_stub_dm(egc, dcs-dmss); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: when signalled to do so by libxl__remus_domain_checkpoint_callback() I think I saw that Remus wasn't currently working, so I'll let you and Hongyang thrash something out before I spend too much effort reviewing these last few RFC bits. Unless you think it is worth my having a look now? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
On 16/06/15 16:00, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: In a remus senario, libxc will write a CHECKPOINT record, then hand ownership scenario of the fd to libxl. Libxl then writes any records required and finishes with a CHECKPOINT_END record, then hands ownership of the fd back to libxc. Seems like a plausible scheme to me, if that's what the RFC was for. The RFC was for all the support remus bits, where I wrote code but was unable to test. They will all be dropped for v2, once the suggested adjustments are done. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 17/27] tools/libxl: Support converting a legacy stream to a v2 stream
On Tue, 2015-06-16 at 16:13 +0100, Andrew Cooper wrote: Since we only support N-N+1 we could perhaps tolerate the duplication if we agreed upon a reasonable schedule to remove all this compat stuff, e.g. in 4.7 or 4.8. N-N+1 is another item on the needs to be resolved before Xapi can use libxl list. In XenServer, we still have support for importing VMs from XenServer 4.0, which is certainly older than Xen 3.2 OK. FWIW I think we should consider importing (i.e. restore) separate to migration, since the former can be subjected to more heavy duty offline conversions if necessary. It's not out of the question that this new migration stream format will allow us to sensibly increase the +1 to some larger number for migrations too. Ian. It might be reasonable to make a compile time option to omit the legacy conversion, but conversion should absolutely be on by default for the first release after this series being accepted. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] Revert xen-hvm: increase maxmem before calling xc_domain_populate_physmap
On 06/16/2015 04:51 PM, Stefano Stabellini wrote: On Tue, 16 Jun 2015, Wei Liu wrote: On Wed, Jun 10, 2015 at 01:55:13PM +0100, George Dunlap wrote: This reverts commit c1d322e6048796296555dd36fdd102d7fa2f50bf. The original commit fixes a bug when assigning a large number of devices which require option roms to a guest. (One known configuration that needs extra memory is having more than 4 emulated NICs assigned. Three or fewer NICs seems to work without this functionality.) However, by unilaterally increasing maxmem, it introduces two problems. First, now libxl's calculation of the required maxmem during migration is broken -- any guest which exercised this functionality will fail on migration. (Guests which have the default number of devices are not affected.) Secondly, it makes it impossible for a higher-level toolstack or administer to predict how much memory a VM will actually use, making it much more difficult to effectively use all of the memory on a machine. The right solution to the original problem is to figure out a way for qemu to take pages from the existing pool of guest memory, rather than allocating more pages. That fix will take more time to develop than we have until the feature freeze. In the mean time, the simplest way to fix the migration issue is to revert this change. That will re-introduce the original bug, but it's an unusual corner case; and without migration it isn't fully functional yet anyway. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- I do think this is the right approach, but I'm mainly sending this is mainly to open up discussion. CC: Stefano Stabellini stefano.stabell...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Andrew Cooper andrew.coop...@citrix.com Stefano, Andrew, any comments? If we're to do this we need to do it now. I think reverting this change in QEMU and relevant changes in libxl would be the most viable solution to solve this for this release. Reverting this patch doesn't really solve the problem: instead of breaking on migration when the VM has more than 3 emulated NICs, the VM simply refuses to start in that case. I guess it can be considered a small improvement but certainly not a fix. Given that the migration issue only happens in an unusual corner case, are we really in a hurry to revert this commit and go back to the failure to start, even before we actually figure out what the proper fix is? I'm in a hurry to go back to a world where qemu doesn't unexpectedly allocate more RAM to a guest. If the real problem with the original patch was that it broke migration, we could fix that pretty easily; but the real problem (to me) with the original patch is that it unpredicatably allocates more memory to a guest that the toolstack doesn't know about. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv2 3/5] evtchn: use a per-event channel lock for sending events
On 16/06/15 16:19, David Vrabel wrote: @@ -1221,6 +1277,8 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) evtchn_port_set_pending(rd, rchn-notify_vcpu_id, rchn); } +spin_unlock(lchn-lock); + spin_unlock(ld-event_lock); } Again I think the event lock can be dropped earlier now. Ditto. Uh, no. This is notify. I've kept the locking like this because of the ld-is_dying check. I think we need the ld-event_lock in case d-is_dying is set and evtchn_destroy(ld) is called. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 6/6] xen/pass-through: constify some static data
On Fri, 5 Jun 2015, Jan Beulich wrote: This is done indirectly by adjusting two typedefs and helps emphasizing that the respective tables aren't supposed to be modified at runtime (as they may be shared between devices). Signed-off-by: Jan Beulich jbeul...@suse.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- a/qemu/upstream/hw/xen/xen_pt.h +++ b/qemu/upstream/hw/xen/xen_pt.h @@ -31,7 +31,7 @@ void xen_pt_log(const PCIDevice *d, cons /* Helper */ #define XEN_PFN(x) ((x) XC_PAGE_SHIFT) -typedef struct XenPTRegInfo XenPTRegInfo; +typedef const struct XenPTRegInfo XenPTRegInfo; typedef struct XenPTReg XenPTReg; typedef struct XenPCIPassthroughState XenPCIPassthroughState; @@ -135,11 +135,11 @@ struct XenPTReg { uint32_t data; /* emulated value */ }; -typedef struct XenPTRegGroupInfo XenPTRegGroupInfo; +typedef const struct XenPTRegGroupInfo XenPTRegGroupInfo; /* emul reg group size initialize method */ typedef int (*xen_pt_reg_size_init_fn) -(XenPCIPassthroughState *, const XenPTRegGroupInfo *, +(XenPCIPassthroughState *, XenPTRegGroupInfo *, uint32_t base_offset, uint8_t *size); /* emulated register group information */ @@ -154,7 +154,7 @@ struct XenPTRegGroupInfo { /* emul register group management table */ typedef struct XenPTRegGroup { QLIST_ENTRY(XenPTRegGroup) entries; -const XenPTRegGroupInfo *reg_grp; +XenPTRegGroupInfo *reg_grp; uint32_t base_offset; uint8_t size; QLIST_HEAD(, XenPTReg) reg_tbl_list; --- a/qemu/upstream/hw/xen/xen_pt_config_init.c +++ b/qemu/upstream/hw/xen/xen_pt_config_init.c @@ -96,8 +96,7 @@ XenPTReg *xen_pt_find_reg(XenPTRegGroup } static uint32_t get_throughable_mask(const XenPCIPassthroughState *s, - const XenPTRegInfo *reg, - uint32_t valid_mask) + XenPTRegInfo *reg, uint32_t valid_mask) { uint32_t throughable_mask = ~(reg-emu_mask | reg-ro_mask); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 07/12] x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code
On 16/06/15 15:19, Boris Ostrovsky wrote: On 06/16/2015 10:15 AM, David Vrabel wrote: On 15/06/15 21:35, Ingo Molnar wrote: * David Vrabel david.vra...@citrix.com wrote: On 15/06/15 10:05, Ian Campbell wrote: On Sat, 2015-06-13 at 11:49 +0200, Ingo Molnar wrote: xen_mm_pin_all()/unpin_all() are used to implement full guest instance suspend/restore. It's a stop-all method that needs to iterate through all allocated pgds in the system to fix them up for Xen's use. This code uses pgd_list, probably because it was an easy interface. But we want to remove the pgd_list, so convert the code over to walk all tasks in the system. This is an equivalent method. It is not equivalent because pgd_alloc() now populates entries in pgds that are not visible to xen_mm_pin_all() (note how the original code adds the pgd to the pgd_list in pgd_ctor() before calling pgd_prepopulate_pmd()). These newly allocated page tables won't be correctly converted on suspend/resume and the new process will die after resume. So how should the Xen logic be fixed for the new scheme? I can't say I can see through the paravirt complexity here. Actually, since we freeze_processes() before trying to pin page tables, I think it should be ok as-is. I'll put the patch through some tests. Actually, I just ran this through a couple of boot/suspend/resume tests and didn't see any issues (with the one fix I mentioned to Ingo earlier). On unstable Xen only. In which case this can have a: Reviewed-by: David Vrabel david.vra...@citrix.com Thanks. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [Patch V3 3/3] xen: add Xen pvUSB maintainer
Add myself as maintainer for the Xen pvUSB stuff. Signed-off-by: Juergen Gross jgr...@suse.com Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- MAINTAINERS | 8 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index d8afd29..8ef4e83 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10965,6 +10965,14 @@ F: drivers/scsi/xen-scsifront.c F: drivers/xen/xen-scsiback.c F: include/xen/interface/io/vscsiif.h +XEN PVUSB DRIVER +M: Juergen Gross jgr...@suse.com +L: xen-de...@lists.xenproject.org (moderated for non-subscribers) +L: linux-...@vger.kernel.org +S: Supported +F: divers/usb/host/xen* +F: include/xen/interface/io/usbif.h + XEN SWIOTLB SUBSYSTEM M: Konrad Rzeszutek Wilk konrad.w...@oracle.com L: xen-de...@lists.xenproject.org (moderated for non-subscribers) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/16/2015 03:20 PM, Juergen Gross wrote: On 06/16/2015 04:06 PM, George Dunlap wrote: On 06/16/2015 02:49 PM, Juergen Gross wrote: On 06/16/2015 03:29 PM, George Dunlap wrote: On 06/16/2015 02:23 PM, Juergen Gross wrote: Hmm, I'd rather have it all in one place. Putting it in qemu would enable us to handle hotplug as well. A USB device assigned via it's serial to a domU could be automatically passed through by qemu when showing up. This isn't possible today, but we wouldn't have to change libxl again for supporting it, only qemu would have to be adapted. Look, we're talking here about the libxl interface. Ian thinks that *at the libxl layer* we need to be able to specify all these things. It doesn't matter at this point whether qemu can also do it or not. When I'm adding new functionality (and this is the case here) I always try to add it at the level where it should be. qemu already is capable of finding a USB device by various means and can be extended easily to support our needs. And please remember, for writing the pvusb backend I'm already doing changes to qemu. So why should we add the same functionality to libxl by reading sysfs instead of letting it do qemu via libusb? And what about BSD? Letting qemu find the device would avoid two variants in libxl. If the libxl interface only has bus and port, then it doesn't matter what qemu can do -- the caller has no way of asking qemu to watch for vid:did:serial. If libxl has vid:did:serial in the interface, then it's only an implementation detail whether we pass that into qemu or whether we pass some other way of uniquely indentifying a particular device. And given that no *current* versions of qemu support vid:did:serial, the most sensible way to implement this would be to have libxl do the conversion and send over bus:addr -- that way when you update libxl you get that functionality regardless of whether qemu has it. Unless you're suggesting that the libxl interface should be a string that we just pass verbatim to qemu. If that's what you mean, it's a complete non-starter, and I'm sure Ian will agree with me. There's no way that we can provide any interface consistency or any documentation of what this string does if it boils down to This does whatever the version of qemu you happen to be running does. No, I didn't expect libxl to just pass an arbitrary string to qemu. My point was to avoid the sysfs accesses in libxl in order to support BSD as well and to reduce the complexity. In the future, if we actually implement the automatically grab this device when it shows up functionality, then yes, having it in qemu rather than having a libxl daemon sit around and watch for those things, might be handy. But we can cross that bridge when we come to it. I would agree if the efforts in libxl would be much smaller than in qemu. But if the efforts are comparable I'd rather do it in the component which will make such an enhancement easier. They're both small, so this is really bikeshedding at the moment. The most important question is the interface: do we include vid:did:serial in the libxl interface. Since you want qemu to do that, I take it that your answer to that is yes. Correct. When we have patches submitted, we can discuss whether libxl should only support the (as-yet unreleased) versions of qemu that do the search themselves, or whether they should support all versions of qemu-xen. (The answer seems pretty obvious to me...) May be I made one wrong assumption: I thought adding USB-passthrough for HVM domains would require qemu changes as well. If this is not the case I can understand your reasoning. Oh, right. No, getting HVM hotplug working over QMP was pretty straightforward; probably only a week's work. It's been the interface that has been the real difficult bit. There are some things that would be nicer to implement in qemu. Right now there's no way to *query* what devices have been plugged in; my patch series had to store information in xenstore. At some point in the future, it might make sense for libxl to mainly just ferry the options back and forth to qemu; even then I'm not sure. In any case it's an implementation detail that can be changed at any time. It's the interface we have to be really careful about. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]
On 06/16/2015 02:37 PM, Ian Jackson wrote: George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): Remember that the path you gave in your previous e-mail isn't the path for the *usb device*, it's the path for the *block device*. It contains a PCI address, but it looks like it also contains part of the USB topology. Are you sure that's actually a stable interface, or does it just happen that on your hardware the discovery always happens in the same order? The block device is (in path terms) underneath the usb device, obviously. Not all of that path is relevant to identifying the USB device. On my system /sys/bus/usb/devices/2-3.3 is a link to /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/. This contains the pci bus address, but it also contains the bus number, which we've just said may be unstable across reboots. You mean the 2 in `usb2' ? I think that bus number is the bus number within the controller, not globally. On the contrary: $ ls -l usb* lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb1 - ../../../devices/pci:00/:00:1a.7/usb1/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb2 - ../../../devices/pci:00/:00:1d.7/usb2/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb3 - ../../../devices/pci:00/:00:1a.0/usb3/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb4 - ../../../devices/pci:00/:00:1a.1/usb4/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb5 - ../../../devices/pci:00/:00:1a.2/usb5/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb6 - ../../../devices/pci:00/:00:1d.0/usb6/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb7 - ../../../devices/pci:00/:00:1d.1/usb7/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb8 - ../../../devices/pci:00/:00:1d.2/usb8/ $ ls /sys//devices/pci:00/:00:1d.7/ | grep usb usb2/ In other words, the global bus enumeration leaks its way into the device path; which means at very least the sysfs device path is potentially unstable across reboots even if you include the pci controller it's on. I suppose it might be possible to specify buspci,port -- the pci address of the root bus, and the topology from there. In theory I guess that should be stable? Yes. The whole point of paths like this is that they are stable if the physical topology doesn't change. So on my netbook /dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1 always refers to the 1st MBR partition on logical device 0 on the USB storage device plugged into the USB port physically on the front left of the computer. That would be great if it were true, but I'm not convinced that the above path doesn't include a globally-enumerated bus number, which might change across reboots if it's enumerated in a different order. It may be that the format above is *not* based on the sysfs path of the device; that the '0' immediately after the USB means the first (and perhaps only) controller at this PCI address. In which case, yes, having a string like this might be a unique identifier for a particular port that would be stable across reboots. That needs some investigation. In any case, at the moment you're essentially inventing from whole cloth a new way of specifying USB devices that (as far as I know) isn't supported by any other program that uses USB. If you can't specify the device by hardware path, you can't specify it deterministically. And as you can see it _is_ supported by other programs that use USB. mount can use it! Mount doesn't know anything about USB devices; all it knows are block devices. You might as well say that 'ls' knows about USB devices because it can read files on a filesystem that resides on a USB stick. I'm talking about programs that explicitly manipulate USB devices -- in particular, those that may pass them through to a guest or seize them, like libvirt, qemu, virtualbox, c. I'm not opposed to doing something new if it really is better. But when you find that a dozen other projects before you have done things one way, you should at least take care before you decide to do something radically different, to make sure that it's because you actually have something better, and not because you've just missed something. I think the hardware path to the controller, at least, should be treated as an opaque OS-specific string. It might have a different format on BSD. If we can make an actual path that's stable across reboots, that would certainly be a good thing. But if it's not stable across reboots, it has no advantage over the bus-port[.port.port] interface. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.2-testing test] 58604: trouble: blocked/broken/fail/pass
flight 58604 xen-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/58604/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-libvirt 3 host-install(3) broken in 58584 REGR. vs. 58411 Tests which are failing intermittently (not blocking): test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 58604 test-amd64-i386-pair 4 host-install/dst_host(4) broken in 58584 pass in 58604 test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass in 58604 test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass in 58604 test-i386-i386-libvirt3 host-install(3) broken pass in 58540 test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken pass in 58540 test-amd64-i386-xl-win7-amd64 3 host-install(3) broken pass in 58540 test-amd64-amd64-xl-multivcpu 3 host-install(3) broken pass in 58584 test-i386-i386-xl 3 host-install(3) broken pass in 58584 test-i386-i386-xl-sedf3 host-install(3) broken pass in 58584 test-amd64-i386-xl-qemut-debianhvm-amd64 3 host-install(3) broken pass in 58584 test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken pass in 58584 test-i386-i386-xl-qemuu-winxpsp3 3 host-install(3) broken pass in 58584 test-amd64-amd64-xl-qemuu-ovmf-amd64 6 xen-boot fail in 58584 pass in 58604 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-localmigrate.2 fail in 58584 pass in 58604 test-amd64-amd64-xl-win7-amd64 16 guest-stopfail pass in 58584 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail pass in 58584 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-win7-amd64 16 guest-stop fail in 58540 like 58411 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 58411 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58411 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1)blocked in 58584 n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-i386-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-i386-i386-libvirt 12 migrate-support-check fail in 58540 never pass test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass build-amd64-rumpuserxen 5 rumpuserxen-buildfail never pass build-i386-rumpuserxen5 rumpuserxen-buildfail never pass test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 20 leak-check/check fail never pass test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass version targeted for testing: xen 97134c441d6d81ba0d7cdcfdc4d8315115b99dce baseline version: xen 21a8344ca38a2797a13b4bf57031b6f49ae12ccb People who touched revisions under test: Ian Campbell ian.campb...@citrix.com Ian Jackson ian.jack...@eu.citrix.com Jan Beulich jbeul...@suse.com Stefano Stabellini stefano.stabell...@eu.citrix.com jobs: build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-xl pass test-amd64-i386-xl pass test-i386-i386-xlbroken test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64broken test-amd64-i386-xl-qemut-debianhvm-amd64 broken test-amd64-amd64-xl-qemuu-debianhvm-amd64
Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
On Tue, 2015-06-16 at 14:36 +0100, Andrew Cooper wrote: On 16/06/15 14:21, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Shortly, libxl will be juggling multiple parallel operations, and will possibly have to take error decisions before some tasks have been set up. It would be preferable, I think, to arrange to call libxl__ev_child_init on all such libxl__ev_child structs either up front or certainly before there is any possibility of needing to unwind them. Such an init would at worst correspond to exactly the place where the zeroed structure you refer to is zeroed. It is possible that one bit fails before it can be calculated whether the second bit needs to start or not. You can call libxl__ev_child_init without needing to know that, i.e. you can do them all at the point where you allocate/initialise the containing struct. At the moment, all bits in libxl in this area do initialisation immediately before use; most bits are even initialised in the function which starts their actions. Some bits are initialised differently depending on the path taken to get to the initialisation site. It would be non-trivial to initialise everything appropriately at the very start. You don't need to fully init, just call libxl__ev_child_init in order to arrange for correct behaviour from libxl__ev_child_inuse and friends. Actually turning it into a useful child can stay where it is if it is tricky to arrange for those things to happen at the same time. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams
On 16/06/15 15:53, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: @@ -377,6 +384,28 @@ static void record_body_done(libxl__egc *egc, stream_failed(egc, stream, ret); } +void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void, + int ret, int retval, int errnoval) +{ +libxl__domain_create_state *dcs = dcs_void; +STATE_AO_GC(dcs-ao); + +if (ret) +goto err; + +if (retval) { +LOGEV(ERROR, errnoval, restoring domain); +ret = ERROR_FAIL; +goto err; +} + +libxl__stream_read_continue(egc, dcs-srs); continue? Is this something to do with checkpointing? No. Simply continue reading records from the stream. The code has changed since the first iteration, where libxl__stream_read_continue() was called from outside of this translation unit. As it currently stands, I should make it a static function. diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 23f27d4..7418d92 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -346,6 +346,8 @@ libxl_domain_create_info = Struct(domain_create_info,[ libxl_domain_restore_params = Struct(domain_restore_params, [ At some point we will need a LIBXL_HAVE #define. (checkpointed_stream, integer), +(stream_version, uint32, {'init_val': '1'}), If we aren't going to go for an IDL enum rather than a uint32 you probably just want the bare integer 1. But, I suspect we would prefer an enum, i.e an explicit list of known versions, rather than an integer? Ideally, this should match the version field in the libxl stream header record. Realistically, I never expect version 2 needing bumping to version 3. I wonder when, if ever, we will be able to flip this to 2? I suppose whenever the legacy conversion stuff gets pulled out. +(legacy_width, uint32), From what I've seen so far this is never user provided but is internal to libxl and detected[0] at runtime. As such it belongs somewhere else other than in the public API. [0] FVO detected == hardcoded depending on the build arch The intention was originally to expose it to the user. While this is possible for `xl restore /path/to/legacy/save --was-32bit-toolstack`, it is much harder for the `xl migrate` case where we cannot control the sending side. I am half tempted to say that anyone experiencing 32-64bit problems should just use the conversion helper themselves. It is deliberately written to be usable on the command line as well as automatically. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
El 16/06/15 a les 18.13, Stefano Stabellini ha escrit: On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. I think that the toolstack (libxl) will need to call xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH) and ARM guests. I'm not sure about this, AFAICT you are suggesting that the toolstack (or domain builder for Dom0) should setup the MMIO regions on behalf of the guest using the XEN_DOMCTL_memory_mapping hypercall. IMHO the toolstack should not setup MMIO regions and instead the guest should be in charge of setting them in the p2m by using a hypercall (or at least that was the plan on x86 PVH). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/6] x86: reduce paravirtualized spinlock overhead
AFAIK there are no outstanding questions for more than one month now. I'd appreciate some feedback or accepting these patches. Juergen On 04/30/2015 12:53 PM, Juergen Gross wrote: Paravirtualized spinlocks produce some overhead even if the kernel is running on bare metal. The main reason are the more complex locking and unlocking functions. Especially unlocking is no longer just one instruction but so complex that it is no longer inlined. This patch series addresses this issue by adding two more pvops functions to reduce the size of the inlined spinlock functions. When running on bare metal unlocking is again basically one instruction. Compile tested with CONFIG_PARAVIRT_SPINLOCKS on and off, 32 and 64 bits. Functional testing on bare metal and as Xen dom0. Correct patching verified by disassembly of active kernel. Juergen Gross (6): x86: use macro instead of 0 for setting TICKET_SLOWPATH_FLAG x86: move decision about clearing slowpath flag into arch_spin_lock() x86: introduce new pvops function clear_slowpath x86: introduce new pvops function spin_unlock x86: switch config from UNINLINE_SPIN_UNLOCK to INLINE_SPIN_UNLOCK x86: remove no longer needed paravirt_ticketlocks_enabled arch/x86/Kconfig | 1 - arch/x86/include/asm/paravirt.h | 13 + arch/x86/include/asm/paravirt_types.h | 12 arch/x86/include/asm/spinlock.h | 53 --- arch/x86/include/asm/spinlock_types.h | 3 +- arch/x86/kernel/kvm.c | 14 + arch/x86/kernel/paravirt-spinlocks.c | 42 +-- arch/x86/kernel/paravirt.c| 12 arch/x86/kernel/paravirt_patch_32.c | 25 + arch/x86/kernel/paravirt_patch_64.c | 24 arch/x86/xen/spinlock.c | 23 +-- include/linux/spinlock_api_smp.h | 2 +- kernel/Kconfig.locks | 7 +++-- kernel/Kconfig.preempt| 3 +- kernel/locking/spinlock.c | 2 +- lib/Kconfig.debug | 1 - 16 files changed, 154 insertions(+), 83 deletions(-) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)
On Tue, Jun 16, 2015 at 04:32:34PM +0200, Juergen Gross wrote: Introduces the Xen pvUSB frontend. With pvUSB it is possible for a Xen domU to communicate with a USB device assigned to that domU. The communication is all done via the pvUSB backend in a driver domain (usually Dom0) which is owner of the physical device. The pvUSB frontend is a USB hcd for a virtual USB host connector. The code is taken from the pvUSB implementation in Xen done by Fujitsu based on Linux kernel 2.6.18. Changes from the original version are: - port to upstream kernel - put all code in just one source file - move module to appropriate location in kernel tree - adapt to Linux style guide - minor code modifications to increase readability Signed-off-by: Juergen Gross jgr...@suse.com --- drivers/usb/host/Kconfig | 11 + drivers/usb/host/Makefile |1 + drivers/usb/host/xen-hcd.c | 1638 3 files changed, 1650 insertions(+) create mode 100644 drivers/usb/host/xen-hcd.c diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig index 197a6a3..3361b4b 100644 --- a/drivers/usb/host/Kconfig +++ b/drivers/usb/host/Kconfig @@ -788,3 +788,14 @@ config USB_HCD_TEST_MODE This option is of interest only to developers who need to validate their USB hardware designs. It is not needed for normal use. If unsure, say N. + +config USB_XEN_HCD + tristate Xen usb virtual host driver + depends on XEN + select XEN_XENBUS_FRONTEND + help + The Xen usb virtual host driver serves as a frontend driver enabling + a Xen guest system to access USB Devices passed through to the guest + by the Xen host (usually Dom0). + Only needed if the kernel is running in a Xen guest and generic + access to a USB device is needed. diff --git a/drivers/usb/host/Makefile b/drivers/usb/host/Makefile index 65b0b6a..3779696 100644 --- a/drivers/usb/host/Makefile +++ b/drivers/usb/host/Makefile @@ -75,3 +75,4 @@ obj-$(CONFIG_USB_HCD_SSB) += ssb-hcd.o obj-$(CONFIG_USB_FUSBH200_HCD) += fusbh200-hcd.o obj-$(CONFIG_USB_FOTG210_HCD)+= fotg210-hcd.o obj-$(CONFIG_USB_MAX3421_HCD)+= max3421-hcd.o +obj-$(CONFIG_USB_XEN_HCD)+= xen-hcd.o diff --git a/drivers/usb/host/xen-hcd.c b/drivers/usb/host/xen-hcd.c new file mode 100644 index 000..9da2856 --- /dev/null +++ b/drivers/usb/host/xen-hcd.c @@ -0,0 +1,1638 @@ +/* + * xen-hcd.c + * + * Xen USB Virtual Host Controller driver + * + * Copyright (C) 2009, FUJITSU LABORATORIES LTD. + * Author: Noboru Iwamatsu n_iwama...@jp.fujitsu.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see http://www.gnu.org/licenses/. + * + * Or, by your choice: + * + * When distributed separately from the Linux kernel or incorporated into + * other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include linux/module.h +#include linux/usb.h +#include linux/list.h +#include linux/usb/hcd.h +#include linux/io.h + +#include xen/xen.h +#include xen/xenbus.h +#include xen/grant_table.h +#include xen/events.h +#include xen/page.h + +#include xen/interface/io/usbif.h + +/* Private per-URB data */ +struct urb_priv { + struct
[Xen-devel] [linux-3.4 test] 58597: regressions - trouble: broken/fail/pass
flight 58597 linux-3.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/58597/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl 9 debian-install fail REGR. vs. 52209-bisect test-amd64-amd64-pair 15 debian-install/dst_host fail REGR. vs. 52715-bisect test-amd64-i386-pair15 debian-install/dst_host fail REGR. vs. 56366-bisect Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf-pin 3 host-install(3) broken blocked in 56366-bisect test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken blocked in 56366-bisect test-amd64-i386-qemuu-rhel6hvm-amd 3 host-install(3) broken blocked in 56366-bisect test-amd64-i386-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-i386-rumpuserxen-i386 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-qemut-rhel6hvm-intel 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-freebsd10-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-rumpuserxen-amd64 6 xen-bootfail blocked in 56366-bisect test-amd64-amd64-xl-multivcpu 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-winxpsp3 6 xen-bootfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-ovmf-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl9 debian-installfail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-winxpsp3 6 xen-bootfail blocked in 56366-bisect test-amd64-i386-libvirt-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-winxpsp3 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-sedf 9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-freebsd10-i386 6 xen-bootfail blocked in 56366-bisect test-amd64-amd64-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-qemuu-rhel6hvm-intel 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-ovmf-amd64 6 xen-bootfail like 53709-bisect Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-xsm 9 debian-install fail never pass test-amd64-amd64-xl-pvh-amd 9 debian-install fail never pass test-amd64-i386-xl-xsm9 debian-install fail never pass test-amd64-amd64-xl-credit2 9 debian-install fail never pass test-amd64-amd64-libvirt-xsm 9 debian-install fail never pass test-amd64-amd64-xl-pvh-intel 9 debian-install fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: linux56b48fcda5076d4070ab00df32ff5ff834e0be86 baseline version: linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70 370 people touched revisions under test, not listing them all jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl fail test-amd64-i386-xl
Re: [Xen-devel] [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: This is a complicated set of changes which must be done together for bisectability. * libxl-save-helper is updated to unconditionally use libxc migration v2. * libxl compatibility workarounds in libxc are disabled for save operations. * libxl__stream_write_start() is logically spliced into the event location where libxl__xc_domain_save() used to reside. * xl is updated to indicate that the stream is now v2 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- RFC: What kind of ABI/API indication is appropriate here. A LIBXL_HAVE* isn't apppropriate. Whether it has HAVE in the name or not I think some sort of #define, or one each for SAVE, RESTORE and perhaps even LEGACY would be appropriate. We should arrange that we can remove the LEGACY one in the future without causing applications to think we've reverted to Xen 4.5 era. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
On 16/06/15 15:31, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Overall looks good, I've got some comments below and I think it almost certainly wants eyes from Ian who knows more about the dc infra etc. +void libxl__stream_read_start(libxl__egc *egc, + libxl__stream_read_state *stream) +{ +libxl__datacopier_state *dc = stream-dc; +int ret = 0; + +/* State initialisation. */ +assert(!stream-running); + +memset(dc, 0, sizeof(*dc)); libxl__datacopier_init, please That call is made by libxl__datacopier_start() each and every time, and unlike here, is matched with an equivalent _kill() call. +dc-ao = stream-ao; +dc-readfd = stream-fd; +dc-writefd = -1; + +/* Start reading the stream header. */ +dc-readwhat = stream header; +dc-readbuf = stream-hdr; +stream-expected_len = dc-bytes_to_read = sizeof(stream-hdr); +dc-used = 0; +dc-callback = stream_header_done; This pattern of resetting and reinitialising the dc occurs in multiple places, I think a helper would be in order, some sort of stream_next_record_init or something perhaps? The only feasible helper would have to take everything as parameters; there is insufficient similarity between all users. I dunno whether that would be harder to read... +void libxl__stream_read_abort(libxl__egc *egc, + libxl__stream_read_state *stream, int rc) +{ +stream_failed(egc, stream, rc); +} + +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream) +{ +stream-rc = 0; +stream-running = false; + +stream_done(egc, stream); Push the running = false into stream_done and flip the assert there? Logically the stream is still running until it is done, so having done assert it isn't running seems counter-intuitive. This is more for piece of mind. stream_done() my strictly only ever be called once, hence its assert. +static void stream_done(libxl__egc *egc, +libxl__stream_read_state *stream) +{ +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs); + +assert(!stream-running); + +stream-completion_callback(egc, dcs, stream-rc); +} + +static void stream_header_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval) +{ +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc); +libxl_sr_hdr *hdr = stream-hdr; +STATE_AO_GC(dc-ao); +int ret = 0; + +if (onwrite || dc-used != stream-expected_len) { +ret = ERROR_FAIL; +LOG(ERROR, write %d, err %d, expected %zu, got %zu, +onwrite, errnoval, stream-expected_len, dc-used); +goto err; +} I think you need to check errnoval == 0 in the !onwrite case, otherwise you may miss a read error? dc-used != stream-expected_len covers all possible read errors, in the something went wrong kind of way. Also it looks like onwrite can be -1, which is a separate error case. + +static void record_header_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval) +{ +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc); +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr; +STATE_AO_GC(dc-ao); +int ret = 0; + +if (onwrite || dc-used != stream-expected_len) { +ret = ERROR_FAIL; +LOG(ERROR, write %d, err %d, expected %zu, got %zu, +onwrite, errnoval, stream-expected_len, dc-used); +goto err; +} Same comments wrt the arguments as the previous one. Maybe a common helper to check (and log) the status at the head of each callback? So you can effectively do if (!everything_ok(stream, dc) goto err? I will see what I can do. +assert(!ret); +if (rec_hdr-length) { +free(stream-rec_body); +stream-rec_body = NULL; reset length too? +static void read_emulator_body(libxl__egc *egc, + libxl__stream_read_state *stream) +{ +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs); +libxl__datacopier_state *dc = stream-dc; +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr; +libxl_sr_emulator_hdr *emu_hdr = stream-rec_body; +STATE_AO_GC(stream-ao); +char path[256]; +int ret = 0; + +sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE.%u, dcs-guest_domid); + +dc-readwhat = save/migration stream; +dc-copywhat = emulator context; +dc-writewhat = qemu save file; +
Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls
On Tue, 16 Jun 2015, Jan Beulich wrote: On 16.06.15 at 16:03, stefano.stabell...@eu.citrix.com wrote: On Fri, 5 Jun 2015, Jan Beulich wrote: +} else if (s-msix-enabled) { +if (!(value PCI_MSIX_FLAGS_ENABLE)) { +xen_pt_msix_disable(s); +s-msix-enabled = false; +} else if (!s-msix-maskall) { Why are you changing the state of s-msix-maskall here? This is the value PCI_MSIX_FLAGS_ENABLE case, nothing to do with maskall, right? We're at an else if inside an else if here. The only case left after the if() still seen above is that value has PCI_MSIX_FLAGS_MASKALL set. You are right. Maybe just add /* (value PCI_MSIX_FLAGS_MASKALL) at this point */ +s-msix-maskall = true; +xen_pt_msix_maskall(s, true); +} } -debug_msix_enabled_old = s-msix-enabled; -s-msix-enabled = !!(*val PCI_MSIX_FLAGS_ENABLE); if (s-msix-enabled != debug_msix_enabled_old) { XEN_PT_LOG(s-dev, %s MSI-X\n, s-msix-enabled ? enable : disable); } +xen_host_pci_get_word(s-real_device, s-msix-ctrl_offset, dev_value); I have to say that I don't like the asymmetry between reading and writing PCI config registers. If writes go via hypercalls, reads should go via hypercalls too. We're not doing any cfg register write via hypercalls (not here, and not elsewhere). What is being replaced by the patch are write to two bits which happen to live in PCI config space. Plus, reading directly, and doing writes via hypercall only when really needed would still be the right thing from a performance pov. OK. It would be nice to document somewhere that QEMU is not supposed to touch those two bits directly, but I wouldn't know where. Maybe a new doc under docs/misc? --- a/qemu/upstream/hw/xen/xen_pt_msi.c +++ b/qemu/upstream/hw/xen/xen_pt_msi.c @@ -301,8 +301,11 @@ static int msix_set_enable(XenPCIPassthr return -1; } -return msi_msix_enable(s, s-msix-ctrl_offset, PCI_MSIX_FLAGS_ENABLE, - enabled); Would it make sense to remove msi_msix_enable completely to avoid any further mistakes? Perhaps, yes. I think I actually had suggested so quite a while back. But I don't see myself wasting much more time on this, ehm, code. Isn't it just a matter of removing msi_msix_enable? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft F] Xen on ARM vITS Handling
On Tue, 2015-06-16 at 20:20 +0530, Vijay Kilari wrote: On Thu, Jun 11, 2015 at 3:10 PM, Ian Campbell ian.campb...@citrix.com wrote: Draft F follows. Also at: http://xenbits.xen.org/people/ianc/vits/draftF.{pdf,html} ## Per-domain `struct pending_irq` for `vLPI`s Internally Xen uses a `struct pending_irq` to track the status of any pending virtual IRQ, including a virtual LPI. Upon domain creation an array of such `struct pending_irq`'s will be allocated to cover the range `8192..nr_lpis` (for the number of LPIs which the guest is configured with) and a pointer this array will be stored in the `struct domain`. The function `irq_to_pending` will be modified to lookup interupts in the LPI range in this array. . nr_lpis can be large if more devices are assigned to domain. As I was suggesting on #xenarm chat, is it ok to use RB-tree instead of array? what should be value for nr_lpis? It should be user configurable and default to the sum of the number of events on all devices at start of day. I think this removes the need for it to be an R-B tree, an array is tolerable here. Adding an R-B tree not only has a memory overhead, but it then needs more complex management when inserting, searching, etc. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [DESIGN] Feature Levelling improvements
On 16.06.15 at 12:50, andrew.coop...@citrix.com wrote: How XenServer currently does levelling == The _Heterogeneous Pool Levelling_ support in XenServer appears to predate the libxc CPUID policy API, so does not currently use it. The toolstack has a table of CPU model numbers identifying whether levelling is supported. It then uses native `CPUID` instructions to look at the first four feature masks, and identifies the subset of features across the pool. `cpuid_mask_{,extd_}{ecx,edx}` is then set on Xen's command line for each host in the pool, and all hosts rebooted. This has several limitations: * Xen and dom0 have a reduced feature set despite not needing to migrate I don't think Xen is affected by this, as it reads the CPUID bits before setting the masks (there are a few cpuid() invocations in random code, but I don't think these access maskable ones). Notes and observations == Experimentally, the masking MSRs can be context switched. There is no need to force all PV guests to the same level, and no need to prevent dom0 or Xen from using certain features. Context switching the masking MSRs will however incur an overhead, and should be avoided where possible. The toolstack needs to know how much control Xen has over VM features. In the case that there are insufficient masking MSRs, and no faulting support is present, a PV VM can still potentially be made safe to migrate by explicitly disabling features on the kernel command line. That wouldn't help with user mode code, would it? VCPU context switch --- Xen shall be updated to lazily context switch all available masking MSRs. It is noted that this shall incur a performance overhead if restricted featuresets are assigned to PV guests, and _CPUID Faulting_ is not available. It shall be the responsibility of the host administrator to avoid creating such a scenario, if the performance overhead is a concern. Not sure how feasible this is: Even if you run all PV guests at equal feature levels, context switching between PV and non-PV guests would still incur overhead (unless you mean to run HVM/PVH ones with whatever masking is currently in place). Plus this still wouldn't deal with masks in place when Xen itself wants to look at any of the maskable ones, unless you intend to audit code to make sure no such uses exist (which - as per above - I suppose/hope to be the case). Future work === The above is a minimum quantity of work to support feature levelling, but further problems exist. They are acknowledged as being issues, but are not in scope for fixing as part of feature levelling. * Xen has no notion of per-cpu and per-package data in the cpuid policy. In particular, this causes issues for VMs attempting to detect topology, which find inconsistent/incorrect cache information. * In the case that `domain_cpuid()` can't locate a leaf in the topology, it will fall back to issuing a plain `CPUID` instruction. This breaks VM encapsulation, as a VM which has migrated can observe differences which should be hidden. I think this is actually something that (a) needs addressing not too far in the future and (b) reminds me that I didn't see any talk here regarding black vs white listing of features not explicitly known to Xen or the tool stack. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Is: graphics corruption with 'xen: Support Xen pv-domains using PAT. Was:Re: [BUG] Characters on the screen are broken on Linux = 3.19 with VT-d enabled
於 二,2015-06-16 於 11:06 +0200,Juergen Gross 提到: On 06/16/2015 10:55 AM, Ting-Wei Lan wrote: Juergen Gross 於 西元2015年06月16日 12:30 寫道: On 06/15/2015 09:03 PM, Ting-Wei Lan wrote: 於 一,2015-06-15 於 14:55 -0400,Konrad Rzeszutek Wilk 提到: On Sat, Jun 13, 2015 at 12:43:14AM +0800, Ting-Wei Lan wrote: When using Linux = 3.19 as the dom0 kernel, characters on the screen become broken after the graphic driver is loaded. The commit that breaks it is (found by git bisect): https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux .git/com mit/?id=47591df Lets CC Juergen Screenshot when the system run in single user mode: https://bugs.freedesktop.org/attachment.cgi?id=115079 Are those messages to be expected: (XEN) Scrubbing Free RAM on 1 nodes using 2 CPUs (XEN) [VT-D]DMAR:[DMA Write] Request device [:00:02.0] fault addr b61eef000, iommu reg = 82c000203000 (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set (XEN) . .done. I'm not familiar with VT-D internals, but seeing these messages for the video device during RAM scrubbing makes wonder if everything is correct regarding the VT-D and memory setup... ... I still remember that there was a similar problem found two years ago on the same hardware with similar broken screen output and it also crashed after Xorg was started, but I cannot confirm that they are the same problems. I don't know whether error messages are simliar. The old problem happens on Linux 3.7 ~ 3.10 with VT-d enabled. It only happened when not using Xen, so I added 'intel_iommu=off' to Linux boot arguments to workaround it. Hmm, do you see any chance in finding the commit which made it working again? Perhaps there was some workaround for this hardware which is missing in Xen now... After some tests, I found the information I provided before was incorrect. It seems the problem happens on all Linux = 3.7, including Linux 4.0.5, so the old problem was never fixed. Here are some 'dmesg | grep -i iommu' outputs. So a Xen-specific error is rather improbable, correct? But Linux provides 'intel_iommu=igfx_off' to workaround the problem. Does Xen provide similar things? I'd continue with sending the new information to the Intel graphics team. I am going to reopen DRM/Intel bug 90037 on freedesktop bugzilla. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
On 16/06/15 15:57, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/Makefile |2 +- tools/libxl/libxl_internal.h | 33 +++ tools/libxl/libxl_stream_write.c | 536 ++ 3 files changed, 570 insertions(+), 1 deletion(-) create mode 100644 tools/libxl/libxl_stream_write.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index ca0ae3e..63e32f7 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ -libxl_stream_read.o \ +libxl_stream_read.o libxl_stream_write.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ libxl_convert_callout.o \ libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 5482950..82cd792 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*, typedef void libxl__save_device_model_cb(libxl__egc*, libxl__domain_suspend_state*, int rc); +/* State for writing a libxl migration v2 stream */ +typedef struct libxl__stream_write_state libxl__stream_write_state; + +struct libxl__stream_write_state { +/* filled by the user */ +libxl__ao *ao; +int fd; +uint32_t domid; +void (*completion_callback)(libxl__egc *egc, +libxl__domain_suspend_state *dss, +int rc); +/* Private */ +int rc; +int joined_rc; +size_t padding; +bool running; +libxl__datacopier_state dc; +}; + +_hidden void libxl__stream_write_start(libxl__egc *egc, + libxl__stream_write_state *stream); + +_hidden void libxl__stream_write_abort(libxl__egc *egc, + libxl__stream_write_state *stream, + int rc); + +static inline bool libxl__stream_write_inuse( +const libxl__stream_write_state *stream) +{ +return stream-running; +} + typedef struct libxl__logdirty_switch { const char *cmd; const char *cmd_path; @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state { /* private for libxl__domain_save_device_model */ libxl__save_device_model_cb *save_dm_callback; libxl__datacopier_state save_dm_datacopier; +libxl__stream_write_state sws; }; diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c new file mode 100644 index 000..856d72e --- /dev/null +++ b/tools/libxl/libxl_stream_write.c @@ -0,0 +1,536 @@ +/* + * Copyright (C) 2015 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include libxl_osdeps.h /* must come before any other headers */ + +#include libxl_internal.h + +/* + * Infrastructure for writing a domain to a libxl migration v2 stream. + * + * Entry points from outside: + * - libxl__stream_write_start() + * - Start writing a stream from the start. + * + * In normal operation, there are two tasks running at once; this stream + * processing, and the the libxl-save-helper. check_stream_finished() is used the the. + * to join all the tasks in both success and error cases. + * + * Nomenclature for event callbacks: + * - $FOO_done(): Completion callback for $FOO + * - write_$FOO(): Set up writing a $FOO Set up or actually write? Set up the dc to write $FOO The write doesn't actually happen until we get the dc callback. + +void libxl__stream_write_start(libxl__egc *egc, + libxl__stream_write_state *stream) +{ +libxl__datacopier_state *dc = stream-dc; +
Re: [Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)
On 16/06/15 15:56, Juergen Gross wrote: Hmm, I'm beginning to question the value of that file. May be I should just throw it away. Thanks for the quick feedback. You could move it to debugfs, but if it's not useful better to get rid of it. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]
On Tue, Jun 16, 2015 at 4:59 PM, Ian Jackson ian.jack...@eu.citrix.com wrote: George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]): Yes. The whole point of paths like this is that they are stable if the physical topology doesn't change. So on my netbook /dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1 always refers to the 1st MBR partition on logical device 0 on the USB storage device plugged into the USB port physically on the front left of the computer. That would be great if it were true, but I'm not convinced that the above path doesn't include a globally-enumerated bus number, which might change across reboots if it's enumerated in a different order. It may be that the format above is *not* based on the sysfs path of the device; that the '0' immediately after the USB means the first (and perhaps only) controller at this PCI address. In which case, yes, having a string like this might be a unique identifier for a particular port that would be stable across reboots. That needs some investigation. That does seem to be the case: What seems to be the case -- that it contains the global bus, or not? Looking at the output of udevadm monitor --property for sdc1 (on another plug): DEVPATH=/devices/pci:00/:00:1d.7/usb1/1-1/1-1:1.0/host22/target22:0:0/22:0:0:0/block/sdc/sdc1 ID_PATH=pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0 ID_PATH_TAG=pci-_00_1d_7-usb-0_1_1_0-scsi-0_0_0_0 I don't know where that ID_PATH comes from. It looks like that's constructed in udev: http://cgit.freedesktop.org/systemd/systemd/tree/src/udev/udev-builtin-path_id.c See handle_usb() (line 542) in particular. If I'm reading it right, what it basically does it take the [bus]-[port], and replace it with usb-0:[port]. (IOW, the '0' is hard-coded.) Also, if I'm reading it right, this won't work properly for Juergen's system that has two USB devices at the same pci address -- they'll both end up resolving to [pciaddr]-usb-0:[whatever]. Juergen -- can you give this a try? Run udevadm info on the two USB busses that share the same PCI slot, and see if the ID_PATH is the same for both? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/6] xen/pass-through: correctly deal with RW1C bits
On 16.06.15 at 16:19, stefano.stabell...@eu.citrix.com wrote: On Fri, 5 Jun 2015, Jan Beulich wrote: @@ -1016,11 +1002,12 @@ static XenPTRegInfo xen_pt_emu_reg_pm[] .size = 2, .init_val = 0x0008, .res_mask = 0x00F0, -.ro_mask= 0xE10C, +.ro_mask= 0x610C, +.rw1c_mask = 0x8000, .emu_mask = 0x810B, .init = xen_pt_common_reg_init, .u.w.read = xen_pt_word_reg_read, -.u.w.write = xen_pt_pmcsr_reg_write, +.u.w.write = xen_pt_word_reg_write, }, { .size = 0, I can see that the code change doesn't cause a change in behaviour for PCI_PM_CTRL, but it does for PCI_STATUS, PCI_EXP_DEVSTA and PCI_EXP_LNKSTA. Please explain why in the commit message. I'm not sure what you're after in a patch titled correctly deal with RW1C bits. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [Patch V3 0/3] xen, usb: support pvUSB frontend driver
This series adds XEN guest pvUSB support. With pvUSB it is possible to use physical USB devices from a XEN domain. The support consists of a frontend in form of a virtual hcd driver in the unprivileged domU passing I/O-requests to the backend in a driver domain (usually Dom0). The backend is not part of this patch series, as it will be supported via qemu. The code is taken (and adapted) from the original pvUSB implementation done for Linux 2.6 in 2008 by Fujitsu. Normal operation of USB devices by adding and removing them dynamically to/from a domain has been tested using various USB devices (USB 1.1, 2.0 and 3.0). The pvUSB backend for these tests was a SUSE SLES Dom0 with a kernel based backend driver. Changes in V3: - move frontend to drivers/usb/host and rename it to xen-hcd. - changed name prefixes in patch 1 to xenusb as requested by Greg - use __un types rather than uintn_t as requested by Greg Changes in V2: - removed backend, as it can be implemented in user land - added some access macros and definitions to the pvUSB interface description to make it independant from linux kernel USB internals - adapted frontend to newer kernel version and use new pvUSB interface macros - set port status in one chunk as suggested by Oliver Neukum Juergen Gross (3): usb: Add Xen pvUSB protocol description usb: Introduce Xen pvUSB frontend (xen hcd) xen: add Xen pvUSB maintainer MAINTAINERS |8 + drivers/usb/host/Kconfig | 11 + drivers/usb/host/Makefile|1 + drivers/usb/host/xen-hcd.c | 1638 ++ include/xen/interface/io/usbif.h | 252 ++ 5 files changed, 1910 insertions(+) create mode 100644 drivers/usb/host/xen-hcd.c create mode 100644 include/xen/interface/io/usbif.h -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state
On 16/06/15 14:26, Ian Jackson wrote: Yang Hongyang writes ([PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state): Currently struct libxl__domain_suspend_state contains 2 type of states, one is save state, another is suspend state. This patch separate it out. The motivation of this is that COLO will need to do suspend/resume continuesly, we need a more common suspend state. Currently in libxl/libxc/etc. suspend and save have referred to the same thing, but different terminology has been used at different layers. Ie both suspend and save each refer to either or both of save to disk or suspend for live migrate, or to the relevant underlying mechanisms. So I'm not sure introducing a distinction between those two terms in libxl is really helpful. If it is to be done there should be a clear explanation of what the difference is. On IRC you said: 14:21 yanghy Diziet, currently, in libxl, suspend is used as 2 means, one is save(corrspond to libxc save), another is suspend the guest(related to suspend callback) That's rather different, I think. Or, at least, I'm not sure that I understand this distinction as you are making it. The suspend callback is part of the implementation of what at higher layers we save/suspend/restore/migration. AIUI this callback is related to pausing the guest, or manipulating its VCPUs ? Perhaps we should rename this callback ? Maybe Andrew Cooper can suggest a name ? Forgive me if I'm confused and going off in the wrong direction... The terminology used by libxc is more consistent in this area. suspend refers to quiescing the VM, so pausing qemu, making a remote_shutdown(SHUTDOWN_suspend) hypercall etc. save refers to the actions involved in actually shuffling the state of the VM, so xc_domain_save() etc. I believe that these are useful distinctions to maintain. libxl currently uses suspend to encapsulate both. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft F] Xen on ARM vITS Handling
On Thu, Jun 11, 2015 at 3:10 PM, Ian Campbell ian.campb...@citrix.com wrote: Draft F follows. Also at: http://xenbits.xen.org/people/ianc/vits/draftF.{pdf,html} ## Per-domain `struct pending_irq` for `vLPI`s Internally Xen uses a `struct pending_irq` to track the status of any pending virtual IRQ, including a virtual LPI. Upon domain creation an array of such `struct pending_irq`'s will be allocated to cover the range `8192..nr_lpis` (for the number of LPIs which the guest is configured with) and a pointer this array will be stored in the `struct domain`. The function `irq_to_pending` will be modified to lookup interupts in the LPI range in this array. . nr_lpis can be large if more devices are assigned to domain. As I was suggesting on #xenarm chat, is it ok to use RB-tree instead of array? what should be value for nr_lpis? Regards Vijay ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/Makefile |2 +- tools/libxl/libxl_internal.h | 33 +++ tools/libxl/libxl_stream_write.c | 536 ++ 3 files changed, 570 insertions(+), 1 deletion(-) create mode 100644 tools/libxl/libxl_stream_write.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index ca0ae3e..63e32f7 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ - libxl_stream_read.o \ + libxl_stream_read.o libxl_stream_write.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ libxl_convert_callout.o \ libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 5482950..82cd792 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*, typedef void libxl__save_device_model_cb(libxl__egc*, libxl__domain_suspend_state*, int rc); +/* State for writing a libxl migration v2 stream */ +typedef struct libxl__stream_write_state libxl__stream_write_state; + +struct libxl__stream_write_state { +/* filled by the user */ +libxl__ao *ao; +int fd; +uint32_t domid; +void (*completion_callback)(libxl__egc *egc, +libxl__domain_suspend_state *dss, +int rc); +/* Private */ +int rc; +int joined_rc; +size_t padding; +bool running; +libxl__datacopier_state dc; +}; + +_hidden void libxl__stream_write_start(libxl__egc *egc, + libxl__stream_write_state *stream); + +_hidden void libxl__stream_write_abort(libxl__egc *egc, + libxl__stream_write_state *stream, + int rc); + +static inline bool libxl__stream_write_inuse( +const libxl__stream_write_state *stream) +{ +return stream-running; +} + typedef struct libxl__logdirty_switch { const char *cmd; const char *cmd_path; @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state { /* private for libxl__domain_save_device_model */ libxl__save_device_model_cb *save_dm_callback; libxl__datacopier_state save_dm_datacopier; +libxl__stream_write_state sws; }; diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c new file mode 100644 index 000..856d72e --- /dev/null +++ b/tools/libxl/libxl_stream_write.c @@ -0,0 +1,536 @@ +/* + * Copyright (C) 2015 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include libxl_osdeps.h /* must come before any other headers */ + +#include libxl_internal.h + +/* + * Infrastructure for writing a domain to a libxl migration v2 stream. + * + * Entry points from outside: + * - libxl__stream_write_start() + * - Start writing a stream from the start. + * + * In normal operation, there are two tasks running at once; this stream + * processing, and the the libxl-save-helper. check_stream_finished() is used the the. + * to join all the tasks in both success and error cases. + * + * Nomenclature for event callbacks: + * - $FOO_done(): Completion callback for $FOO + * - write_$FOO(): Set up writing a $FOO Set up or actually write? + +void libxl__stream_write_start(libxl__egc *egc, + libxl__stream_write_state *stream) +{ +libxl__datacopier_state *dc = stream-dc; +STATE_AO_GC(stream-ao); +struct libxl_sr_hdr hdr = { 0 }; +int ret = 0; + +assert(!stream-running); +stream-running
Re: [Xen-devel] [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: In a remus senario, libxc will write a CHECKPOINT record, then hand ownership scenario of the fd to libxl. Libxl then writes any records required and finishes with a CHECKPOINT_END record, then hands ownership of the fd back to libxc. Seems like a plausible scheme to me, if that's what the RFC was for. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- docs/specs/libxl-migration-stream.pandoc | 15 ++- tools/libxl/libxl_sr_stream_format.h |1 + tools/python/xen/migration/libxl.py | 11 +++ 3 files changed, 26 insertions(+), 1 deletion(-) diff --git a/docs/specs/libxl-migration-stream.pandoc b/docs/specs/libxl-migration-stream.pandoc index 7235317..d41932a 100644 --- a/docs/specs/libxl-migration-stream.pandoc +++ b/docs/specs/libxl-migration-stream.pandoc @@ -119,7 +119,9 @@ type 0x: END 0x0003: EMULATOR_CONTEXT - 0x0004 - 0x7FFF: Reserved for future _mandatory_ + 0x0004: CHECKPOINT_END + + 0x0005 - 0x7FFF: Reserved for future _mandatory_ records. 0x8000 - 0x: Reserved for future _optional_ @@ -203,3 +205,14 @@ indexIndex of this emulator for the domain, if multiple emulator_ctx Emulator context blob. + +CHECKPOINT_END +-- + +A checkpoint end record marks the end of a checkpoint in the image. + + 0 1 2 3 4 5 6 7 octet ++-+ + +The end record contains no fields; its body_length is 0. + diff --git a/tools/libxl/libxl_sr_stream_format.h b/tools/libxl/libxl_sr_stream_format.h index 487f9e2..5dfa55f 100644 --- a/tools/libxl/libxl_sr_stream_format.h +++ b/tools/libxl/libxl_sr_stream_format.h @@ -35,6 +35,7 @@ #define REC_TYPE_LIBXC_CONTEXT 0x0001U #define REC_TYPE_XENSTORE_DATA 0x0002U #define REC_TYPE_EMULATOR_CONTEXT0x0003U +#define REC_TYPE_CHECKPOINT_END 0x0004U typedef struct libxl_sr_emulator_hdr { diff --git a/tools/python/xen/migration/libxl.py b/tools/python/xen/migration/libxl.py index 4e1f4f8..415502e 100644 --- a/tools/python/xen/migration/libxl.py +++ b/tools/python/xen/migration/libxl.py @@ -36,12 +36,14 @@ REC_TYPE_end = 0x REC_TYPE_libxc_context= 0x0001 REC_TYPE_xenstore_data= 0x0002 REC_TYPE_emulator_context = 0x0003 +REC_TYPE_checkpoint_end = 0x0004 rec_type_to_str = { REC_TYPE_end : End, REC_TYPE_libxc_context: Libxc context, REC_TYPE_xenstore_data: Xenstore data, REC_TYPE_emulator_context : Emulator context, +REC_TYPE_checkpoint_end : Checkpoint end, } # emulator_context @@ -176,6 +178,13 @@ class VerifyLibxl(VerifyBase): self.info( Index %d, type %s % (emu_idx, emulator_id_to_str[emu_id])) +def verify_record_checkpoint_end(self, content): + Checkpoint end record + +if len(content) != 0: +raise RecordError(Checkpoint end record with non-zero length) + + record_verifiers = { REC_TYPE_end: VerifyLibxl.verify_record_end, @@ -185,4 +194,6 @@ record_verifiers = { VerifyLibxl.verify_record_xenstore_data, REC_TYPE_emulator_context: VerifyLibxl.verify_record_emulator_context, +REC_TYPE_checkpoint_end: +VerifyLibxl.verify_record_checkpoint_end, } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]
Tuesday, June 16, 2015, 4:41:52 PM, you wrote: On 06/16/2015 02:37 PM, Ian Jackson wrote: George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): Remember that the path you gave in your previous e-mail isn't the path for the *usb device*, it's the path for the *block device*. It contains a PCI address, but it looks like it also contains part of the USB topology. Are you sure that's actually a stable interface, or does it just happen that on your hardware the discovery always happens in the same order? The block device is (in path terms) underneath the usb device, obviously. Not all of that path is relevant to identifying the USB device. On my system /sys/bus/usb/devices/2-3.3 is a link to /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/. This contains the pci bus address, but it also contains the bus number, which we've just said may be unstable across reboots. You mean the 2 in `usb2' ? I think that bus number is the bus number within the controller, not globally. On the contrary: $ ls -l usb* lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb1 - ../../../devices/pci:00/:00:1a.7/usb1/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb2 - ../../../devices/pci:00/:00:1d.7/usb2/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb3 - ../../../devices/pci:00/:00:1a.0/usb3/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb4 - ../../../devices/pci:00/:00:1a.1/usb4/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb5 - ../../../devices/pci:00/:00:1a.2/usb5/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb6 - ../../../devices/pci:00/:00:1d.0/usb6/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb7 - ../../../devices/pci:00/:00:1d.1/usb7/ lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb8 - ../../../devices/pci:00/:00:1d.2/usb8/ $ ls /sys//devices/pci:00/:00:1d.7/ | grep usb usb2/ In other words, the global bus enumeration leaks its way into the device path; which means at very least the sysfs device path is potentially unstable across reboots even if you include the pci controller it's on. I suppose it might be possible to specify buspci,port -- the pci address of the root bus, and the topology from there. In theory I guess that should be stable? Yes. The whole point of paths like this is that they are stable if the physical topology doesn't change. So on my netbook /dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1 always refers to the 1st MBR partition on logical device 0 on the USB storage device plugged into the USB port physically on the front left of the computer. That would be great if it were true, but I'm not convinced that the above path doesn't include a globally-enumerated bus number, which might change across reboots if it's enumerated in a different order. It may be that the format above is *not* based on the sysfs path of the device; that the '0' immediately after the USB means the first (and perhaps only) controller at this PCI address. In which case, yes, having a string like this might be a unique identifier for a particular port that would be stable across reboots. That needs some investigation. In any case, at the moment you're essentially inventing from whole cloth a new way of specifying USB devices that (as far as I know) isn't supported by any other program that uses USB. If you can't specify the device by hardware path, you can't specify it deterministically. And as you can see it _is_ supported by other programs that use USB. mount can use it! Mount doesn't know anything about USB devices; all it knows are block devices. You might as well say that 'ls' knows about USB devices because it can read files on a filesystem that resides on a USB stick. I'm talking about programs that explicitly manipulate USB devices -- in particular, those that may pass them through to a guest or seize them, like libvirt, qemu, virtualbox, c. I'm not opposed to doing something new if it really is better. But when you find that a dozen other projects before you have done things one way, you should at least take care before you decide to do something radically different, to make sure that it's because you actually have something better, and not because you've just missed something. I think the hardware path to the controller, at least, should be treated as an opaque OS-specific string. It might have a different format on BSD. If we can make an actual path that's stable across reboots, that would certainly be a good thing. But if it's not stable across reboots, it has no advantage over the bus-port[.port.port] interface. -George Would a symlink to a sysfs or device node entry also be allowed ? That way you can leverage the power of tools like udev or other device managers and script the linkage based on a multitude of device properties (f.e. serial numbers, manufacturer id, device id, etc.). Which can make it persistent across reboots or
Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls
On 16.06.15 at 16:56, stefano.stabell...@eu.citrix.com wrote: On Tue, 16 Jun 2015, Jan Beulich wrote: On 16.06.15 at 16:03, stefano.stabell...@eu.citrix.com wrote: On Fri, 5 Jun 2015, Jan Beulich wrote: --- a/qemu/upstream/hw/xen/xen_pt_msi.c +++ b/qemu/upstream/hw/xen/xen_pt_msi.c @@ -301,8 +301,11 @@ static int msix_set_enable(XenPCIPassthr return -1; } -return msi_msix_enable(s, s-msix-ctrl_offset, PCI_MSIX_FLAGS_ENABLE, - enabled); Would it make sense to remove msi_msix_enable completely to avoid any further mistakes? Perhaps, yes. I think I actually had suggested so quite a while back. But I don't see myself wasting much more time on this, ehm, code. Isn't it just a matter of removing msi_msix_enable? It has another caller xen_pt_msi_set_enable(). If we went down the route of what this patch does, then MSI's enable bit should ultimately also be driven through a hypercall, and that would then be the point where the function would naturally disappear. But as said, it looks like we're intending to go a different route anyway. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel