date:20150616




On 06/16/2015 09:09 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

move save/restore code into libxl_dom_save.c.


If this (unlike other patches in the series) is purely code motion
please indicate that this is the case.

You might also like to consider refactoring things such that all patches
are pure motion.


Yes, this is only code move, no refactoring or other thing.




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children

Andrew Cooper writes (Re: [PATCH 01/27] tools/libxl: Fix 
libxl__ev_child_inuse() check for not-yet-initialised children):
 It is possible that one bit fails before it can be calculated whether
 the second bit needs to start or not.
 
 At the moment, all bits in libxl in this area do initialisation
 immediately before use; most bits are even initialised in the function
 which starts their actions.  Some bits are initialised differently
 depending on the path taken to get to the initialisation site. 

As a rule of thumb a function libxl__initiate_foo_ which takes a
libxl__foo_state* should do this initialisation for the whole
libxl__foo_state.

I don't see why you can't do that.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 15/27] tools/libxl: Migration v2 stream format

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 From: Ross Lagerwall ross.lagerw...@citrix.com
 
 C structures describing the Libxl migration v2 stream format

Do we think these should be internal or are they part of the library
API? I suppose it's a bit of a grey area, obviously the file format is
ABI, but its not one a user should ever interact with directly.

What I'm getting at is may most of this should be in the libxl__ rather
than libxl_ namespace (also some of it being un-namespaced would further
suggest these are strictly speaking install, as would the fact it isn't
installed...)

 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/libxl_sr_stream_format.h |   57 
 ++
  1 file changed, 57 insertions(+)
  create mode 100644 tools/libxl/libxl_sr_stream_format.h
 
 diff --git a/tools/libxl/libxl_sr_stream_format.h 
 b/tools/libxl/libxl_sr_stream_format.h
 new file mode 100644
 index 000..487f9e2
 --- /dev/null
 +++ b/tools/libxl/libxl_sr_stream_format.h
 @@ -0,0 +1,57 @@
 +#ifndef LIBXL_SR_STREAM_FORMAT_H
 +#define LIBXL_SR_STREAM_FORMAT_H
 +
 +/*
 + * C structures for the Migration v2 stream format.
 + * See docs/specs/libxl-migration-stream.pandoc
 + */
 +
 +#include stdint.h
 +
 +typedef struct libxl_sr_hdr
 +{
 +uint64_t ident;
 +uint32_t version;
 +uint32_t options;
 +} libxl_sr_hdr;
 +
 +#define RESTORE_STREAM_IDENT 0x4c6962786c466d74UL
 +#define RESTORE_STREAM_VERSION   0x0002U
 +
 +#define RESTORE_OPT_BIG_ENDIAN   (1  0)
 +#define RESTORE_OPT_LEGACY   (1  1)
 +
 +
 +typedef struct libxl_sr_rec_hdr
 +{
 +uint32_t type;
 +uint32_t length;
 +} libxl_sr_rec_hdr;
 +
 +/* All records must be aligned up to an 8 octet boundary */
 +#define REC_ALIGN_ORDER  3U
 +
 +#define REC_TYPE_END 0xU
 +#define REC_TYPE_LIBXC_CONTEXT   0x0001U
 +#define REC_TYPE_XENSTORE_DATA   0x0002U
 +#define REC_TYPE_EMULATOR_CONTEXT0x0003U
 +
 +typedef struct libxl_sr_emulator_hdr
 +{
 +uint32_t id;
 +uint32_t index;
 +} libxl_sr_emulator_hdr;
 +
 +#define EMULATOR_UNKNOWN 0xU
 +#define EMULATOR_QEMU_TRADITIONAL0x0001U
 +#define EMULATOR_QEMU_UPSTREAM   0x0002U
 +
 +#endif /* LIBXL_SR_STREAM_FORMAT_H */
 +
 +/*
 + * Local variables:
 + * mode: C
 + * c-basic-offset: 4
 + * indent-tabs-mode: nil
 + * End:
 + */



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls

 On 16.06.15 at 16:03, stefano.stabell...@eu.citrix.com wrote:
 On Fri, 5 Jun 2015, Jan Beulich wrote:
 +} else if (s-msix-enabled) {
 +if (!(value  PCI_MSIX_FLAGS_ENABLE)) {
 +xen_pt_msix_disable(s);
 +s-msix-enabled = false;
 +} else if (!s-msix-maskall) {
 
 Why are you changing the state of s-msix-maskall here?
 This is the value  PCI_MSIX_FLAGS_ENABLE case, nothing to do with
 maskall, right?

We're at an else if inside an else if here. The only case left
after the if() still seen above is that value has
PCI_MSIX_FLAGS_MASKALL set.

 +s-msix-maskall = true;
 +xen_pt_msix_maskall(s, true);
 +}
  }
  
 -debug_msix_enabled_old = s-msix-enabled;
 -s-msix-enabled = !!(*val  PCI_MSIX_FLAGS_ENABLE);
  if (s-msix-enabled != debug_msix_enabled_old) {
  XEN_PT_LOG(s-dev, %s MSI-X\n,
 s-msix-enabled ? enable : disable);
  }
  
 +xen_host_pci_get_word(s-real_device, s-msix-ctrl_offset, 
 dev_value);
 
 I have to say that I don't like the asymmetry between reading and
 writing PCI config registers. If writes go via hypercalls, reads should
 go via hypercalls too.

We're not doing any cfg register write via hypercalls (not here,
and not elsewhere). What is being replaced by the patch are
write to two bits which happen to live in PCI config space. Plus,
reading directly, and doing writes via hypercall only when really
needed would still be the right thing from a performance pov.

 --- a/qemu/upstream/hw/xen/xen_pt_msi.c
 +++ b/qemu/upstream/hw/xen/xen_pt_msi.c
 @@ -301,8 +301,11 @@ static int msix_set_enable(XenPCIPassthr
  return -1;
  }
  
 -return msi_msix_enable(s, s-msix-ctrl_offset, PCI_MSIX_FLAGS_ENABLE,
 -   enabled);
 
 Would it make sense to remove msi_msix_enable completely to avoid any
 further mistakes?

Perhaps, yes. I think I actually had suggested so quite a while back.
But I don't see myself wasting much more time on this, ehm, code.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes

2015-06-16 Thread Julien Grall

On 16/06/15 14:13, David Vrabel wrote:
 On 16/06/15 13:57, Julien Grall wrote:
 On 16/06/15 12:59, Jan Beulich wrote:
 On 16.06.15 at 13:14, julien.gr...@citrix.com wrote:
 On 15/06/2015 16:48, David Vrabel wrote:
 diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
 index 44ea92d..a0ff9d2 100644
 --- a/xen/include/xen/sched.h
 +++ b/xen/include/xen/sched.h
 @@ -129,7 +129,7 @@ struct evtchn
   #endif
   } ssid;
   #endif
 -};
 +} __attribute__((aligned(64)));

 Why don't you use __cacheline_aligned?

 That would double the size on x86, for little or no benefit.

 Well, the cache line size is not necessarily 64 bytes on every
 architecture. In the case of ARM, the cache line depends on the
 processor version.

 __cacheline_aligned is the only way to ensure that the cache line is not
 shared on ARM.

 AFAIU, the goal of this patch is to avoid sharing the cache line. If
 not, the commit message is misleading because it claims that a cache
 line is always 64 bytes...
 
 We want to avoid sharing the cache line where we can do so for no
 additional memory cost.

I would expand the commit message to make clear that we may not share
the cache line. The 64 bytes (cache line) is confusing and without any
background it can be interpreted wrongly.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children

Ian Campbell writes (Re: [PATCH 01/27] tools/libxl: Fix 
libxl__ev_child_inuse() check for not-yet-initialised children):
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
  Shortly, libxl will be juggling multiple parallel operations, and will
  possibly have to take error decisions before some tasks have been set up.
 
 It would be preferable, I think, to arrange to call libxl__ev_child_init
 on all such libxl__ev_child structs either up front or certainly before
 there is any possibility of needing to unwind them.

Yes.

 Such an init would at worst correspond to exactly the place where the
 zeroed structure you refer to is zeroed.

I would welcome a patch which caused an assertion failure if -pid==0.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 10/27] docs: Libxl migration v2 stream specification

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com

 +EMULATOR\_CONTEXT
 +
 +
 +A context blob for a specific emulator associated with the domain.
 +
 + 0 1 2 3 4 5 6 7 octet
 ++++
 +| emulator_id| index  |
 ++++
 +| emulator_ctx|
 +...
 ++-+
 +
 +
 +FieldDescription
 + ---
 +emulator_id  0x: Unknown (In the case of a legacy stream)
 +
 + 0x0001: Qemu Traditional
 +
 + 0x0002: Qemu Upstream
 +
 + 0x0003 - 0x: Reserved for future emulators.

Would it be useful for future proofing to carve out some space for a
per-emulator version field too?

Otherwise LGTM.

One thought, it might be useful (here or elsewhere) to have an explicit
overview of the expected control flow (as in the ownership of the fd,
and/or nesting of the layers as you prefer to think about it) between
libxc, libxl and the next layer (i.e. xl).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Proposal: Feature Maturity Lifecycle

2015-06-16 Thread Wei Liu

On Mon, Jun 15, 2015 at 05:34:08PM +0100, Lars Kurth wrote:
 
  On 15 Jun 2015, at 15:51, George Dunlap dunl...@umich.edu wrote:
  
  A more practical requirement for a feature with Supported status is
  that _either_:
  
  * The feature is tested automatically.
  
  * At least once during each release freeze, the feature's
maintainers produce a test report (by a deadline specified by
the release manager).  Features with no test report get
downgraded from Supported to some lower status.
  
  This seems a reasonable compromise.  Rather than forcing a maintainer
  to engage with osstest as a condition of considering the feature
  flagged Supported, we gently nudge them to automate it to save
  themselves the hassle of testing it every release. :-)
 
 Alright. I will make a note of this and amend the next version. 
 
 I was wondering whether any maintainers who will be impacted by this and the 
 Release Manager (current and past) have a view
 

The end result is probably we spam all maintainers for their components,
regardlessly whether that components are tested in OSSTest or not.
Triaging whether a component is tested by OSSTest can only be done by
hand and is tedious and time consuming.

The extra work load may be moderate if we only do it once per release
and have the corresponding tool to generate emails automatically.

Wei.

 Also, please let me know of any additional feedback on the main proposal, 
 such that I can make a note of people's views and make modifications.
 
 Best Regards
 Lars
 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v24 12/15] x86/VPMU: Handle PMU interrupts for PV(H) guests

2015-06-16 Thread Boris Ostrovsky


On 06/16/2015 03:45 AM, Jan Beulich wrote:

On 15.06.15 at 19:17, boris.ostrov...@oracle.com wrote:

+for ( i = 0; i  num_counters; i++ )
+{
+if ( (ctrl_regs[i]  CTRL_RSVD_MASK) != ctrl_rsvd[i] )
+{
+/*
+ * Not necessary to re-init context since we should never load
+ * it until guest provides valid values. But just to be safe.
+ */
+amd_vpmu_init_regs(ctxt);
+return -EINVAL;
+}
+
+if ( is_pmu_enabled(ctrl_regs[i]) )
+num_enabled++;
+}
+
+if ( num_enabled )

Looks like a boolean flag would do - the exact count doesn't seem to
be of interest here or in later patches?

So the reason why I use a counter here is because keeping track of
VPMU_RUNNING state is currently broken on AMD, I noticed it when I was
updating this patch. amd_vpmu_do_wrmsr() will reset VPMU_RUNNING if the
MSR write is disabling current counter, even though there may still be
other counters running. This may be related to HVM brokenness of AMD
counters that I mentioned a while ago where the guest, when running with
multiple counters, sometimes gets unexpected NMIs. (This goes back all
the way to to 4.1.)

I don't want to fix this in this series but I will likely need to count
number of active counters when I do (just like I do for Intel).

I can use a boolean now though since I am not dealing with this problem
here.

If another rev is needed, I'd prefer if you did so. But if we can have
this version go in (provided we get all the necessary acks), I wouldn't
insist on you doing another round just because of this.


I think there are a couple of (fairly cosmetic) changes that need to be 
done so there will be another rev.


OTOH I just tried a quick-and-dirty fix for this problem and it doesn't 
resolve it so presumably there is more to this. It's not particularly 
invasive but I think it would be rather pointless to put it in as it 
still doesn't allow multiple counters on AMD and I suspect the final fix 
will be touching the same code again.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/6] tools/libxl: move domain suspend code into libxl_dom_suspend.c

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:
 Move domain suspend code into a separate file libxl_dom_suspend.c.
 export an API libxl__domain_suspend() which wrappers the static

just ..which wraps the...

 function domain_suspend_callback_common() for internal use.
 
 Note that the newly added file libxl_dom_suspend.c is used for
 suspend/resume code.
 
 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 CC: Andrew Cooper andrew.coop...@citrix.com
 Acked-by: Ian Campbell ian.campb...@citrix.com
 ---
  tools/libxl/Makefile|   3 +-
  tools/libxl/libxl_dom.c | 350 +---
  tools/libxl/libxl_dom_suspend.c | 381 
 
  tools/libxl/libxl_internal.h|   6 +
  4 files changed, 393 insertions(+), 347 deletions(-)
  create mode 100644 tools/libxl/libxl_dom_suspend.c
 
 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
 index cc9c152..3f98d62 100644
 --- a/tools/libxl/Makefile
 +++ b/tools/libxl/Makefile
 @@ -95,7 +95,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
 libxl_pci.o \
   libxl_internal.o libxl_utils.o libxl_uuid.o \
   libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
 \
   libxl_save_callout.o _libxl_save_msgs_callout.o \
 - libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 + libxl_qmp.o libxl_event.o libxl_fork.o 
 libxl_dom_suspend.o \
 + $(LIBXL_OBJS-y)
  LIBXL_OBJS += libxl_genid.o
  LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
  
 diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
 index cce04dd..9444329 100644
 --- a/tools/libxl/libxl_dom.c
 +++ b/tools/libxl/libxl_dom.c
 @@ -1103,11 +1103,6 @@ int libxl__toolstack_restore(uint32_t domid, const 
 uint8_t *buf,
  
  /* Domain suspend (save) */
  
 -static void domain_save_done(libxl__egc *egc,
 - libxl__domain_suspend_state *dss, int rc);
 -static void domain_suspend_callback_common_done(libxl__egc *egc,
 -libxl__domain_suspend_state *dss, int ok);
 -
  /*- complicated callback, called by xc_domain_save -*/
  
  /*
 @@ -1324,35 +1319,6 @@ static void switch_logdirty_done(libxl__egc *egc,
  
  /*- callbacks, called by xc_domain_save -*/
  
 -int libxl__domain_suspend_device_model(libxl__gc *gc,
 -   libxl__domain_suspend_state *dss)
 -{
 -int ret = 0;
 -uint32_t const domid = dss-domid;
 -const char *const filename = dss-dm_savefile;
 -
 -switch (libxl__device_model_version_running(gc, domid)) {
 -case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
 -LOG(DEBUG, Saving device model state to %s, filename);
 -libxl__qemu_traditional_cmd(gc, domid, save);
 -libxl__wait_for_device_model_deprecated(gc, domid, paused, NULL, 
 NULL, NULL);
 -break;
 -}
 -case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
 -if (libxl__qmp_stop(gc, domid))
 -return ERROR_FAIL;
 -/* Save DM state into filename */
 -ret = libxl__qmp_save(gc, domid, filename);
 -if (ret)
 -unlink(filename);
 -break;
 -default:
 -return ERROR_INVAL;
 -}
 -
 -return ret;
 -}
 -
  int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
  {
  
 @@ -1373,301 +1339,6 @@ int libxl__domain_resume_device_model(libxl__gc *gc, 
 uint32_t domid)
  return 0;
  }
  
 -static void domain_suspend_common_wait_guest(libxl__egc *egc,
 - libxl__domain_suspend_state 
 *dss);
 -static void domain_suspend_common_guest_suspended(libxl__egc *egc,
 - libxl__domain_suspend_state *dss);
 -
 -static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
 -  libxl__xswait_state *xswa, int rc, const char *state);
 -static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
 -libxl__ev_evtchn *evev);
 -static void suspend_common_wait_guest_watch(libxl__egc *egc,
 -  libxl__ev_xswatch *xsw, const char *watch_path, const char 
 *event_path);
 -static void suspend_common_wait_guest_check(libxl__egc *egc,
 -libxl__domain_suspend_state *dss);
 -static void suspend_common_wait_guest_timeout(libxl__egc *egc,
 -  libxl__ev_time *ev, const struct timeval *requested_abs);
 -
 -static void domain_suspend_common_failed(libxl__egc *egc,
 - libxl__domain_suspend_state *dss);
 -static void domain_suspend_common_done(libxl__egc *egc,
 -   libxl__domain_suspend_state *dss,
 -   bool ok);
 -

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API


On 06/16/2015 01:45 PM, Ian Jackson wrote:

Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):

On 06/16/2015 01:10 PM, Ian Jackson wrote:

AIUI some devices have serial numbers, which means you can distinguish
them ?


Yes, they have. The question is whether those are different on multiple
instances? With lsusb I've found a device with serial number
0123456789ABCD. Do you really believe I'm just lucky owning the one with
such a nice serial number? ;-)


Heh.  I think, though, that this suggests that if the user has devices
whose serial numbers are actually different, they should be able to
specify a device by vid/pid/serial.


I think you are right.

The next question: should this be part of libxl or qemu? It should be
possible to extend the qemu vendorId:productId syntax to
vendorId:productId:serial without too much effort, as libusb
includes some support to obtain the serial number.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

On 06/16/2015 02:06 PM, Juergen Gross wrote:
 On 06/16/2015 01:45 PM, Ian Jackson wrote:
 Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb
 API):
 On 06/16/2015 01:10 PM, Ian Jackson wrote:
 AIUI some devices have serial numbers, which means you can distinguish
 them ?

 Yes, they have. The question is whether those are different on multiple
 instances? With lsusb I've found a device with serial number
 0123456789ABCD. Do you really believe I'm just lucky owning the one with
 such a nice serial number? ;-)

 Heh.  I think, though, that this suggests that if the user has devices
 whose serial numbers are actually different, they should be able to
 specify a device by vid/pid/serial.
 
 I think you are right.
 
 The next question: should this be part of libxl or qemu? It should be
 possible to extend the qemu vendorId:productId syntax to
 vendorId:productId:serial without too much effort, as libusb
 includes some support to obtain the serial number.

It doesn't really matter what qemu or pvusb can do, as long as libxl can
convert what it has into something that they can use.  So if libxl is
given vid, pid, and serial (remember, this will be a struct, not a
string), it can look in sysfs and find the bus:addr (which is unique at
any given time) and hand it to qemu  (or the bus-port in the case of pvusb).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC] Xen PV IOMMU interface draft B

 On 12.06.15 at 18:43, malcolm.cross...@citrix.com wrote:
 IOMMUOP_query_caps
 --
 
 This subop queries the runtime capabilities of the PV-IOMMU interface for 
 the
 specific called domain. This subop uses `struct pv_iommu_op` directly.

calling domain perhaps?

 
 --
 Field  Purpose
 -  
 ---
 `flags`[out] This field details the IOMMUOP capabilities.
 
 `status`   [out] Status of this op, op specific values listed below
 
 --
 
 Defined bits for flags field:
 
 
 --
 NameBitDefinition
    -- --
 IOMMU_QUERY_map_cap  0IOMMUOP_map_page or IOMMUOP_map_foreign
   can be used for this domain

this (see also above) perhaps being the calling domain? In which
case I wonder how the for and IOMMUOP_map_foreign are
meant to fit together: I assume the flag to indicate that mapping into
the (calling) domain is possible. Which then makes me wonder - what
use if the new hypercall when this flag isn't set?

 IOMMU_QUERY_map_all_gfns 1IOMMUOP_map_page subop can map any MFN
   not used by Xen

gfns or MFN?

 Defined values for map_page subop status field:
 
 Value   Reason
 --  
 --
 0   subop successfully returned
 -EIOIOMMU unit returned error when attempting to map BFN to GFN.
 -EPERM  GFN could not be mapped because the GFN belongs to Xen.
 -EPERM  Domain is not a  domain and GFN does not belong to domain

is not a hardware domain? Also, I think we're pretty determined
for there to ever only be one, so perhaps it should be the
hardware domain here and elsewhere.

 IOMMUOP_unmap_page
 --
 This subop uses `struct unmap_page` part of the `struct pv_iommu_op`.
 
 The subop usage of the struct pv_iommu_op and ``struct unmap_page` fields
 are detailed below:
 
 
 Field  Purpose
 -  -
 `bfn`  [in] Bus address frame number to be unmapped in DOMID_SELF

Has it been determined that unmapping based on GFN is never
going to be needed, and that unmapping by BFN is the more
practical solution? The map counterpart doesn't seem to exclude
establishing multiple mappings for the same BFN, and hence the
inverse here would become kind of fuzzy in that case.

 IOMMUOP_map_foreign_page
 
 This subop uses `struct map_foreign_page` part of the `struct pv_iommu_op`.
 
 It is not valid to use domid representing the calling domain.

And what's the point of that? Was it considered to have only one
map/unmap pair, capable of mapping both local and foreign pages?
If so, what speaks against that?

 The M2B mechanism is a MFN to (BFN,domid,ioserver) tuple.

I didn't see anything explaining the significance of this (namely
the ioserver part, I think I can see the need for the domid), can
you explain the backgound here please?

 Every new M2B entry will take a reference to the MFN backing the GFN.

What happens when that GFN-MFN mapping changes?

 All the following conditions are required to be true for PV IOMMU 
 map_foreign
 subop to succeed:
 
 1. IOMMU detected and supported by Xen
 2. The domain has IOMMU controlled hardware allocated to it
 3. The domain is a hardware_domain and the following Xen IOMMU options are
NOT enabled: dom0-passthrough

Is there a way for the hardware domain to know it is running in
pass-through mode? Also, the domain is ambiguous here; I'm
sure you mean the invoking domain, not the one owning the page.

 This subop usage of the struct pv_iommu_op and ``struct map_foreign_page`
 fields are detailed below:
 
 
 Field  Purpose
 -  -
 `domid`[in] The domain ID for which the gfn field applies
 
 `ioserver` [in] IOREQ server id associated with mapping
 
 `bfn`  [in] Bus address frame number for gfn address

In the description above you speak of returning data in this field. Is
[in] really correct?

 Defined bits for flags field:
 
 Name BitDefinition
 -  --
 IOMMUOP_readable  0BFN IOMMU mapping is readable
 IOMMUOP_writeable 1BFN IOMMU mapping is writeable
 IOMMUOP_swap_mfn  2BFN IOMMU mapping can be safely

Re: [Xen-devel] [PATCH 1/6] xen/MSI-X: latch MSI-X table writes

On Fri, 5 Jun 2015, Jan Beulich wrote:
 The remaining log message in pci_msix_write() is wrong, as there guest
 behavior may only appear to be wrong: For one, the old logic didn't
 take the mask-all bit into account. And then this shouldn't depend on
 host device state (i.e. the host may have masked the entry without the
 guest having done so). Plus these writes shouldn't be dropped even when
 an entry gets unmasked. Instead, if they can't be made take effect
 right away, they should take effect on the next unmasking or enabling
 operation - the specification explicitly describes such caching
 behavior.
 
 Signed-off-by: Jan Beulich jbeul...@suse.com
 
 @@ -322,6 +323,13 @@ static int xen_pt_msix_update_one(XenPCI
  
  pirq = entry-pirq;
  
 +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s-msix-maskall ||
 +(entry-latch(VECTOR_CTRL)  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {

I admit I am having difficulties understanding the full purpose of these
checks. Please add a comment on them.

I guess the intention is only to make changes using the latest values,
the ones in entry-latch, when the right conditions are met, otherwise
keep using the old values. Is that right?

In that case, don't we want to use the latest values on MASKBIT -
!MASKBIT transitions? In general when unmasking? Here it looks like we
are using the latest values when maskall is set or
PCI_MSIX_ENTRY_CTRL_MASKBIT is set, that is the opposite of what we
want.


 +entry-addr = entry-latch(LOWER_ADDR) |
 +  ((uint64_t)entry-latch(UPPER_ADDR)  32);
 +entry-data = entry-latch(DATA);
 +}
 +
  rc = msi_msix_setup(s, entry-addr, entry-data, pirq, true, entry_nr,
  entry-pirq == XEN_PT_UNASSIGNED_PIRQ);
  if (rc) {
 @@ -444,39 +432,28 @@ static void pci_msix_write(void *opaque,
  offset = addr % PCI_MSIX_ENTRY_SIZE;
  
  if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) {
 -const volatile uint32_t *vec_ctrl;
 -
  if (get_entry_value(entry, offset) == val
   entry-pirq != XEN_PT_UNASSIGNED_PIRQ) {
  return;
  }
  
 +entry-updated = true;
 +} else if (msix-enabled  entry-updated 
 +   !(val  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
 +const volatile uint32_t *vec_ctrl;
 +
  /*
   * If Xen intercepts the mask bit access, entry-vec_ctrl may not be
   * up-to-date. Read from hardware directly.
   */
  vec_ctrl = s-msix-phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE
  + PCI_MSIX_ENTRY_VECTOR_CTRL;
 +set_entry_value(entry, offset, *vec_ctrl);

Why are you calling set_entry_value with the hardware vec_ctrl value? It
doesn't look correct to me.  In any case, if you wanted to do it,
shouldn't you just set/unset PCI_MSIX_ENTRY_CTRL_MASKBIT instead of the
whole *vec_ctrl?


 -if (msix-enabled  !(*vec_ctrl  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
 -if (!entry-warned) {
 -entry-warned = true;
 -XEN_PT_ERR(s-dev, Can't update msix entry %d since MSI-X 
 is
 -already enabled.\n, entry_nr);
 -}
 -return;
 -}
 -
 -entry-updated = true;
 +xen_pt_msix_update_one(s, entry_nr);

Shouldn't we call xen_pt_msix_update_one only if (*vec_ctrl 
PCI_MSIX_ENTRY_CTRL_MASKBIT)? In other words, only when we see a
MASKBIT - !MASKBIT transition?


  }
  
  set_entry_value(entry, offset, val);
 -
 -if (offset == PCI_MSIX_ENTRY_VECTOR_CTRL) {
 -if (msix-enabled  !(val  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
 -xen_pt_msix_update_one(s, entry_nr);
 -}
 -}
  }
  
  static uint64_t pci_msix_read(void *opaque, hwaddr addr,

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] tools/libxl: move remus code into libxl_remus.c




On 06/16/2015 09:08 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

move remus code into libxl_remus.c.


Please say something like ... by refactoring bits of
libxl_domain_remus_start and domain_save_done into X and Y and moving
the remaining functionality unchanged into the new file.

I gave two examples of functions which changed there, but please make
sure the list is complete+accurate.


Okay, will fix the commit message next time, thank you!




Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
  tools/libxl/Makefile |   2 +-
  tools/libxl/libxl.c  |  55 +---
  tools/libxl/libxl_dom.c  | 206 +
  tools/libxl/libxl_internal.h |  11 ++
  tools/libxl/libxl_remus.c| 304 +++
  5 files changed, 318 insertions(+), 260 deletions(-)
  create mode 100644 tools/libxl/libxl_remus.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 3f98d62..8535eaa 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -56,7 +56,7 @@ else
  LIBXL_OBJS-y += libxl_nonetbuffer.o
  endif

-LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o

  LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
  LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 77c6a36..0f9248e 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -792,10 +792,6 @@ out:
  return ptr;
  }

-static void libxl__remus_setup_done(libxl__egc *egc,
-libxl__remus_devices_state *rds, int rc);
-static void libxl__remus_setup_failed(libxl__egc *egc,
-  libxl__remus_devices_state *rds, int rc);
  static void remus_failover_cb(libxl__egc *egc,
libxl__domain_suspend_state *dss, int rc);

@@ -844,63 +840,14 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,

  assert(info);

-/* Convenience aliases */
-libxl__remus_devices_state *const rds = dss-rds;
-
-if (libxl_defbool_val(info-netbuf)) {
-if (!libxl__netbuffer_enabled(gc)) {
-LOG(ERROR, Remus: No support for network buffering);
-rc = ERROR_FAIL;
-goto out;
-}
-rds-device_kind_flags |= (1  LIBXL__DEVICE_KIND_VIF);
-}
-
-if (libxl_defbool_val(info-diskbuf))
-rds-device_kind_flags |= (1  LIBXL__DEVICE_KIND_VBD);
-
-rds-ao = ao;
-rds-domid = domid;
-rds-callback = libxl__remus_setup_done;
-
  /* Point of no return */
-libxl__remus_devices_setup(egc, rds);
+libxl__remus_setup(egc, dss);
  return AO_INPROGRESS;

   out:
  return AO_ABORT(rc);
  }

-static void libxl__remus_setup_done(libxl__egc *egc,
-libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss-ao);
-
-if (!rc) {
-libxl__domain_save(egc, dss);
-return;
-}
-
-LOG(ERROR, Remus: failed to setup device for guest with domid %u, rc %d,
-dss-domid, rc);
-rds-callback = libxl__remus_setup_failed;
-libxl__remus_devices_teardown(egc, rds);
-}
-
-static void libxl__remus_setup_failed(libxl__egc *egc,
-  libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss-ao);
-
-if (rc)
-LOG(ERROR, Remus: failed to teardown device after setup failed
- for guest with domid %u, rc %d, dss-domid, rc);
-
-dss-callback(egc, dss, rc);
-}
-
  static void remus_failover_cb(libxl__egc *egc,
libxl__domain_suspend_state *dss, int rc)
  {
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 701e9f7..0f81081 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1409,189 +1409,6 @@ int libxl__toolstack_save(uint32_t domid, uint8_t **buf,
  return 0;
  }

-/*- remus callbacks -*/
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int ok);
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
- libxl__remus_devices_state *rds,
- int rc);
-static void remus_devices_preresume_cb(libxl__egc *egc,
-   libxl__remus_devices_state *rds,
-   int rc);
-
-static void libxl__remus_domain_suspend_callback(void *data)
-{
-libxl__save_helper_state *shs = data;
-libxl__egc *egc = shs-egc;

Re: [Xen-devel] [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 If a conversion of a legacy stream is needed, libxl__xc_domain_restore() will
 need to use an fd other to the one found in the domain_create_state.
 
 No functional change.

It could be argued that the one in domain_create_state should always be
the correct one. If that means that it differs from the one ultimately
passed in by the user then the original ought to be saved in the state
associated with the filter as its input and the filter's output plumbed
into domain_create_state instead.

I'll reserve judgement until I see a bit more of how this pans out
through the series.

 @@ -41,13 +41,13 @@ static void helper_exited(libxl__egc *egc, 
 libxl__ev_child *ch,
  /*- entrypoints -*/
  
  void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state 
 *dcs,
 +  int restore_fd,

Assuming we go this way: can it remain const?


int hvm, int pae, int superpages)
  {
  STATE_AO_GC(dcs-ao);
  
  /* Convenience aliases */
  const uint32_t domid = dcs-guest_domid;
 -const int restore_fd = dcs-restore_fd;
  libxl__domain_build_state *const state = dcs-build_state;
  
  unsigned cbflags = libxl__srm_callout_enumcallbacks_restore



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 07/12] x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code

2015-06-16 Thread Boris Ostrovsky


On 06/16/2015 10:15 AM, David Vrabel wrote:

On 15/06/15 21:35, Ingo Molnar wrote:

* David Vrabel david.vra...@citrix.com wrote:


On 15/06/15 10:05, Ian Campbell wrote:

On Sat, 2015-06-13 at 11:49 +0200, Ingo Molnar wrote:

xen_mm_pin_all()/unpin_all() are used to implement full guest instance
suspend/restore. It's a stop-all method that needs to iterate through all
allocated pgds in the system to fix them up for Xen's use.

This code uses pgd_list, probably because it was an easy interface.

But we want to remove the pgd_list, so convert the code over to walk all
tasks in the system. This is an equivalent method.

It is not equivalent because pgd_alloc() now populates entries in pgds that are
not visible to xen_mm_pin_all() (note how the original code adds the pgd to the
pgd_list in pgd_ctor() before calling pgd_prepopulate_pmd()).  These newly
allocated page tables won't be correctly converted on suspend/resume and the new
process will die after resume.

So how should the Xen logic be fixed for the new scheme? I can't say I can see
through the paravirt complexity here.

Actually, since we freeze_processes() before trying to pin page tables,
I think it should be ok as-is.

I'll put the patch through some tests.


Actually, I just ran this through a couple of boot/suspend/resume tests 
and didn't see any issues (with the one fix I mentioned to Ingo 
earlier). On unstable Xen only.


-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes

 On 16.06.15 at 13:14, julien.gr...@citrix.com wrote:
 On 15/06/2015 16:48, David Vrabel wrote:
 diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
 index 44ea92d..a0ff9d2 100644
 --- a/xen/include/xen/sched.h
 +++ b/xen/include/xen/sched.h
 @@ -129,7 +129,7 @@ struct evtchn
   #endif
   } ssid;
   #endif
 -};
 +} __attribute__((aligned(64)));
 
 Why don't you use __cacheline_aligned?

That would double the size on x86, for little or no benefit.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

On 06/16/2015 02:23 PM, Juergen Gross wrote:
 On 06/16/2015 03:09 PM, George Dunlap wrote:
 On 06/16/2015 02:06 PM, Juergen Gross wrote:
 On 06/16/2015 01:45 PM, Ian Jackson wrote:
 Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb
 API):
 On 06/16/2015 01:10 PM, Ian Jackson wrote:
 AIUI some devices have serial numbers, which means you can
 distinguish
 them ?

 Yes, they have. The question is whether those are different on
 multiple
 instances? With lsusb I've found a device with serial number
 0123456789ABCD. Do you really believe I'm just lucky owning the one
 with
 such a nice serial number? ;-)

 Heh.  I think, though, that this suggests that if the user has devices
 whose serial numbers are actually different, they should be able to
 specify a device by vid/pid/serial.

 I think you are right.

 The next question: should this be part of libxl or qemu? It should be
 possible to extend the qemu vendorId:productId syntax to
 vendorId:productId:serial without too much effort, as libusb
 includes some support to obtain the serial number.

 It doesn't really matter what qemu or pvusb can do, as long as libxl can
 convert what it has into something that they can use.  So if libxl is
 given vid, pid, and serial (remember, this will be a struct, not a
 string), it can look in sysfs and find the bus:addr (which is unique at
 any given time) and hand it to qemu  (or the bus-port in the case of
 pvusb).
 
 Hmm, I'd rather have it all in one place. Putting it in qemu would
 enable us to handle hotplug as well. A USB device assigned via it's
 serial to a domU could be automatically passed through by qemu when
 showing up. This isn't possible today, but we wouldn't have to change
 libxl again for supporting it, only qemu would have to be adapted.

Look, we're talking here about the libxl interface.  Ian thinks that *at
the libxl layer* we need to be able to specify all these things.  It
doesn't matter at this point whether qemu can also do it or not.

In the future, if we actually implement the automatically grab this
device when it shows up functionality, then yes, having it in qemu
rather than having a libxl daemon sit around and watch for those things,
might be handy.  But we can cross that bridge when we come to it.

 -George



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/6] xen/MSI-X: latch MSI-X table writes

 On 16.06.15 at 15:35, stefano.stabell...@eu.citrix.com wrote:
 On Fri, 5 Jun 2015, Jan Beulich wrote:
 @@ -322,6 +323,13 @@ static int xen_pt_msix_update_one(XenPCI
  
  pirq = entry-pirq;
  
 +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s-msix-maskall ||
 +(entry-latch(VECTOR_CTRL)  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
 
 I admit I am having difficulties understanding the full purpose of these
 checks. Please add a comment on them.

The comment would (pointlessly imo) re-state what the code already
says:

 I guess the intention is only to make changes using the latest values,
 the ones in entry-latch, when the right conditions are met, otherwise
 keep using the old values. Is that right?
 
 In that case, don't we want to use the latest values on MASKBIT -
 !MASKBIT transitions? In general when unmasking?

This is what we want. And with that, the questions you ask further
down should be answered too: The function gets invoked with the
pre-change mask flag state in -latch[], and updates the values
used for actually setting up when that one has the entry masked
(or mask-all is set). The actual new value gets written to -latch[]
after the call.

 @@ -444,39 +432,28 @@ static void pci_msix_write(void *opaque,
  offset = addr % PCI_MSIX_ENTRY_SIZE;
  
  if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) {
 -const volatile uint32_t *vec_ctrl;
 -
  if (get_entry_value(entry, offset) == val
   entry-pirq != XEN_PT_UNASSIGNED_PIRQ) {
  return;
  }
  
 +entry-updated = true;
 +} else if (msix-enabled  entry-updated 
 +   !(val  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
 +const volatile uint32_t *vec_ctrl;
 +
  /*
   * If Xen intercepts the mask bit access, entry-vec_ctrl may not be
   * up-to-date. Read from hardware directly.
   */
  vec_ctrl = s-msix-phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE
  + PCI_MSIX_ENTRY_VECTOR_CTRL;
 +set_entry_value(entry, offset, *vec_ctrl);
 
 Why are you calling set_entry_value with the hardware vec_ctrl value? It
 doesn't look correct to me.  In any case, if you wanted to do it,
 shouldn't you just set/unset PCI_MSIX_ENTRY_CTRL_MASKBIT instead of the
 whole *vec_ctrl?

The comment above the code explains it: What we have stored locally
may not reflect reality, as we may not have seen all writes (and this
indeed isn't just a may). And if out cached value isn't valid anymore,
why would we not want to update all of it, rather than just the mask
bit?

 -if (msix-enabled  !(*vec_ctrl  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
 -if (!entry-warned) {
 -entry-warned = true;
 -XEN_PT_ERR(s-dev, Can't update msix entry %d since MSI-X 
 is
 -already enabled.\n, entry_nr);
 -}
 -return;
 -}
 -
 -entry-updated = true;
 +xen_pt_msix_update_one(s, entry_nr);
 
 Shouldn't we call xen_pt_msix_update_one only if (*vec_ctrl 
 PCI_MSIX_ENTRY_CTRL_MASKBIT)? In other words, only when we see a
 MASKBIT - !MASKBIT transition?

The combination of the !(val  PCI_MSIX_ENTRY_CTRL_MASKBIT)
check in the if() surrounding this call and the
(entry-latch(VECTOR_CTRL)  PCI_MSIX_ENTRY_CTRL_MASKBIT)
check inside the function guarantee just that (i.e. the function
invocation is benign in the other case, as entry-addr/entry-data
would remain unchanged).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls

On Fri, 5 Jun 2015, Jan Beulich wrote:
 Particularly the maskall bit has to be under exclusive hypervisor
 control (and since they live in the same config space field, the
 enable bit has to follow suit). Use the replacement hypercall
 interfaces.

From this commit message, I expect a patch that simply substitutes
maskall and enable writings with hypercalls and nothing else.


 Signed-off-by: Jan Beulich jbeul...@suse.com
 
 --- a/qemu/upstream/hw/xen/xen_pt.h
 +++ b/qemu/upstream/hw/xen/xen_pt.h
 @@ -181,6 +181,7 @@ typedef struct XenPTMSIXEntry {
  typedef struct XenPTMSIX {
  uint32_t ctrl_offset;
  bool enabled;
 +bool maskall;
  int total_entries;
  int bar_index;
  uint64_t table_base;
 @@ -293,7 +294,9 @@ int xen_pt_msix_init(XenPCIPassthroughSt
  void xen_pt_msix_delete(XenPCIPassthroughState *s);
  int xen_pt_msix_update(XenPCIPassthroughState *s);
  int xen_pt_msix_update_remap(XenPCIPassthroughState *s, int bar_index);
 +void xen_pt_msix_enable(XenPCIPassthroughState *s);
  void xen_pt_msix_disable(XenPCIPassthroughState *s);
 +int xen_pt_msix_maskall(XenPCIPassthroughState *s, bool mask);
  
  static inline bool xen_pt_has_msix_mapping(XenPCIPassthroughState *s, int 
 bar)
  {
 --- a/qemu/upstream/hw/xen/xen_pt_config_init.c
 +++ b/qemu/upstream/hw/xen/xen_pt_config_init.c
 @@ -1436,32 +1436,58 @@ static int xen_pt_msixctrl_reg_write(Xen
   uint16_t dev_value, uint16_t valid_mask)
  {
  XenPTRegInfo *reg = cfg_entry-reg;
 -uint16_t writable_mask = 0;
 +uint16_t writable_mask, value;
  uint16_t throughable_mask = get_throughable_mask(s, reg, valid_mask);
  int debug_msix_enabled_old;
  
  /* modify emulate register */
  writable_mask = reg-emu_mask  ~reg-ro_mask  valid_mask;
 -cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, 
 writable_mask);
 +value = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, writable_mask);
 +cfg_entry-data = value;

  /* create value for writing to I/O device register */
  *val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
  
 +debug_msix_enabled_old = s-msix-enabled;
 +
  /* update MSI-X */
 -if ((*val  PCI_MSIX_FLAGS_ENABLE)

Please explain in the commit message the reason of the *val/value
change.


 - !(*val  PCI_MSIX_FLAGS_MASKALL)) {
 +if ((value  PCI_MSIX_FLAGS_ENABLE)
 + !(value  PCI_MSIX_FLAGS_MASKALL)) {
 +if (!s-msix-enabled) {
 +if (!s-msix-maskall) {
 +xen_pt_msix_maskall(s, true);
 +}
 +xen_pt_msix_enable(s);
 +}
  xen_pt_msix_update(s);
 -} else if (!(*val  PCI_MSIX_FLAGS_ENABLE)  s-msix-enabled) {
 -xen_pt_msix_disable(s);
 +s-msix-enabled = true;
 +s-msix-maskall = false;
 +xen_pt_msix_maskall(s, false);

Please explain in the commit message the reason behind the
maskall-enable-update-un_maskall logic, that wasn't present before.


 +} else if (s-msix-enabled) {
 +if (!(value  PCI_MSIX_FLAGS_ENABLE)) {
 +xen_pt_msix_disable(s);
 +s-msix-enabled = false;
 +} else if (!s-msix-maskall) {

Why are you changing the state of s-msix-maskall here?
This is the value  PCI_MSIX_FLAGS_ENABLE case, nothing to do with
maskall, right?


 +s-msix-maskall = true;
 +xen_pt_msix_maskall(s, true);
 +}
  }
  
 -debug_msix_enabled_old = s-msix-enabled;
 -s-msix-enabled = !!(*val  PCI_MSIX_FLAGS_ENABLE);
  if (s-msix-enabled != debug_msix_enabled_old) {
  XEN_PT_LOG(s-dev, %s MSI-X\n,
 s-msix-enabled ? enable : disable);
  }
  
 +xen_host_pci_get_word(s-real_device, s-msix-ctrl_offset, dev_value);

I have to say that I don't like the asymmetry between reading and
writing PCI config registers. If writes go via hypercalls, reads should
go via hypercalls too.


 +if (s-msix-enabled  !(dev_value  PCI_MSIX_FLAGS_ENABLE)) {
 +XEN_PT_ERR(s-dev, MSI-X unexpectedly disabled\n);
 +} else if ((dev_value  PCI_MSIX_FLAGS_ENABLE) 
 +   s-msix-maskall 
 +   !(dev_value  PCI_MSIX_FLAGS_MASKALL)) {
 +XEN_PT_ERR(s-dev, MSI-X unexpectedly unmasked\n);
 +}
 +
  return 0;
  }
  
 @@ -1483,9 +1509,12 @@ static XenPTRegInfo xen_pt_emu_reg_msix[
  .offset = PCI_MSI_FLAGS,
  .size   = 2,
  .init_val   = 0x,
 -.res_mask   = 0x3800,
 -.ro_mask= 0x07FF,
 -.emu_mask   = 0x,
 +/* This must not be split into res_mask (0x3800) and ro_mask (0x07FF)
 + * because even in permissive mode there must not be any write back
 + * to this register.
 + */
 +.ro_mask= 0x3FFF,
 +.emu_mask   = 0xC000,
  .init   = xen_pt_msixctrl_reg_init,
  .u.w.read   = xen_pt_word_reg_read,
  .u.w.write  = xen_pt_msixctrl_reg_write,
 ---

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):
 So it sounds like we're converging on Allow multiple ways to specify
 the interface, with at least the following fields:
 - bus (int - 1,2,3, c)
 - port (string - 2.1.3, c)
 - address/devnum (int)
 - vendorid (uint16_t)
 - deviceid (uint16_t)

You're missing the full device path from that list.  Typically that
will include at least one pci bus address.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API


On 06/16/2015 03:29 PM, George Dunlap wrote:

On 06/16/2015 02:23 PM, Juergen Gross wrote:

On 06/16/2015 03:09 PM, George Dunlap wrote:

On 06/16/2015 02:06 PM, Juergen Gross wrote:

On 06/16/2015 01:45 PM, Ian Jackson wrote:

Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb
API):

On 06/16/2015 01:10 PM, Ian Jackson wrote:

AIUI some devices have serial numbers, which means you can
distinguish
them ?


Yes, they have. The question is whether those are different on
multiple
instances? With lsusb I've found a device with serial number
0123456789ABCD. Do you really believe I'm just lucky owning the one
with
such a nice serial number? ;-)


Heh.  I think, though, that this suggests that if the user has devices
whose serial numbers are actually different, they should be able to
specify a device by vid/pid/serial.


I think you are right.

The next question: should this be part of libxl or qemu? It should be
possible to extend the qemu vendorId:productId syntax to
vendorId:productId:serial without too much effort, as libusb
includes some support to obtain the serial number.


It doesn't really matter what qemu or pvusb can do, as long as libxl can
convert what it has into something that they can use.  So if libxl is
given vid, pid, and serial (remember, this will be a struct, not a
string), it can look in sysfs and find the bus:addr (which is unique at
any given time) and hand it to qemu  (or the bus-port in the case of
pvusb).


Hmm, I'd rather have it all in one place. Putting it in qemu would
enable us to handle hotplug as well. A USB device assigned via it's
serial to a domU could be automatically passed through by qemu when
showing up. This isn't possible today, but we wouldn't have to change
libxl again for supporting it, only qemu would have to be adapted.


Look, we're talking here about the libxl interface.  Ian thinks that *at
the libxl layer* we need to be able to specify all these things.  It
doesn't matter at this point whether qemu can also do it or not.


When I'm adding new functionality (and this is the case here) I always
try to add it at the level where it should be. qemu already is capable
of finding a USB device by various means and can be extended easily to
support our needs. And please remember, for writing the pvusb backend
I'm already doing changes to qemu.

So why should we add the same functionality to libxl by reading sysfs
instead of letting it do qemu via libusb? And what about BSD? Letting
qemu find the device would avoid two variants in libxl.


In the future, if we actually implement the automatically grab this
device when it shows up functionality, then yes, having it in qemu
rather than having a libxl daemon sit around and watch for those things,
might be handy.  But we can cross that bridge when we come to it.


I would agree if the efforts in libxl would be much smaller than in
qemu. But if the efforts are comparable I'd rather do it in the
component which will make such an enhancement easier.

Just my $0.02


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 08/27] tools/libxl: Extra APIs for the save helper

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 With libxl migration v2, there will be other moving parts which might fail,
 requiring the helper to be stopped for reasons which are not its fault.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/libxl_internal.h |8 
  tools/libxl/libxl_save_callout.c |   16 
  2 files changed, 24 insertions(+)
 
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index 4f204f9..3fcc37a 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -3182,6 +3182,14 @@ _hidden void libxl__xc_domain_restore(libxl__egc *egc,
  _hidden void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
 int rc, int retval, int errnoval);
  
 +_hidden void libxl__save_helper_abort(libxl__egc *egc,
 +  libxl__save_helper_state *shs);
 +
 +static inline bool libxl__save_helper_inuse(const libxl__save_helper_state 
 *shs)
 +{
 +return libxl__ev_child_inuse(shs-child);
 +}

Will this be used other than in libxl__save_helper_abort?

 +
  /* Each time the dm needs to be saved, we must call suspend and then save */
  _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
 libxl__domain_suspend_state *dss);
 diff --git a/tools/libxl/libxl_save_callout.c 
 b/tools/libxl/libxl_save_callout.c
 index 231de2f..71de297 100644
 --- a/tools/libxl/libxl_save_callout.c
 +++ b/tools/libxl/libxl_save_callout.c
 @@ -256,6 +256,22 @@ static void helper_failed(libxl__egc *egc, 
 libxl__save_helper_state *shs,
  libxl__kill(gc, shs-child.pid, SIGKILL, save/restore helper);
  }
  
 +void libxl__save_helper_abort(libxl__egc *egc,
 +  libxl__save_helper_state *shs)
 +{
 +STATE_AO_GC(shs-ao);
 +
 +if (!libxl__ev_child_inuse(shs-child)) {
 +helper_failed(egc, shs, ERROR_FAIL);

Did you mean helper_done here?

helper_failed deregisters the fd, and potentially sends SIGKILL,
although in this case it won't because its not inuse.

So all this is doing is deregistering the fd, which AFAICT isn't
registered if there is no pid.

In fact, apart from the specific signal used, this function looks a lot
like helper_failed.

 +return;
 +}
 +
 +if (!shs-rc)
 +shs-rc = ERROR_FAIL;
 +
 +libxl__kill(gc, shs-child.pid, SIGTERM, save/restore helper);
 +}
 +
  static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
 int fd, short events, short revents)
  {



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes

2015-06-16 Thread Julien Grall

On 16/06/15 12:59, Jan Beulich wrote:
 On 16.06.15 at 13:14, julien.gr...@citrix.com wrote:
 On 15/06/2015 16:48, David Vrabel wrote:
 diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
 index 44ea92d..a0ff9d2 100644
 --- a/xen/include/xen/sched.h
 +++ b/xen/include/xen/sched.h
 @@ -129,7 +129,7 @@ struct evtchn
   #endif
   } ssid;
   #endif
 -};
 +} __attribute__((aligned(64)));

 Why don't you use __cacheline_aligned?
 
 That would double the size on x86, for little or no benefit.

Well, the cache line size is not necessarily 64 bytes on every
architecture. In the case of ARM, the cache line depends on the
processor version.

__cacheline_aligned is the only way to ensure that the cache line is not
shared on ARM.

AFAIU, the goal of this patch is to avoid sharing the cache line. If
not, the commit message is misleading because it claims that a cache
line is always 64 bytes...

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv2 5/5] evtchn: pad struct evtchn to 64 bytes

On 16/06/15 13:57, Julien Grall wrote:
 On 16/06/15 12:59, Jan Beulich wrote:
 On 16.06.15 at 13:14, julien.gr...@citrix.com wrote:
 On 15/06/2015 16:48, David Vrabel wrote:
 diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
 index 44ea92d..a0ff9d2 100644
 --- a/xen/include/xen/sched.h
 +++ b/xen/include/xen/sched.h
 @@ -129,7 +129,7 @@ struct evtchn
   #endif
   } ssid;
   #endif
 -};
 +} __attribute__((aligned(64)));

 Why don't you use __cacheline_aligned?

 That would double the size on x86, for little or no benefit.
 
 Well, the cache line size is not necessarily 64 bytes on every
 architecture. In the case of ARM, the cache line depends on the
 processor version.
 
 __cacheline_aligned is the only way to ensure that the cache line is not
 shared on ARM.
 
 AFAIU, the goal of this patch is to avoid sharing the cache line. If
 not, the commit message is misleading because it claims that a cache
 line is always 64 bytes...

We want to avoid sharing the cache line where we can do so for no
additional memory cost.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API


On 06/16/2015 03:19 PM, George Dunlap wrote:

On Tue, Jun 16, 2015 at 1:02 PM, Ian Jackson ian.jack...@eu.citrix.com wrote:

George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):

So it sounds like we're converging on Allow multiple ways to specify
the interface, with at least the following fields:
- bus (int - 1,2,3, c)
- port (string - 2.1.3, c)
- address/devnum (int)
- vendorid (uint16_t)
- deviceid (uint16_t)


You're missing the full device path from that list.  Typically that
will include at least one pci bus address.


I'm not sure what constitutes a full device path.  Is that defined somewhere?

Remember that the path you gave in your previous e-mail isn't the path
for the *usb device*, it's the path for the *block device*.  It
contains a PCI address, but it looks like it also contains part of the
USB topology.  Are you sure that's actually a stable interface, or
does it just happen that on your hardware the discovery always happens
in the same order?

On my system /sys/bus/usb/devices/2-3.3 is a link to
/sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/.  This contains
the pci bus address, but it also contains the bus number, which we've
just said may be unstable across reboots.

I suppose it might be possible to specify buspci,port -- the pci
address of the root bus, and the topology from there.  In theory I
guess that should be stable?


Hmm, perhaps. On my system I've got:

/sys/devices/pci:00/:00:14.0/usb3/
/sys/devices/pci:00/:00:14.0/usb4/

So two busses on one pci bus address. Are usb3 and usb4 always in this
order or are they sometimes just numbered the other way round?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children

On 16/06/15 14:21, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Shortly, libxl will be juggling multiple parallel operations, and will
 possibly have to take error decisions before some tasks have been set up.
 It would be preferable, I think, to arrange to call libxl__ev_child_init
 on all such libxl__ev_child structs either up front or certainly before
 there is any possibility of needing to unwind them.

 Such an init would at worst correspond to exactly the place where the
 zeroed structure you refer to is zeroed.

It is possible that one bit fails before it can be calculated whether
the second bit needs to start or not.

At the moment, all bits in libxl in this area do initialisation
immediately before use; most bits are even initialised in the function
which starts their actions.  Some bits are initialised differently
depending on the path taken to get to the initialisation site. 

It would be non-trivial to initialise everything appropriately at the
very start.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Shortly more parameters will appear, and this saves unboxing each one.
 
 No functional change.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/libxl_create.c   |   12 ++--
  tools/libxl/libxl_internal.h |2 +-
  tools/libxl/libxl_save_callout.c |2 +-
  3 files changed, 8 insertions(+), 8 deletions(-)
 
 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index 86384d2..385891c 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
   int rc, uint32_t domid);
  
  static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
 -uint32_t *domid,
 -int restore_fd, int checkpointed_stream,
 +uint32_t *domid, int restore_fd,
 +const libxl_domain_restore_params *params,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
 @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, 
 libxl_domain_config *d_config,
  libxl_domain_config_init(cdcs-dcs.guest_config_saved);
  libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config);
  cdcs-dcs.restore_fd = restore_fd;
 +if (params) cdcs-dcs.restore_params = *params;

Is this eventually going to become non-optional? I think not and its
validity is entirely intertwined with the validity of restore_fd (as I
suspect it was before, but I've not checked).

Perhaps an error check to that effect would be useful?

Anyway, I think what you've done here is correct, so:
Acked-by: Ian Campbell ian.campb...@citrix.com

[...]
 @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
  libxl_domain_config *guest_config;
  libxl_domain_config guest_config_saved; /* vanilla config */
  int restore_fd;
 +libxl_domain_restore_params restore_params;
  libxl__domain_create_cb *callback;
  libxl_asyncprogress_how aop_console_how;
  /* private to domain_create */
  int guest_domid;
 -int checkpointed_stream;

This has, in effect moved from private to domain_create to filled in
by user, I don't think the change here has actually changed its status,
but I suspect it was wrong before (alternatively restore_fd is in the
wrong place instead).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/6] tools/libxl: move domain suspend code into libxl_dom_suspend.c




On 06/16/2015 09:00 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

Move domain suspend code into a separate file libxl_dom_suspend.c.
export an API libxl__domain_suspend() which wrappers the static


just ..which wraps the...


will fix, thank you !




function domain_suspend_callback_common() for internal use.

Note that the newly added file libxl_dom_suspend.c is used for
suspend/resume code.

Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
CC: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
---
  tools/libxl/Makefile|   3 +-
  tools/libxl/libxl_dom.c | 350 +---
  tools/libxl/libxl_dom_suspend.c | 381 
  tools/libxl/libxl_internal.h|   6 +
  4 files changed, 393 insertions(+), 347 deletions(-)
  create mode 100644 tools/libxl/libxl_dom_suspend.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index cc9c152..3f98d62 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -95,7 +95,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
libxl_pci.o \
libxl_internal.o libxl_utils.o libxl_uuid.o \
libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
\
libxl_save_callout.o _libxl_save_msgs_callout.o \
-   libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
+   libxl_qmp.o libxl_event.o libxl_fork.o 
libxl_dom_suspend.o \
+   $(LIBXL_OBJS-y)
  LIBXL_OBJS += libxl_genid.o
  LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index cce04dd..9444329 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1103,11 +1103,6 @@ int libxl__toolstack_restore(uint32_t domid, const 
uint8_t *buf,

  /* Domain suspend (save) */

-static void domain_save_done(libxl__egc *egc,
- libxl__domain_suspend_state *dss, int rc);
-static void domain_suspend_callback_common_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int ok);
-
  /*- complicated callback, called by xc_domain_save -*/

  /*
@@ -1324,35 +1319,6 @@ static void switch_logdirty_done(libxl__egc *egc,

  /*- callbacks, called by xc_domain_save -*/

-int libxl__domain_suspend_device_model(libxl__gc *gc,
-   libxl__domain_suspend_state *dss)
-{
-int ret = 0;
-uint32_t const domid = dss-domid;
-const char *const filename = dss-dm_savefile;
-
-switch (libxl__device_model_version_running(gc, domid)) {
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-LOG(DEBUG, Saving device model state to %s, filename);
-libxl__qemu_traditional_cmd(gc, domid, save);
-libxl__wait_for_device_model_deprecated(gc, domid, paused, NULL, 
NULL, NULL);
-break;
-}
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-if (libxl__qmp_stop(gc, domid))
-return ERROR_FAIL;
-/* Save DM state into filename */
-ret = libxl__qmp_save(gc, domid, filename);
-if (ret)
-unlink(filename);
-break;
-default:
-return ERROR_INVAL;
-}
-
-return ret;
-}
-
  int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
  {

@@ -1373,301 +1339,6 @@ int libxl__domain_resume_device_model(libxl__gc *gc, 
uint32_t domid)
  return 0;
  }

-static void domain_suspend_common_wait_guest(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-static void domain_suspend_common_guest_suspended(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-
-static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
-  libxl__xswait_state *xswa, int rc, const char *state);
-static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
-libxl__ev_evtchn *evev);
-static void suspend_common_wait_guest_watch(libxl__egc *egc,
-  libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
-static void suspend_common_wait_guest_check(libxl__egc *egc,
-libxl__domain_suspend_state *dss);
-static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-  libxl__ev_time *ev, const struct timeval *requested_abs);
-
-static void domain_suspend_common_failed(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-static void domain_suspend_common_done(libxl__egc *egc,
-   libxl__domain_suspend_state *dss,
-   bool ok);
-
-static bool

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

On 06/16/2015 02:49 PM, Juergen Gross wrote:
 On 06/16/2015 03:29 PM, George Dunlap wrote:
 On 06/16/2015 02:23 PM, Juergen Gross wrote:
 Hmm, I'd rather have it all in one place. Putting it in qemu would
 enable us to handle hotplug as well. A USB device assigned via it's
 serial to a domU could be automatically passed through by qemu when
 showing up. This isn't possible today, but we wouldn't have to change
 libxl again for supporting it, only qemu would have to be adapted.

 Look, we're talking here about the libxl interface.  Ian thinks that *at
 the libxl layer* we need to be able to specify all these things.  It
 doesn't matter at this point whether qemu can also do it or not.
 
 When I'm adding new functionality (and this is the case here) I always
 try to add it at the level where it should be. qemu already is capable
 of finding a USB device by various means and can be extended easily to
 support our needs. And please remember, for writing the pvusb backend
 I'm already doing changes to qemu.
 
 So why should we add the same functionality to libxl by reading sysfs
 instead of letting it do qemu via libusb? And what about BSD? Letting
 qemu find the device would avoid two variants in libxl.

If the libxl interface only has bus and port, then it doesn't matter
what qemu can do -- the caller has no way of asking qemu to watch for
vid:did:serial.

If libxl has vid:did:serial in the interface, then it's only an
implementation detail whether we pass that into qemu or whether we pass
some other way of uniquely indentifying a particular device.  And given
that no *current* versions of qemu support vid:did:serial, the most
sensible way to implement this would be to have libxl do the conversion
and send over bus:addr -- that way when you update libxl you get that
functionality regardless of whether qemu has it.

Unless you're suggesting that the libxl interface should be a string
that we just pass verbatim to qemu.  If that's what you mean, it's a
complete non-starter, and I'm sure Ian will agree with me.  There's no
way that we can provide any interface consistency or any documentation
of what this string does if it boils down to This does whatever the
version of qemu you happen to be running does.

 In the future, if we actually implement the automatically grab this
 device when it shows up functionality, then yes, having it in qemu
 rather than having a libxl daemon sit around and watch for those things,
 might be handy.  But we can cross that bridge when we come to it.
 
 I would agree if the efforts in libxl would be much smaller than in
 qemu. But if the efforts are comparable I'd rather do it in the
 component which will make such an enhancement easier.

They're both small, so this is really bikeshedding at the moment.  The
most important question is the interface: do we include vid:did:serial
in the libxl interface.  Since you want qemu to do that, I take it that
your answer to that is yes.

When we have patches submitted, we can discuss whether libxl should only
support the (as-yet unreleased) versions of qemu that do the search
themselves, or whether they should support all versions of qemu-xen.
(The answer seems pretty obvious to me...)

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 07/12] x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code

On 15/06/15 21:35, Ingo Molnar wrote:
 
 * David Vrabel david.vra...@citrix.com wrote:
 
 On 15/06/15 10:05, Ian Campbell wrote:
 On Sat, 2015-06-13 at 11:49 +0200, Ingo Molnar wrote:
 
 xen_mm_pin_all()/unpin_all() are used to implement full guest instance 
 suspend/restore. It's a stop-all method that needs to iterate through all 
 allocated pgds in the system to fix them up for Xen's use.

 This code uses pgd_list, probably because it was an easy interface.

 But we want to remove the pgd_list, so convert the code over to walk all 
 tasks in the system. This is an equivalent method.

 It is not equivalent because pgd_alloc() now populates entries in pgds that 
 are 
 not visible to xen_mm_pin_all() (note how the original code adds the pgd to 
 the 
 pgd_list in pgd_ctor() before calling pgd_prepopulate_pmd()).  These newly 
 allocated page tables won't be correctly converted on suspend/resume and the 
 new 
 process will die after resume.
 
 So how should the Xen logic be fixed for the new scheme? I can't say I can 
 see 
 through the paravirt complexity here.

Actually, since we freeze_processes() before trying to pin page tables,
I think it should be ok as-is.

I'll put the patch through some tests.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/6] tools/libxl: move domain resume code into libxl_dom_suspend.c

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:
 move domain resume code into libxl_dom_suspend.c.

Even though it has resume in the name, I'm not sure that
libxl__domain_s3_resume is a good candidate for moving to the suspend
code, it's called only from libxl_send_trigger and IIRC we don't really
implement s3, just a fake version where the domain is paused/unpaused
(rather the destroyed and resumed, say.

Having moved libxl__domain_resume and libxl__domain_resume_device_model
into the same file I think the latter could become static. (If you do
that in this patch please say something along the lines of pure code
motion except for making libxl__domain_resume_device_model static in
the commit message)



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 This is safe (as the variable will simply be unused), and is required for
 correct compilation when midway through untangling the libxc/libxl
 interaction.
 
 The #define is left in place to highlight that the variables can be removed
 once the untangling is complete.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com

Acked-by: Ian Campbell ian.campb...@citrix.com




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API


On 06/16/2015 03:09 PM, George Dunlap wrote:

On 06/16/2015 02:06 PM, Juergen Gross wrote:

On 06/16/2015 01:45 PM, Ian Jackson wrote:

Juergen Gross writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb
API):

On 06/16/2015 01:10 PM, Ian Jackson wrote:

AIUI some devices have serial numbers, which means you can distinguish
them ?


Yes, they have. The question is whether those are different on multiple
instances? With lsusb I've found a device with serial number
0123456789ABCD. Do you really believe I'm just lucky owning the one with
such a nice serial number? ;-)


Heh.  I think, though, that this suggests that if the user has devices
whose serial numbers are actually different, they should be able to
specify a device by vid/pid/serial.


I think you are right.

The next question: should this be part of libxl or qemu? It should be
possible to extend the qemu vendorId:productId syntax to
vendorId:productId:serial without too much effort, as libusb
includes some support to obtain the serial number.


It doesn't really matter what qemu or pvusb can do, as long as libxl can
convert what it has into something that they can use.  So if libxl is
given vid, pid, and serial (remember, this will be a struct, not a
string), it can look in sysfs and find the bus:addr (which is unique at
any given time) and hand it to qemu  (or the bus-port in the case of pvusb).


Hmm, I'd rather have it all in one place. Putting it in qemu would
enable us to handle hotplug as well. A USB device assigned via it's
serial to a domU could be automatically passed through by qemu when
showing up. This isn't possible today, but we wouldn't have to change
libxl again for supporting it, only qemu would have to be adapted.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Shortly, libxl will be juggling multiple parallel operations, and will
 possibly have to take error decisions before some tasks have been set up.

It would be preferable, I think, to arrange to call libxl__ev_child_init
on all such libxl__ev_child structs either up front or certainly before
there is any possibility of needing to unwind them.

Such an init would at worst correspond to exactly the place where the
zeroed structure you refer to is zeroed.

 No child process of libxl will ever have a pid of 0, so gate
 libxl__ev_child_inuse() on a pid strictly greater than 0.
 
 This makes it safe to use on a zeroed structure of a task which has not yet
 been set up.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 
 ---
 This change does make libxl__ev_child_init() functionally useless.  I am
 undecided between leaving it in place in case it is useful in the future, or 
 to
 remove it completely.
 ---
  tools/libxl/libxl_internal.h |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index e96d6b5..6226c18 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -880,7 +880,7 @@ _hidden pid_t libxl__ev_child_fork(libxl__gc *gc, 
 libxl__ev_child *childw_out,
  static inline void libxl__ev_child_init(libxl__ev_child *childw_out)
  { childw_out-pid = -1; }
  static inline int libxl__ev_child_inuse(const libxl__ev_child *childw_out)
 -{ return childw_out-pid = 0; }
 +{ return childw_out-pid  0; }
  
  /* Useable (only) in the child to once more make the ctx useable for
   * xenstore operations.  logs failure in the form what: error



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Introduced at this point so the python stream conversion code has a concrete
 ABI to use.

Please could you also explicitly mention that it isn't added to FLAG_ALL
yet because we don't actually implement it yet and that it will be added
there later.

With that:
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com

Acked-by: Ian Campbell ian.campb...@citrix.com

(although I wish I had a better name in mind...)



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]

George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):
 Remember that the path you gave in your previous e-mail isn't the path
 for the *usb device*, it's the path for the *block device*.  It
 contains a PCI address, but it looks like it also contains part of the
 USB topology.  Are you sure that's actually a stable interface, or
 does it just happen that on your hardware the discovery always happens
 in the same order?

The block device is (in path terms) underneath the usb device,
obviously.  Not all of that path is relevant to identifying the
USB device.

 On my system /sys/bus/usb/devices/2-3.3 is a link to
 /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/.  This contains
 the pci bus address, but it also contains the bus number, which we've
 just said may be unstable across reboots.

You mean the 2 in `usb2' ?  I think that bus number is the bus number
within the controller, not globally.

 I suppose it might be possible to specify buspci,port -- the pci
 address of the root bus, and the topology from there.  In theory I
 guess that should be stable?

Yes.  The whole point of paths like this is that they are stable if
the physical topology doesn't change.  So on my netbook

  /dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1

always refers to the 1st MBR partition on logical device 0 on the USB
storage device plugged into the USB port physically on the front left
of the computer.

 In any case, at the moment you're essentially inventing from whole
 cloth a new way of specifying USB devices that (as far as I know)
 isn't supported by any other program that uses USB.

If you can't specify the device by hardware path, you can't specify it
deterministically.

And as you can see it _is_ supported by other programs that use USB.
mount can use it!

I think the hardware path to the controller, at least, should be
treated as an opaque OS-specific string.  It might have a different
format on BSD.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/6] tools/libxl: move domain resume code into libxl_dom_suspend.c




On 06/16/2015 09:04 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

move domain resume code into libxl_dom_suspend.c.


Even though it has resume in the name, I'm not sure that
libxl__domain_s3_resume is a good candidate for moving to the suspend
code, it's called only from libxl_send_trigger and IIRC we don't really
implement s3, just a fake version where the domain is paused/unpaused
(rather the destroyed and resumed, say.

Having moved libxl__domain_resume and libxl__domain_resume_device_model
into the same file I think the latter could become static. (If you do
that in this patch please say something along the lines of pure code
motion except for making libxl__domain_resume_device_model static in
the commit message)


ok, will fix this in the next version, thank you for the review!




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/6] xen/MSI-X: really enforce alignment

On Fri, 5 Jun 2015, Jan Beulich wrote:
 The way the generic infrastructure works the intention of not allowing
 unaligned accesses can't be achieved by simply setting .unaligned to
 false. The benefit is that we can now replace the conditionals in
 {get,set}_entry_value() by assert()-s.
 
 Signed-off-by: Jan Beulich jbeul...@suse.com

Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com


 --- a/qemu/upstream/hw/xen/xen_pt_msi.c
 +++ b/qemu/upstream/hw/xen/xen_pt_msi.c
 @@ -421,16 +421,14 @@ int xen_pt_msix_update_remap(XenPCIPasst
  
  static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset)
  {
 -return !(offset % sizeof(*e-latch))
 -   ? e-latch[offset / sizeof(*e-latch)] : 0;
 +assert(!(offset % sizeof(*e-latch)));
 +return e-latch[offset / sizeof(*e-latch)];
  }
  
  static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val)
  {
 -if (!(offset % sizeof(*e-latch)))
 -{
 -e-latch[offset / sizeof(*e-latch)] = val;
 -}
 +assert(!(offset % sizeof(*e-latch)));
 +e-latch[offset / sizeof(*e-latch)] = val;
  }
  
  static void pci_msix_write(void *opaque, hwaddr addr,
 @@ -496,6 +494,12 @@ static uint64_t pci_msix_read(void *opaq
  }
  }
  
 +static bool pci_msix_accepts(void *opaque, hwaddr addr,
 + unsigned size, bool is_write)
 +{
 +return !(addr  (size - 1));
 +}
 +
  static const MemoryRegionOps pci_msix_ops = {
  .read = pci_msix_read,
  .write = pci_msix_write,
 @@ -504,7 +508,13 @@ static const MemoryRegionOps pci_msix_op
  .min_access_size = 4,
  .max_access_size = 4,
  .unaligned = false,
 +.accepts = pci_msix_accepts
  },
 +.impl = {
 +.min_access_size = 4,
 +.max_access_size = 4,
 +.unaligned = false
 +}
  };
  
  int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base)
 
 
 
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state

On 16/06/15 14:37, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Shortly more parameters will appear, and this saves unboxing each one.

 No functional change.

 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/libxl_create.c   |   12 ++--
  tools/libxl/libxl_internal.h |2 +-
  tools/libxl/libxl_save_callout.c |2 +-
  3 files changed, 8 insertions(+), 8 deletions(-)

 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index 86384d2..385891c 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
   int rc, uint32_t domid);
  
  static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
 -uint32_t *domid,
 -int restore_fd, int checkpointed_stream,
 +uint32_t *domid, int restore_fd,
 +const libxl_domain_restore_params *params,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
 @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, 
 libxl_domain_config *d_config,
  libxl_domain_config_init(cdcs-dcs.guest_config_saved);
  libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config);
  cdcs-dcs.restore_fd = restore_fd;
 +if (params) cdcs-dcs.restore_params = *params;
 Is this eventually going to become non-optional? I think not and its
 validity is entirely intertwined with the validity of restore_fd (as I
 suspect it was before, but I've not checked).

 Perhaps an error check to that effect would be useful?

It is mandatory for restore, and currently unused for plain create. 
restore_fd being  -1 does appear to be the canonical switch between a
restore and a create, so should be the qualification of validity.


 Anyway, I think what you've done here is correct, so:
 Acked-by: Ian Campbell ian.campb...@citrix.com
 
 [...]
 @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
  libxl_domain_config *guest_config;
  libxl_domain_config guest_config_saved; /* vanilla config */
  int restore_fd;
 +libxl_domain_restore_params restore_params;
  libxl__domain_create_cb *callback;
  libxl_asyncprogress_how aop_console_how;
  /* private to domain_create */
  int guest_domid;
 -int checkpointed_stream;
 This has, in effect moved from private to domain_create to filled in
 by user, I don't think the change here has actually changed its status,
 but I suspect it was wrong before (alternatively restore_fd is in the
 wrong place instead).

I think it was wrong before.  It was always a caller-provided parameter,
albeit implicit by virtue of essentially being a remus boolean.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream

On 16/06/15 14:39, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Introduced at this point so the python stream conversion code has a concrete
 ABI to use.
 Please could you also explicitly mention that it isn't added to FLAG_ALL
 yet because we don't actually implement it yet and that it will be added
 there later.

Certainly.


 With that:
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 Acked-by: Ian Campbell ian.campb...@citrix.com

 (although I wish I had a better name in mind...)

Any suggestions welcome.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API


On 06/16/2015 04:06 PM, George Dunlap wrote:

On 06/16/2015 02:49 PM, Juergen Gross wrote:

On 06/16/2015 03:29 PM, George Dunlap wrote:

On 06/16/2015 02:23 PM, Juergen Gross wrote:

Hmm, I'd rather have it all in one place. Putting it in qemu would
enable us to handle hotplug as well. A USB device assigned via it's
serial to a domU could be automatically passed through by qemu when
showing up. This isn't possible today, but we wouldn't have to change
libxl again for supporting it, only qemu would have to be adapted.


Look, we're talking here about the libxl interface.  Ian thinks that *at
the libxl layer* we need to be able to specify all these things.  It
doesn't matter at this point whether qemu can also do it or not.


When I'm adding new functionality (and this is the case here) I always
try to add it at the level where it should be. qemu already is capable
of finding a USB device by various means and can be extended easily to
support our needs. And please remember, for writing the pvusb backend
I'm already doing changes to qemu.

So why should we add the same functionality to libxl by reading sysfs
instead of letting it do qemu via libusb? And what about BSD? Letting
qemu find the device would avoid two variants in libxl.


If the libxl interface only has bus and port, then it doesn't matter
what qemu can do -- the caller has no way of asking qemu to watch for
vid:did:serial.

If libxl has vid:did:serial in the interface, then it's only an
implementation detail whether we pass that into qemu or whether we pass
some other way of uniquely indentifying a particular device.  And given
that no *current* versions of qemu support vid:did:serial, the most
sensible way to implement this would be to have libxl do the conversion
and send over bus:addr -- that way when you update libxl you get that
functionality regardless of whether qemu has it.

Unless you're suggesting that the libxl interface should be a string
that we just pass verbatim to qemu.  If that's what you mean, it's a
complete non-starter, and I'm sure Ian will agree with me.  There's no
way that we can provide any interface consistency or any documentation
of what this string does if it boils down to This does whatever the
version of qemu you happen to be running does.


No, I didn't expect libxl to just pass an arbitrary string to qemu.
My point was to avoid the sysfs accesses in libxl in order to support
BSD as well and to reduce the complexity.


In the future, if we actually implement the automatically grab this
device when it shows up functionality, then yes, having it in qemu
rather than having a libxl daemon sit around and watch for those things,
might be handy.  But we can cross that bridge when we come to it.


I would agree if the efforts in libxl would be much smaller than in
qemu. But if the efforts are comparable I'd rather do it in the
component which will make such an enhancement easier.


They're both small, so this is really bikeshedding at the moment.  The
most important question is the interface: do we include vid:did:serial
in the libxl interface.  Since you want qemu to do that, I take it that
your answer to that is yes.


Correct.


When we have patches submitted, we can discuss whether libxl should only
support the (as-yet unreleased) versions of qemu that do the search
themselves, or whether they should support all versions of qemu-xen.
(The answer seems pretty obvious to me...)


May be I made one wrong assumption: I thought adding USB-passthrough for
HVM domains would require qemu changes as well. If this is not the case
I can understand your reasoning.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/6] xen/pass-through: correctly deal with RW1C bits

On Fri, 5 Jun 2015, Jan Beulich wrote:
 Introduce yet another mask for them, so that the generic routine can
 handle them, at once rendering xen_pt_pmcsr_reg_write() superfluous.
 
 Signed-off-by: Jan Beulich jbeul...@suse.com
 
 --- a/qemu/upstream/hw/xen/xen_pt.h
 +++ b/qemu/upstream/hw/xen/xen_pt.h
 @@ -105,6 +105,8 @@ struct XenPTRegInfo {
  uint32_t res_mask;
  /* reg read only field mask (ON:RO/ROS, OFF:other) */
  uint32_t ro_mask;
 +/* reg read/write-1-clear field mask (ON:RW1C/RW1CS, OFF:other) */
 +uint32_t rw1c_mask;
  /* reg emulate field mask (ON:emu, OFF:passthrough) */
  uint32_t emu_mask;
  xen_pt_conf_reg_init init;
 --- a/qemu/upstream/hw/xen/xen_pt_config_init.c
 +++ b/qemu/upstream/hw/xen/xen_pt_config_init.c
 @@ -176,7 +176,8 @@ static int xen_pt_byte_reg_write(XenPCIP
  cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, 
 writable_mask);
  
  /* create value for writing to I/O device register */
 -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
 +*val = XEN_PT_MERGE_VALUE(*val, dev_value  ~reg-rw1c_mask,
 +  throughable_mask);
  
  return 0;
  }
 @@ -193,7 +194,8 @@ static int xen_pt_word_reg_write(XenPCIP
  cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, 
 writable_mask);
  
  /* create value for writing to I/O device register */
 -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
 +*val = XEN_PT_MERGE_VALUE(*val, dev_value  ~reg-rw1c_mask,
 +  throughable_mask);
  
  return 0;
  }
 @@ -210,7 +212,8 @@ static int xen_pt_long_reg_write(XenPCIP
  cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, 
 writable_mask);
  
  /* create value for writing to I/O device register */
 -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
 +*val = XEN_PT_MERGE_VALUE(*val, dev_value  ~reg-rw1c_mask,
 +  throughable_mask);
  
  return 0;
  }
 @@ -611,6 +614,7 @@ static XenPTRegInfo xen_pt_emu_reg_heade
  .init_val   = 0x,
  .res_mask   = 0x0007,
  .ro_mask= 0x06F8,
 +.rw1c_mask  = 0xF900,
  .emu_mask   = 0x0010,
  .init   = xen_pt_status_reg_init,
  .u.w.read   = xen_pt_word_reg_read,
 @@ -910,6 +914,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[
  .size   = 2,
  .res_mask   = 0xFFC0,
  .ro_mask= 0x0030,
 +.rw1c_mask  = 0x000F,
  .init   = xen_pt_common_reg_init,
  .u.w.read   = xen_pt_word_reg_read,
  .u.w.write  = xen_pt_word_reg_write,
 @@ -930,6 +935,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[
  .offset = PCI_EXP_LNKSTA,
  .size   = 2,
  .ro_mask= 0x3FFF,
 +.rw1c_mask  = 0xC000,
  .init   = xen_pt_common_reg_init,
  .u.w.read   = xen_pt_word_reg_read,
  .u.w.write  = xen_pt_word_reg_write,
 @@ -966,26 +972,6 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[
   * Power Management Capability
   */
  
 -/* write Power Management Control/Status register */
 -static int xen_pt_pmcsr_reg_write(XenPCIPassthroughState *s,
 -  XenPTReg *cfg_entry, uint16_t *val,
 -  uint16_t dev_value, uint16_t valid_mask)
 -{
 -XenPTRegInfo *reg = cfg_entry-reg;
 -uint16_t writable_mask = 0;
 -uint16_t throughable_mask = get_throughable_mask(s, reg, valid_mask);
 -
 -/* modify emulate register */
 -writable_mask = reg-emu_mask  ~reg-ro_mask  valid_mask;
 -cfg_entry-data = XEN_PT_MERGE_VALUE(*val, cfg_entry-data, 
 writable_mask);
 -
 -/* create value for writing to I/O device register */
 -*val = XEN_PT_MERGE_VALUE(*val, dev_value  ~PCI_PM_CTRL_PME_STATUS,
 -  throughable_mask);
 -
 -return 0;
 -}
 -
  /* Power Management Capability reg static information table */
  static XenPTRegInfo xen_pt_emu_reg_pm[] = {
  /* Next Pointer reg */
 @@ -1016,11 +1002,12 @@ static XenPTRegInfo xen_pt_emu_reg_pm[] 
  .size   = 2,
  .init_val   = 0x0008,
  .res_mask   = 0x00F0,
 -.ro_mask= 0xE10C,
 +.ro_mask= 0x610C,
 +.rw1c_mask  = 0x8000,
  .emu_mask   = 0x810B,
  .init   = xen_pt_common_reg_init,
  .u.w.read   = xen_pt_word_reg_read,
 -.u.w.write  = xen_pt_pmcsr_reg_write,
 +.u.w.write  = xen_pt_word_reg_write,
  },
  {
  .size = 0,

I can see that the code change doesn't cause a change in behaviour for
PCI_PM_CTRL, but it does for PCI_STATUS, PCI_EXP_DEVSTA and
PCI_EXP_LNKSTA. Please explain why in the commit message.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/6] tools/libxl: rename libxl__domain_suspend to libxl__domain_save

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:
 Rename libxl__domain_suspend() to libxl__domain_save() since it
 actually do the save domain work.

This results in some strangeness in that some functions called *save*
are now passed a struct called *suspend*. I think this is probably
temporary and is all fixed up by the end of the series, is that true?

If so then this temporary state affairs is:
Acked-by: Ian Campbell ian.campb...@citrix.com



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] tools/libxl: move remus code into libxl_remus.c

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:
 move remus code into libxl_remus.c.

Please say something like ... by refactoring bits of
libxl_domain_remus_start and domain_save_done into X and Y and moving
the remaining functionality unchanged into the new file.

I gave two examples of functions which changed there, but please make
sure the list is complete+accurate.

 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/Makefile |   2 +-
  tools/libxl/libxl.c  |  55 +---
  tools/libxl/libxl_dom.c  | 206 +
  tools/libxl/libxl_internal.h |  11 ++
  tools/libxl/libxl_remus.c| 304 
 +++
  5 files changed, 318 insertions(+), 260 deletions(-)
  create mode 100644 tools/libxl/libxl_remus.c
 
 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
 index 3f98d62..8535eaa 100644
 --- a/tools/libxl/Makefile
 +++ b/tools/libxl/Makefile
 @@ -56,7 +56,7 @@ else
  LIBXL_OBJS-y += libxl_nonetbuffer.o
  endif
  
 -LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
 +LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
  
  LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
  LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
 diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
 index 77c6a36..0f9248e 100644
 --- a/tools/libxl/libxl.c
 +++ b/tools/libxl/libxl.c
 @@ -792,10 +792,6 @@ out:
  return ptr;
  }
  
 -static void libxl__remus_setup_done(libxl__egc *egc,
 -libxl__remus_devices_state *rds, int rc);
 -static void libxl__remus_setup_failed(libxl__egc *egc,
 -  libxl__remus_devices_state *rds, int 
 rc);
  static void remus_failover_cb(libxl__egc *egc,
libxl__domain_suspend_state *dss, int rc);
  
 @@ -844,63 +840,14 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
 libxl_domain_remus_info *info,
  
  assert(info);
  
 -/* Convenience aliases */
 -libxl__remus_devices_state *const rds = dss-rds;
 -
 -if (libxl_defbool_val(info-netbuf)) {
 -if (!libxl__netbuffer_enabled(gc)) {
 -LOG(ERROR, Remus: No support for network buffering);
 -rc = ERROR_FAIL;
 -goto out;
 -}
 -rds-device_kind_flags |= (1  LIBXL__DEVICE_KIND_VIF);
 -}
 -
 -if (libxl_defbool_val(info-diskbuf))
 -rds-device_kind_flags |= (1  LIBXL__DEVICE_KIND_VBD);
 -
 -rds-ao = ao;
 -rds-domid = domid;
 -rds-callback = libxl__remus_setup_done;
 -
  /* Point of no return */
 -libxl__remus_devices_setup(egc, rds);
 +libxl__remus_setup(egc, dss);
  return AO_INPROGRESS;
  
   out:
  return AO_ABORT(rc);
  }
  
 -static void libxl__remus_setup_done(libxl__egc *egc,
 -libxl__remus_devices_state *rds, int rc)
 -{
 -libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
 -STATE_AO_GC(dss-ao);
 -
 -if (!rc) {
 -libxl__domain_save(egc, dss);
 -return;
 -}
 -
 -LOG(ERROR, Remus: failed to setup device for guest with domid %u, rc 
 %d,
 -dss-domid, rc);
 -rds-callback = libxl__remus_setup_failed;
 -libxl__remus_devices_teardown(egc, rds);
 -}
 -
 -static void libxl__remus_setup_failed(libxl__egc *egc,
 -  libxl__remus_devices_state *rds, int 
 rc)
 -{
 -libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
 -STATE_AO_GC(dss-ao);
 -
 -if (rc)
 -LOG(ERROR, Remus: failed to teardown device after setup failed
 - for guest with domid %u, rc %d, dss-domid, rc);
 -
 -dss-callback(egc, dss, rc);
 -}
 -
  static void remus_failover_cb(libxl__egc *egc,
libxl__domain_suspend_state *dss, int rc)
  {
 diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
 index 701e9f7..0f81081 100644
 --- a/tools/libxl/libxl_dom.c
 +++ b/tools/libxl/libxl_dom.c
 @@ -1409,189 +1409,6 @@ int libxl__toolstack_save(uint32_t domid, uint8_t 
 **buf,
  return 0;
  }
  
 -/*- remus callbacks -*/
 -static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
 -libxl__domain_suspend_state *dss, int ok);
 -static void remus_devices_postsuspend_cb(libxl__egc *egc,
 - libxl__remus_devices_state *rds,
 - int rc);
 -static void remus_devices_preresume_cb(libxl__egc *egc,
 -   libxl__remus_devices_state *rds,
 -   int rc);
 -
 -static void libxl__remus_domain_suspend_callback(void *data)
 -{
 -libxl__save_helper_state *shs = data;
 -libxl__egc

Re: [Xen-devel] [PATCH v2 1/6] tools/libxl: rename libxl__domain_suspend to libxl__domain_save




On 06/16/2015 08:59 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

Rename libxl__domain_suspend() to libxl__domain_save() since it
actually do the save domain work.


This results in some strangeness in that some functions called *save*
are now passed a struct called *suspend*. I think this is probably
temporary and is all fixed up by the end of the series, is that true?


Yes, it is fixed by the refactor of the suspend state.



If so then this temporary state affairs is:
 Acked-by: Ian Campbell ian.campb...@citrix.com


.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 11/27] tools/python: Libxc migration v2 infrastructure

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 Contains:
  * Python implementation of the libxc migration v2 records
  * Verification code for spec compliance
  * Unit tests
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com

For this, and the following lobxl and verification patches:

Acked-by: Ian Campbell ian.campb...@citrix.com

(I've not read them closely on this pass, I recall looking at them
before)



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children

On 16/06/15 14:47, Ian Jackson wrote:
 Andrew Cooper writes (Re: [PATCH 01/27] tools/libxl: Fix 
 libxl__ev_child_inuse() check for not-yet-initialised children):
 It is possible that one bit fails before it can be calculated whether
 the second bit needs to start or not.

 At the moment, all bits in libxl in this area do initialisation
 immediately before use; most bits are even initialised in the function
 which starts their actions.  Some bits are initialised differently
 depending on the path taken to get to the initialisation site. 
 As a rule of thumb a function libxl__initiate_foo_ which takes a
 libxl__foo_state* should do this initialisation for the whole
 libxl__foo_state.

 I don't see why you can't do that.

The only example of libxl__initiate_foo_ is
libxl__initiate_device_remove() which starts the first action involved
with removing a device.

I will see what I can do, but there are areas of this code which can't
have their initialisation brought any further forward.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 14/27] tools/python: Conversion utility for legacy migration streams

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 This utility will take a legacy stream as in input, and produce a v2 stream as
 an output.  It is exec()'d by libxl to provide backwards compatibility.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/python/Makefile |4 +
  tools/python/scripts/convert-legacy-stream.py |  683 
 +
  2 files changed, 687 insertions(+)
  create mode 100755 tools/python/scripts/convert-legacy-stream.py
 
 diff --git a/tools/python/Makefile b/tools/python/Makefile
 index e933be8..531c862 100644
 --- a/tools/python/Makefile
 +++ b/tools/python/Makefile
 @@ -17,9 +17,13 @@ build: genwrap.py $(XEN_ROOT)/tools/libxl/libxl_types.idl \
  
  .PHONY: install
  install:
 + $(INSTALL_DIR) $(DESTDIR)$(PRIVATE_BINDIR)
 +
   CC=$(CC) CFLAGS=$(PY_CFLAGS) $(PYTHON) setup.py install \
   $(PYTHON_PREFIX_ARG) --root=$(DESTDIR) --force
  
 + $(INSTALL_PROG) scripts/convert-legacy-stream.py 
 $(DESTDIR)$(PRIVATE_BINDIR)

Please drop the .py on install, or from the source too if you prefer.
With that:

Acked-by: Ian Campbell ian.campb...@citrix.com



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 5/6] tools/libxl: move save/restore code into libxl_dom_save.c

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:
 move save/restore code into libxl_dom_save.c.

If this (unlike other patches in the series) is purely code motion
please indicate that this is the case.

You might also like to consider refactoring things such that all patches
are pure motion.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:
 Currently struct libxl__domain_suspend_state contains 2 type of states,
 one is save state, another is suspend state. This patch separate it out.

This patch separates those two out.

 The motivation of this is that COLO will need to do suspend/resume
 continuesly, we need a more common suspend state.

continuously

 After this change, dss stands for libxl__domain_save_state,
 dsps stands for libxl__domain_suspend_state.
 
 Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 CC: Andrew Cooper andrew.coop...@citrix.com

I presume this is almost entirely mechanical and the compiler mostly
took care of identifying everywhere to be changed:

Acked-by: Ian Campbell ian.campb...@citrix.com

[...]
 @@ -2839,9 +2840,27 @@ typedef struct libxl__logdirty_switch {
  } libxl__logdirty_switch;
  
  struct libxl__domain_suspend_state {
[...]
 +struct libxl__domain_save_state {

Please could you add a little comment before each of these explaining
their scope for the hard of thinking (i.e. me). One is the live phase
and one is the final stop-and-copy phase, I think.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

On Tue, Jun 16, 2015 at 1:02 PM, Ian Jackson ian.jack...@eu.citrix.com wrote:
 George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):
 So it sounds like we're converging on Allow multiple ways to specify
 the interface, with at least the following fields:
 - bus (int - 1,2,3, c)
 - port (string - 2.1.3, c)
 - address/devnum (int)
 - vendorid (uint16_t)
 - deviceid (uint16_t)

 You're missing the full device path from that list.  Typically that
 will include at least one pci bus address.

I'm not sure what constitutes a full device path.  Is that defined somewhere?

Remember that the path you gave in your previous e-mail isn't the path
for the *usb device*, it's the path for the *block device*.  It
contains a PCI address, but it looks like it also contains part of the
USB topology.  Are you sure that's actually a stable interface, or
does it just happen that on your hardware the discovery always happens
in the same order?

On my system /sys/bus/usb/devices/2-3.3 is a link to
/sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/.  This contains
the pci bus address, but it also contains the bus number, which we've
just said may be unstable across reboots.

I suppose it might be possible to specify buspci,port -- the pci
address of the root bus, and the topology from there.  In theory I
guess that should be stable?

In any case, at the moment you're essentially inventing from whole
cloth a new way of specifying USB devices that (as far as I know)
isn't supported by any other program that uses USB.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state

Yang Hongyang writes ([PATCH v2 6/6] libxl/save: Refactor 
libxl__domain_suspend_state):
 Currently struct libxl__domain_suspend_state contains 2 type of states,
 one is save state, another is suspend state. This patch separate it out.
 The motivation of this is that COLO will need to do suspend/resume
 continuesly, we need a more common suspend state.

Currently in libxl/libxc/etc.  suspend and save have referred to
the same thing, but different terminology has been used at different
layers.  Ie both suspend and save each refer to either or both of
save to disk or suspend for live migrate, or to the relevant
underlying mechanisms.

So I'm not sure introducing a distinction between those two terms in
libxl is really helpful.  If it is to be done there should be a clear
explanation of what the difference is.

On IRC you said:

14:21 yanghy Diziet, currently, in libxl, suspend is used as 2
   means, one is save(corrspond to libxc save), another is
   suspend the guest(related to suspend callback)

That's rather different, I think.  Or, at least, I'm not sure that I
understand this distinction as you are making it.  The suspend
callback is part of the implementation of what at higher layers we
save/suspend/restore/migration.

AIUI this callback is related to pausing the guest, or manipulating
its VCPUs ?  Perhaps we should rename this callback ?  Maybe Andrew
Cooper can suggest a name ?

Forgive me if I'm confused and going off in the wrong direction...

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 05/27] tools/libxl: Introduce ROUNDUP()

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 This is the same as is used by libxc.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com

Acked-by: Ian Campbell ian.campb...@citrix.com



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 5/6] xen/pass-through: log errno values rather than function return ones

On Fri, 5 Jun 2015, Jan Beulich wrote:
 Functions setting errno commonly return just -1, which is of no
 particular use in the log file.
 
 Signed-off-by: Jan Beulich jbeul...@suse.com

Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com


 --- a/qemu/upstream/hw/xen/xen_pt.c
 +++ b/qemu/upstream/hw/xen/xen_pt.c
 @@ -609,8 +609,8 @@ static void xen_pt_region_update(XenPCIP
guest_port, machine_port, size,
op);
  if (rc) {
 -XEN_PT_ERR(d, %s ioport mapping failed! (rc: %i)\n,
 -   adding ? create new : remove old, rc);
 +XEN_PT_ERR(d, %s ioport mapping failed! (err: %i)\n,
 +   adding ? create new : remove old, errno);
  }
  } else {
  pcibus_t guest_addr = sec-offset_within_address_space;
 @@ -623,8 +623,8 @@ static void xen_pt_region_update(XenPCIP
XEN_PFN(size + XC_PAGE_SIZE - 1),
op);
  if (rc) {
 -XEN_PT_ERR(d, %s mem mapping failed! (rc: %i)\n,
 -   adding ? create new : remove old, rc);
 +XEN_PT_ERR(d, %s mem mapping failed! (err: %i)\n,
 +   adding ? create new : remove old, errno);
  }
  }
  }
 @@ -738,8 +738,8 @@ static int xen_pt_initfn(PCIDevice *d)
  rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, pirq);
  
  if (rc  0) {
 -XEN_PT_ERR(d, Mapping machine irq %u to pirq %i failed, (rc: %d)\n,
 -   machine_irq, pirq, rc);
 +XEN_PT_ERR(d, Mapping machine irq %u to pirq %i failed, (err: 
 %d)\n,
 +   machine_irq, pirq, errno);
  
  /* Disable PCI intx assertion (turn on bit10 of devctl) */
  cmd |= PCI_COMMAND_INTX_DISABLE;
 @@ -760,8 +760,8 @@ static int xen_pt_initfn(PCIDevice *d)
 PCI_SLOT(d-devfn),
 e_intx);
  if (rc  0) {
 -XEN_PT_ERR(d, Binding of interrupt %i failed! (rc: %d)\n,
 -   e_intx, rc);
 +XEN_PT_ERR(d, Binding of interrupt %i failed! (err: %d)\n,
 +   e_intx, errno);
  
  /* Disable PCI intx assertion (turn on bit10 of devctl) */
  cmd |= PCI_COMMAND_INTX_DISABLE;
 @@ -770,7 +770,7 @@ static int xen_pt_initfn(PCIDevice *d)
  if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
  if (xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq)) {
  XEN_PT_ERR(d, Unmapping of machine interrupt %i failed!
 -(rc: %d)\n, machine_irq, rc);
 +(err: %d)\n, machine_irq, errno);
  }
  }
  s-machine_irq = 0;
 @@ -808,9 +808,9 @@ static void xen_pt_unregister_device(PCI
   0 /* isa_irq */);
  if (rc  0) {
  XEN_PT_ERR(d, unbinding of interrupt INT%c failed.
 -(machine irq: %i, rc: %d)
 +(machine irq: %i, err: %d)
  But bravely continuing on..\n,
 -   'a' + intx, machine_irq, rc);
 +   'a' + intx, machine_irq, errno);
  }
  }
  
 @@ -828,9 +828,9 @@ static void xen_pt_unregister_device(PCI
  rc = xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq);
  
  if (rc  0) {
 -XEN_PT_ERR(d, unmapping of interrupt %i failed. (rc: %d)
 +XEN_PT_ERR(d, unmapping of interrupt %i failed. (err: %d)
  But bravely continuing on..\n,
 -   machine_irq, rc);
 +   machine_irq, errno);
  }
  }
  }
 --- a/qemu/upstream/hw/xen/xen_pt_msi.c
 +++ b/qemu/upstream/hw/xen/xen_pt_msi.c
 @@ -133,8 +133,8 @@ static int msi_msix_setup(XenPCIPassthro
   msix_entry, table_base);
  if (rc) {
  XEN_PT_ERR(s-dev,
 -   Mapping of MSI%s (rc: %i, vec: %#x, entry %#x)\n,
 -   is_msix ? -X : , rc, gvec, msix_entry);
 +   Mapping of MSI%s (err: %i, vec: %#x, entry %#x)\n,
 +   is_msix ? -X : , errno, gvec, msix_entry);
  return rc;
  }
  }
 @@ -167,12 +167,12 @@ static int msi_msix_update(XenPCIPassthr
pirq, gflags, table_addr);
  
  if (rc) {
 -XEN_PT_ERR(d, Updating of MSI%s failed. (rc: %d)\n,
 -   is_msix ? -X : , rc);
 +XEN_PT_ERR(d, Updating of MSI%s failed. (err: %d)\n,
 +   is_msix ? -X : , errno);
  
  if (xc_physdev_unmap_pirq(xen_xc, xen_domid, *old_pirq)) {
 -XEN_PT_ERR(d, Unmapping of MSI%s pirq %d

[Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)

Introduces the Xen pvUSB frontend. With pvUSB it is possible for a Xen
domU to communicate with a USB device assigned to that domU. The
communication is all done via the pvUSB backend in a driver domain
(usually Dom0) which is owner of the physical device.

The pvUSB frontend is a USB hcd for a virtual USB host connector.

The code is taken from the pvUSB implementation in Xen done by Fujitsu
based on Linux kernel 2.6.18.

Changes from the original version are:
- port to upstream kernel
- put all code in just one source file
- move module to appropriate location in kernel tree
- adapt to Linux style guide
- minor code modifications to increase readability

Signed-off-by: Juergen Gross jgr...@suse.com
---
 drivers/usb/host/Kconfig   |   11 +
 drivers/usb/host/Makefile  |1 +
 drivers/usb/host/xen-hcd.c | 1638 
 3 files changed, 1650 insertions(+)
 create mode 100644 drivers/usb/host/xen-hcd.c

diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig
index 197a6a3..3361b4b 100644
--- a/drivers/usb/host/Kconfig
+++ b/drivers/usb/host/Kconfig
@@ -788,3 +788,14 @@ config USB_HCD_TEST_MODE
  This option is of interest only to developers who need to validate
  their USB hardware designs.  It is not needed for normal use.  If
  unsure, say N.
+
+config USB_XEN_HCD
+   tristate Xen usb virtual host driver
+   depends on XEN
+   select XEN_XENBUS_FRONTEND
+   help
+ The Xen usb virtual host driver serves as a frontend driver enabling
+ a Xen guest system to access USB Devices passed through to the guest
+ by the Xen host (usually Dom0).
+ Only needed if the kernel is running in a Xen guest and generic
+ access to a USB device is needed.
diff --git a/drivers/usb/host/Makefile b/drivers/usb/host/Makefile
index 65b0b6a..3779696 100644
--- a/drivers/usb/host/Makefile
+++ b/drivers/usb/host/Makefile
@@ -75,3 +75,4 @@ obj-$(CONFIG_USB_HCD_SSB) += ssb-hcd.o
 obj-$(CONFIG_USB_FUSBH200_HCD) += fusbh200-hcd.o
 obj-$(CONFIG_USB_FOTG210_HCD)  += fotg210-hcd.o
 obj-$(CONFIG_USB_MAX3421_HCD)  += max3421-hcd.o
+obj-$(CONFIG_USB_XEN_HCD)  += xen-hcd.o
diff --git a/drivers/usb/host/xen-hcd.c b/drivers/usb/host/xen-hcd.c
new file mode 100644
index 000..9da2856
--- /dev/null
+++ b/drivers/usb/host/xen-hcd.c
@@ -0,0 +1,1638 @@
+/*
+ * xen-hcd.c
+ *
+ * Xen USB Virtual Host Controller driver
+ *
+ * Copyright (C) 2009, FUJITSU LABORATORIES LTD.
+ * Author: Noboru Iwamatsu n_iwama...@jp.fujitsu.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see http://www.gnu.org/licenses/.
+ *
+ * Or, by your choice:
+ *
+ * When distributed separately from the Linux kernel or incorporated into
+ * other software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include linux/module.h
+#include linux/usb.h
+#include linux/list.h
+#include linux/usb/hcd.h
+#include linux/io.h
+
+#include xen/xen.h
+#include xen/xenbus.h
+#include xen/grant_table.h
+#include xen/events.h
+#include xen/page.h
+
+#include xen/interface/io/usbif.h
+
+/* Private per-URB data */
+struct urb_priv {
+   struct list_head list;
+   struct urb *urb;
+   int req_id; /* RING_REQUEST id for submitting */
+   int unlink_req_id;  /* RING_REQUEST id for unlinking */
+

Re: [Xen-devel] [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 @@ -377,6 +384,28 @@ static void record_body_done(libxl__egc *egc,
  stream_failed(egc, stream, ret);
  }
  
 +void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
 +   int ret, int retval, int errnoval)
 +{
 +libxl__domain_create_state *dcs = dcs_void;
 +STATE_AO_GC(dcs-ao);
 +
 +if (ret)
 +goto err;
 +
 +if (retval) {
 +LOGEV(ERROR, errnoval, restoring domain);
 +ret = ERROR_FAIL;
 +goto err;
 +}
 +
 +libxl__stream_read_continue(egc, dcs-srs);

continue? Is this something to do with checkpointing?

 diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
 index 23f27d4..7418d92 100644
 --- a/tools/libxl/libxl_types.idl
 +++ b/tools/libxl/libxl_types.idl
 @@ -346,6 +346,8 @@ libxl_domain_create_info = Struct(domain_create_info,[
  
  libxl_domain_restore_params = Struct(domain_restore_params, [

At some point we will need a LIBXL_HAVE #define.

  (checkpointed_stream, integer),
 +(stream_version, uint32, {'init_val': '1'}),

If we aren't going to go for an IDL enum rather than a uint32 you
probably just want the bare integer 1.

But, I suspect we would prefer an enum, i.e an explicit list of known
versions, rather than an integer?

I wonder when, if ever, we will be able to flip this to 2? I suppose
whenever the legacy conversion stuff gets pulled out.

 +(legacy_width, uint32),

From what I've seen so far this is never user provided but is internal
to libxl and detected[0] at runtime. As such it belongs somewhere else
other than in the public API.

[0] FVO detected == hardcoded depending on the build arch

  ])
  
  libxl_domain_sched_params = Struct(domain_sched_params,[
 diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
 index ddb293c..14d96c9 100644
 --- a/tools/libxl/xl_cmdimpl.c
 +++ b/tools/libxl/xl_cmdimpl.c
 @@ -110,7 +110,9 @@
  
  #define XL_MANDATORY_FLAG_JSON (1U  0) /* config data is in JSON format */
  #define XL_MANDATORY_FLAG_STREAMv2 (1U  1) /* stream is v2 */
 -#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON)
 +#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON |\
 +XL_MANDATORY_FLAG_STREAMv2)
 +
  struct save_file_header {
  char magic[32]; /* savefileheader_magic */
  /* All uint32_ts are in domain's byte order. */
 @@ -2724,6 +2726,9 @@ static uint32_t create_domain(struct domain_create 
 *dom_info)
  libxl_domain_restore_params_init(params);
  
  params.checkpointed_stream = dom_info-checkpointed_stream;
 +params.stream_version =
 +(hdr.mandatory_flags  XL_MANDATORY_FLAG_STREAMv2) ? 2 : 1;
 +
  ret = libxl_domain_create_restore(ctx, d_config,
domid, restore_fd,
params,



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

On Tue, Jun 16, 2015 at 12:25 PM, Juergen Gross jgr...@suse.com wrote:
 Yes, they have. The question is whether those are different on multiple
 instances? With lsusb I've found a device with serial number
 0123456789ABCD. Do you really believe I'm just lucky owning the one with
 such a nice serial number? ;-)

FWIW apparently the spec for USB printers requires a unique serial
number; so there are probably cases where serial number is actually
useful.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream

On Tue, 2015-06-16 at 16:01 +0100, Andrew Cooper wrote:
 On 16/06/15 15:31, Ian Campbell wrote:
  On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
  From: Ross Lagerwall ross.lagerw...@citrix.com
 
  Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
  Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
  CC: Ian Campbell ian.campb...@citrix.com
  CC: Ian Jackson ian.jack...@eu.citrix.com
  CC: Wei Liu wei.l...@citrix.com
  Overall looks good, I've got some comments below and I think it almost
  certainly wants eyes from Ian who knows more about the dc infra etc.
 
  +void libxl__stream_read_start(libxl__egc *egc,
  +  libxl__stream_read_state *stream)
  +{
  +libxl__datacopier_state *dc = stream-dc;
  +int ret = 0;
  +
  +/* State initialisation. */
  +assert(!stream-running);
  +
  +memset(dc, 0, sizeof(*dc));
  libxl__datacopier_init, please
 
 That call is made by libxl__datacopier_start() each and every time, and
 unlike here, is matched with an equivalent _kill() call.

Hrm, I think I'd best defer to Ian J on what the right way to deal with
a dc being setup and resused in this way is.

  +/* Start reading the stream header. */
  +dc-readwhat = stream header;
  +dc-readbuf = stream-hdr;
  +stream-expected_len = dc-bytes_to_read = sizeof(stream-hdr);
  +dc-used = 0;
  +dc-callback = stream_header_done;
  This pattern of resetting and reinitialising the dc occurs in multiple
  places, I think a helper would be in order, some sort of
  stream_next_record_init or something perhaps?
 
 The only feasible helper would have to take everything as parameters; 
 there is insufficient similarity between all users.
 
 I dunno whether that would be harder to read...

I was more concerned about ensuring everyone does everything (especially
if a new field gets added), having a function with parameters would
cause a compile failure when the addition was (hopefully) propagated to
that function and added as a parameter.

Lets see what Ian thinks.

 
 
  +void libxl__stream_read_abort(libxl__egc *egc,
  +  libxl__stream_read_state *stream, int rc)
  +{
  +stream_failed(egc, stream, rc);
  +}
  +
  +static void stream_success(libxl__egc *egc, libxl__stream_read_state 
  *stream)
  +{
  +stream-rc = 0;
  +stream-running = false;
  +
  +stream_done(egc, stream);
  Push the running = false into stream_done and flip the assert there?
  Logically the stream is still running until it is done, so having done
  assert it isn't running seems counter-intuitive.
 
 This is more for piece of mind.  stream_done() my strictly only ever be
 called once, hence its assert.

assert(stream-running);
stream-running = false

in stream_done() gives the same piece of mind, doesn't it?

  +if (onwrite || dc-used != stream-expected_len) {
  +ret = ERROR_FAIL;
  +LOG(ERROR, write %d, err %d, expected %zu, got %zu,
  +onwrite, errnoval, stream-expected_len, dc-used);
  +goto err;
  +}
  I think you need to check errnoval == 0 in the !onwrite case, otherwise
  you may miss a read error?
 
 dc-used != stream-expected_len covers all possible read errors, in
 the something went wrong kind of way.

With the current implementation, perhaps, but the doc doesn't seem to
say you can rely on it (an appropriate reaction might be to change the
doc rather than the code, I don't mind).

  Sharing the dc between al these differing usages is starting to rankle a
  little, but I think it is necessary because it may have queued data from
  a previous read which was larger than the current record, correct?
 
  Hrm, isn't setting dc-used = 0 on each reset potentially throwing some
  stuff away?
 
 We should never be in a case where we are setting up a new read/write
 from the dc with any previous IO pending.

Essentially because dc uses its idea of bytes remaining to do the read,
the scenario I was imagining only comes about if the dc is reading large
blocks and segmenting them later, which isn't how it is used here.

  +if (dc-writefd == -1) {
  +ret = ERROR_FAIL;
  +LOGE(ERROR, Unable to open '%s', path);
  +goto err;
  +}
  +dc-maxsz = dc-bytes_to_read = rec_hdr-length - sizeof(*emu_hdr);
  +stream-expected_len = dc-used = 0;
  expecting 0? This differs from the pattern common everywhere else and
  I'm not sure why.
 
 The datacopier has been overloaded so many times, it is messy to use.
 
 In this case, we are splicing from read fd to a write fd, rather than to
 a local buffer.
 
 Therefore, when the IO is complete, we expect 0 bytes in the local
 buffer, as all data should end up in the fd.

I think using 2 or more data copiers to cover the different
configurations might help? You can still reuse one for the normal record
processing but a separate dedicated one for writing the emu to a file
might iron out a wrinkle.

 
  +memset(dc, 0,

Re: [Xen-devel] [Draft F] Xen on ARM vITS Handling

2015-06-16 Thread Julien Grall

On 16/06/15 16:10, Ian Campbell wrote:

 This could happen when the device is not quiescent. We had this issue on
 the vexpress at boot time when the network card was trying to send an
 interrupt before DOM0 is setup.

 I don't fully understand the issue you are trying to describe, but do
 you want to propose a change to the spec?

 I actually don't know how to modify it. So it's an open question.

 For SPI too, or just for LPI?

 Only LPI.
 
 How is it fixed for SPI?

We queue the SPI but don't inject it (gic_raise_guest_irq) to the guest.

The injection will be done when the guest enable the IRQ.

We can't use it for LPIs as the mapping vLPI - pLPI may not yet exists.

 vgic_vcpu_inject_irq doesn't queue the interrupt if a VCPU is down. I
 think this is because the state of the VCPU wouldn't be correct.

 The process would be something like:

   - Creation of the domain
= All vCPUs are down

   - Device is assigned to the guest
   = Enable physical LPIs

   * physical LPI is received *
= Will be ignored and not EOIed (VCPU0 is down)
   = The LPI will never fired again during the guest life

   -  Domain is started by the toolstack
= VCPU0 is online

 Is it sufficient to queue interrupts even for VCPUs which are down? How
 does the lack of a vITT entry when this interrupt occurred affect this?

 Well, in this case we don't know on which vLPI we have to inject it. But 
 as said above, I guess we don't care if we ensure that the device can't 
 send an event (by ensuring that the device doesn't know the 
 GITS_TRANSLATER address is).
 
 Yes, I think that works.
 
 I'm not sure where to spell that out though.

We want to find a way to avoid the PCI device to send an event. I see
multiple possibility to do it:
1) Mask the event MSI-X allow to mask a specific event. There is a
similar feature for MSI but it's optional.
2) Compose the MSI message only when the vLPI is enabled. That require
Xen/PCI-back to delay the write in the config space.

None of this solution satisfy me. But I don't have much knowledge on how
PCI-passthrough works on Xen.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 6/6] xen/pass-through: constify some static data

 On 16.06.15 at 16:27, stefano.stabell...@eu.citrix.com wrote:
 On Fri, 5 Jun 2015, Jan Beulich wrote:
 This is done indirectly by adjusting two typedefs and helps emphasizing
 that the respective tables aren't supposed to be modified at runtime
 (as they may be shared between devices).
 
 Signed-off-by: Jan Beulich jbeul...@suse.com
 
 Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

Perhaps patch 5 and 6 could go in right away, without waiting
for the issues on the other ones addressed? After all, it may
well be that patch 1 will be dropped following the discussion on
the hypervisor side patch series...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 From: Ross Lagerwall ross.lagerw...@citrix.com
 
 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com

Overall looks good, I've got some comments below and I think it almost
certainly wants eyes from Ian who knows more about the dc infra etc.

 +void libxl__stream_read_start(libxl__egc *egc,
 +  libxl__stream_read_state *stream)
 +{
 +libxl__datacopier_state *dc = stream-dc;
 +int ret = 0;
 +
 +/* State initialisation. */
 +assert(!stream-running);
 +
 +memset(dc, 0, sizeof(*dc));

libxl__datacopier_init, please

 +dc-ao = stream-ao;
 +dc-readfd = stream-fd;
 +dc-writefd = -1;
 +
 +/* Start reading the stream header. */
 +dc-readwhat = stream header;
 +dc-readbuf = stream-hdr;
 +stream-expected_len = dc-bytes_to_read = sizeof(stream-hdr);
 +dc-used = 0;
 +dc-callback = stream_header_done;

This pattern of resetting and reinitialising the dc occurs in multiple
places, I think a helper would be in order, some sort of
stream_next_record_init or something perhaps?

 +void libxl__stream_read_abort(libxl__egc *egc,
 +  libxl__stream_read_state *stream, int rc)
 +{
 +stream_failed(egc, stream, rc);
 +}
 +
 +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
 +{
 +stream-rc = 0;
 +stream-running = false;
 +
 +stream_done(egc, stream);

Push the running = false into stream_done and flip the assert there?
Logically the stream is still running until it is done, so having done
assert it isn't running seems counter-intuitive.

 +static void stream_done(libxl__egc *egc,
 +libxl__stream_read_state *stream)
 +{
 +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
 +
 +assert(!stream-running);
 +
 +stream-completion_callback(egc, dcs, stream-rc);
 +}
 +
 +static void stream_header_done(libxl__egc *egc,
 +   libxl__datacopier_state *dc,
 +   int onwrite, int errnoval)
 +{
 +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
 +libxl_sr_hdr *hdr = stream-hdr;
 +STATE_AO_GC(dc-ao);
 +int ret = 0;
 +
 +if (onwrite || dc-used != stream-expected_len) {
 +ret = ERROR_FAIL;
 +LOG(ERROR, write %d, err %d, expected %zu, got %zu,
 +onwrite, errnoval, stream-expected_len, dc-used);
 +goto err;
 +}

I think you need to check errnoval == 0 in the !onwrite case, otherwise
you may miss a read error?

Also it looks like onwrite can be -1, which is a separate error case.

 +
 +static void record_header_done(libxl__egc *egc,
 +   libxl__datacopier_state *dc,
 +   int onwrite, int errnoval)
 +{
 +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
 +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr;
 +STATE_AO_GC(dc-ao);
 +int ret = 0;
 +
 +if (onwrite || dc-used != stream-expected_len) {
 +ret = ERROR_FAIL;
 +LOG(ERROR, write %d, err %d, expected %zu, got %zu,
 +onwrite, errnoval, stream-expected_len, dc-used);
 +goto err;
 +}

Same comments wrt the arguments as the previous one.

Maybe a common helper to check (and log) the status at the head of each
callback? So you can effectively do if (!everything_ok(stream, dc) goto
err?

 +assert(!ret);
 +if (rec_hdr-length) {
 +free(stream-rec_body);
 +stream-rec_body = NULL;

reset length too?

 +static void read_emulator_body(libxl__egc *egc,
 +   libxl__stream_read_state *stream)
 +{
 +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
 +libxl__datacopier_state *dc = stream-dc;
 +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr;
 +libxl_sr_emulator_hdr *emu_hdr = stream-rec_body;
 +STATE_AO_GC(stream-ao);
 +char path[256];
 +int ret = 0;
 +
 +sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE.%u, dcs-guest_domid);
 +
 +dc-readwhat = save/migration stream;
 +dc-copywhat = emulator context;
 +dc-writewhat = qemu save file;
 +dc-readbuf = NULL;
 +dc-writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);

Since it this is all done in the same process (or children of it) with
not setuid etc, I think 0600 would be better to avoid accidentally
leaving the save state world readable (just in case it matters).

Also, should consider whether this fd needs to be subject to the carefd
machinery.

Sharing the dc between al these differing usages is starting to rankle a
little, but I think it is necessary because it may have queued data from
a previous read which was larger than the current record, correct?

Hrm, isn't

Re: [Xen-devel] [PATCH 1/6] xen/MSI-X: latch MSI-X table writes

On Tue, 16 Jun 2015, Jan Beulich wrote:
  On 16.06.15 at 15:35, stefano.stabell...@eu.citrix.com wrote:
  On Fri, 5 Jun 2015, Jan Beulich wrote:
  @@ -322,6 +323,13 @@ static int xen_pt_msix_update_one(XenPCI
   
   pirq = entry-pirq;
   
  +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s-msix-maskall ||
  +(entry-latch(VECTOR_CTRL)  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
  
  I admit I am having difficulties understanding the full purpose of these
  checks. Please add a comment on them.
 
 The comment would (pointlessly imo) re-state what the code already
 says:
 
  I guess the intention is only to make changes using the latest values,
  the ones in entry-latch, when the right conditions are met, otherwise
  keep using the old values. Is that right?
  
  In that case, don't we want to use the latest values on MASKBIT -
  !MASKBIT transitions? In general when unmasking?
 
 This is what we want. And with that, the questions you ask further
 down should be answered too: The function gets invoked with the
 pre-change mask flag state in -latch[], and updates the values
 used for actually setting up when that one has the entry masked
 (or mask-all is set). The actual new value gets written to -latch[]
 after the call.

I think this logic is counter-intuitive and prone to confuse the reader.
This change doesn't make sense on its own: when one will read
xen_pt_msix_update_one, won't be able to understand the function without
checking the call sites.

Could we turn it around to be more obvious?  Here check if
!(entry-latch(VECTOR_CTRL)  PCI_MSIX_ENTRY_CTRL_MASKBIT) and below only
call xen_pt_msix_update_one on MASKBIT - !MASKBIT transactions?

Or something like that?


  @@ -444,39 +432,28 @@ static void pci_msix_write(void *opaque,
   offset = addr % PCI_MSIX_ENTRY_SIZE;
   
   if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) {
  -const volatile uint32_t *vec_ctrl;
  -
   if (get_entry_value(entry, offset) == val
entry-pirq != XEN_PT_UNASSIGNED_PIRQ) {
   return;
   }
   
  +entry-updated = true;
  +} else if (msix-enabled  entry-updated 
  +   !(val  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
  +const volatile uint32_t *vec_ctrl;
  +
   /*
* If Xen intercepts the mask bit access, entry-vec_ctrl may not 
  be
* up-to-date. Read from hardware directly.
*/
   vec_ctrl = s-msix-phys_iomem_base + entry_nr * 
  PCI_MSIX_ENTRY_SIZE
   + PCI_MSIX_ENTRY_VECTOR_CTRL;
  +set_entry_value(entry, offset, *vec_ctrl);
  
  Why are you calling set_entry_value with the hardware vec_ctrl value? It
  doesn't look correct to me.  In any case, if you wanted to do it,
  shouldn't you just set/unset PCI_MSIX_ENTRY_CTRL_MASKBIT instead of the
  whole *vec_ctrl?
 
 The comment above the code explains it: What we have stored locally
 may not reflect reality, as we may not have seen all writes (and this
 indeed isn't just a may). And if out cached value isn't valid anymore,
 why would we not want to update all of it, rather than just the mask
 bit?

OK, however the previous code wasn't actually updating the entirety of
vector_ctrl. It was just using the updated value to check for
PCI_MSIX_ENTRY_CTRL_MASKBIT.  This is something else.  The new behavior
might be correct, but at least the commit message needs to explain it.


  -if (msix-enabled  !(*vec_ctrl  PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
  -if (!entry-warned) {
  -entry-warned = true;
  -XEN_PT_ERR(s-dev, Can't update msix entry %d since 
  MSI-X is
  -already enabled.\n, entry_nr);
  -}
  -return;
  -}
  -
  -entry-updated = true;
  +xen_pt_msix_update_one(s, entry_nr);
  
  Shouldn't we call xen_pt_msix_update_one only if (*vec_ctrl 
  PCI_MSIX_ENTRY_CTRL_MASKBIT)? In other words, only when we see a
  MASKBIT - !MASKBIT transition?
 
 The combination of the !(val  PCI_MSIX_ENTRY_CTRL_MASKBIT)
 check in the if() surrounding this call and the
 (entry-latch(VECTOR_CTRL)  PCI_MSIX_ENTRY_CTRL_MASKBIT)
 check inside the function guarantee just that (i.e. the function
 invocation is benign in the other case, as entry-addr/entry-data
 would remain unchanged).

OK, maybe the code works as is, but it took me a long time to make sense
of it because it relies on the combinations of three checks in three
different places. I would prefer to change it into something more
obvious.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)

2015-06-16 Thread Greg KH

On Tue, Jun 16, 2015 at 04:56:35PM +0200, Juergen Gross wrote:
 Hmm, I'm beginning to question the value of that file. May be I should
 just throw it away.

Given that you are going to have to justify every one of them, and
ensure that you are creating them in a race-free manner (I didn't look
to check), you might just drop it if they are not needed.

There's always debugfs for things like generic debugging stuff.  If you
want to do that, use the usb subdir in debugfs for your directory and
files.

thanks,

greg k-h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 2/6] libxl: do not add a vkb backend to hvm guests

2015-06-16 Thread Wei Liu

On Wed, Jun 10, 2015 at 11:09:50AM +0100, Stefano Stabellini wrote:
 When QEMU restricts its xenstore connection, it cannot provide PV
 backends. A separate QEMU instance is required to provide PV backends in
 userspace, such as qdisk. With two separate instances, it is not
 possible to take advantage of vkb for mouse and keyboard, as the QEMU
 that emulates the graphic card (the device model), would be separate
 from the QEMU running the vkb backend (PV QEMU).
 

The question is that how would this affect the non-split setup.

But do I understand correctly we are moving towards a split setup
regardlessly whether QEMU supports xsrestrict or not? I.e. there is no
non-split setup in the future.

Wei.

 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 ---
  tools/libxl/libxl_create.c |5 -
  1 file changed, 5 deletions(-)
 
 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index f0da7dc..a74b340 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -1275,17 +1275,12 @@ static void domcreate_launch_dm(libxl__egc *egc, 
 libxl__multidev *multidev,
  {
  libxl__device_console console;
  libxl__device device;
 -libxl_device_vkb vkb;
  
  init_console_info(gc, console, 0);
  console.backend_domid = state-console_domid;
  libxl__device_console_add(gc, domid, console, state, device);
  libxl__device_console_dispose(console);
  
 -libxl_device_vkb_init(vkb);
 -libxl__device_vkb_add(gc, domid, vkb);
 -libxl_device_vkb_dispose(vkb);
 -
  dcs-dmss.dm.guest_domid = domid;
  if (libxl_defbool_val(d_config-b_info.device_model_stubdomain))
  libxl__spawn_stub_dm(egc, dcs-dmss);
 -- 
 1.7.10.4

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 when signalled to do so by libxl__remus_domain_checkpoint_callback()

I think I saw that Remus wasn't currently working, so I'll let you and
Hongyang thrash something out before I spend too much effort reviewing
these last few RFC bits. Unless you think it is worth my having a look
now?



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams

On 16/06/15 16:00, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 In a remus senario, libxc will write a CHECKPOINT record, then hand ownership
 scenario

 of the fd to libxl.  Libxl then writes any records required and finishes with
 a CHECKPOINT_END record, then hands ownership of the fd back to libxc.
 Seems like a plausible scheme to me, if that's what the RFC was for.

The RFC was for all the support remus bits, where I wrote code but was
unable to test.  They will all be dropped for v2, once the suggested
adjustments are done.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 17/27] tools/libxl: Support converting a legacy stream to a v2 stream

On Tue, 2015-06-16 at 16:13 +0100, Andrew Cooper wrote:

  Since we only support N-N+1 we could perhaps tolerate the duplication
  if we agreed upon a reasonable schedule to remove all this compat stuff,
  e.g. in 4.7 or 4.8.
 
 N-N+1 is another item on the needs to be resolved before Xapi can use
 libxl list.
 
 In XenServer, we still have support for importing VMs from XenServer
 4.0, which is certainly older than Xen 3.2

OK. FWIW I think we should consider importing (i.e. restore) separate to
migration, since the former can be subjected to more heavy duty offline
conversions if necessary.

It's not out of the question that this new migration stream format will
allow us to sensibly increase the +1 to some larger number for
migrations too.

Ian.

 It might be reasonable to make a compile time option to omit the legacy
 conversion, but conversion should absolutely be on by default for the
 first release after this series being accepted.
 
 ~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] Revert xen-hvm: increase maxmem before calling xc_domain_populate_physmap

On 06/16/2015 04:51 PM, Stefano Stabellini wrote:
 On Tue, 16 Jun 2015, Wei Liu wrote:
 On Wed, Jun 10, 2015 at 01:55:13PM +0100, George Dunlap wrote:
 This reverts commit c1d322e6048796296555dd36fdd102d7fa2f50bf.

 The original commit fixes a bug when assigning a large number of
 devices which require option roms to a guest.  (One known
 configuration that needs extra memory is having more than 4 emulated
 NICs assigned.  Three or fewer NICs seems to work without this
 functionality.)

 However, by unilaterally increasing maxmem, it introduces two
 problems.

 First, now libxl's calculation of the required maxmem during migration
 is broken -- any guest which exercised this functionality will fail on
 migration.  (Guests which have the default number of devices are not
 affected.)

 Secondly, it makes it impossible for a higher-level toolstack or
 administer to predict how much memory a VM will actually use, making
 it much more difficult to effectively use all of the memory on a
 machine.

 The right solution to the original problem is to figure out a way for
 qemu to take pages from the existing pool of guest memory, rather than
 allocating more pages.

 That fix will take more time to develop than we have until the feature
 freeze.  In the mean time, the simplest way to fix the migration issue
 is to revert this change.  That will re-introduce the original bug,
 but it's an unusual corner case; and without migration it isn't fully
 functional yet anyway.

 Signed-off-by: George Dunlap george.dun...@eu.citrix.com
 ---
 I do think this is the right approach, but I'm mainly sending this is
 mainly to open up discussion.

 CC: Stefano Stabellini stefano.stabell...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Andrew Cooper andrew.coop...@citrix.com

 Stefano, Andrew, any comments?

 If we're to do this we need to do it now.

 I think reverting this change in QEMU and relevant changes in libxl
 would be the most viable solution to solve this for this release.
 
 Reverting this patch doesn't really solve the problem: instead of
 breaking on migration when the VM has more than 3 emulated NICs, the VM
 simply refuses to start in that case. I guess it can be considered a
 small improvement but certainly not a fix.
 
 Given that the migration issue only happens in an unusual corner case,
 are we really in a hurry to revert this commit and go back to the
 failure to start, even before we actually figure out what the proper fix
 is?

I'm in a hurry to go back to a world where qemu doesn't unexpectedly
allocate more RAM to a guest.  If the real problem with the original
patch was that it broke migration, we could fix that pretty easily; but
the real problem (to me) with the original patch is that it
unpredicatably allocates more memory to a guest that the toolstack
doesn't know about.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv2 3/5] evtchn: use a per-event channel lock for sending events

On 16/06/15 16:19, David Vrabel wrote:
 @@ -1221,6 +1277,8 @@ void notify_via_xen_event_channel(struct domain *ld, 
 int lport)
  evtchn_port_set_pending(rd, rchn-notify_vcpu_id, rchn);
  }
  
 +spin_unlock(lchn-lock);
 +
  spin_unlock(ld-event_lock);
  }

 Again I think the event lock can be dropped earlier now.
 
 Ditto.

Uh, no. This is notify.  I've kept the locking like this because of the
ld-is_dying check.  I think we need the ld-event_lock in case
d-is_dying is set and evtchn_destroy(ld) is called.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 6/6] xen/pass-through: constify some static data

On Fri, 5 Jun 2015, Jan Beulich wrote:
 This is done indirectly by adjusting two typedefs and helps emphasizing
 that the respective tables aren't supposed to be modified at runtime
 (as they may be shared between devices).
 
 Signed-off-by: Jan Beulich jbeul...@suse.com

Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com


 --- a/qemu/upstream/hw/xen/xen_pt.h
 +++ b/qemu/upstream/hw/xen/xen_pt.h
 @@ -31,7 +31,7 @@ void xen_pt_log(const PCIDevice *d, cons
  /* Helper */
  #define XEN_PFN(x) ((x)  XC_PAGE_SHIFT)
  
 -typedef struct XenPTRegInfo XenPTRegInfo;
 +typedef const struct XenPTRegInfo XenPTRegInfo;
  typedef struct XenPTReg XenPTReg;
  
  typedef struct XenPCIPassthroughState XenPCIPassthroughState;
 @@ -135,11 +135,11 @@ struct XenPTReg {
  uint32_t data; /* emulated value */
  };
  
 -typedef struct XenPTRegGroupInfo XenPTRegGroupInfo;
 +typedef const struct XenPTRegGroupInfo XenPTRegGroupInfo;
  
  /* emul reg group size initialize method */
  typedef int (*xen_pt_reg_size_init_fn)
 -(XenPCIPassthroughState *, const XenPTRegGroupInfo *,
 +(XenPCIPassthroughState *, XenPTRegGroupInfo *,
   uint32_t base_offset, uint8_t *size);
  
  /* emulated register group information */
 @@ -154,7 +154,7 @@ struct XenPTRegGroupInfo {
  /* emul register group management table */
  typedef struct XenPTRegGroup {
  QLIST_ENTRY(XenPTRegGroup) entries;
 -const XenPTRegGroupInfo *reg_grp;
 +XenPTRegGroupInfo *reg_grp;
  uint32_t base_offset;
  uint8_t size;
  QLIST_HEAD(, XenPTReg) reg_tbl_list;
 --- a/qemu/upstream/hw/xen/xen_pt_config_init.c
 +++ b/qemu/upstream/hw/xen/xen_pt_config_init.c
 @@ -96,8 +96,7 @@ XenPTReg *xen_pt_find_reg(XenPTRegGroup 
  }
  
  static uint32_t get_throughable_mask(const XenPCIPassthroughState *s,
 - const XenPTRegInfo *reg,
 - uint32_t valid_mask)
 + XenPTRegInfo *reg, uint32_t valid_mask)
  {
  uint32_t throughable_mask = ~(reg-emu_mask | reg-ro_mask);

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 07/12] x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code

On 16/06/15 15:19, Boris Ostrovsky wrote:
 On 06/16/2015 10:15 AM, David Vrabel wrote:
 On 15/06/15 21:35, Ingo Molnar wrote:
 * David Vrabel david.vra...@citrix.com wrote:

 On 15/06/15 10:05, Ian Campbell wrote:
 On Sat, 2015-06-13 at 11:49 +0200, Ingo Molnar wrote:
 xen_mm_pin_all()/unpin_all() are used to implement full guest
 instance
 suspend/restore. It's a stop-all method that needs to iterate
 through all
 allocated pgds in the system to fix them up for Xen's use.

 This code uses pgd_list, probably because it was an easy interface.

 But we want to remove the pgd_list, so convert the code over to
 walk all
 tasks in the system. This is an equivalent method.
 It is not equivalent because pgd_alloc() now populates entries in
 pgds that are
 not visible to xen_mm_pin_all() (note how the original code adds the
 pgd to the
 pgd_list in pgd_ctor() before calling pgd_prepopulate_pmd()).  These
 newly
 allocated page tables won't be correctly converted on suspend/resume
 and the new
 process will die after resume.
 So how should the Xen logic be fixed for the new scheme? I can't say
 I can see
 through the paravirt complexity here.
 Actually, since we freeze_processes() before trying to pin page tables,
 I think it should be ok as-is.

 I'll put the patch through some tests.
 
 Actually, I just ran this through a couple of boot/suspend/resume tests
 and didn't see any issues (with the one fix I mentioned to Ingo
 earlier). On unstable Xen only.

In which case this can have a:

Reviewed-by: David Vrabel david.vra...@citrix.com

Thanks.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [Patch V3 3/3] xen: add Xen pvUSB maintainer

Add myself as maintainer for the Xen pvUSB stuff.

Signed-off-by: Juergen Gross jgr...@suse.com
Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d8afd29..8ef4e83 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10965,6 +10965,14 @@ F: drivers/scsi/xen-scsifront.c
 F: drivers/xen/xen-scsiback.c
 F: include/xen/interface/io/vscsiif.h
 
+XEN PVUSB DRIVER
+M: Juergen Gross jgr...@suse.com
+L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
+L: linux-...@vger.kernel.org
+S: Supported
+F: divers/usb/host/xen*
+F: include/xen/interface/io/usbif.h
+
 XEN SWIOTLB SUBSYSTEM
 M: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API

On 06/16/2015 03:20 PM, Juergen Gross wrote:
 On 06/16/2015 04:06 PM, George Dunlap wrote:
 On 06/16/2015 02:49 PM, Juergen Gross wrote:
 On 06/16/2015 03:29 PM, George Dunlap wrote:
 On 06/16/2015 02:23 PM, Juergen Gross wrote:
 Hmm, I'd rather have it all in one place. Putting it in qemu would
 enable us to handle hotplug as well. A USB device assigned via it's
 serial to a domU could be automatically passed through by qemu when
 showing up. This isn't possible today, but we wouldn't have to change
 libxl again for supporting it, only qemu would have to be adapted.

 Look, we're talking here about the libxl interface.  Ian thinks that
 *at
 the libxl layer* we need to be able to specify all these things.  It
 doesn't matter at this point whether qemu can also do it or not.

 When I'm adding new functionality (and this is the case here) I always
 try to add it at the level where it should be. qemu already is capable
 of finding a USB device by various means and can be extended easily to
 support our needs. And please remember, for writing the pvusb backend
 I'm already doing changes to qemu.

 So why should we add the same functionality to libxl by reading sysfs
 instead of letting it do qemu via libusb? And what about BSD? Letting
 qemu find the device would avoid two variants in libxl.

 If the libxl interface only has bus and port, then it doesn't matter
 what qemu can do -- the caller has no way of asking qemu to watch for
 vid:did:serial.

 If libxl has vid:did:serial in the interface, then it's only an
 implementation detail whether we pass that into qemu or whether we pass
 some other way of uniquely indentifying a particular device.  And given
 that no *current* versions of qemu support vid:did:serial, the most
 sensible way to implement this would be to have libxl do the conversion
 and send over bus:addr -- that way when you update libxl you get that
 functionality regardless of whether qemu has it.

 Unless you're suggesting that the libxl interface should be a string
 that we just pass verbatim to qemu.  If that's what you mean, it's a
 complete non-starter, and I'm sure Ian will agree with me.  There's no
 way that we can provide any interface consistency or any documentation
 of what this string does if it boils down to This does whatever the
 version of qemu you happen to be running does.
 
 No, I didn't expect libxl to just pass an arbitrary string to qemu.
 My point was to avoid the sysfs accesses in libxl in order to support
 BSD as well and to reduce the complexity.
 
 In the future, if we actually implement the automatically grab this
 device when it shows up functionality, then yes, having it in qemu
 rather than having a libxl daemon sit around and watch for those
 things,
 might be handy.  But we can cross that bridge when we come to it.

 I would agree if the efforts in libxl would be much smaller than in
 qemu. But if the efforts are comparable I'd rather do it in the
 component which will make such an enhancement easier.

 They're both small, so this is really bikeshedding at the moment.  The
 most important question is the interface: do we include vid:did:serial
 in the libxl interface.  Since you want qemu to do that, I take it that
 your answer to that is yes.
 
 Correct.
 
 When we have patches submitted, we can discuss whether libxl should only
 support the (as-yet unreleased) versions of qemu that do the search
 themselves, or whether they should support all versions of qemu-xen.
 (The answer seems pretty obvious to me...)
 
 May be I made one wrong assumption: I thought adding USB-passthrough for
 HVM domains would require qemu changes as well. If this is not the case
 I can understand your reasoning.

Oh, right.  No, getting HVM hotplug working over QMP was pretty
straightforward; probably only a week's work.  It's been the interface
that has been the real difficult bit.

There are some things that would be nicer to implement in qemu.  Right
now there's no way to *query* what devices have been plugged in; my
patch series had to store information in xenstore.

At some point in the future, it might make sense for libxl to mainly
just ferry the options back and forth to qemu; even then I'm not sure.
In any case it's an implementation detail that can be changed at any
time.  It's the interface we have to be really careful about.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]

On 06/16/2015 02:37 PM, Ian Jackson wrote:
 George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):
 Remember that the path you gave in your previous e-mail isn't the path
 for the *usb device*, it's the path for the *block device*.  It
 contains a PCI address, but it looks like it also contains part of the
 USB topology.  Are you sure that's actually a stable interface, or
 does it just happen that on your hardware the discovery always happens
 in the same order?
 
 The block device is (in path terms) underneath the usb device,
 obviously.  Not all of that path is relevant to identifying the
 USB device.
 
 On my system /sys/bus/usb/devices/2-3.3 is a link to
 /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/.  This contains
 the pci bus address, but it also contains the bus number, which we've
 just said may be unstable across reboots.
 
 You mean the 2 in `usb2' ?  I think that bus number is the bus number
 within the controller, not globally.

On the contrary:

$ ls -l usb*
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb1 -
../../../devices/pci:00/:00:1a.7/usb1/
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb2 -
../../../devices/pci:00/:00:1d.7/usb2/
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb3 -
../../../devices/pci:00/:00:1a.0/usb3/
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb4 -
../../../devices/pci:00/:00:1a.1/usb4/
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb5 -
../../../devices/pci:00/:00:1a.2/usb5/
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb6 -
../../../devices/pci:00/:00:1d.0/usb6/
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb7 -
../../../devices/pci:00/:00:1d.1/usb7/
lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb8 -
../../../devices/pci:00/:00:1d.2/usb8/

$ ls /sys//devices/pci:00/:00:1d.7/ | grep usb
usb2/

In other words, the global bus enumeration leaks its way into the device
path; which means at very least the sysfs device path is potentially
unstable across reboots even if you include the pci controller it's on.

 I suppose it might be possible to specify buspci,port -- the pci
 address of the root bus, and the topology from there.  In theory I
 guess that should be stable?
 
 Yes.  The whole point of paths like this is that they are stable if
 the physical topology doesn't change.  So on my netbook
 
   /dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1
 
 always refers to the 1st MBR partition on logical device 0 on the USB
 storage device plugged into the USB port physically on the front left
 of the computer.

That would be great if it were true, but I'm not convinced that the
above path doesn't include a globally-enumerated bus number, which might
change across reboots if it's enumerated in a different order.

It may be that the format above is *not* based on the sysfs path of the
device; that the '0' immediately after the USB means the first (and
perhaps only) controller at this PCI address.  In which case, yes,
having a string like this might be a unique identifier for a particular
port that would be stable across reboots.  That needs some investigation.

 In any case, at the moment you're essentially inventing from whole
 cloth a new way of specifying USB devices that (as far as I know)
 isn't supported by any other program that uses USB.
 
 If you can't specify the device by hardware path, you can't specify it
 deterministically.
 
 And as you can see it _is_ supported by other programs that use USB.
 mount can use it!

Mount doesn't know anything about USB devices; all it knows are block
devices.  You might as well say that 'ls' knows about USB devices
because it can read files on a filesystem that resides on a USB stick.

I'm talking about programs that explicitly manipulate USB devices -- in
particular, those that may pass them through to a guest or seize them,
like libvirt, qemu, virtualbox, c.

I'm not opposed to doing something new if it really is better.  But when
you find that a dozen other projects before you have done things one
way, you should at least take care before you decide to do something
radically different, to make sure that it's because you actually have
something better, and not because you've just missed something.

 I think the hardware path to the controller, at least, should be
 treated as an opaque OS-specific string.  It might have a different
 format on BSD.

If we can make an actual path that's stable across reboots, that would
certainly be a good thing.  But if it's not stable across reboots, it
has no advantage over the bus-port[.port.port] interface.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.2-testing test] 58604: trouble: blocked/broken/fail/pass

2015-06-16 Thread osstest service user

flight 58604 xen-4.2-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58604/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt  3 host-install(3) broken in 58584 REGR. vs. 58411

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 58604
 test-amd64-i386-pair  4 host-install/dst_host(4) broken in 58584 pass in 58604
 test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass in 
58604
 test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass in 
58604
 test-i386-i386-libvirt3 host-install(3)   broken pass in 58540
 test-amd64-i386-qemuu-freebsd10-amd64  3 host-install(3)  broken pass in 58540
 test-amd64-i386-xl-win7-amd64  3 host-install(3)  broken pass in 58540
 test-amd64-amd64-xl-multivcpu  3 host-install(3)  broken pass in 58584
 test-i386-i386-xl 3 host-install(3)   broken pass in 58584
 test-i386-i386-xl-sedf3 host-install(3)   broken pass in 58584
 test-amd64-i386-xl-qemut-debianhvm-amd64 3 host-install(3) broken pass in 58584
 test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken pass in 
58584
 test-i386-i386-xl-qemuu-winxpsp3  3 host-install(3)   broken pass in 58584
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-boot   fail in 58584 pass in 58604
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-localmigrate.2 fail in 58584 
pass in 58604
 test-amd64-amd64-xl-win7-amd64 16 guest-stopfail pass in 58584
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail pass in 58584

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-win7-amd64 16 guest-stop   fail in 58540 like 58411
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 58411
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58411

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)blocked in 58584 n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-i386-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-i386-i386-libvirt   12 migrate-support-check fail in 58540 never pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install  fail never pass
 build-amd64-rumpuserxen   5 rumpuserxen-buildfail   never pass
 build-i386-rumpuserxen5 rumpuserxen-buildfail   never pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xend-winxpsp3 20 leak-check/check fail  never pass
 test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass

version targeted for testing:
 xen  97134c441d6d81ba0d7cdcfdc4d8315115b99dce
baseline version:
 xen  21a8344ca38a2797a13b4bf57031b6f49ae12ccb


People who touched revisions under test:
  Ian Campbell ian.campb...@citrix.com
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com
  Stefano Stabellini stefano.stabell...@eu.citrix.com


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-i386-i386-xlbroken  
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64broken  
 test-amd64-i386-xl-qemut-debianhvm-amd64 broken  
 test-amd64-amd64-xl-qemuu-debianhvm-amd64

Re: [Xen-devel] [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children

On Tue, 2015-06-16 at 14:36 +0100, Andrew Cooper wrote:
 On 16/06/15 14:21, Ian Campbell wrote:
  On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
  Shortly, libxl will be juggling multiple parallel operations, and will
  possibly have to take error decisions before some tasks have been set up.
  It would be preferable, I think, to arrange to call libxl__ev_child_init
  on all such libxl__ev_child structs either up front or certainly before
  there is any possibility of needing to unwind them.
 
  Such an init would at worst correspond to exactly the place where the
  zeroed structure you refer to is zeroed.
 
 It is possible that one bit fails before it can be calculated whether
 the second bit needs to start or not.

You can call libxl__ev_child_init without needing to know that, i.e. you
can do them all at the point where you allocate/initialise the
containing struct.

 At the moment, all bits in libxl in this area do initialisation
 immediately before use; most bits are even initialised in the function
 which starts their actions.  Some bits are initialised differently
 depending on the path taken to get to the initialisation site. 
 
 It would be non-trivial to initialise everything appropriately at the
 very start.

You don't need to fully init, just call libxl__ev_child_init in order to
arrange for correct behaviour from libxl__ev_child_inuse and friends.
Actually turning it into a useful child can stay where it is if it is
tricky to arrange for those things to happen at the same time.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams

On 16/06/15 15:53, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 @@ -377,6 +384,28 @@ static void record_body_done(libxl__egc *egc,
  stream_failed(egc, stream, ret);
  }
  
 +void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
 +   int ret, int retval, int errnoval)
 +{
 +libxl__domain_create_state *dcs = dcs_void;
 +STATE_AO_GC(dcs-ao);
 +
 +if (ret)
 +goto err;
 +
 +if (retval) {
 +LOGEV(ERROR, errnoval, restoring domain);
 +ret = ERROR_FAIL;
 +goto err;
 +}
 +
 +libxl__stream_read_continue(egc, dcs-srs);
 continue? Is this something to do with checkpointing?

No.  Simply continue reading records from the stream.

The code has changed since the first iteration, where
libxl__stream_read_continue() was called from outside of this
translation unit.  As it currently stands, I should make it a static
function.


 diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
 index 23f27d4..7418d92 100644
 --- a/tools/libxl/libxl_types.idl
 +++ b/tools/libxl/libxl_types.idl
 @@ -346,6 +346,8 @@ libxl_domain_create_info = Struct(domain_create_info,[
  
  libxl_domain_restore_params = Struct(domain_restore_params, [
 At some point we will need a LIBXL_HAVE #define.

  (checkpointed_stream, integer),
 +(stream_version, uint32, {'init_val': '1'}),
 If we aren't going to go for an IDL enum rather than a uint32 you
 probably just want the bare integer 1.

 But, I suspect we would prefer an enum, i.e an explicit list of known
 versions, rather than an integer?

Ideally, this should match the version field in the libxl stream header
record.  Realistically, I never expect version 2 needing bumping to
version 3.


 I wonder when, if ever, we will be able to flip this to 2? I suppose
 whenever the legacy conversion stuff gets pulled out.

 +(legacy_width, uint32),
 From what I've seen so far this is never user provided but is internal
 to libxl and detected[0] at runtime. As such it belongs somewhere else
 other than in the public API.

 [0] FVO detected == hardcoded depending on the build arch

The intention was originally to expose it to the user.  While this is
possible for `xl restore /path/to/legacy/save --was-32bit-toolstack`, it
is much harder for the `xl migrate` case where we cannot control the
sending side.

I am half tempted to say that anyone experiencing 32-64bit problems
should just use the conversion helper themselves.  It is deliberately
written to be usable on the command line as well as automatically.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-16 Thread Roger Pau Monné

El 16/06/15 a les 18.13, Stefano Stabellini ha escrit:
 On Thu, 11 Jun 2015, Ian Campbell wrote:
 On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote:
 Hi Ian,

 On 11/06/2015 04:56, Ian Campbell wrote:
 On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote:
 Hi,

 On 10/06/2015 08:45, Ian Campbell wrote:
 4. DomU access / assignment PCI device
 --
 When a device is attached to a domU, provision has to be made such that
 it can
 access the MMIO space of the device and xen is able to identify the 
 mapping
 between guest bdf and system bdf. Two hypercalls are introduced

 I don't think we want/need new hypercalls here, the same existing
 hypercalls which are used on x86 should be suitable. That's
 XEN_DOMCTL_memory_mapping from the toolstack I think.

 XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest
 (i.e hvmloader?) is writing in the PCI BAR.

 What about for x86 PV? I think it is done by the toolstack there, I
 don't know what pciback does with accesses to BAR registers.

 XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page 
 table. This is only used for auto-translated guest.

 In the case of x86 PV, the page-table is managed by the guest. The only 
 things to do is to give the MMIO permission to the guest in order to the 
 let him use them. This is done at boot time in the toolstack.

 Ah yes, makes sense.

 Manish, this sort of thing and the constraints etc should be discussed
 in the doc please.
  
 I think that the toolstack (libxl) will need to call
 xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to
 xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH)
 and ARM guests.

I'm not sure about this, AFAICT you are suggesting that the toolstack
(or domain builder for Dom0) should setup the MMIO regions on behalf of
the guest using the XEN_DOMCTL_memory_mapping hypercall.

IMHO the toolstack should not setup MMIO regions and instead the guest
should be in charge of setting them in the p2m by using a hypercall (or
at least that was the plan on x86 PVH).

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/6] x86: reduce paravirtualized spinlock overhead


AFAIK there are no outstanding questions for more than one month now.
I'd appreciate some feedback or accepting these patches.


Juergen

On 04/30/2015 12:53 PM, Juergen Gross wrote:

Paravirtualized spinlocks produce some overhead even if the kernel is
running on bare metal. The main reason are the more complex locking
and unlocking functions. Especially unlocking is no longer just one
instruction but so complex that it is no longer inlined.

This patch series addresses this issue by adding two more pvops
functions to reduce the size of the inlined spinlock functions. When
running on bare metal unlocking is again basically one instruction.

Compile tested with CONFIG_PARAVIRT_SPINLOCKS on and off, 32 and 64
bits.

Functional testing on bare metal and as Xen dom0.

Correct patching verified by disassembly of active kernel.

Juergen Gross (6):
   x86: use macro instead of 0 for setting TICKET_SLOWPATH_FLAG
   x86: move decision about clearing slowpath flag into arch_spin_lock()
   x86: introduce new pvops function clear_slowpath
   x86: introduce new pvops function spin_unlock
   x86: switch config from UNINLINE_SPIN_UNLOCK to INLINE_SPIN_UNLOCK
   x86: remove no longer needed paravirt_ticketlocks_enabled

  arch/x86/Kconfig  |  1 -
  arch/x86/include/asm/paravirt.h   | 13 +
  arch/x86/include/asm/paravirt_types.h | 12 
  arch/x86/include/asm/spinlock.h   | 53 ---
  arch/x86/include/asm/spinlock_types.h |  3 +-
  arch/x86/kernel/kvm.c | 14 +
  arch/x86/kernel/paravirt-spinlocks.c  | 42 +--
  arch/x86/kernel/paravirt.c| 12 
  arch/x86/kernel/paravirt_patch_32.c   | 25 +
  arch/x86/kernel/paravirt_patch_64.c   | 24 
  arch/x86/xen/spinlock.c   | 23 +--
  include/linux/spinlock_api_smp.h  |  2 +-
  kernel/Kconfig.locks  |  7 +++--
  kernel/Kconfig.preempt|  3 +-
  kernel/locking/spinlock.c |  2 +-
  lib/Kconfig.debug |  1 -
  16 files changed, 154 insertions(+), 83 deletions(-)




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)

2015-06-16 Thread Greg KH

On Tue, Jun 16, 2015 at 04:32:34PM +0200, Juergen Gross wrote:
 Introduces the Xen pvUSB frontend. With pvUSB it is possible for a Xen
 domU to communicate with a USB device assigned to that domU. The
 communication is all done via the pvUSB backend in a driver domain
 (usually Dom0) which is owner of the physical device.
 
 The pvUSB frontend is a USB hcd for a virtual USB host connector.
 
 The code is taken from the pvUSB implementation in Xen done by Fujitsu
 based on Linux kernel 2.6.18.
 
 Changes from the original version are:
 - port to upstream kernel
 - put all code in just one source file
 - move module to appropriate location in kernel tree
 - adapt to Linux style guide
 - minor code modifications to increase readability
 
 Signed-off-by: Juergen Gross jgr...@suse.com
 ---
  drivers/usb/host/Kconfig   |   11 +
  drivers/usb/host/Makefile  |1 +
  drivers/usb/host/xen-hcd.c | 1638 
 
  3 files changed, 1650 insertions(+)
  create mode 100644 drivers/usb/host/xen-hcd.c
 
 diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig
 index 197a6a3..3361b4b 100644
 --- a/drivers/usb/host/Kconfig
 +++ b/drivers/usb/host/Kconfig
 @@ -788,3 +788,14 @@ config USB_HCD_TEST_MODE
 This option is of interest only to developers who need to validate
 their USB hardware designs.  It is not needed for normal use.  If
 unsure, say N.
 +
 +config USB_XEN_HCD
 + tristate Xen usb virtual host driver
 + depends on XEN
 + select XEN_XENBUS_FRONTEND
 + help
 +   The Xen usb virtual host driver serves as a frontend driver enabling
 +   a Xen guest system to access USB Devices passed through to the guest
 +   by the Xen host (usually Dom0).
 +   Only needed if the kernel is running in a Xen guest and generic
 +   access to a USB device is needed.
 diff --git a/drivers/usb/host/Makefile b/drivers/usb/host/Makefile
 index 65b0b6a..3779696 100644
 --- a/drivers/usb/host/Makefile
 +++ b/drivers/usb/host/Makefile
 @@ -75,3 +75,4 @@ obj-$(CONFIG_USB_HCD_SSB)   += ssb-hcd.o
  obj-$(CONFIG_USB_FUSBH200_HCD)   += fusbh200-hcd.o
  obj-$(CONFIG_USB_FOTG210_HCD)+= fotg210-hcd.o
  obj-$(CONFIG_USB_MAX3421_HCD)+= max3421-hcd.o
 +obj-$(CONFIG_USB_XEN_HCD)+= xen-hcd.o
 diff --git a/drivers/usb/host/xen-hcd.c b/drivers/usb/host/xen-hcd.c
 new file mode 100644
 index 000..9da2856
 --- /dev/null
 +++ b/drivers/usb/host/xen-hcd.c
 @@ -0,0 +1,1638 @@
 +/*
 + * xen-hcd.c
 + *
 + * Xen USB Virtual Host Controller driver
 + *
 + * Copyright (C) 2009, FUJITSU LABORATORIES LTD.
 + * Author: Noboru Iwamatsu n_iwama...@jp.fujitsu.com
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 + * GNU General Public License for more details.
 + *
 + * You should have received a copy of the GNU General Public License
 + * along with this program; if not, see http://www.gnu.org/licenses/.
 + *
 + * Or, by your choice:
 + *
 + * When distributed separately from the Linux kernel or incorporated into
 + * other software packages, subject to the following license:
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to
 + * deal in the Software without restriction, including without limitation the
 + * rights to use, copy, modify, merge, publish, distribute, sublicense, 
 and/or
 + * sell copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 
 THE
 + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 + * DEALINGS IN THE SOFTWARE.
 + */
 +
 +#include linux/module.h
 +#include linux/usb.h
 +#include linux/list.h
 +#include linux/usb/hcd.h
 +#include linux/io.h
 +
 +#include xen/xen.h
 +#include xen/xenbus.h
 +#include xen/grant_table.h
 +#include xen/events.h
 +#include xen/page.h
 +
 +#include xen/interface/io/usbif.h
 +
 +/* Private per-URB data */
 +struct urb_priv {
 + struct

[Xen-devel] [linux-3.4 test] 58597: regressions - trouble: broken/fail/pass

2015-06-16 Thread osstest service user

flight 58597 linux-3.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58597/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl   9 debian-install fail REGR. vs. 52209-bisect
 test-amd64-amd64-pair   15 debian-install/dst_host fail REGR. vs. 52715-bisect
 test-amd64-i386-pair15 debian-install/dst_host fail REGR. vs. 56366-bisect

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-sedf-pin  3 host-install(3) broken blocked in 56366-bisect
 test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken blocked in 
56366-bisect
 test-amd64-i386-qemuu-rhel6hvm-amd 3 host-install(3) broken blocked in 
56366-bisect
 test-amd64-i386-libvirt   9 debian-installfail blocked in 56366-bisect
 test-amd64-i386-rumpuserxen-i386  6 xen-boot  fail blocked in 56366-bisect
 test-amd64-amd64-libvirt  9 debian-installfail blocked in 56366-bisect
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-boot  fail blocked in 56366-bisect
 test-amd64-i386-freebsd10-amd64  6 xen-boot   fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot fail blocked in 56366-bisect
 test-amd64-amd64-rumpuserxen-amd64  6 xen-bootfail blocked in 56366-bisect
 test-amd64-amd64-xl-multivcpu  6 xen-boot fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemut-winxpsp3  6 xen-bootfail blocked in 56366-bisect
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot   fail blocked in 56366-bisect
 test-amd64-i386-xl9 debian-installfail blocked in 56366-bisect
 test-amd64-amd64-xl-qemuu-winxpsp3  6 xen-bootfail blocked in 56366-bisect
 test-amd64-i386-libvirt-xsm   6 xen-boot  fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-boot fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-amd64-xl-sedf  9 debian-installfail blocked in 56366-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-freebsd10-i386  6 xen-bootfail blocked in 56366-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-boot  fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 56366-bisect
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 
56366-bisect
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-bootfail like 53709-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-xsm   9 debian-install   fail   never pass
 test-amd64-amd64-xl-pvh-amd   9 debian-install   fail   never pass
 test-amd64-i386-xl-xsm9 debian-install   fail   never pass
 test-amd64-amd64-xl-credit2   9 debian-install   fail   never pass
 test-amd64-amd64-libvirt-xsm  9 debian-install   fail   never pass
 test-amd64-amd64-xl-pvh-intel  9 debian-install   fail  never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 linux56b48fcda5076d4070ab00df32ff5ff834e0be86
baseline version:
 linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70


370 people touched revisions under test,
not listing them all


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  fail
 test-amd64-i386-xl

Re: [Xen-devel] [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 This is a complicated set of changes which must be done together for
 bisectability.
 
  * libxl-save-helper is updated to unconditionally use libxc migration v2.
  * libxl compatibility workarounds in libxc are disabled for save operations.
  * libxl__stream_write_start() is logically spliced into the event location
where libxl__xc_domain_save() used to reside.
  * xl is updated to indicate that the stream is now v2
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 
 ---
 RFC: What kind of ABI/API indication is appropriate here.  A LIBXL_HAVE*
 isn't apppropriate.

Whether it has HAVE in the name or not I think some sort of #define,
or one each for SAVE, RESTORE and perhaps even LEGACY would be
appropriate. We should arrange that we can remove the LEGACY one in the
future without causing applications to think we've reverted to Xen 4.5
era.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream

On 16/06/15 15:31, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 From: Ross Lagerwall ross.lagerw...@citrix.com

 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Overall looks good, I've got some comments below and I think it almost
 certainly wants eyes from Ian who knows more about the dc infra etc.

 +void libxl__stream_read_start(libxl__egc *egc,
 +  libxl__stream_read_state *stream)
 +{
 +libxl__datacopier_state *dc = stream-dc;
 +int ret = 0;
 +
 +/* State initialisation. */
 +assert(!stream-running);
 +
 +memset(dc, 0, sizeof(*dc));
 libxl__datacopier_init, please

That call is made by libxl__datacopier_start() each and every time, and
unlike here, is matched with an equivalent _kill() call.


 +dc-ao = stream-ao;
 +dc-readfd = stream-fd;
 +dc-writefd = -1;
 +
 +/* Start reading the stream header. */
 +dc-readwhat = stream header;
 +dc-readbuf = stream-hdr;
 +stream-expected_len = dc-bytes_to_read = sizeof(stream-hdr);
 +dc-used = 0;
 +dc-callback = stream_header_done;
 This pattern of resetting and reinitialising the dc occurs in multiple
 places, I think a helper would be in order, some sort of
 stream_next_record_init or something perhaps?

The only feasible helper would have to take everything as parameters; 
there is insufficient similarity between all users.

I dunno whether that would be harder to read...


 +void libxl__stream_read_abort(libxl__egc *egc,
 +  libxl__stream_read_state *stream, int rc)
 +{
 +stream_failed(egc, stream, rc);
 +}
 +
 +static void stream_success(libxl__egc *egc, libxl__stream_read_state 
 *stream)
 +{
 +stream-rc = 0;
 +stream-running = false;
 +
 +stream_done(egc, stream);
 Push the running = false into stream_done and flip the assert there?
 Logically the stream is still running until it is done, so having done
 assert it isn't running seems counter-intuitive.

This is more for piece of mind.  stream_done() my strictly only ever be
called once, hence its assert.


 +static void stream_done(libxl__egc *egc,
 +libxl__stream_read_state *stream)
 +{
 +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
 +
 +assert(!stream-running);
 +
 +stream-completion_callback(egc, dcs, stream-rc);
 +}
 +
 +static void stream_header_done(libxl__egc *egc,
 +   libxl__datacopier_state *dc,
 +   int onwrite, int errnoval)
 +{
 +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
 +libxl_sr_hdr *hdr = stream-hdr;
 +STATE_AO_GC(dc-ao);
 +int ret = 0;
 +
 +if (onwrite || dc-used != stream-expected_len) {
 +ret = ERROR_FAIL;
 +LOG(ERROR, write %d, err %d, expected %zu, got %zu,
 +onwrite, errnoval, stream-expected_len, dc-used);
 +goto err;
 +}
 I think you need to check errnoval == 0 in the !onwrite case, otherwise
 you may miss a read error?

dc-used != stream-expected_len covers all possible read errors, in
the something went wrong kind of way.


 Also it looks like onwrite can be -1, which is a separate error case.

 +
 +static void record_header_done(libxl__egc *egc,
 +   libxl__datacopier_state *dc,
 +   int onwrite, int errnoval)
 +{
 +libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
 +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr;
 +STATE_AO_GC(dc-ao);
 +int ret = 0;
 +
 +if (onwrite || dc-used != stream-expected_len) {
 +ret = ERROR_FAIL;
 +LOG(ERROR, write %d, err %d, expected %zu, got %zu,
 +onwrite, errnoval, stream-expected_len, dc-used);
 +goto err;
 +}
 Same comments wrt the arguments as the previous one.

 Maybe a common helper to check (and log) the status at the head of each
 callback? So you can effectively do if (!everything_ok(stream, dc) goto
 err?

I will see what I can do.


 +assert(!ret);
 +if (rec_hdr-length) {
 +free(stream-rec_body);
 +stream-rec_body = NULL;
 reset length too?

 +static void read_emulator_body(libxl__egc *egc,
 +   libxl__stream_read_state *stream)
 +{
 +libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
 +libxl__datacopier_state *dc = stream-dc;
 +libxl_sr_rec_hdr *rec_hdr = stream-rec_hdr;
 +libxl_sr_emulator_hdr *emu_hdr = stream-rec_body;
 +STATE_AO_GC(stream-ao);
 +char path[256];
 +int ret = 0;
 +
 +sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE.%u, dcs-guest_domid);
 +
 +dc-readwhat = save/migration stream;
 +dc-copywhat = emulator context;
 +dc-writewhat = qemu save file;
 +

Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls

On Tue, 16 Jun 2015, Jan Beulich wrote:
  On 16.06.15 at 16:03, stefano.stabell...@eu.citrix.com wrote:
  On Fri, 5 Jun 2015, Jan Beulich wrote:
  +} else if (s-msix-enabled) {
  +if (!(value  PCI_MSIX_FLAGS_ENABLE)) {
  +xen_pt_msix_disable(s);
  +s-msix-enabled = false;
  +} else if (!s-msix-maskall) {
  
  Why are you changing the state of s-msix-maskall here?
  This is the value  PCI_MSIX_FLAGS_ENABLE case, nothing to do with
  maskall, right?
 
 We're at an else if inside an else if here. The only case left
 after the if() still seen above is that value has
 PCI_MSIX_FLAGS_MASKALL set.

You are right.

Maybe just add

/* (value  PCI_MSIX_FLAGS_MASKALL) at this point */


  +s-msix-maskall = true;
  +xen_pt_msix_maskall(s, true);
  +}
   }
   
  -debug_msix_enabled_old = s-msix-enabled;
  -s-msix-enabled = !!(*val  PCI_MSIX_FLAGS_ENABLE);
   if (s-msix-enabled != debug_msix_enabled_old) {
   XEN_PT_LOG(s-dev, %s MSI-X\n,
  s-msix-enabled ? enable : disable);
   }
   
  +xen_host_pci_get_word(s-real_device, s-msix-ctrl_offset, 
  dev_value);
  
  I have to say that I don't like the asymmetry between reading and
  writing PCI config registers. If writes go via hypercalls, reads should
  go via hypercalls too.
 
 We're not doing any cfg register write via hypercalls (not here,
 and not elsewhere). What is being replaced by the patch are
 write to two bits which happen to live in PCI config space. Plus,
 reading directly, and doing writes via hypercall only when really
 needed would still be the right thing from a performance pov.

OK. It would be nice to document somewhere that QEMU is not supposed to
touch those two bits directly, but I wouldn't know where. Maybe a new
doc under docs/misc?


  --- a/qemu/upstream/hw/xen/xen_pt_msi.c
  +++ b/qemu/upstream/hw/xen/xen_pt_msi.c
  @@ -301,8 +301,11 @@ static int msix_set_enable(XenPCIPassthr
   return -1;
   }
   
  -return msi_msix_enable(s, s-msix-ctrl_offset, PCI_MSIX_FLAGS_ENABLE,
  -   enabled);
  
  Would it make sense to remove msi_msix_enable completely to avoid any
  further mistakes?
 
 Perhaps, yes. I think I actually had suggested so quite a while back.
 But I don't see myself wasting much more time on this, ehm, code.

Isn't it just a matter of removing msi_msix_enable?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Draft F] Xen on ARM vITS Handling

On Tue, 2015-06-16 at 20:20 +0530, Vijay Kilari wrote:
 On Thu, Jun 11, 2015 at 3:10 PM, Ian Campbell ian.campb...@citrix.com wrote:
  Draft F follows. Also at:
  http://xenbits.xen.org/people/ianc/vits/draftF.{pdf,html}
 
 
  ## Per-domain `struct pending_irq` for `vLPI`s
 
  Internally Xen uses a `struct pending_irq` to track the status of any
  pending virtual IRQ, including a virtual LPI.
 
  Upon domain creation an array of such `struct pending_irq`'s will be
  allocated to cover the range `8192..nr_lpis` (for the number of LPIs
  which the guest is configured with) and a pointer this array will be
  stored in the `struct domain`. The function `irq_to_pending` will be
  modified to lookup interupts in the LPI range in this array.
 .
 
 nr_lpis can be large if more devices are assigned to domain.
 As I was suggesting on #xenarm chat, is it ok to use RB-tree instead of array?
 
 what should be value for nr_lpis?

It should be user configurable and default to the sum of the number of
events on all devices at start of day.

I think this removes the need for it to be an R-B tree, an array is
tolerable here.

Adding an R-B tree not only has a memory overhead, but it then needs
more complex management when inserting, searching, etc.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [DESIGN] Feature Levelling improvements

 On 16.06.15 at 12:50, andrew.coop...@citrix.com wrote:
 How XenServer currently does levelling
 ==
 
 The _Heterogeneous Pool Levelling_ support in XenServer appears to
 predate the
 libxc CPUID policy API, so does not currently use it.  The toolstack has a
 table of CPU model numbers identifying whether levelling is supported.  It
 then uses native `CPUID` instructions to look at the first four feature
 masks,
 and identifies the subset of features across the pool.
 `cpuid_mask_{,extd_}{ecx,edx}` is then set on Xen's command line for
 each host
 in the pool, and all hosts rebooted.
 
 This has several limitations:
 
 * Xen and dom0 have a reduced feature set despite not needing to migrate

I don't think Xen is affected by this, as it reads the CPUID bits
before setting the masks (there are a few cpuid() invocations
in random code, but I don't think these access maskable ones).

 Notes and observations
 ==
 
 Experimentally, the masking MSRs can be context switched.  There is no
 need to
 force all PV guests to the same level, and no need to prevent dom0 or
 Xen from
 using certain features.  Context switching the masking MSRs will however
 incur
 an overhead, and should be avoided where possible.
 
 The toolstack needs to know how much control Xen has over VM features. 
 In the
 case that there are insufficient masking MSRs, and no faulting support is
 present, a PV VM can still potentially be made safe to migrate by explicitly
 disabling features on the kernel command line.

That wouldn't help with user mode code, would it?

 VCPU context switch
 ---
 
 Xen shall be updated to lazily context switch all available masking
 MSRs.  It
 is noted that this shall incur a performance overhead if restricted
 featuresets are assigned to PV guests, and _CPUID Faulting_ is not
 available.
 
 It shall be the responsibility of the host administrator to avoid creating
 such a scenario, if the performance overhead is a concern.

Not sure how feasible this is: Even if you run all PV guests at equal
feature levels, context switching between PV and non-PV guests
would still incur overhead (unless you mean to run HVM/PVH ones
with whatever masking is currently in place). Plus this still wouldn't
deal with masks in place when Xen itself wants to look at any of the
maskable ones, unless you intend to audit code to make sure no
such uses exist (which - as per above - I suppose/hope to be the
case).

 Future work
 ===
 
 The above is a minimum quantity of work to support feature levelling, but
 further problems exist.  They are acknowledged as being issues, but are
 not in
 scope for fixing as part of feature levelling.
 
 * Xen has no notion of per-cpu and per-package data in the cpuid policy.  In
   particular, this causes issues for VMs attempting to detect topology,
 which
   find inconsistent/incorrect cache information.
 
 * In the case that `domain_cpuid()` can't locate a leaf in the topology, it
   will fall back to issuing a plain `CPUID` instruction.  This breaks VM
   encapsulation, as a VM which has migrated can observe differences which
   should be hidden.

I think this is actually something that (a) needs addressing not too
far in the future and (b) reminds me that I didn't see any talk here
regarding black vs white listing of features not explicitly known to
Xen or the tool stack.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Is: graphics corruption with 'xen: Support Xen pv-domains using PAT. Was:Re: [BUG] Characters on the screen are broken on Linux = 3.19 with VT-d enabled

2015-06-16 Thread 藍挺瑋

於 二，2015-06-16 於 11:06 +0200，Juergen Gross 提到：
 On 06/16/2015 10:55 AM, Ting-Wei Lan wrote:
  Juergen Gross 於 西元2015年06月16日 12:30 寫道:
   On 06/15/2015 09:03 PM, Ting-Wei Lan wrote:
於 一，2015-06-15 於 14:55 -0400，Konrad Rzeszutek Wilk 提到：
 On Sat, Jun 13, 2015 at 12:43:14AM +0800, Ting-Wei Lan wrote:
  When using Linux = 3.19 as the dom0 kernel, characters on 
  the
  screen become
  broken after the graphic driver is loaded. The commit that 
  breaks
  it is
  (found by git bisect):
  
  
  https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux
  .git/com
  mit/?id=47591df
 
 Lets CC Juergen
  
  
  Screenshot when the system run in single user mode:
  
 https://bugs.freedesktop.org/attachment.cgi?id=115079
   
   Are those messages to be expected:
   
   (XEN) Scrubbing Free RAM on 1 nodes using 2 CPUs
   (XEN) [VT-D]DMAR:[DMA Write] Request device [:00:02.0] fault 
   addr
   b61eef000, iommu reg = 82c000203000
   (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
   (XEN)
   .
   .done.
   
   I'm not familiar with VT-D internals, but seeing these messages 
   for the
   video device during RAM scrubbing makes wonder if everything is 
   correct
   regarding the VT-D and memory setup...
 ...
  I still remember that there was a similar problem found two 
  years
  ago on the
  same hardware with similar broken screen output and it also 
  crashed
  after
  Xorg was started, but I cannot confirm that they are the 
  same
  problems. I
  don't know whether error messages are simliar.
  
  The old problem happens on Linux 3.7 ~ 3.10 with VT-d 
  enabled. It
  only
  happened when not using Xen, so I added 'intel_iommu=off' 
  to Linux
  boot
  arguments to workaround it.
   
   Hmm, do you see any chance in finding the commit which made it 
   working
   again? Perhaps there was some workaround for this hardware which 
   is
   missing in Xen now...
  
  After some tests, I found the information I provided before was
  incorrect. It seems the problem happens on all Linux = 3.7, 
  including
  Linux 4.0.5, so the old problem was never fixed. Here are some 
  'dmesg |
  grep -i iommu' outputs.
 
 So a Xen-specific error is rather improbable, correct?

But Linux provides 'intel_iommu=igfx_off' to workaround the problem.
Does Xen provide similar things?

 
 I'd continue with sending the new information to the Intel graphics
 team.

I am going to reopen DRM/Intel bug 90037 on freedesktop bugzilla.

 
 
 Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream

On 16/06/15 15:57, Ian Campbell wrote:
 On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 From: Ross Lagerwall ross.lagerw...@citrix.com

 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/Makefile |2 +-
  tools/libxl/libxl_internal.h |   33 +++
  tools/libxl/libxl_stream_write.c |  536 
 ++
  3 files changed, 570 insertions(+), 1 deletion(-)
  create mode 100644 tools/libxl/libxl_stream_write.c

 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
 index ca0ae3e..63e32f7 100644
 --- a/tools/libxl/Makefile
 +++ b/tools/libxl/Makefile
 @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
 libxl_pci.o \
  libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
  libxl_internal.o libxl_utils.o libxl_uuid.o \
  libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
 \
 -libxl_stream_read.o \
 +libxl_stream_read.o libxl_stream_write.o \
  libxl_save_callout.o _libxl_save_msgs_callout.o \
  libxl_convert_callout.o \
  libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index 5482950..82cd792 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
  typedef void libxl__save_device_model_cb(libxl__egc*,
   libxl__domain_suspend_state*, int 
 rc);
  
 +/* State for writing a libxl migration v2 stream */
 +typedef struct libxl__stream_write_state libxl__stream_write_state;
 +
 +struct libxl__stream_write_state {
 +/* filled by the user */
 +libxl__ao *ao;
 +int fd;
 +uint32_t domid;
 +void (*completion_callback)(libxl__egc *egc,
 +libxl__domain_suspend_state *dss,
 +int rc);
 +/* Private */
 +int rc;
 +int joined_rc;
 +size_t padding;
 +bool running;
 +libxl__datacopier_state dc;
 +};
 +
 +_hidden void libxl__stream_write_start(libxl__egc *egc,
 +   libxl__stream_write_state *stream);
 +
 +_hidden void libxl__stream_write_abort(libxl__egc *egc,
 +   libxl__stream_write_state *stream,
 +   int rc);
 +
 +static inline bool libxl__stream_write_inuse(
 +const libxl__stream_write_state *stream)
 +{
 +return stream-running;
 +}
 +
  typedef struct libxl__logdirty_switch {
  const char *cmd;
  const char *cmd_path;
 @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
  /* private for libxl__domain_save_device_model */
  libxl__save_device_model_cb *save_dm_callback;
  libxl__datacopier_state save_dm_datacopier;
 +libxl__stream_write_state sws;
  };
  

 diff --git a/tools/libxl/libxl_stream_write.c 
 b/tools/libxl/libxl_stream_write.c
 new file mode 100644
 index 000..856d72e
 --- /dev/null
 +++ b/tools/libxl/libxl_stream_write.c
 @@ -0,0 +1,536 @@
 +/*
 + * Copyright (C) 2015  Citrix Ltd.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU Lesser General Public License as published
 + * by the Free Software Foundation; version 2.1 only. with the special
 + * exception on linking described in file LICENSE.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU Lesser General Public License for more details.
 + */
 +
 +#include libxl_osdeps.h /* must come before any other headers */
 +
 +#include libxl_internal.h
 +
 +/*
 + * Infrastructure for writing a domain to a libxl migration v2 stream.
 + *
 + * Entry points from outside:
 + *  - libxl__stream_write_start()
 + * - Start writing a stream from the start.
 + *
 + * In normal operation, there are two tasks running at once; this stream
 + * processing, and the the libxl-save-helper.  check_stream_finished() is 
 used
 the the.

 + * to join all the tasks in both success and error cases.
 + *
 + * Nomenclature for event callbacks:
 + *  - $FOO_done(): Completion callback for $FOO
 + *  - write_$FOO(): Set up writing a $FOO
 Set up or actually write?

Set up the dc to write $FOO

The write doesn't actually happen until we get the dc callback.


 +
 +void libxl__stream_write_start(libxl__egc *egc,
 +   libxl__stream_write_state *stream)
 +{
 +libxl__datacopier_state *dc = stream-dc;
 +

Re: [Xen-devel] [Patch V3 2/3] usb: Introduce Xen pvUSB frontend (xen hcd)

On 16/06/15 15:56, Juergen Gross wrote:
 
 Hmm, I'm beginning to question the value of that file. May be I should
 just throw it away.
 
 Thanks for the quick feedback.

You could move it to debugfs, but if it's not useful better to get rid
of it.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]

On Tue, Jun 16, 2015 at 4:59 PM, Ian Jackson ian.jack...@eu.citrix.com wrote:
George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
[and 1 more messages]):
Yes. The whole point of paths like this is that they are stable if
the physical topology doesn't change. So on my netbook

/dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1

always refers to the 1st MBR partition on logical device 0 on the USB
storage device plugged into the USB port physically on the front left
of the computer.

That would be great if it were true, but I'm not convinced that the
above path doesn't include a globally-enumerated bus number, which might
change across reboots if it's enumerated in a different order.

It may be that the format above is *not* based on the sysfs path of the
device; that the '0' immediately after the USB means the first (and
perhaps only) controller at this PCI address. In which case, yes,
having a string like this might be a unique identifier for a particular
port that would be stable across reboots. That needs some investigation.

That does seem to be the case:

What seems to be the case -- that it contains the global bus, or not?

Looking at the output of udevadm monitor --property for sdc1 (on
another plug):

DEVPATH=/devices/pci:00/:00:1d.7/usb1/1-1/1-1:1.0/host22/target22:0:0/22:0:0:0/block/sdc/sdc1
ID_PATH=pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0
ID_PATH_TAG=pci-_00_1d_7-usb-0_1_1_0-scsi-0_0_0_0

I don't know where that ID_PATH comes from.

It looks like that's constructed in udev:

http://cgit.freedesktop.org/systemd/systemd/tree/src/udev/udev-builtin-path_id.c

See handle_usb() (line 542) in particular.

If I'm reading it right, what it basically does it take the
[bus]-[port], and replace it with usb-0:[port]. (IOW, the '0' is
hard-coded.)

Also, if I'm reading it right, this won't work properly for Juergen's
system that has two USB devices at the same pci address -- they'll
both end up resolving to [pciaddr]-usb-0:[whatever].

Juergen -- can you give this a try? Run udevadm info on the two USB
busses that share the same PCI slot, and see if the ID_PATH is the
same for both?

-George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/6] xen/pass-through: correctly deal with RW1C bits

 On 16.06.15 at 16:19, stefano.stabell...@eu.citrix.com wrote:
 On Fri, 5 Jun 2015, Jan Beulich wrote:
 @@ -1016,11 +1002,12 @@ static XenPTRegInfo xen_pt_emu_reg_pm[] 
  .size   = 2,
  .init_val   = 0x0008,
  .res_mask   = 0x00F0,
 -.ro_mask= 0xE10C,
 +.ro_mask= 0x610C,
 +.rw1c_mask  = 0x8000,
  .emu_mask   = 0x810B,
  .init   = xen_pt_common_reg_init,
  .u.w.read   = xen_pt_word_reg_read,
 -.u.w.write  = xen_pt_pmcsr_reg_write,
 +.u.w.write  = xen_pt_word_reg_write,
  },
  {
  .size = 0,
 
 I can see that the code change doesn't cause a change in behaviour for
 PCI_PM_CTRL, but it does for PCI_STATUS, PCI_EXP_DEVSTA and
 PCI_EXP_LNKSTA. Please explain why in the commit message.

I'm not sure what you're after in a patch titled correctly deal with
RW1C bits.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [Patch V3 0/3] xen, usb: support pvUSB frontend driver

This series adds XEN guest pvUSB support. With pvUSB it is possible to
use physical USB devices from a XEN domain.

The support consists of a frontend in form of a virtual hcd driver in
the unprivileged domU passing I/O-requests to the backend in a driver
domain (usually Dom0). The backend is not part of this patch series,
as it will be supported via qemu.

The code is taken (and adapted) from the original pvUSB implementation
done for Linux 2.6 in 2008 by Fujitsu.

Normal operation of USB devices by adding and removing them dynamically
to/from a domain has been tested using various USB devices (USB 1.1,
2.0 and 3.0). The pvUSB backend for these tests was a SUSE SLES Dom0
with a kernel based backend driver.

Changes in V3:
- move frontend to drivers/usb/host and rename it to xen-hcd.
- changed name prefixes in patch 1 to xenusb as requested by Greg
- use __un types rather than uintn_t as requested by Greg

Changes in V2:
- removed backend, as it can be implemented in user land
- added some access macros and definitions to the pvUSB interface
  description to make it independant from linux kernel USB internals
- adapted frontend to newer kernel version and use new pvUSB
  interface macros
- set port status in one chunk as suggested by Oliver Neukum

Juergen Gross (3):
  usb: Add Xen pvUSB protocol description
  usb: Introduce Xen pvUSB frontend (xen hcd)
  xen: add Xen pvUSB maintainer

 MAINTAINERS  |8 +
 drivers/usb/host/Kconfig |   11 +
 drivers/usb/host/Makefile|1 +
 drivers/usb/host/xen-hcd.c   | 1638 ++
 include/xen/interface/io/usbif.h |  252 ++
 5 files changed, 1910 insertions(+)
 create mode 100644 drivers/usb/host/xen-hcd.c
 create mode 100644 include/xen/interface/io/usbif.h

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 6/6] libxl/save: Refactor libxl__domain_suspend_state

On 16/06/15 14:26, Ian Jackson wrote:
 Yang Hongyang writes ([PATCH v2 6/6] libxl/save: Refactor 
 libxl__domain_suspend_state):
 Currently struct libxl__domain_suspend_state contains 2 type of states,
 one is save state, another is suspend state. This patch separate it out.
 The motivation of this is that COLO will need to do suspend/resume
 continuesly, we need a more common suspend state.
 Currently in libxl/libxc/etc.  suspend and save have referred to
 the same thing, but different terminology has been used at different
 layers.  Ie both suspend and save each refer to either or both of
 save to disk or suspend for live migrate, or to the relevant
 underlying mechanisms.

 So I'm not sure introducing a distinction between those two terms in
 libxl is really helpful.  If it is to be done there should be a clear
 explanation of what the difference is.

 On IRC you said:

 14:21 yanghy Diziet, currently, in libxl, suspend is used as 2
means, one is save(corrspond to libxc save), another is
suspend the guest(related to suspend callback)

 That's rather different, I think.  Or, at least, I'm not sure that I
 understand this distinction as you are making it.  The suspend
 callback is part of the implementation of what at higher layers we
 save/suspend/restore/migration.

 AIUI this callback is related to pausing the guest, or manipulating
 its VCPUs ?  Perhaps we should rename this callback ?  Maybe Andrew
 Cooper can suggest a name ?

 Forgive me if I'm confused and going off in the wrong direction...

The terminology used by libxc is more consistent in this area.

suspend refers to quiescing the VM, so pausing qemu, making a
remote_shutdown(SHUTDOWN_suspend) hypercall etc.

save refers to the actions involved in actually shuffling the state of
the VM, so xc_domain_save() etc.

I believe that these are useful distinctions to maintain.  libxl
currently uses suspend to encapsulate both.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Draft F] Xen on ARM vITS Handling

2015-06-16 Thread Vijay Kilari

On Thu, Jun 11, 2015 at 3:10 PM, Ian Campbell ian.campb...@citrix.com wrote:
 Draft F follows. Also at:
 http://xenbits.xen.org/people/ianc/vits/draftF.{pdf,html}


 ## Per-domain `struct pending_irq` for `vLPI`s

 Internally Xen uses a `struct pending_irq` to track the status of any
 pending virtual IRQ, including a virtual LPI.

 Upon domain creation an array of such `struct pending_irq`'s will be
 allocated to cover the range `8192..nr_lpis` (for the number of LPIs
 which the guest is configured with) and a pointer this array will be
 stored in the `struct domain`. The function `irq_to_pending` will be
 modified to lookup interupts in the LPI range in this array.
.

nr_lpis can be large if more devices are assigned to domain.
As I was suggesting on #xenarm chat, is it ok to use RB-tree instead of array?

what should be value for nr_lpis?

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 From: Ross Lagerwall ross.lagerw...@citrix.com
 
 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  tools/libxl/Makefile |2 +-
  tools/libxl/libxl_internal.h |   33 +++
  tools/libxl/libxl_stream_write.c |  536 
 ++
  3 files changed, 570 insertions(+), 1 deletion(-)
  create mode 100644 tools/libxl/libxl_stream_write.c
 
 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
 index ca0ae3e..63e32f7 100644
 --- a/tools/libxl/Makefile
 +++ b/tools/libxl/Makefile
 @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
 libxl_pci.o \
   libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
   libxl_internal.o libxl_utils.o libxl_uuid.o \
   libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
 \
 - libxl_stream_read.o \
 + libxl_stream_read.o libxl_stream_write.o \
   libxl_save_callout.o _libxl_save_msgs_callout.o \
   libxl_convert_callout.o \
   libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index 5482950..82cd792 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
  typedef void libxl__save_device_model_cb(libxl__egc*,
   libxl__domain_suspend_state*, int 
 rc);
  
 +/* State for writing a libxl migration v2 stream */
 +typedef struct libxl__stream_write_state libxl__stream_write_state;
 +
 +struct libxl__stream_write_state {
 +/* filled by the user */
 +libxl__ao *ao;
 +int fd;
 +uint32_t domid;
 +void (*completion_callback)(libxl__egc *egc,
 +libxl__domain_suspend_state *dss,
 +int rc);
 +/* Private */
 +int rc;
 +int joined_rc;
 +size_t padding;
 +bool running;
 +libxl__datacopier_state dc;
 +};
 +
 +_hidden void libxl__stream_write_start(libxl__egc *egc,
 +   libxl__stream_write_state *stream);
 +
 +_hidden void libxl__stream_write_abort(libxl__egc *egc,
 +   libxl__stream_write_state *stream,
 +   int rc);
 +
 +static inline bool libxl__stream_write_inuse(
 +const libxl__stream_write_state *stream)
 +{
 +return stream-running;
 +}
 +
  typedef struct libxl__logdirty_switch {
  const char *cmd;
  const char *cmd_path;
 @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
  /* private for libxl__domain_save_device_model */
  libxl__save_device_model_cb *save_dm_callback;
  libxl__datacopier_state save_dm_datacopier;
 +libxl__stream_write_state sws;
  };
  
 
 diff --git a/tools/libxl/libxl_stream_write.c 
 b/tools/libxl/libxl_stream_write.c
 new file mode 100644
 index 000..856d72e
 --- /dev/null
 +++ b/tools/libxl/libxl_stream_write.c
 @@ -0,0 +1,536 @@
 +/*
 + * Copyright (C) 2015  Citrix Ltd.
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU Lesser General Public License as published
 + * by the Free Software Foundation; version 2.1 only. with the special
 + * exception on linking described in file LICENSE.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU Lesser General Public License for more details.
 + */
 +
 +#include libxl_osdeps.h /* must come before any other headers */
 +
 +#include libxl_internal.h
 +
 +/*
 + * Infrastructure for writing a domain to a libxl migration v2 stream.
 + *
 + * Entry points from outside:
 + *  - libxl__stream_write_start()
 + * - Start writing a stream from the start.
 + *
 + * In normal operation, there are two tasks running at once; this stream
 + * processing, and the the libxl-save-helper.  check_stream_finished() is 
 used

the the.

 + * to join all the tasks in both success and error cases.
 + *
 + * Nomenclature for event callbacks:
 + *  - $FOO_done(): Completion callback for $FOO
 + *  - write_$FOO(): Set up writing a $FOO

Set up or actually write?

 +
 +void libxl__stream_write_start(libxl__egc *egc,
 +   libxl__stream_write_state *stream)
 +{
 +libxl__datacopier_state *dc = stream-dc;
 +STATE_AO_GC(stream-ao);
 +struct libxl_sr_hdr hdr = { 0 };
 +int ret = 0;
 +
 +assert(!stream-running);
 +stream-running

Re: [Xen-devel] [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
 In a remus senario, libxc will write a CHECKPOINT record, then hand ownership

scenario

 of the fd to libxl.  Libxl then writes any records required and finishes with
 a CHECKPOINT_END record, then hands ownership of the fd back to libxc.

Seems like a plausible scheme to me, if that's what the RFC was for.

 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  docs/specs/libxl-migration-stream.pandoc |   15 ++-
  tools/libxl/libxl_sr_stream_format.h |1 +
  tools/python/xen/migration/libxl.py  |   11 +++
  3 files changed, 26 insertions(+), 1 deletion(-)
 
 diff --git a/docs/specs/libxl-migration-stream.pandoc 
 b/docs/specs/libxl-migration-stream.pandoc
 index 7235317..d41932a 100644
 --- a/docs/specs/libxl-migration-stream.pandoc
 +++ b/docs/specs/libxl-migration-stream.pandoc
 @@ -119,7 +119,9 @@ type 0x: END
  
   0x0003: EMULATOR_CONTEXT
  
 - 0x0004 - 0x7FFF: Reserved for future _mandatory_
 + 0x0004: CHECKPOINT_END
 +
 + 0x0005 - 0x7FFF: Reserved for future _mandatory_
   records.
  
   0x8000 - 0x: Reserved for future _optional_
 @@ -203,3 +205,14 @@ indexIndex of this emulator for the domain, 
 if multiple
  
  emulator_ctx Emulator context blob.
  
 +
 +CHECKPOINT_END
 +--
 +
 +A checkpoint end record marks the end of a checkpoint in the image.
 +
 + 0 1 2 3 4 5 6 7 octet
 ++-+
 +
 +The end record contains no fields; its body_length is 0.
 +
 diff --git a/tools/libxl/libxl_sr_stream_format.h 
 b/tools/libxl/libxl_sr_stream_format.h
 index 487f9e2..5dfa55f 100644
 --- a/tools/libxl/libxl_sr_stream_format.h
 +++ b/tools/libxl/libxl_sr_stream_format.h
 @@ -35,6 +35,7 @@
  #define REC_TYPE_LIBXC_CONTEXT   0x0001U
  #define REC_TYPE_XENSTORE_DATA   0x0002U
  #define REC_TYPE_EMULATOR_CONTEXT0x0003U
 +#define REC_TYPE_CHECKPOINT_END  0x0004U
  
  typedef struct libxl_sr_emulator_hdr
  {
 diff --git a/tools/python/xen/migration/libxl.py 
 b/tools/python/xen/migration/libxl.py
 index 4e1f4f8..415502e 100644
 --- a/tools/python/xen/migration/libxl.py
 +++ b/tools/python/xen/migration/libxl.py
 @@ -36,12 +36,14 @@ REC_TYPE_end  = 0x
  REC_TYPE_libxc_context= 0x0001
  REC_TYPE_xenstore_data= 0x0002
  REC_TYPE_emulator_context = 0x0003
 +REC_TYPE_checkpoint_end   = 0x0004
  
  rec_type_to_str = {
  REC_TYPE_end  : End,
  REC_TYPE_libxc_context: Libxc context,
  REC_TYPE_xenstore_data: Xenstore data,
  REC_TYPE_emulator_context : Emulator context,
 +REC_TYPE_checkpoint_end   : Checkpoint end,
  }
  
  # emulator_context
 @@ -176,6 +178,13 @@ class VerifyLibxl(VerifyBase):
  self.info(  Index %d, type %s % (emu_idx, 
 emulator_id_to_str[emu_id]))
  
 
 +def verify_record_checkpoint_end(self, content):
 + Checkpoint end record 
 +
 +if len(content) != 0:
 +raise RecordError(Checkpoint end record with non-zero length)
 +
 +
  record_verifiers = {
  REC_TYPE_end:
  VerifyLibxl.verify_record_end,
 @@ -185,4 +194,6 @@ record_verifiers = {
  VerifyLibxl.verify_record_xenstore_data,
  REC_TYPE_emulator_context:
  VerifyLibxl.verify_record_emulator_context,
 +REC_TYPE_checkpoint_end:
 +VerifyLibxl.verify_record_checkpoint_end,
  }



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API [and 1 more messages]

2015-06-16 Thread Sander Eikelenboom


Tuesday, June 16, 2015, 4:41:52 PM, you wrote:

 On 06/16/2015 02:37 PM, Ian Jackson wrote:
 George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API):
 Remember that the path you gave in your previous e-mail isn't the path
 for the *usb device*, it's the path for the *block device*.  It
 contains a PCI address, but it looks like it also contains part of the
 USB topology.  Are you sure that's actually a stable interface, or
 does it just happen that on your hardware the discovery always happens
 in the same order?
 
 The block device is (in path terms) underneath the usb device,
 obviously.  Not all of that path is relevant to identifying the
 USB device.
 
 On my system /sys/bus/usb/devices/2-3.3 is a link to
 /sys/devices/pci:00/:00:1d.7/usb2/2-3/2-3.3/.  This contains
 the pci bus address, but it also contains the bus number, which we've
 just said may be unstable across reboots.
 
 You mean the 2 in `usb2' ?  I think that bus number is the bus number
 within the controller, not globally.

 On the contrary:

 $ ls -l usb*
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb1 -
 ../../../devices/pci:00/:00:1a.7/usb1/
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb2 -
 ../../../devices/pci:00/:00:1d.7/usb2/
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb3 -
 ../../../devices/pci:00/:00:1a.0/usb3/
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb4 -
 ../../../devices/pci:00/:00:1a.1/usb4/
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb5 -
 ../../../devices/pci:00/:00:1a.2/usb5/
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb6 -
 ../../../devices/pci:00/:00:1d.0/usb6/
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb7 -
 ../../../devices/pci:00/:00:1d.1/usb7/
 lrwxrwxrwx 1 root root 0 Jun 15 10:14 usb8 -
 ../../../devices/pci:00/:00:1d.2/usb8/

 $ ls /sys//devices/pci:00/:00:1d.7/ | grep usb
 usb2/

 In other words, the global bus enumeration leaks its way into the device
 path; which means at very least the sysfs device path is potentially
 unstable across reboots even if you include the pci controller it's on.

 I suppose it might be possible to specify buspci,port -- the pci
 address of the root bus, and the topology from there.  In theory I
 guess that should be stable?
 
 Yes.  The whole point of paths like this is that they are stable if
 the physical topology doesn't change.  So on my netbook
 
   /dev/disk/by-path/pci-:00:1d.7-usb-0:1:1.0-scsi-0:0:0:0-part1
 
 always refers to the 1st MBR partition on logical device 0 on the USB
 storage device plugged into the USB port physically on the front left
 of the computer.

 That would be great if it were true, but I'm not convinced that the
 above path doesn't include a globally-enumerated bus number, which might
 change across reboots if it's enumerated in a different order.

 It may be that the format above is *not* based on the sysfs path of the
 device; that the '0' immediately after the USB means the first (and
 perhaps only) controller at this PCI address.  In which case, yes,
 having a string like this might be a unique identifier for a particular
 port that would be stable across reboots.  That needs some investigation.

 In any case, at the moment you're essentially inventing from whole
 cloth a new way of specifying USB devices that (as far as I know)
 isn't supported by any other program that uses USB.
 
 If you can't specify the device by hardware path, you can't specify it
 deterministically.
 
 And as you can see it _is_ supported by other programs that use USB.
 mount can use it!

 Mount doesn't know anything about USB devices; all it knows are block
 devices.  You might as well say that 'ls' knows about USB devices
 because it can read files on a filesystem that resides on a USB stick.

 I'm talking about programs that explicitly manipulate USB devices -- in
 particular, those that may pass them through to a guest or seize them,
 like libvirt, qemu, virtualbox, c.

 I'm not opposed to doing something new if it really is better.  But when
 you find that a dozen other projects before you have done things one
 way, you should at least take care before you decide to do something
 radically different, to make sure that it's because you actually have
 something better, and not because you've just missed something.

 I think the hardware path to the controller, at least, should be
 treated as an opaque OS-specific string.  It might have a different
 format on BSD.

 If we can make an actual path that's stable across reboots, that would
 certainly be a good thing.  But if it's not stable across reboots, it
 has no advantage over the bus-port[.port.port] interface.
  -George

Would a symlink to a sysfs or device node entry also be allowed ?

That way you can leverage the power of tools like udev or other device managers
and script the linkage based on a multitude of device properties 
(f.e. serial numbers, manufacturer id, device id, etc.).

Which can make it persistent across reboots or

Re: [Xen-devel] [PATCH 2/6] xen/MSI-X: drive maskall and enable bits through hypercalls