Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-12-04 Thread Gerd Hoffmann
  Hi,

> btw some questions here:
> 
> for non-gl and gl rendering in Qemu, are they based on dma-buf already?

> once we can export guest framebuffer in dma-buf, is there additional work
> required or just straightforward to integrate with SPICE?

Right now we are busy integrating dma-buf support into spice, which will
be used for the gl rendering path, for virtio-gpu.

For intel-vgpu the wireup inside qemu will be slightly different:  We'll
get a dma-buf handle from the igd driver, whereas virtio-gpu renders
into a texture, then exports that texture as dma-buf.

But in both cases we'll go pass the dma-buf with the guest framebuffer
(and meta-data such as fourcc and size) to spice-server, which in turn
will pass on the dma-buf to spice-client for (local) display.  So we
have a common code path in spice for both virtio-gpu and intel-vgpu,
based on dma-bufs.  spice-server even doesn't need to know what kind of
graphics device the guest has, it'll go just process the dma-bufs.

longer-term we also plan to support video-encoding for a remote display.
Again based on dma-bufs, by sending them to the gpu video encoder.


The non-gl rendering path needs to be figured out.

With virtio-gpu we'll go simply turn off 3d support, so the guest will
fallback to do software rendering, we'll get a classic DisplaySurface
and the vnc server can work with that.

That isn't going to fly with intel-vgpu though, so we need something
else.  Import dma-buf, then glReadPixels into a DisplaySurface would
work.  But as mentioned before I'd prefer a code path which doesn't
require opengl support in qemu, and one option for that would be the
special vfio region.  I've written up a quick draft meanwhile:


diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 751b69f..91b928d 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -596,6 +596,28 @@ struct vfio_iommu_spapr_tce_remove {
 };
 #define VFIO_IOMMU_SPAPR_TCE_REMOVE_IO(VFIO_TYPE, VFIO_BASE + 20)
 
+/*  Additional API for vGPU  */
+
+/*
+ * framebuffer meta data
+ * subregion located at the end of the framebuffer region
+ */
+struct vfio_framebuffer {
+   __u32 argsz;
+
+   /* out */
+   __u32 format;/* drm fourcc */
+   __u32 offset;/* relative to region start */
+   __u32 width; /* in pixels */
+   __u32 height;/* in pixels */
+   __u32 stride;/* in bytes  */
+
+   /* in+out */
+#define VFIO_FB_STATE_REQUEST_UPDATE   1  /* userspace requests update
*/
+#define VFIO_FB_STATE_UPDATE_COMPLETE  2  /* kernel signals completion
*/
+   __u32 state; /* VFIO_FB_STATE_ */
+};
+
 /* * */
 
 #endif /* _UAPIVFIO_H */

cheers,
  Gerd

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-12-02 Thread Tian, Kevin
> From: Tian, Kevin
> Sent: Friday, November 20, 2015 4:36 PM
> 
> >
> > > > So, for non-opengl rendering qemu needs the guest framebuffer data so it
> > > > can feed it into the vnc server.  The vfio framebuffer region is meant
> > > > to support this use case.
> > >
> > > what's the format requirement on that framebuffer? If you are familiar
> > > with Intel Graphics, there's a so-called tiling feature applied on frame
> > > buffer so it can't be used as a raw input to vnc server. w/o opengl you
> > > need do some conversion on CPU first.
> >
> > Yes, that conversion needs to happen, qemu can't deal with tiled
> > graphics.  Anything which pixman can handle will work.  Prefered would
> > be PIXMAN_x8r8g8b8 (aka DRM_FORMAT_XRGB on little endian host) which
> > is the format used by the vnc server (and other places in qemu)
> > internally.

Now the format is reported based on guest setting. Some agent needs to
do format conversion in user space.

> >
> > qemu can also use the opengl texture for the guest fb, then fetch the
> > data with glReadPixels().  Which will probably do exactly the same
> > conversion.  But it'll add a opengl dependency to the non-opengl
> > rendering path in qemu, would be nice if we can avoid that.
> >
> > While being at it:  When importing a dma-buf with a tiled framebuffer
> > into opengl (via eglCreateImageKHR + EGL_LINUX_DMA_BUF_EXT) I suspect we
> > have to pass in the tile size as attribute to make it work.  Is that
> > correct?
> >
> 
> I'd guess so, but need double confirm later when reaching that level of 
> detail.
> some homework on dma-buf is required first. :-)
> 

btw some questions here:

for non-gl and gl rendering in Qemu, are they based on dma-buf already?

once we can export guest framebuffer in dma-buf, is there additional work
required or just straightforward to integrate with SPICE?

Thanks
Kevin
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-24 Thread Daniel Vetter
On Tue, Nov 24, 2015 at 03:12:31PM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > But there's some work to add generic mmap support to dma-bufs, and for
> > really simple case (where we don't have a gl driver to handle the dma-buf
> > specially) for untiled framebuffers that would be all we need?
> 
> Not requiring gl is certainly a bonus, people might want build qemu
> without opengl support to reduce the attach surface and/or package
> dependency chain.
> 
> And, yes, requirements for the non-gl rendering path are pretty low.
> qemu needs something it can mmap, and which it can ask pixman to handle.
> Preferred format is PIXMAN_x8r8g8b8 (qemu uses that internally in alot
> of places so this avoids conversions).
> 
> Current plan is to have a special vfio region (not visible to the guest)
> where the framebuffer lives, with one or two pages at the end for meta
> data (format and size).  Status field is there too and will be used by
> qemu to request updates and the kernel to signal update completion.
> Guess I should write that down as vfio rfc patch ...
> 
> I don't think it makes sense to have fields to notify qemu about which
> framebuffer regions have been updated, I'd expect with full-screen
> composing we have these days this information isn't available anyway.
> Maybe a flag telling whenever there have been updates or not, so qemu
> can skip update processing in case we have the screensaver showing a
> black screen all day long.

GL, wayland, X, EGL and soonish Android's surface flinger (hwc already has
it afaik) all track damage. There's plans to add the same to the atomic
kms api too. But if you do damage tracking you really don't want to
support (maybe allow for perf reasons if the guest is stupid) frontbuffer
rendering, which means you need buffer handles + damage, and not a static
region.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-24 Thread Gerd Hoffmann
  Hi,

> But there's some work to add generic mmap support to dma-bufs, and for
> really simple case (where we don't have a gl driver to handle the dma-buf
> specially) for untiled framebuffers that would be all we need?

Not requiring gl is certainly a bonus, people might want build qemu
without opengl support to reduce the attach surface and/or package
dependency chain.

And, yes, requirements for the non-gl rendering path are pretty low.
qemu needs something it can mmap, and which it can ask pixman to handle.
Preferred format is PIXMAN_x8r8g8b8 (qemu uses that internally in alot
of places so this avoids conversions).

Current plan is to have a special vfio region (not visible to the guest)
where the framebuffer lives, with one or two pages at the end for meta
data (format and size).  Status field is there too and will be used by
qemu to request updates and the kernel to signal update completion.
Guess I should write that down as vfio rfc patch ...

I don't think it makes sense to have fields to notify qemu about which
framebuffer regions have been updated, I'd expect with full-screen
composing we have these days this information isn't available anyway.
Maybe a flag telling whenever there have been updates or not, so qemu
can skip update processing in case we have the screensaver showing a
black screen all day long.

cheers,
  Gerd


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-24 Thread Daniel Vetter
On Tue, Nov 24, 2015 at 01:38:55PM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > > Yes, vGPU may have additional features, like a framebuffer area, that
> > > aren't present or optional for direct assignment.  Obviously we support
> > > direct assignment of GPUs for some vendors already without this feature.
> > 
> > For exposing framebuffers for spice/vnc I highly recommend against
> > anything that looks like a bar/fixed mmio range mapping. First this means
> > the kernel driver needs to internally fake remapping, which isn't fun.
> 
> Sure.  I don't think we should remap here.  More below.
> 
> > My recoomendation is to build the actual memory access for underlying
> > framebuffers on top of dma-buf, so that it can be vacuumed up by e.g. the
> > host gpu driver again for rendering.
> 
> We want that too ;)
> 
> Some more background:
> 
> OpenGL support in qemu is still young and emerging, and we are actually
> building on dma-bufs here.  There are a bunch of different ways how
> guest display output is handled.  At the end of the day it boils down to
> only two fundamental cases though:
> 
>   (a) Where qemu doesn't need access to the guest framebuffer
>   - qemu directly renders via opengl (works today with virtio-gpu
> and will be in the qemu 2.5 release)
>   - qemu passed on the dma-buf to spice client for local display
> (experimental code exists).
>   - qemu feeds the guest display into gpu-assisted video encoder
> to send a stream over the network (no code yet).
> 
>   (b) Where qemu must read the guest framebuffer.
>   - qemu's builtin vnc server.
>   - qemu writing screenshots to file.
>   - (non-opengl legacy code paths for local display, will
>  hopefully disappear long-term though ...)
> 
> So, the question is how to support (b) best.  Even with OpenGL support
> in qemu improving over time I don't expect this going away completely
> anytime soon.
> 
> I think it makes sense to have a special vfio region for that.  I don't
> think remapping makes sense there.  It doesn't need to be "live", it
> doesn't need support high refresh rates.  Placing a copy of the guest
> framebuffer there on request (and convert from tiled to linear while
> being at it) is perfectly fine.  qemu has a adaptive update rate and
> will stop doing frequent update requests when the vnc client
> disconnects, so there will be nothing to do if nobody wants actually see
> the guest display.
> 
> Possible alternative approach would be to import a dma-buf, then use
> glReadPixels().  I suspect when doing the copy in the kernel the driver
> could ask just the gpu to blit the guest framebuffer.  Don't know gfx
> hardware good enough to be sure though, comments are welcome.

Generally the kernel can't do gpu blts since the required massive state
setup is only in the userspace side of the GL driver stack. But
glReadPixels can do tricks for detiling, and if you use pixel buffer
objects or something similar it'll even be amortized reasonably.

But there's some work to add generic mmap support to dma-bufs, and for
really simple case (where we don't have a gl driver to handle the dma-buf
specially) for untiled framebuffers that would be all we need?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-24 Thread Gerd Hoffmann
  Hi,

> > Yes, vGPU may have additional features, like a framebuffer area, that
> > aren't present or optional for direct assignment.  Obviously we support
> > direct assignment of GPUs for some vendors already without this feature.
> 
> For exposing framebuffers for spice/vnc I highly recommend against
> anything that looks like a bar/fixed mmio range mapping. First this means
> the kernel driver needs to internally fake remapping, which isn't fun.

Sure.  I don't think we should remap here.  More below.

> My recoomendation is to build the actual memory access for underlying
> framebuffers on top of dma-buf, so that it can be vacuumed up by e.g. the
> host gpu driver again for rendering.

We want that too ;)

Some more background:

OpenGL support in qemu is still young and emerging, and we are actually
building on dma-bufs here.  There are a bunch of different ways how
guest display output is handled.  At the end of the day it boils down to
only two fundamental cases though:

  (a) Where qemu doesn't need access to the guest framebuffer
  - qemu directly renders via opengl (works today with virtio-gpu
and will be in the qemu 2.5 release)
  - qemu passed on the dma-buf to spice client for local display
(experimental code exists).
  - qemu feeds the guest display into gpu-assisted video encoder
to send a stream over the network (no code yet).

  (b) Where qemu must read the guest framebuffer.
  - qemu's builtin vnc server.
  - qemu writing screenshots to file.
  - (non-opengl legacy code paths for local display, will
 hopefully disappear long-term though ...)

So, the question is how to support (b) best.  Even with OpenGL support
in qemu improving over time I don't expect this going away completely
anytime soon.

I think it makes sense to have a special vfio region for that.  I don't
think remapping makes sense there.  It doesn't need to be "live", it
doesn't need support high refresh rates.  Placing a copy of the guest
framebuffer there on request (and convert from tiled to linear while
being at it) is perfectly fine.  qemu has a adaptive update rate and
will stop doing frequent update requests when the vnc client
disconnects, so there will be nothing to do if nobody wants actually see
the guest display.

Possible alternative approach would be to import a dma-buf, then use
glReadPixels().  I suspect when doing the copy in the kernel the driver
could ask just the gpu to blit the guest framebuffer.  Don't know gfx
hardware good enough to be sure though, comments are welcome.

cheers,
  Gerd


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-24 Thread Chris Wilson
On Tue, Nov 24, 2015 at 12:19:18PM +0100, Daniel Vetter wrote:
> Downside: Tracking mapping changes on the guest side won't be any easier.
> This is mostly a problem for integrated gpus, since discrete ones usually
> require contiguous vram for scanout. I think saying "don't do that" is a
> valid option though, i.e. we're assuming that page mappings for a in-use
> scanout range never changes on the guest side. That is true for at least
> all the current linux drivers.

Apart from we already suffer limitations of fixed mappings and have patches
that want to change the page mapping of active scanouts.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-24 Thread Daniel Vetter
On Thu, Nov 19, 2015 at 01:02:36PM -0700, Alex Williamson wrote:
> On Thu, 2015-11-19 at 04:06 +, Tian, Kevin wrote:
> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > Sent: Thursday, November 19, 2015 2:12 AM
> > > 
> > > [cc +qemu-devel, +paolo, +gerd]
> > > 
> > > Another area of extension is how to expose a framebuffer to QEMU for
> > > seamless integration into a SPICE/VNC channel.  For this I believe we
> > > could use a new region, much like we've done to expose VGA access
> > > through a vfio device file descriptor.  An area within this new
> > > framebuffer region could be directly mappable in QEMU while a
> > > non-mappable page, at a standard location with standardized format,
> > > provides a description of framebuffer and potentially even a
> > > communication channel to synchronize framebuffer captures.  This would
> > > be new code for QEMU, but something we could share among all vGPU
> > > implementations.
> > 
> > Now GVT-g already provides an interface to decode framebuffer information,
> > w/ an assumption that the framebuffer will be further composited into 
> > OpenGL APIs. So the format is defined according to OpenGL definition.
> > Does that meet SPICE requirement?
> > 
> > Another thing to be added. Framebuffers are frequently switched in
> > reality. So either Qemu needs to poll or a notification mechanism is 
> > required.
> > And since it's dynamic, having framebuffer page directly exposed in the
> > new region might be tricky. We can just expose framebuffer information
> > (including base, format, etc.) and let Qemu to map separately out of VFIO
> > interface.
> 
> Sure, we'll need to work out that interface, but it's also possible that
> the framebuffer region is simply remapped to another area of the device
> (ie. multiple interfaces mapping the same thing) by the vfio device
> driver.  Whether it's easier to do that or make the framebuffer region
> reference another region is something we'll need to see.
> 
> > And... this works fine with vGPU model since software knows all the
> > detail about framebuffer. However in pass-through case, who do you expect
> > to provide that information? Is it OK to introduce vGPU specific APIs in
> > VFIO?
> 
> Yes, vGPU may have additional features, like a framebuffer area, that
> aren't present or optional for direct assignment.  Obviously we support
> direct assignment of GPUs for some vendors already without this feature.

For exposing framebuffers for spice/vnc I highly recommend against
anything that looks like a bar/fixed mmio range mapping. First this means
the kernel driver needs to internally fake remapping, which isn't fun.
Second we can't get at the memory in an easy fashion for hw-accelerated
compositing.

My recoomendation is to build the actual memory access for underlying
framebuffers on top of dma-buf, so that it can be vacuumed up by e.g. the
host gpu driver again for rendering. For userspace the generic part would
simply be an invalidate-fb signal, with the new dma-buf supplied.

Upsides:
- You can composit stuff with the gpu.
- VRAM and other kinds of resources (even stuff not visible in pci bars)
  can be represented.

Downside: Tracking mapping changes on the guest side won't be any easier.
This is mostly a problem for integrated gpus, since discrete ones usually
require contiguous vram for scanout. I think saying "don't do that" is a
valid option though, i.e. we're assuming that page mappings for a in-use
scanout range never changes on the guest side. That is true for at least
all the current linux drivers.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-22 Thread Jike Song

On 11/21/2015 01:25 AM, Alex Williamson wrote:

On Fri, 2015-11-20 at 08:10 +, Tian, Kevin wrote:


Here is a more concrete example:

KVMGT doesn't require IOMMU. All DMA targets are already replaced with
HPA thru shadow GTT. So DMA requests from GPU all contain HPAs.

When IOMMU is enabled, one simple approach is to have vGPU IOMMU
driver configure system IOMMU with identity mapping (HPA->HPA). We
can't use (GPA->HPA) since GPAs from multiple VMs are conflicting.

However, we still have host gfx driver running. When IOMMU is enabled,
dma_alloc_*** will return IOVA (drvers/iommu/iova.c) in host gfx driver,
which will have IOVA->HPA programmed to system IOMMU.

One IOMMU device entry can only translate one address space, so here
comes a conflict (HPA->HPA vs. IOVA->HPA). To solve this, vGPU IOMMU
driver needs to allocate IOVA from iova.c for each VM w/ vGPU assigned,
and then KVMGT will program IOVA in shadow GTT accordingly. It adds
one additional mapping layer (GPA->IOVA->HPA). In this way two
requirements can be unified together since only IOVA->HPA mapping
needs to be built.

So unlike existing type1 IOMMU driver which controls IOMMU alone, vGPU
IOMMU driver needs to cooperate with other agent (iova.c here) to
co-manage system IOMMU. This may not impact existing VFIO framework.
Just want to highlight additional work here when implementing the vGPU
IOMMU driver.


Right, so the existing i915 driver needs to use the DMA API and calls
like dma_map_page() to enable translations through the IOMMU.  With
dma_map_page(), the caller provides a page address (~HPA) and is
returned an IOVA.  So unfortunately you don't get to take the shortcut
of having an identity mapping through the IOMMU unless you want to
convert i915 entirely to using the IOMMU API, because we also can't have
the conflict that an HPA could overlap an IOVA for a previously mapped
page.

The double translation, once through the GPU MMU and once through the
system IOMMU is going to happen regardless of whether we can identity
map through the IOMMU.  The only solution to this would be for the GPU
to participate in ATS and provide pre-translated transactions from the
GPU.  All of this is internal to the i915 driver (or vfio extension of
that driver) and needs to be done regardless of what sort of interface
we're using to expose the vGPU to QEMU.  It just seems like VFIO
provides a convenient way of doing this since you'll have ready access
to the HVA-GPA mappings for the user.

I think the key points though are:

   * the VFIO type1 IOMMU stores GPA to HVA translations
   * get_user_pages() on the HVA will pin the page and give you a
 page
   * dma_map_page() receives that page, programs the system IOMMU and
 provides an IOVA
   * the GPU MMU can then be programmed with the GPA to IOVA
 translations


Thanks for such a nice example! I'll do my home work and get back to you
shortly :)



Thanks,
Alex



--
Thanks,
Jike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-20 Thread Alex Williamson
On Fri, 2015-11-20 at 08:10 +, Tian, Kevin wrote:
> > From: Tian, Kevin
> > Sent: Friday, November 20, 2015 3:10 PM
> 
> > > > >
> > > > > The proposal is therefore that GPU vendors can expose vGPUs to
> > > > > userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
> > > > > supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
> > > > > module (or extension of i915) can register as a vfio bus driver, 
> > > > > create
> > > > > a struct device per vGPU, create an IOMMU group for that device, and
> > > > > register that device with the vfio-core.  Since we don't rely on the
> > > > > system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
> > > > > extension of the same module) can register a "type1" compliant IOMMU
> > > > > driver into vfio-core.  From the perspective of QEMU then, all of the
> > > > > existing vfio-pci code is re-used, QEMU remains largely unaware of any
> > > > > specifics of the vGPU being assigned, and the only necessary change so
> > > > > far is how QEMU traverses sysfs to find the device and thus the IOMMU
> > > > > group leading to the vfio group.
> > > >
> > > > GVT-g requires to pin guest memory and query GPA->HPA information,
> > > > upon which shadow GTTs will be updated accordingly from (GMA->GPA)
> > > > to (GMA->HPA). So yes, here a dummy or simple "type1" compliant IOMMU
> > > > can be introduced just for this requirement.
> > > >
> > > > However there's one tricky point which I'm not sure whether overall
> > > > VFIO concept will be violated. GVT-g doesn't require system IOMMU
> > > > to function, however host system may enable system IOMMU just for
> > > > hardening purpose. This means two-level translations existing (GMA->
> > > > IOVA->HPA), so the dummy IOMMU driver has to request system IOMMU
> > > > driver to allocate IOVA for VMs and then setup IOVA->HPA mapping
> > > > in IOMMU page table. In this case, multiple VM's translations are
> > > > multiplexed in one IOMMU page table.
> > > >
> > > > We might need create some group/sub-group or parent/child concepts
> > > > among those IOMMUs for thorough permission control.
> > >
> > > My thought here is that this is all abstracted through the vGPU IOMMU
> > > and device vfio backends.  It's the GPU driver itself, or some vfio
> > > extension of that driver, mediating access to the device and deciding
> > > when to configure GPU MMU mappings.  That driver has access to the GPA
> > > to HVA translations thanks to the type1 complaint IOMMU it implements
> > > and can pin pages as needed to create GPA to HPA mappings.  That should
> > > give it all the pieces it needs to fully setup mappings for the vGPU.
> > > Whether or not there's a system IOMMU is simply an exercise for that
> > > driver.  It needs to do a DMA mapping operation through the system IOMMU
> > > the same for a vGPU as if it was doing it for itself, because they are
> > > in fact one in the same.  The GMA to IOVA mapping seems like an internal
> > > detail.  I assume the IOVA is some sort of GPA, and the GMA is managed
> > > through mediation of the device.
> > 
> > Sorry I'm not familiar with VFIO internal. My original worry is that system
> > IOMMU for GPU may be already claimed by another vfio driver (e.g. host 
> > kernel
> > wants to harden gfx driver from rest sub-systems, regardless of whether vGPU
> > is created or not). In that case vGPU IOMMU driver shouldn't manage system
> > IOMMU directly.
> > 
> > btw, curious today how VFIO coordinates with system IOMMU driver regarding
> > to whether a IOMMU is used to control device assignment, or used for kernel
> > hardening. Somehow two are conflicting since different address spaces are
> > concerned (GPA vs. IOVA)...
> > 
> 
> Here is a more concrete example:
> 
> KVMGT doesn't require IOMMU. All DMA targets are already replaced with 
> HPA thru shadow GTT. So DMA requests from GPU all contain HPAs.
> 
> When IOMMU is enabled, one simple approach is to have vGPU IOMMU
> driver configure system IOMMU with identity mapping (HPA->HPA). We 
> can't use (GPA->HPA) since GPAs from multiple VMs are conflicting. 
> 
> However, we still have host gfx driver running. When IOMMU is enabled, 
> dma_alloc_*** will return IOVA (drvers/iommu/iova.c) in host gfx driver,
> which will have IOVA->HPA programmed to system IOMMU.
> 
> One IOMMU device entry can only translate one address space, so here
> comes a conflict (HPA->HPA vs. IOVA->HPA). To solve this, vGPU IOMMU
> driver needs to allocate IOVA from iova.c for each VM w/ vGPU assigned,
> and then KVMGT will program IOVA in shadow GTT accordingly. It adds
> one additional mapping layer (GPA->IOVA->HPA). In this way two 
> requirements can be unified together since only IOVA->HPA mapping 
> needs to be built.
> 
> So unlike existing type1 IOMMU driver which controls IOMMU alone, vGPU 
> IOMMU driver needs to cooperate with other agent (iova.c here) to
> co-manage system IOMMU. This may not impact existing VFIO framewo

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-20 Thread Alex Williamson
On Fri, 2015-11-20 at 07:09 +, Tian, Kevin wrote:
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Friday, November 20, 2015 4:03 AM
> > 
> > > >
> > > > The proposal is therefore that GPU vendors can expose vGPUs to
> > > > userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
> > > > supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
> > > > module (or extension of i915) can register as a vfio bus driver, create
> > > > a struct device per vGPU, create an IOMMU group for that device, and
> > > > register that device with the vfio-core.  Since we don't rely on the
> > > > system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
> > > > extension of the same module) can register a "type1" compliant IOMMU
> > > > driver into vfio-core.  From the perspective of QEMU then, all of the
> > > > existing vfio-pci code is re-used, QEMU remains largely unaware of any
> > > > specifics of the vGPU being assigned, and the only necessary change so
> > > > far is how QEMU traverses sysfs to find the device and thus the IOMMU
> > > > group leading to the vfio group.
> > >
> > > GVT-g requires to pin guest memory and query GPA->HPA information,
> > > upon which shadow GTTs will be updated accordingly from (GMA->GPA)
> > > to (GMA->HPA). So yes, here a dummy or simple "type1" compliant IOMMU
> > > can be introduced just for this requirement.
> > >
> > > However there's one tricky point which I'm not sure whether overall
> > > VFIO concept will be violated. GVT-g doesn't require system IOMMU
> > > to function, however host system may enable system IOMMU just for
> > > hardening purpose. This means two-level translations existing (GMA->
> > > IOVA->HPA), so the dummy IOMMU driver has to request system IOMMU
> > > driver to allocate IOVA for VMs and then setup IOVA->HPA mapping
> > > in IOMMU page table. In this case, multiple VM's translations are
> > > multiplexed in one IOMMU page table.
> > >
> > > We might need create some group/sub-group or parent/child concepts
> > > among those IOMMUs for thorough permission control.
> > 
> > My thought here is that this is all abstracted through the vGPU IOMMU
> > and device vfio backends.  It's the GPU driver itself, or some vfio
> > extension of that driver, mediating access to the device and deciding
> > when to configure GPU MMU mappings.  That driver has access to the GPA
> > to HVA translations thanks to the type1 complaint IOMMU it implements
> > and can pin pages as needed to create GPA to HPA mappings.  That should
> > give it all the pieces it needs to fully setup mappings for the vGPU.
> > Whether or not there's a system IOMMU is simply an exercise for that
> > driver.  It needs to do a DMA mapping operation through the system IOMMU
> > the same for a vGPU as if it was doing it for itself, because they are
> > in fact one in the same.  The GMA to IOVA mapping seems like an internal
> > detail.  I assume the IOVA is some sort of GPA, and the GMA is managed
> > through mediation of the device.
> 
> Sorry I'm not familiar with VFIO internal. My original worry is that system 
> IOMMU for GPU may be already claimed by another vfio driver (e.g. host kernel
> wants to harden gfx driver from rest sub-systems, regardless of whether vGPU 
> is created or not). In that case vGPU IOMMU driver shouldn't manage system
> IOMMU directly.

There are different APIs for the IOMMU depending on how it's being use.
If the IOMMU is being used for inter-device isolation in the host, then
the DMA API (ex. dma_map_page) transparently makes use of the IOMMU.
When we're doing device assignment, we make use of the IOMMU API which
allows more explicit control (ex. iommu_domain_alloc,
iommu_attach_device, iommu_map, etc).  A vGPU is not an SR-IOV VF, it
doesn't have a unique requester ID that allows the IOMMU to
differentiate one vGPU from another, or vGPU from GPU.  All mappings for
vGPUs need to occur for the GPU.  It's therefore the responsibility of
the GPU driver, or this vfio extension of that driver, that needs to
perform the IOMMU mapping for the vGPU.

My expectation is therefore that once the GMA to IOVA mapping is
configured in the GPU MMU, the IOVA to HPA needs to be programmed, as if
the GPU driver was performing the setup itself, which it is.  Before the
device mediation that triggered the mapping setup is complete, the GPU
MMU and the system IOMMU (if preset) should be configured to enable that
DMA.  The GPU MMU provides the isolation of the vGPU, the system IOMMU
enable the DMA to occur.

> btw, curious today how VFIO coordinates with system IOMMU driver regarding
> to whether a IOMMU is used to control device assignment, or used for kernel 
> hardening. Somehow two are conflicting since different address spaces are
> concerned (GPA vs. IOVA)...

When devices unbind from native host drivers, any previous IOMMU
mappings and domains are removed.  These are typically created via the
DMA API above.  The initialization

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-20 Thread Alex Williamson
On Fri, 2015-11-20 at 13:51 +0800, Jike Song wrote:
> On 11/20/2015 12:22 PM, Alex Williamson wrote:
> > On Fri, 2015-11-20 at 10:58 +0800, Jike Song wrote:
> >> On 11/19/2015 11:52 PM, Alex Williamson wrote:
> >>> On Thu, 2015-11-19 at 15:32 +, Stefano Stabellini wrote:
>  On Thu, 19 Nov 2015, Jike Song wrote:
> > Hi Alex, thanks for the discussion.
> >
> > In addition to Kevin's replies, I have a high-level question: can VFIO
> > be used by QEMU for both KVM and Xen?
> 
>  No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
>  is owned by Xen.
> >>>
> >>> Right, but in this case we're talking about device MMUs, which are owned
> >>> by the device driver which I think is running in dom0, right?  This
> >>> proposal doesn't require support of the system IOMMU, the dom0 driver
> >>> maps IOVA translations just as it would for itself.  We're largely
> >>> proposing use of the VFIO API to provide a common interface to expose a
> >>> PCI(e) device to QEMU, but what happens in the vGPU vendor device and
> >>> IOMMU backends is specific to the device and perhaps even specific to
> >>> the hypervisor.  Thanks,
> >>
> >> Let me conclude this, and please correct me in case of any misread: the
> >> vGPU interface between kernel and QEMU will be through VFIO, with a new
> >> VFIO backend (instead of the existing type1), for both KVMGT and XenGT?
> >
> > My primary concern is KVM and QEMU upstream, the proposal is not
> > specifically directed at XenGT, but does not exclude it either.  Xen is
> > welcome to adopt this proposal as well, it simply defines the channel
> > through which vGPUs are exposed to QEMU as the VFIO API.  The core VFIO
> > code in the Linux kernel is just as available for use in Xen dom0 as it
> > is for a KVM host. VFIO in QEMU certainly knows about some
> > accelerations for KVM, but these are almost entirely around allowing
> > eventfd based interrupts to be injected through KVM, which is something
> > I'm sure Xen could provide as well.  These accelerations are also not
> > required, VFIO based device assignment in QEMU works with or without
> > KVM.  Likewise, the VFIO kernel interface knows nothing about KVM and
> > has no dependencies on it.
> >
> > There are two components to the VFIO API, one is the type1 compliant
> > IOMMU interface, which for this proposal is really doing nothing more
> > than tracking the HVA to GPA mappings for the VM.  This much seems
> > entirely common regardless of the hypervisor.  The other part is the
> > device interface.  The lifecycle of the virtual device seems like it
> > would be entirely shared, as does much of the emulation components of
> > the device.  When we get to pinning pages, providing direct access to
> > memory ranges for a VM, and accelerating interrupts, the vGPU drivers
> > will likely need some per hypervisor branches, but these are areas where
> > that's true no matter what the interface.  I'm probably over
> > simplifying, but hopefully not too much, correct me if I'm wrong.
> >
> 
> Thanks for confirmation. For QEMU/KVM, I totally agree your point; However,
> if we take XenGT to consider, it will be a bit more complex: with Xen
> hypervisor and Dom0 kernel running in different level, it's not a straight-
> forward way for QEMU to do something like mapping a portion of MMIO BAR
> via VFIO in Dom0 kernel, instead of calling hypercalls directly.

This would need to be part of the support added for Xen.  To directly
map a device MMIO space to the VM, VFIO provides an mmap, QEMU registers
that mmap with KVM, or Xen.  It's all just MemoryRegions in QEMU.
Perhaps it's even already supported by Xen.

> I don't know if there is a better way to handle this. But I do agree that
> channels between kernel and Qemu via VFIO is a good idea, even though we
> may have to split KVMGT/XenGT in Qemu a bit.  We are currently working on
> moving all of PCI CFG emulation from kernel to Qemu, hopefully we can
> release it by end of this year and work with you guys to adjust it for
> the agreed method.

Well, moving PCI config space emulation from kernel to QEMU is exactly
the wrong direction to take for this proposal.  Config space access to
the vGPU would occur through the VFIO API.  So if you already have
config space emulation in the kernel, that's already one less piece of
work for a VFIO model, it just needs to be "wired up" through the VFIO
API.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-20 Thread Zhiyuan Lv
On Fri, Nov 20, 2015 at 04:36:15PM +0800, Tian, Kevin wrote:
> > From: Gerd Hoffmann [mailto:kra...@redhat.com]
> > Sent: Friday, November 20, 2015 4:26 PM
> > 
> >   Hi,
> > 
> > > > iGVT-g_Setup_Guide.txt mentions a "Indirect Display Mode", but doesn't
> > > > explain how the guest framebuffer can be accessed then.
> > >
> > > You can check "fb_decoder.h". One thing to clarify. Its format is
> > > actually based on drm definition, instead of OpenGL. Sorry for
> > > that.
> > 
> > drm is fine.  That header explains the format, but not how it can be
> > accessed.  Is the guest fb exported as dma-buf?
> 
> Currently not, but per our previous discussion we should move to use
> dma-buf. We have some demo code in user space. Not sure whether
> they're public now. Jike could you help do a check?

Our current implementation did not use dma-buf yet, still based on DRM_FLINK
interface. We will switch to dma-buf. Thanks!

Regards,
-Zhiyuan

> 
> > 
> > > > So, for non-opengl rendering qemu needs the guest framebuffer data so it
> > > > can feed it into the vnc server.  The vfio framebuffer region is meant
> > > > to support this use case.
> > >
> > > what's the format requirement on that framebuffer? If you are familiar
> > > with Intel Graphics, there's a so-called tiling feature applied on frame
> > > buffer so it can't be used as a raw input to vnc server. w/o opengl you
> > > need do some conversion on CPU first.
> > 
> > Yes, that conversion needs to happen, qemu can't deal with tiled
> > graphics.  Anything which pixman can handle will work.  Prefered would
> > be PIXMAN_x8r8g8b8 (aka DRM_FORMAT_XRGB on little endian host) which
> > is the format used by the vnc server (and other places in qemu)
> > internally.
> > 
> > qemu can also use the opengl texture for the guest fb, then fetch the
> > data with glReadPixels().  Which will probably do exactly the same
> > conversion.  But it'll add a opengl dependency to the non-opengl
> > rendering path in qemu, would be nice if we can avoid that.
> > 
> > While being at it:  When importing a dma-buf with a tiled framebuffer
> > into opengl (via eglCreateImageKHR + EGL_LINUX_DMA_BUF_EXT) I suspect we
> > have to pass in the tile size as attribute to make it work.  Is that
> > correct?
> > 
> 
> I'd guess so, but need double confirm later when reaching that level of 
> detail. 
> some homework on dma-buf is required first. :-)
> 
> Thanks
> Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-20 Thread Tian, Kevin
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Friday, November 20, 2015 4:26 PM
> 
>   Hi,
> 
> > > iGVT-g_Setup_Guide.txt mentions a "Indirect Display Mode", but doesn't
> > > explain how the guest framebuffer can be accessed then.
> >
> > You can check "fb_decoder.h". One thing to clarify. Its format is
> > actually based on drm definition, instead of OpenGL. Sorry for
> > that.
> 
> drm is fine.  That header explains the format, but not how it can be
> accessed.  Is the guest fb exported as dma-buf?

Currently not, but per our previous discussion we should move to use
dma-buf. We have some demo code in user space. Not sure whether
they're public now. Jike could you help do a check?

> 
> > > So, for non-opengl rendering qemu needs the guest framebuffer data so it
> > > can feed it into the vnc server.  The vfio framebuffer region is meant
> > > to support this use case.
> >
> > what's the format requirement on that framebuffer? If you are familiar
> > with Intel Graphics, there's a so-called tiling feature applied on frame
> > buffer so it can't be used as a raw input to vnc server. w/o opengl you
> > need do some conversion on CPU first.
> 
> Yes, that conversion needs to happen, qemu can't deal with tiled
> graphics.  Anything which pixman can handle will work.  Prefered would
> be PIXMAN_x8r8g8b8 (aka DRM_FORMAT_XRGB on little endian host) which
> is the format used by the vnc server (and other places in qemu)
> internally.
> 
> qemu can also use the opengl texture for the guest fb, then fetch the
> data with glReadPixels().  Which will probably do exactly the same
> conversion.  But it'll add a opengl dependency to the non-opengl
> rendering path in qemu, would be nice if we can avoid that.
> 
> While being at it:  When importing a dma-buf with a tiled framebuffer
> into opengl (via eglCreateImageKHR + EGL_LINUX_DMA_BUF_EXT) I suspect we
> have to pass in the tile size as attribute to make it work.  Is that
> correct?
> 

I'd guess so, but need double confirm later when reaching that level of detail. 
some homework on dma-buf is required first. :-)

Thanks
Kevin
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-20 Thread Gerd Hoffmann
  Hi,

> > iGVT-g_Setup_Guide.txt mentions a "Indirect Display Mode", but doesn't
> > explain how the guest framebuffer can be accessed then.
> 
> You can check "fb_decoder.h". One thing to clarify. Its format is
> actually based on drm definition, instead of OpenGL. Sorry for
> that.

drm is fine.  That header explains the format, but not how it can be
accessed.  Is the guest fb exported as dma-buf?

> > So, for non-opengl rendering qemu needs the guest framebuffer data so it
> > can feed it into the vnc server.  The vfio framebuffer region is meant
> > to support this use case.
> 
> what's the format requirement on that framebuffer? If you are familiar
> with Intel Graphics, there's a so-called tiling feature applied on frame
> buffer so it can't be used as a raw input to vnc server. w/o opengl you
> need do some conversion on CPU first.

Yes, that conversion needs to happen, qemu can't deal with tiled
graphics.  Anything which pixman can handle will work.  Prefered would
be PIXMAN_x8r8g8b8 (aka DRM_FORMAT_XRGB on little endian host) which
is the format used by the vnc server (and other places in qemu)
internally.

qemu can also use the opengl texture for the guest fb, then fetch the
data with glReadPixels().  Which will probably do exactly the same
conversion.  But it'll add a opengl dependency to the non-opengl
rendering path in qemu, would be nice if we can avoid that.

While being at it:  When importing a dma-buf with a tiled framebuffer
into opengl (via eglCreateImageKHR + EGL_LINUX_DMA_BUF_EXT) I suspect we
have to pass in the tile size as attribute to make it work.  Is that
correct?

cheers,
  Gerd


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-20 Thread Tian, Kevin
> From: Tian, Kevin
> Sent: Friday, November 20, 2015 3:10 PM

> > > >
> > > > The proposal is therefore that GPU vendors can expose vGPUs to
> > > > userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
> > > > supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
> > > > module (or extension of i915) can register as a vfio bus driver, create
> > > > a struct device per vGPU, create an IOMMU group for that device, and
> > > > register that device with the vfio-core.  Since we don't rely on the
> > > > system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
> > > > extension of the same module) can register a "type1" compliant IOMMU
> > > > driver into vfio-core.  From the perspective of QEMU then, all of the
> > > > existing vfio-pci code is re-used, QEMU remains largely unaware of any
> > > > specifics of the vGPU being assigned, and the only necessary change so
> > > > far is how QEMU traverses sysfs to find the device and thus the IOMMU
> > > > group leading to the vfio group.
> > >
> > > GVT-g requires to pin guest memory and query GPA->HPA information,
> > > upon which shadow GTTs will be updated accordingly from (GMA->GPA)
> > > to (GMA->HPA). So yes, here a dummy or simple "type1" compliant IOMMU
> > > can be introduced just for this requirement.
> > >
> > > However there's one tricky point which I'm not sure whether overall
> > > VFIO concept will be violated. GVT-g doesn't require system IOMMU
> > > to function, however host system may enable system IOMMU just for
> > > hardening purpose. This means two-level translations existing (GMA->
> > > IOVA->HPA), so the dummy IOMMU driver has to request system IOMMU
> > > driver to allocate IOVA for VMs and then setup IOVA->HPA mapping
> > > in IOMMU page table. In this case, multiple VM's translations are
> > > multiplexed in one IOMMU page table.
> > >
> > > We might need create some group/sub-group or parent/child concepts
> > > among those IOMMUs for thorough permission control.
> >
> > My thought here is that this is all abstracted through the vGPU IOMMU
> > and device vfio backends.  It's the GPU driver itself, or some vfio
> > extension of that driver, mediating access to the device and deciding
> > when to configure GPU MMU mappings.  That driver has access to the GPA
> > to HVA translations thanks to the type1 complaint IOMMU it implements
> > and can pin pages as needed to create GPA to HPA mappings.  That should
> > give it all the pieces it needs to fully setup mappings for the vGPU.
> > Whether or not there's a system IOMMU is simply an exercise for that
> > driver.  It needs to do a DMA mapping operation through the system IOMMU
> > the same for a vGPU as if it was doing it for itself, because they are
> > in fact one in the same.  The GMA to IOVA mapping seems like an internal
> > detail.  I assume the IOVA is some sort of GPA, and the GMA is managed
> > through mediation of the device.
> 
> Sorry I'm not familiar with VFIO internal. My original worry is that system
> IOMMU for GPU may be already claimed by another vfio driver (e.g. host kernel
> wants to harden gfx driver from rest sub-systems, regardless of whether vGPU
> is created or not). In that case vGPU IOMMU driver shouldn't manage system
> IOMMU directly.
> 
> btw, curious today how VFIO coordinates with system IOMMU driver regarding
> to whether a IOMMU is used to control device assignment, or used for kernel
> hardening. Somehow two are conflicting since different address spaces are
> concerned (GPA vs. IOVA)...
> 

Here is a more concrete example:

KVMGT doesn't require IOMMU. All DMA targets are already replaced with 
HPA thru shadow GTT. So DMA requests from GPU all contain HPAs.

When IOMMU is enabled, one simple approach is to have vGPU IOMMU
driver configure system IOMMU with identity mapping (HPA->HPA). We 
can't use (GPA->HPA) since GPAs from multiple VMs are conflicting. 

However, we still have host gfx driver running. When IOMMU is enabled, 
dma_alloc_*** will return IOVA (drvers/iommu/iova.c) in host gfx driver,
which will have IOVA->HPA programmed to system IOMMU.

One IOMMU device entry can only translate one address space, so here
comes a conflict (HPA->HPA vs. IOVA->HPA). To solve this, vGPU IOMMU
driver needs to allocate IOVA from iova.c for each VM w/ vGPU assigned,
and then KVMGT will program IOVA in shadow GTT accordingly. It adds
one additional mapping layer (GPA->IOVA->HPA). In this way two 
requirements can be unified together since only IOVA->HPA mapping 
needs to be built.

So unlike existing type1 IOMMU driver which controls IOMMU alone, vGPU 
IOMMU driver needs to cooperate with other agent (iova.c here) to
co-manage system IOMMU. This may not impact existing VFIO framework.
Just want to highlight additional work here when implementing the vGPU
IOMMU driver.

Thanks
Kevin
 


Thanks
Kevin
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m

RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Friday, November 20, 2015 4:03 AM
> 
> > >
> > > The proposal is therefore that GPU vendors can expose vGPUs to
> > > userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
> > > supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
> > > module (or extension of i915) can register as a vfio bus driver, create
> > > a struct device per vGPU, create an IOMMU group for that device, and
> > > register that device with the vfio-core.  Since we don't rely on the
> > > system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
> > > extension of the same module) can register a "type1" compliant IOMMU
> > > driver into vfio-core.  From the perspective of QEMU then, all of the
> > > existing vfio-pci code is re-used, QEMU remains largely unaware of any
> > > specifics of the vGPU being assigned, and the only necessary change so
> > > far is how QEMU traverses sysfs to find the device and thus the IOMMU
> > > group leading to the vfio group.
> >
> > GVT-g requires to pin guest memory and query GPA->HPA information,
> > upon which shadow GTTs will be updated accordingly from (GMA->GPA)
> > to (GMA->HPA). So yes, here a dummy or simple "type1" compliant IOMMU
> > can be introduced just for this requirement.
> >
> > However there's one tricky point which I'm not sure whether overall
> > VFIO concept will be violated. GVT-g doesn't require system IOMMU
> > to function, however host system may enable system IOMMU just for
> > hardening purpose. This means two-level translations existing (GMA->
> > IOVA->HPA), so the dummy IOMMU driver has to request system IOMMU
> > driver to allocate IOVA for VMs and then setup IOVA->HPA mapping
> > in IOMMU page table. In this case, multiple VM's translations are
> > multiplexed in one IOMMU page table.
> >
> > We might need create some group/sub-group or parent/child concepts
> > among those IOMMUs for thorough permission control.
> 
> My thought here is that this is all abstracted through the vGPU IOMMU
> and device vfio backends.  It's the GPU driver itself, or some vfio
> extension of that driver, mediating access to the device and deciding
> when to configure GPU MMU mappings.  That driver has access to the GPA
> to HVA translations thanks to the type1 complaint IOMMU it implements
> and can pin pages as needed to create GPA to HPA mappings.  That should
> give it all the pieces it needs to fully setup mappings for the vGPU.
> Whether or not there's a system IOMMU is simply an exercise for that
> driver.  It needs to do a DMA mapping operation through the system IOMMU
> the same for a vGPU as if it was doing it for itself, because they are
> in fact one in the same.  The GMA to IOVA mapping seems like an internal
> detail.  I assume the IOVA is some sort of GPA, and the GMA is managed
> through mediation of the device.

Sorry I'm not familiar with VFIO internal. My original worry is that system 
IOMMU for GPU may be already claimed by another vfio driver (e.g. host kernel
wants to harden gfx driver from rest sub-systems, regardless of whether vGPU 
is created or not). In that case vGPU IOMMU driver shouldn't manage system
IOMMU directly.

btw, curious today how VFIO coordinates with system IOMMU driver regarding
to whether a IOMMU is used to control device assignment, or used for kernel 
hardening. Somehow two are conflicting since different address spaces are
concerned (GPA vs. IOVA)...

> 
> 
> > > There are a few areas where we know we'll need to extend the VFIO API to
> > > make this work, but it seems like they can all be done generically.  One
> > > is that PCI BARs are described through the VFIO API as regions and each
> > > region has a single flag describing whether mmap (ie. direct mapping) of
> > > that region is possible.  We expect that vGPUs likely need finer
> > > granularity, enabling some areas within a BAR to be trapped and fowarded
> > > as a read or write access for the vGPU-vfio-device module to emulate,
> > > while other regions, like framebuffers or texture regions, are directly
> > > mapped.  I have prototype code to enable this already.
> >
> > Yes in GVT-g one BAR resource might be partitioned among multiple vGPUs.
> > If VFIO can support such partial resource assignment, it'd be great. Similar
> > parent/child concept might also be required here, so any resource enumerated
> > on a vGPU shouldn't break limitations enforced on the physical device.
> 
> To be clear, I'm talking about partitioning of the BAR exposed to the
> guest.  Partitioning of the physical BAR would be managed by the vGPU
> vfio device driver.  For instance when the guest mmap's a section of the
> virtual BAR, the vGPU device driver would map that to a portion of the
> physical device BAR.
> 
> > One unique requirement for GVT-g here, though, is that vGPU device model
> > need to know guest BAR configuration for proper emulation (e.g. register
> > IO emulation handler to KVM). S

RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Tian, Kevin
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Thursday, November 19, 2015 4:41 PM
> 
>   Hi,
> 
> > > Another area of extension is how to expose a framebuffer to QEMU for
> > > seamless integration into a SPICE/VNC channel.  For this I believe we
> > > could use a new region, much like we've done to expose VGA access
> > > through a vfio device file descriptor.  An area within this new
> > > framebuffer region could be directly mappable in QEMU while a
> > > non-mappable page, at a standard location with standardized format,
> > > provides a description of framebuffer and potentially even a
> > > communication channel to synchronize framebuffer captures.  This would
> > > be new code for QEMU, but something we could share among all vGPU
> > > implementations.
> >
> > Now GVT-g already provides an interface to decode framebuffer information,
> > w/ an assumption that the framebuffer will be further composited into
> > OpenGL APIs.
> 
> Can I have a pointer to docs / code?
> 
> iGVT-g_Setup_Guide.txt mentions a "Indirect Display Mode", but doesn't
> explain how the guest framebuffer can be accessed then.

You can check "fb_decoder.h". One thing to clarify. Its format is
actually based on drm definition, instead of OpenGL. Sorry for
that.

> 
> > So the format is defined according to OpenGL definition.
> > Does that meet SPICE requirement?
> 
> Yes and no ;)
> 
> Some more background:  We basically have two rendering paths in qemu.
> The classic one, without opengl, and a new, still emerging one, using
> opengl and dma-bufs (gtk support merged for qemu 2.5, sdl2 support will
> land in 2.6, spice support still WIP, hopefully 2.6 too).  For best
> performance you probably want use the new opengl-based rendering
> whenever possible.  However I do *not* expect the classic rendering path
> disappear, we'll continue to need that in various cases, most prominent
> one being vnc support.
> 
> So, for non-opengl rendering qemu needs the guest framebuffer data so it
> can feed it into the vnc server.  The vfio framebuffer region is meant
> to support this use case.

what's the format requirement on that framebuffer? If you are familiar
with Intel Graphics, there's a so-called tiling feature applied on frame
buffer so it can't be used as a raw input to vnc server. w/o opengl you
need do some conversion on CPU first.

> 
> > Another thing to be added. Framebuffers are frequently switched in
> > reality. So either Qemu needs to poll or a notification mechanism is 
> > required.
> 
> The idea is to have qemu poll (and adapt poll rate, i.e. without vnc
> client connected qemu will poll alot less frequently).
> 
> > And since it's dynamic, having framebuffer page directly exposed in the
> > new region might be tricky.  We can just expose framebuffer information
> > (including base, format, etc.) and let Qemu to map separately out of VFIO
> > interface.
> 
> Allocate some memory, ask gpu to blit the guest framebuffer there, i.e.
> provide a snapshot of the current guest display instead of playing
> mapping tricks?

yes it works but better to be completed in user level.

> 
> > And... this works fine with vGPU model since software knows all the
> > detail about framebuffer. However in pass-through case, who do you expect
> > to provide that information? Is it OK to introduce vGPU specific APIs in
> > VFIO?
> 
> It will only be used in the vgpu case, not for pass-though.
> 
> We think it is better to extend the vfio interface to improve vgpu
> support rather than inventing something new while vfio can satisfy 90%
> of the vgpu needs already.  We want avoid vendor-specific extensions
> though, the vgpu extension should work across vendors.

it's fine, as long as vgpu specific interface is allowed. :-)

> 
> > Now there is no standard. We expose vGPU life-cycle mgmt. APIs through
> > sysfs (under i915 node), which is very Intel specific. In reality different
> > vendors have quite different capabilities for their own vGPUs, so not sure
> > how standard we can define such a mechanism.
> 
> Agree when it comes to create vGPU instances.
> 
> > But this code should be
> > minor to be maintained in libvirt.
> 
> As far I know libvirt only needs to discover those devices.  If they
> look like sr/iov devices in sysfs this might work without any changes to
> libvirt.
> 
> cheers,
>   Gerd
> 



RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Tian, Kevin
> From: Song, Jike
> Sent: Friday, November 20, 2015 1:52 PM
> 
> On 11/20/2015 12:22 PM, Alex Williamson wrote:
> > On Fri, 2015-11-20 at 10:58 +0800, Jike Song wrote:
> >> On 11/19/2015 11:52 PM, Alex Williamson wrote:
> >>> On Thu, 2015-11-19 at 15:32 +, Stefano Stabellini wrote:
>  On Thu, 19 Nov 2015, Jike Song wrote:
> > Hi Alex, thanks for the discussion.
> >
> > In addition to Kevin's replies, I have a high-level question: can VFIO
> > be used by QEMU for both KVM and Xen?
> 
>  No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
>  is owned by Xen.
> >>>
> >>> Right, but in this case we're talking about device MMUs, which are owned
> >>> by the device driver which I think is running in dom0, right?  This
> >>> proposal doesn't require support of the system IOMMU, the dom0 driver
> >>> maps IOVA translations just as it would for itself.  We're largely
> >>> proposing use of the VFIO API to provide a common interface to expose a
> >>> PCI(e) device to QEMU, but what happens in the vGPU vendor device and
> >>> IOMMU backends is specific to the device and perhaps even specific to
> >>> the hypervisor.  Thanks,

As I commented in another thread, let's not including device MMU in this
discussion, which is purely device internal so not in the scope of VFIO (Qemu
doesn't need to know). Let's keep discussion about a dummy type-1
IOMMU driver for maintaining G2H mapping.

> >>
> >> Let me conclude this, and please correct me in case of any misread: the
> >> vGPU interface between kernel and QEMU will be through VFIO, with a new
> >> VFIO backend (instead of the existing type1), for both KVMGT and XenGT?
> >
> > My primary concern is KVM and QEMU upstream, the proposal is not
> > specifically directed at XenGT, but does not exclude it either.  Xen is
> > welcome to adopt this proposal as well, it simply defines the channel
> > through which vGPUs are exposed to QEMU as the VFIO API.  The core VFIO
> > code in the Linux kernel is just as available for use in Xen dom0 as it
> > is for a KVM host. VFIO in QEMU certainly knows about some
> > accelerations for KVM, but these are almost entirely around allowing
> > eventfd based interrupts to be injected through KVM, which is something
> > I'm sure Xen could provide as well.  These accelerations are also not
> > required, VFIO based device assignment in QEMU works with or without
> > KVM.  Likewise, the VFIO kernel interface knows nothing about KVM and
> > has no dependencies on it.
> >
> > There are two components to the VFIO API, one is the type1 compliant
> > IOMMU interface, which for this proposal is really doing nothing more
> > than tracking the HVA to GPA mappings for the VM.  This much seems
> > entirely common regardless of the hypervisor.  The other part is the
> > device interface.  The lifecycle of the virtual device seems like it
> > would be entirely shared, as does much of the emulation components of
> > the device.  When we get to pinning pages, providing direct access to
> > memory ranges for a VM, and accelerating interrupts, the vGPU drivers
> > will likely need some per hypervisor branches, but these are areas where
> > that's true no matter what the interface.  I'm probably over
> > simplifying, but hopefully not too much, correct me if I'm wrong.
> >
> 
> Thanks for confirmation. For QEMU/KVM, I totally agree your point; However,
> if we take XenGT to consider, it will be a bit more complex: with Xen
> hypervisor and Dom0 kernel running in different level, it's not a straight-
> forward way for QEMU to do something like mapping a portion of MMIO BAR
> via VFIO in Dom0 kernel, instead of calling hypercalls directly.
> 
> I don't know if there is a better way to handle this. But I do agree that
> channels between kernel and Qemu via VFIO is a good idea, even though we
> may have to split KVMGT/XenGT in Qemu a bit.  We are currently working on
> moving all of PCI CFG emulation from kernel to Qemu, hopefully we can
> release it by end of this year and work with you guys to adjust it for
> the agreed method.

Currently pass-through path is already different in Qemu between Xen and KVM.
For now let's keep it simple about how to extend VFIO to manage vGPU. In
the future if Xen decides to use VFIO, it should not be that difficult to add
some Xen specific vfio driver there.

> 
> 
> > The benefit of course is that aside from some extensions to the API, the
> > QEMU components are already in place and there's a lot more leverage for
> > getting both QEMU and libvirt support upstream in being able to support
> > multiple vendors, perhaps multiple hypervisors, with the same code.
> > Also, I'm not sure how useful it is, but VFIO is a userspace driver
> > interface, where here we're predominantly talking about that userspace
> > driver being QEMU.  It's not limited to that though.  A userspace
> > compute application could have direct access to a vGPU through this
> > model.  Thanks,
> 

One idea in 

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Jike Song

On 11/20/2015 12:22 PM, Alex Williamson wrote:

On Fri, 2015-11-20 at 10:58 +0800, Jike Song wrote:

On 11/19/2015 11:52 PM, Alex Williamson wrote:

On Thu, 2015-11-19 at 15:32 +, Stefano Stabellini wrote:

On Thu, 19 Nov 2015, Jike Song wrote:

Hi Alex, thanks for the discussion.

In addition to Kevin's replies, I have a high-level question: can VFIO
be used by QEMU for both KVM and Xen?


No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
is owned by Xen.


Right, but in this case we're talking about device MMUs, which are owned
by the device driver which I think is running in dom0, right?  This
proposal doesn't require support of the system IOMMU, the dom0 driver
maps IOVA translations just as it would for itself.  We're largely
proposing use of the VFIO API to provide a common interface to expose a
PCI(e) device to QEMU, but what happens in the vGPU vendor device and
IOMMU backends is specific to the device and perhaps even specific to
the hypervisor.  Thanks,


Let me conclude this, and please correct me in case of any misread: the
vGPU interface between kernel and QEMU will be through VFIO, with a new
VFIO backend (instead of the existing type1), for both KVMGT and XenGT?


My primary concern is KVM and QEMU upstream, the proposal is not
specifically directed at XenGT, but does not exclude it either.  Xen is
welcome to adopt this proposal as well, it simply defines the channel
through which vGPUs are exposed to QEMU as the VFIO API.  The core VFIO
code in the Linux kernel is just as available for use in Xen dom0 as it
is for a KVM host. VFIO in QEMU certainly knows about some
accelerations for KVM, but these are almost entirely around allowing
eventfd based interrupts to be injected through KVM, which is something
I'm sure Xen could provide as well.  These accelerations are also not
required, VFIO based device assignment in QEMU works with or without
KVM.  Likewise, the VFIO kernel interface knows nothing about KVM and
has no dependencies on it.

There are two components to the VFIO API, one is the type1 compliant
IOMMU interface, which for this proposal is really doing nothing more
than tracking the HVA to GPA mappings for the VM.  This much seems
entirely common regardless of the hypervisor.  The other part is the
device interface.  The lifecycle of the virtual device seems like it
would be entirely shared, as does much of the emulation components of
the device.  When we get to pinning pages, providing direct access to
memory ranges for a VM, and accelerating interrupts, the vGPU drivers
will likely need some per hypervisor branches, but these are areas where
that's true no matter what the interface.  I'm probably over
simplifying, but hopefully not too much, correct me if I'm wrong.



Thanks for confirmation. For QEMU/KVM, I totally agree your point; However,
if we take XenGT to consider, it will be a bit more complex: with Xen
hypervisor and Dom0 kernel running in different level, it's not a straight-
forward way for QEMU to do something like mapping a portion of MMIO BAR
via VFIO in Dom0 kernel, instead of calling hypercalls directly.

I don't know if there is a better way to handle this. But I do agree that
channels between kernel and Qemu via VFIO is a good idea, even though we
may have to split KVMGT/XenGT in Qemu a bit.  We are currently working on
moving all of PCI CFG emulation from kernel to Qemu, hopefully we can
release it by end of this year and work with you guys to adjust it for
the agreed method.



The benefit of course is that aside from some extensions to the API, the
QEMU components are already in place and there's a lot more leverage for
getting both QEMU and libvirt support upstream in being able to support
multiple vendors, perhaps multiple hypervisors, with the same code.
Also, I'm not sure how useful it is, but VFIO is a userspace driver
interface, where here we're predominantly talking about that userspace
driver being QEMU.  It's not limited to that though.  A userspace
compute application could have direct access to a vGPU through this
model.  Thanks,





Alex


--
Thanks,
Jike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Alex Williamson
On Fri, 2015-11-20 at 10:58 +0800, Jike Song wrote:
> On 11/19/2015 11:52 PM, Alex Williamson wrote:
> > On Thu, 2015-11-19 at 15:32 +, Stefano Stabellini wrote:
> >> On Thu, 19 Nov 2015, Jike Song wrote:
> >>> Hi Alex, thanks for the discussion.
> >>>
> >>> In addition to Kevin's replies, I have a high-level question: can VFIO
> >>> be used by QEMU for both KVM and Xen?
> >>
> >> No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
> >> is owned by Xen.
> >
> > Right, but in this case we're talking about device MMUs, which are owned
> > by the device driver which I think is running in dom0, right?  This
> > proposal doesn't require support of the system IOMMU, the dom0 driver
> > maps IOVA translations just as it would for itself.  We're largely
> > proposing use of the VFIO API to provide a common interface to expose a
> > PCI(e) device to QEMU, but what happens in the vGPU vendor device and
> > IOMMU backends is specific to the device and perhaps even specific to
> > the hypervisor.  Thanks,
> 
> Let me conclude this, and please correct me in case of any misread: the
> vGPU interface between kernel and QEMU will be through VFIO, with a new
> VFIO backend (instead of the existing type1), for both KVMGT and XenGT?

My primary concern is KVM and QEMU upstream, the proposal is not
specifically directed at XenGT, but does not exclude it either.  Xen is
welcome to adopt this proposal as well, it simply defines the channel
through which vGPUs are exposed to QEMU as the VFIO API.  The core VFIO
code in the Linux kernel is just as available for use in Xen dom0 as it
is for a KVM host.  VFIO in QEMU certainly knows about some
accelerations for KVM, but these are almost entirely around allowing
eventfd based interrupts to be injected through KVM, which is something
I'm sure Xen could provide as well.  These accelerations are also not
required, VFIO based device assignment in QEMU works with or without
KVM.  Likewise, the VFIO kernel interface knows nothing about KVM and
has no dependencies on it.

There are two components to the VFIO API, one is the type1 compliant
IOMMU interface, which for this proposal is really doing nothing more
than tracking the HVA to GPA mappings for the VM.  This much seems
entirely common regardless of the hypervisor.  The other part is the
device interface.  The lifecycle of the virtual device seems like it
would be entirely shared, as does much of the emulation components of
the device.  When we get to pinning pages, providing direct access to
memory ranges for a VM, and accelerating interrupts, the vGPU drivers
will likely need some per hypervisor branches, but these are areas where
that's true no matter what the interface.  I'm probably over
simplifying, but hopefully not too much, correct me if I'm wrong.

The benefit of course is that aside from some extensions to the API, the
QEMU components are already in place and there's a lot more leverage for
getting both QEMU and libvirt support upstream in being able to support
multiple vendors, perhaps multiple hypervisors, with the same code.
Also, I'm not sure how useful it is, but VFIO is a userspace driver
interface, where here we're predominantly talking about that userspace
driver being QEMU.  It's not limited to that though.  A userspace
compute application could have direct access to a vGPU through this
model.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Jike Song

On 11/19/2015 11:52 PM, Alex Williamson wrote:

On Thu, 2015-11-19 at 15:32 +, Stefano Stabellini wrote:

On Thu, 19 Nov 2015, Jike Song wrote:

Hi Alex, thanks for the discussion.

In addition to Kevin's replies, I have a high-level question: can VFIO
be used by QEMU for both KVM and Xen?


No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
is owned by Xen.


Right, but in this case we're talking about device MMUs, which are owned
by the device driver which I think is running in dom0, right?  This
proposal doesn't require support of the system IOMMU, the dom0 driver
maps IOVA translations just as it would for itself.  We're largely
proposing use of the VFIO API to provide a common interface to expose a
PCI(e) device to QEMU, but what happens in the vGPU vendor device and
IOMMU backends is specific to the device and perhaps even specific to
the hypervisor.  Thanks,


Let me conclude this, and please correct me in case of any misread: the
vGPU interface between kernel and QEMU will be through VFIO, with a new
VFIO backend (instead of the existing type1), for both KVMGT and XenGT?




Alex



--
Thanks,
Jike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Jike Song

On 11/19/2015 07:09 PM, Paolo Bonzini wrote:

On 19/11/2015 09:40, Gerd Hoffmann wrote:

But this code should be
minor to be maintained in libvirt.

As far I know libvirt only needs to discover those devices.  If they
look like sr/iov devices in sysfs this might work without any changes to
libvirt.


I don't think they will look like SR/IOV devices.

The interface may look a little like the sysfs interface that GVT-g is
already using.  However, it should at least be extended to support
multiple vGPUs in a single VM.  This might not be possible for Intel
integrated graphics, but it should definitely be possible for discrete
graphics cards.


Didn't hear about multiple vGPUs for a single VM before. Yes If we
expect same vGPU interfaces for different vendors, abstraction and
vendor specific stuff should be implemented.



Another nit is that the VM id should probably be replaced by a UUID
(because it's too easy to stumble on an existing VM id), assuming a VM
id is needed at all.


For the last assumption, yes, a VM id is not necessary for gvt-g, it's
only a temporary implementation.

As long as libvirt is used, UUID should be enough for gvt-g. However,
UUID is not mandatory? What should we do if user don't specify an UUID
in QEMU cmdline?



Paolo



--
Thanks,
Jike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Alex Williamson
Hi Kevin,

On Thu, 2015-11-19 at 04:06 +, Tian, Kevin wrote:
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Thursday, November 19, 2015 2:12 AM
> > 
> > [cc +qemu-devel, +paolo, +gerd]
> > 
> > On Tue, 2015-10-27 at 17:25 +0800, Jike Song wrote:
> > > Hi all,
> > >
> > > We are pleased to announce another update of Intel GVT-g for Xen.
> > >
> > > Intel GVT-g is a full GPU virtualization solution with mediated
> > > pass-through, starting from 4th generation Intel Core(TM) processors
> > > with Intel Graphics processors. A virtual GPU instance is maintained
> > > for each VM, with part of performance critical resources directly
> > > assigned. The capability of running native graphics driver inside a
> > > VM, without hypervisor intervention in performance critical paths,
> > > achieves a good balance among performance, feature, and sharing
> > > capability. Xen is currently supported on Intel Processor Graphics
> > > (a.k.a. XenGT); and the core logic can be easily ported to other
> > > hypervisors.
> > >
> > >
> > > Repositories
> > >
> > >  Kernel: https://github.com/01org/igvtg-kernel (2015q3-3.18.0 branch)
> > >  Xen: https://github.com/01org/igvtg-xen (2015q3-4.5 branch)
> > >  Qemu: https://github.com/01org/igvtg-qemu (xengt_public2015q3 branch)
> > >
> > >
> > > This update consists of:
> > >
> > >  - XenGT is now merged with KVMGT in unified repositories(kernel and 
> > > qemu), but
> > currently
> > >different branches for qemu.  XenGT and KVMGT share same iGVT-g 
> > > core logic.
> > 
> > Hi!
> > 
> > At redhat we've been thinking about how to support vGPUs from multiple
> > vendors in a common way within QEMU.  We want to enable code sharing
> > between vendors and give new vendors an easy path to add their own
> > support.  We also have the complication that not all vGPU vendors are as
> > open source friendly as Intel, so being able to abstract the device
> > mediation and access outside of QEMU is a big advantage.
> > 
> > The proposal I'd like to make is that a vGPU, whether it is from Intel
> > or another vendor, is predominantly a PCI(e) device.  We have an
> > interface in QEMU already for exposing arbitrary PCI devices, vfio-pci.
> > Currently vfio-pci uses the VFIO API to interact with "physical" devices
> > and system IOMMUs.  I highlight /physical/ there because some of these
> > physical devices are SR-IOV VFs, which is somewhat of a fuzzy concept,
> > somewhere between fixed hardware and a virtual device implemented in
> > software.  That software just happens to be running on the physical
> > endpoint.
> 
> Agree. 
> 
> One clarification for rest discussion, is that we're talking about GVT-g vGPU 
> here which is a pure software GPU virtualization technique. GVT-d (note 
> some use in the text) refers to passing through the whole GPU or a specific 
> VF. GVT-d already falls into existing VFIO APIs nicely (though some on-going
> effort to remove Intel specific platform stickness from gfx driver). :-)
> 
> > 
> > vGPUs are similar, with the virtual device created at a different point,
> > host software.  They also rely on different IOMMU constructs, making use
> > of the MMU capabilities of the GPU (GTTs and such), but really having
> > similar requirements.
> 
> One important difference between system IOMMU and GPU-MMU here.
> System IOMMU is very much about translation from a DMA target
> (IOVA on native, or GPA in virtualization case) to HPA. However GPU
> internal MMUs is to translate from Graphics Memory Address (GMA)
> to DMA target (HPA if system IOMMU is disabled, or IOVA/GPA if system
> IOMMU is enabled). GMA is an internal addr space within GPU, not 
> exposed to Qemu and fully managed by GVT-g device model. Since it's 
> not a standard PCI defined resource, we don't need abstract this capability
> in VFIO interface.
> 
> > 
> > The proposal is therefore that GPU vendors can expose vGPUs to
> > userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
> > supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
> > module (or extension of i915) can register as a vfio bus driver, create
> > a struct device per vGPU, create an IOMMU group for that device, and
> > register that device with the vfio-core.  Since we don't rely on the
> > system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
> > extension of the same module) can register a "type1" compliant IOMMU
> > driver into vfio-core.  From the perspective of QEMU then, all of the
> > existing vfio-pci code is re-used, QEMU remains largely unaware of any
> > specifics of the vGPU being assigned, and the only necessary change so
> > far is how QEMU traverses sysfs to find the device and thus the IOMMU
> > group leading to the vfio group.
> 
> GVT-g requires to pin guest memory and query GPA->HPA information,
> upon which shadow GTTs will be updated accordingly from (GMA->GPA)
> to (GMA->HPA). So yes, here a dummy or simple "type1" c

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Stefano Stabellini
On Thu, 19 Nov 2015, Paolo Bonzini wrote:
> On 19/11/2015 16:32, Stefano Stabellini wrote:
> > > In addition to Kevin's replies, I have a high-level question: can VFIO
> > > be used by QEMU for both KVM and Xen?
> > 
> > No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
> > is owned by Xen.
> 
> I don't think QEMU command line compatibility between KVM and Xen should
> be a design goal for GVT-g.

Right, I agree.

In fact I don't want my comment to be taken as "VFIO should not be used
at all". I only meant to reply to the question. I think it is unlikely
to be the best path for Xen, but it could very well be the right answer
for KVM.


> Nevertheless, it shouldn't be a problem to use a "virtual" VFIO (which
> doesn't need the IOMMU, because it uses the MMU in the physical GPU)
> even under Xen.

That could be true, but I would expect some extra work to be needed to
make use of VFIO on Xen. Also it might cause some duplication of
functionalities with the current Xen passthrough code base.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Alex Williamson
On Thu, 2015-11-19 at 15:32 +, Stefano Stabellini wrote:
> On Thu, 19 Nov 2015, Jike Song wrote:
> > Hi Alex, thanks for the discussion.
> > 
> > In addition to Kevin's replies, I have a high-level question: can VFIO
> > be used by QEMU for both KVM and Xen?
> 
> No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
> is owned by Xen.

Right, but in this case we're talking about device MMUs, which are owned
by the device driver which I think is running in dom0, right?  This
proposal doesn't require support of the system IOMMU, the dom0 driver
maps IOVA translations just as it would for itself.  We're largely
proposing use of the VFIO API to provide a common interface to expose a
PCI(e) device to QEMU, but what happens in the vGPU vendor device and
IOMMU backends is specific to the device and perhaps even specific to
the hypervisor.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Paolo Bonzini


On 19/11/2015 16:32, Stefano Stabellini wrote:
> > In addition to Kevin's replies, I have a high-level question: can VFIO
> > be used by QEMU for both KVM and Xen?
> 
> No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
> is owned by Xen.

I don't think QEMU command line compatibility between KVM and Xen should
be a design goal for GVT-g.

Nevertheless, it shouldn't be a problem to use a "virtual" VFIO (which
doesn't need the IOMMU, because it uses the MMU in the physical GPU)
even under Xen.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Stefano Stabellini
On Thu, 19 Nov 2015, Jike Song wrote:
> Hi Alex, thanks for the discussion.
> 
> In addition to Kevin's replies, I have a high-level question: can VFIO
> be used by QEMU for both KVM and Xen?

No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
is owned by Xen.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Paolo Bonzini


On 19/11/2015 09:40, Gerd Hoffmann wrote:
>> > But this code should be
>> > minor to be maintained in libvirt.
> As far I know libvirt only needs to discover those devices.  If they
> look like sr/iov devices in sysfs this might work without any changes to
> libvirt.

I don't think they will look like SR/IOV devices.

The interface may look a little like the sysfs interface that GVT-g is
already using.  However, it should at least be extended to support
multiple vGPUs in a single VM.  This might not be possible for Intel
integrated graphics, but it should definitely be possible for discrete
graphics cards.

Another nit is that the VM id should probably be replaced by a UUID
(because it's too easy to stumble on an existing VM id), assuming a VM
id is needed at all.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-19 Thread Gerd Hoffmann
  Hi,

> > Another area of extension is how to expose a framebuffer to QEMU for
> > seamless integration into a SPICE/VNC channel.  For this I believe we
> > could use a new region, much like we've done to expose VGA access
> > through a vfio device file descriptor.  An area within this new
> > framebuffer region could be directly mappable in QEMU while a
> > non-mappable page, at a standard location with standardized format,
> > provides a description of framebuffer and potentially even a
> > communication channel to synchronize framebuffer captures.  This would
> > be new code for QEMU, but something we could share among all vGPU
> > implementations.
> 
> Now GVT-g already provides an interface to decode framebuffer information,
> w/ an assumption that the framebuffer will be further composited into 
> OpenGL APIs.

Can I have a pointer to docs / code?

iGVT-g_Setup_Guide.txt mentions a "Indirect Display Mode", but doesn't
explain how the guest framebuffer can be accessed then.

> So the format is defined according to OpenGL definition.
> Does that meet SPICE requirement?

Yes and no ;)

Some more background:  We basically have two rendering paths in qemu.
The classic one, without opengl, and a new, still emerging one, using
opengl and dma-bufs (gtk support merged for qemu 2.5, sdl2 support will
land in 2.6, spice support still WIP, hopefully 2.6 too).  For best
performance you probably want use the new opengl-based rendering
whenever possible.  However I do *not* expect the classic rendering path
disappear, we'll continue to need that in various cases, most prominent
one being vnc support.

So, for non-opengl rendering qemu needs the guest framebuffer data so it
can feed it into the vnc server.  The vfio framebuffer region is meant
to support this use case.

> Another thing to be added. Framebuffers are frequently switched in
> reality. So either Qemu needs to poll or a notification mechanism is required.

The idea is to have qemu poll (and adapt poll rate, i.e. without vnc
client connected qemu will poll alot less frequently).

> And since it's dynamic, having framebuffer page directly exposed in the
> new region might be tricky.  We can just expose framebuffer information
> (including base, format, etc.) and let Qemu to map separately out of VFIO
> interface.

Allocate some memory, ask gpu to blit the guest framebuffer there, i.e.
provide a snapshot of the current guest display instead of playing
mapping tricks?

> And... this works fine with vGPU model since software knows all the
> detail about framebuffer. However in pass-through case, who do you expect
> to provide that information? Is it OK to introduce vGPU specific APIs in
> VFIO?

It will only be used in the vgpu case, not for pass-though.

We think it is better to extend the vfio interface to improve vgpu
support rather than inventing something new while vfio can satisfy 90%
of the vgpu needs already.  We want avoid vendor-specific extensions
though, the vgpu extension should work across vendors.

> Now there is no standard. We expose vGPU life-cycle mgmt. APIs through
> sysfs (under i915 node), which is very Intel specific. In reality different
> vendors have quite different capabilities for their own vGPUs, so not sure
> how standard we can define such a mechanism.

Agree when it comes to create vGPU instances.

> But this code should be
> minor to be maintained in libvirt.

As far I know libvirt only needs to discover those devices.  If they
look like sr/iov devices in sysfs this might work without any changes to
libvirt.

cheers,
  Gerd


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-18 Thread Jike Song

Hi Alex,
On 11/19/2015 12:06 PM, Tian, Kevin wrote:

From: Alex Williamson [mailto:alex.william...@redhat.com]
Sent: Thursday, November 19, 2015 2:12 AM

[cc +qemu-devel, +paolo, +gerd]

On Tue, 2015-10-27 at 17:25 +0800, Jike Song wrote:

{snip}


Hi!

At redhat we've been thinking about how to support vGPUs from multiple
vendors in a common way within QEMU.  We want to enable code sharing
between vendors and give new vendors an easy path to add their own
support.  We also have the complication that not all vGPU vendors are as
open source friendly as Intel, so being able to abstract the device
mediation and access outside of QEMU is a big advantage.

The proposal I'd like to make is that a vGPU, whether it is from Intel
or another vendor, is predominantly a PCI(e) device.  We have an
interface in QEMU already for exposing arbitrary PCI devices, vfio-pci.
Currently vfio-pci uses the VFIO API to interact with "physical" devices
and system IOMMUs.  I highlight /physical/ there because some of these
physical devices are SR-IOV VFs, which is somewhat of a fuzzy concept,
somewhere between fixed hardware and a virtual device implemented in
software.  That software just happens to be running on the physical
endpoint.


Agree.

One clarification for rest discussion, is that we're talking about GVT-g vGPU
here which is a pure software GPU virtualization technique. GVT-d (note
some use in the text) refers to passing through the whole GPU or a specific
VF. GVT-d already falls into existing VFIO APIs nicely (though some on-going
effort to remove Intel specific platform stickness from gfx driver). :-)



Hi Alex, thanks for the discussion.

In addition to Kevin's replies, I have a high-level question: can VFIO
be used by QEMU for both KVM and Xen?

--
Thanks,
Jike

 


vGPUs are similar, with the virtual device created at a different point,
host software.  They also rely on different IOMMU constructs, making use
of the MMU capabilities of the GPU (GTTs and such), but really having
similar requirements.


One important difference between system IOMMU and GPU-MMU here.
System IOMMU is very much about translation from a DMA target
(IOVA on native, or GPA in virtualization case) to HPA. However GPU
internal MMUs is to translate from Graphics Memory Address (GMA)
to DMA target (HPA if system IOMMU is disabled, or IOVA/GPA if system
IOMMU is enabled). GMA is an internal addr space within GPU, not
exposed to Qemu and fully managed by GVT-g device model. Since it's
not a standard PCI defined resource, we don't need abstract this capability
in VFIO interface.



The proposal is therefore that GPU vendors can expose vGPUs to
userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
module (or extension of i915) can register as a vfio bus driver, create
a struct device per vGPU, create an IOMMU group for that device, and
register that device with the vfio-core.  Since we don't rely on the
system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
extension of the same module) can register a "type1" compliant IOMMU
driver into vfio-core.  From the perspective of QEMU then, all of the
existing vfio-pci code is re-used, QEMU remains largely unaware of any
specifics of the vGPU being assigned, and the only necessary change so
far is how QEMU traverses sysfs to find the device and thus the IOMMU
group leading to the vfio group.


GVT-g requires to pin guest memory and query GPA->HPA information,
upon which shadow GTTs will be updated accordingly from (GMA->GPA)
to (GMA->HPA). So yes, here a dummy or simple "type1" compliant IOMMU
can be introduced just for this requirement.

However there's one tricky point which I'm not sure whether overall
VFIO concept will be violated. GVT-g doesn't require system IOMMU
to function, however host system may enable system IOMMU just for
hardening purpose. This means two-level translations existing (GMA->
IOVA->HPA), so the dummy IOMMU driver has to request system IOMMU
driver to allocate IOVA for VMs and then setup IOVA->HPA mapping
in IOMMU page table. In this case, multiple VM's translations are
multiplexed in one IOMMU page table.

We might need create some group/sub-group or parent/child concepts
among those IOMMUs for thorough permission control.



There are a few areas where we know we'll need to extend the VFIO API to
make this work, but it seems like they can all be done generically.  One
is that PCI BARs are described through the VFIO API as regions and each
region has a single flag describing whether mmap (ie. direct mapping) of
that region is possible.  We expect that vGPUs likely need finer
granularity, enabling some areas within a BAR to be trapped and fowarded
as a read or write access for the vGPU-vfio-device module to emulate,
while other regions, like framebuffers or texture regions, are directly
mapped.  I have prototype code to enable this already.


Yes in GVT-g one BAR resource might be 

RE: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-18 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Thursday, November 19, 2015 2:12 AM
> 
> [cc +qemu-devel, +paolo, +gerd]
> 
> On Tue, 2015-10-27 at 17:25 +0800, Jike Song wrote:
> > Hi all,
> >
> > We are pleased to announce another update of Intel GVT-g for Xen.
> >
> > Intel GVT-g is a full GPU virtualization solution with mediated
> > pass-through, starting from 4th generation Intel Core(TM) processors
> > with Intel Graphics processors. A virtual GPU instance is maintained
> > for each VM, with part of performance critical resources directly
> > assigned. The capability of running native graphics driver inside a
> > VM, without hypervisor intervention in performance critical paths,
> > achieves a good balance among performance, feature, and sharing
> > capability. Xen is currently supported on Intel Processor Graphics
> > (a.k.a. XenGT); and the core logic can be easily ported to other
> > hypervisors.
> >
> >
> > Repositories
> >
> >  Kernel: https://github.com/01org/igvtg-kernel (2015q3-3.18.0 branch)
> >  Xen: https://github.com/01org/igvtg-xen (2015q3-4.5 branch)
> >  Qemu: https://github.com/01org/igvtg-qemu (xengt_public2015q3 branch)
> >
> >
> > This update consists of:
> >
> >  - XenGT is now merged with KVMGT in unified repositories(kernel and 
> > qemu), but
> currently
> >different branches for qemu.  XenGT and KVMGT share same iGVT-g core 
> > logic.
> 
> Hi!
> 
> At redhat we've been thinking about how to support vGPUs from multiple
> vendors in a common way within QEMU.  We want to enable code sharing
> between vendors and give new vendors an easy path to add their own
> support.  We also have the complication that not all vGPU vendors are as
> open source friendly as Intel, so being able to abstract the device
> mediation and access outside of QEMU is a big advantage.
> 
> The proposal I'd like to make is that a vGPU, whether it is from Intel
> or another vendor, is predominantly a PCI(e) device.  We have an
> interface in QEMU already for exposing arbitrary PCI devices, vfio-pci.
> Currently vfio-pci uses the VFIO API to interact with "physical" devices
> and system IOMMUs.  I highlight /physical/ there because some of these
> physical devices are SR-IOV VFs, which is somewhat of a fuzzy concept,
> somewhere between fixed hardware and a virtual device implemented in
> software.  That software just happens to be running on the physical
> endpoint.

Agree. 

One clarification for rest discussion, is that we're talking about GVT-g vGPU 
here which is a pure software GPU virtualization technique. GVT-d (note 
some use in the text) refers to passing through the whole GPU or a specific 
VF. GVT-d already falls into existing VFIO APIs nicely (though some on-going
effort to remove Intel specific platform stickness from gfx driver). :-)

> 
> vGPUs are similar, with the virtual device created at a different point,
> host software.  They also rely on different IOMMU constructs, making use
> of the MMU capabilities of the GPU (GTTs and such), but really having
> similar requirements.

One important difference between system IOMMU and GPU-MMU here.
System IOMMU is very much about translation from a DMA target
(IOVA on native, or GPA in virtualization case) to HPA. However GPU
internal MMUs is to translate from Graphics Memory Address (GMA)
to DMA target (HPA if system IOMMU is disabled, or IOVA/GPA if system
IOMMU is enabled). GMA is an internal addr space within GPU, not 
exposed to Qemu and fully managed by GVT-g device model. Since it's 
not a standard PCI defined resource, we don't need abstract this capability
in VFIO interface.

> 
> The proposal is therefore that GPU vendors can expose vGPUs to
> userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
> supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
> module (or extension of i915) can register as a vfio bus driver, create
> a struct device per vGPU, create an IOMMU group for that device, and
> register that device with the vfio-core.  Since we don't rely on the
> system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
> extension of the same module) can register a "type1" compliant IOMMU
> driver into vfio-core.  From the perspective of QEMU then, all of the
> existing vfio-pci code is re-used, QEMU remains largely unaware of any
> specifics of the vGPU being assigned, and the only necessary change so
> far is how QEMU traverses sysfs to find the device and thus the IOMMU
> group leading to the vfio group.

GVT-g requires to pin guest memory and query GPA->HPA information,
upon which shadow GTTs will be updated accordingly from (GMA->GPA)
to (GMA->HPA). So yes, here a dummy or simple "type1" compliant IOMMU 
can be introduced just for this requirement.

However there's one tricky point which I'm not sure whether overall
VFIO concept will be violated. GVT-g doesn't require system IOMMU
to function, however host system may enable system IOMMU just for

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-11-18 Thread Alex Williamson
[cc +qemu-devel, +paolo, +gerd]

On Tue, 2015-10-27 at 17:25 +0800, Jike Song wrote:
> Hi all,
> 
> We are pleased to announce another update of Intel GVT-g for Xen.
> 
> Intel GVT-g is a full GPU virtualization solution with mediated
> pass-through, starting from 4th generation Intel Core(TM) processors
> with Intel Graphics processors. A virtual GPU instance is maintained
> for each VM, with part of performance critical resources directly
> assigned. The capability of running native graphics driver inside a
> VM, without hypervisor intervention in performance critical paths,
> achieves a good balance among performance, feature, and sharing
> capability. Xen is currently supported on Intel Processor Graphics
> (a.k.a. XenGT); and the core logic can be easily ported to other
> hypervisors.
> 
> 
> Repositories
> 
>  Kernel: https://github.com/01org/igvtg-kernel (2015q3-3.18.0 branch)
>  Xen: https://github.com/01org/igvtg-xen (2015q3-4.5 branch)
>  Qemu: https://github.com/01org/igvtg-qemu (xengt_public2015q3 branch)
> 
> 
> This update consists of:
> 
>  - XenGT is now merged with KVMGT in unified repositories(kernel and 
> qemu), but currently
>different branches for qemu.  XenGT and KVMGT share same iGVT-g core 
> logic.

Hi!

At redhat we've been thinking about how to support vGPUs from multiple
vendors in a common way within QEMU.  We want to enable code sharing
between vendors and give new vendors an easy path to add their own
support.  We also have the complication that not all vGPU vendors are as
open source friendly as Intel, so being able to abstract the device
mediation and access outside of QEMU is a big advantage.

The proposal I'd like to make is that a vGPU, whether it is from Intel
or another vendor, is predominantly a PCI(e) device.  We have an
interface in QEMU already for exposing arbitrary PCI devices, vfio-pci.
Currently vfio-pci uses the VFIO API to interact with "physical" devices
and system IOMMUs.  I highlight /physical/ there because some of these
physical devices are SR-IOV VFs, which is somewhat of a fuzzy concept,
somewhere between fixed hardware and a virtual device implemented in
software.  That software just happens to be running on the physical
endpoint.

vGPUs are similar, with the virtual device created at a different point,
host software.  They also rely on different IOMMU constructs, making use
of the MMU capabilities of the GPU (GTTs and such), but really having
similar requirements.

The proposal is therefore that GPU vendors can expose vGPUs to
userspace, and thus to QEMU, using the VFIO API.  For instance, vfio
supports modular bus drivers and IOMMU drivers.  An intel-vfio-gvt-d
module (or extension of i915) can register as a vfio bus driver, create
a struct device per vGPU, create an IOMMU group for that device, and
register that device with the vfio-core.  Since we don't rely on the
system IOMMU for GVT-d vGPU assignment, another vGPU vendor driver (or
extension of the same module) can register a "type1" compliant IOMMU
driver into vfio-core.  From the perspective of QEMU then, all of the
existing vfio-pci code is re-used, QEMU remains largely unaware of any
specifics of the vGPU being assigned, and the only necessary change so
far is how QEMU traverses sysfs to find the device and thus the IOMMU
group leading to the vfio group.

There are a few areas where we know we'll need to extend the VFIO API to
make this work, but it seems like they can all be done generically.  One
is that PCI BARs are described through the VFIO API as regions and each
region has a single flag describing whether mmap (ie. direct mapping) of
that region is possible.  We expect that vGPUs likely need finer
granularity, enabling some areas within a BAR to be trapped and fowarded
as a read or write access for the vGPU-vfio-device module to emulate,
while other regions, like framebuffers or texture regions, are directly
mapped.  I have prototype code to enable this already.

Another area is that we really don't want to proliferate each vGPU
needing a new IOMMU type within vfio.  The existing type1 IOMMU provides
potentially the most simple mapping and unmapping interface possible.
We'd therefore need to allow multiple "type1" IOMMU drivers for vfio,
making type1 be more of an interface specification rather than a single
implementation.  This is a trivial change to make within vfio and one
that I believe is compatible with the existing API.  Note that
implementing a type1-compliant vfio IOMMU does not imply pinning an
mapping every registered page.  A vGPU, with mediated device access, may
use this only to track the current HVA to GPA mappings for a VM.  Only
when a DMA is enabled for the vGPU instance is that HVA pinned and an
HPA to GPA translation programmed into the GPU MMU.

Another area of extension is how to expose a framebuffer to QEMU for
seamless integration into a SPICE/VNC channel.  For this I believe we
could use a new region, much like we've done to expose V

Re: [Intel-gfx] [Announcement] 2015-Q3 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2015-10-27 Thread Jike Song

Hi all,

We are pleased to announce another update of Intel GVT-g for Xen.

Intel GVT-g is a full GPU virtualization solution with mediated pass-through, 
starting from 4th generation Intel Core(TM) processors with Intel Graphics 
processors. A virtual GPU instance is maintained for each VM, with part of 
performance critical resources directly assigned. The capability of running 
native graphics driver inside a VM, without hypervisor intervention in 
performance critical paths, achieves a good balance among performance, feature, 
and sharing capability. Xen is currently supported on Intel Processor Graphics 
(a.k.a. XenGT); and the core logic can be easily ported to other hypervisors.


Repositories

Kernel: https://github.com/01org/igvtg-kernel (2015q3-3.18.0 branch)
Xen: https://github.com/01org/igvtg-xen (2015q3-4.5 branch)
Qemu: https://github.com/01org/igvtg-qemu (xengt_public2015q3 branch)


This update consists of:

- XenGT is now merged with KVMGT in unified repositories(kernel and qemu), 
but currently
  different branches for qemu.  XenGT and KVMGT share same iGVT-g core 
logic.
- fix sysfs/debugfs access seldom crash issue
- fix a BUG in XenGT I/O emulation logic
- improve 3d workload stability

Next update will be around early Jan, 2016.


Known issues:

- At least 2GB memory is suggested for VM to run most 3D workloads.
- Keymap might be incorrect in guest. Config file may need to explicitly specify 
"keymap='en-us'". Although it looks like the default value, earlier we saw the 
problem of wrong keymap code if it is not explicitly set.
- When using three monitors, doing hotplug between Guest pause/unpause may 
not be able to lightup all monitors automatically. Some specific monitor issues.
- Cannot move mouse pointer smoothly in guest by default launched by VNC mode. Configuration 
file need to explicitly specify "usb=1" to enable a USB bus, and 
"usbdevice='tablet'" to add pointer device using absolute coordinates.
- Resume dom0 from S3 may cause some error message.
- i915 unload/reload cannot works well with less than 3 vcpu when upowerd 
service was running
- Unigen Tropics running in multiple guests will cause dom0 and guests tdr.


Please subscribe the mailing list to report BUGs, discuss, and/or contribute:

https://lists.01.org/mailman/listinfo/igvt-g


More information about Intel GVT-g background, architecture, etc can be found 
at(may not be up-to-date):

https://01.org/igvt-g
https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian

http://events.linuxfoundation.org/sites/events/files/slides/XenGT-Xen%20Summit-v7_0.pdf

http://events.linuxfoundation.org/sites/events/files/slides/XenGT-Xen%20Summit-REWRITE%203RD%20v4.pdf
https://01.org/xen/blogs/srclarkx/2013/graphics-virtualization-xengt


Note:

   The XenGT project should be considered a work in progress. As such it is not 
a complete product nor should it be considered one. Extra care should be taken 
when testing and configuring a system to use the XenGT project.


--
Thanks,
Jike

On 07/07/2015 10:49 AM, Jike Song wrote:

Hi all,

We're pleased to announce a public update to Intel Graphics Virtualization 
Technology(Intel GVT-g, formerly known as XenGT).

Intel GVT-g is a full GPU virtualization solution with mediated pass-through, 
starting from 4th generation Intel Core(TM) processors with Intel Graphics 
processors. A virtual GPU instance is maintained for each VM, with part of 
performance critical resources directly assigned. The capability of running 
native graphics driver inside a VM, without hypervisor intervention in 
performance critical paths, achieves a good balance among performance, feature, 
and sharing capability. Xen is currently supported on Intel Processor Graphics 
(a.k.a. XenGT); and the core logic can be easily ported to other hypervisors, 
for example, the experimental code has been released to support GVT-g running 
on a KVM hypervisor (a.k.a KVMGT).

Tip of repositories
-

  Kernel: 5b73653d5ca, Branch: master-2015Q2-3.18.0
  Qemu: 2a75bbff62c1, Branch: master
  Xen: 38c36f0f511b1, Branch: master-2015Q2-4.5

This update consists of:
  - Change time based scheduler timer to be configurable to enhance 
stability
  - Fix stability issues that VM/Dom0 got tdr when hang up at some specific 
instruction on BDW
  - Optimize the emulation of el_status register to enhance stability
  - 2D/3D performance in linux VMs has been improved about 50% on BDW
  - Fix abnormal idle power consumption issue due to wrong forcewake policy
  - Fix tdr issue when running 2D/3D/Media workloads in Windows VMs 
simultaneously
  - KVM support is still in a separate branch as prototype work. We plan to 
integrate KVM/Xen support together in the future releases
  - Next update will be around early Oct, 2015

Notice that this release can support both Intel 4th ge