Re: [PATCH V4 5/5] vdpasim: vDPA device simulator

2020-02-20 Thread Jason Wang


On 2020/2/20 下午11:12, Jason Gunthorpe wrote:

On Thu, Feb 20, 2020 at 02:11:41PM +0800, Jason Wang wrote:

+static void vdpasim_device_release(struct device *dev)
+{
+   struct vdpasim *vdpasim = dev_to_sim(dev);
+
+   cancel_work_sync(>work);
+   kfree(vdpasim->buffer);
+   vhost_iotlb_free(vdpasim->iommu);
+   kfree(vdpasim);
+}
+
+static struct vdpasim *vdpasim_create(void)
+{
+   struct virtio_net_config *config;
+   struct vhost_iotlb *iommu;
+   struct vdpasim *vdpasim;
+   struct device *dev;
+   void *buffer;
+   int ret = -ENOMEM;
+
+   iommu = vhost_iotlb_alloc(2048, 0);
+   if (!iommu)
+   goto err;
+
+   buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
+   if (!buffer)
+   goto err_buffer;
+
+   vdpasim = kzalloc(sizeof(*vdpasim), GFP_KERNEL);
+   if (!vdpasim)
+   goto err_alloc;
+
+   vdpasim->buffer = buffer;
+   vdpasim->iommu = iommu;
+
+   config = >config;
+   config->mtu = 1500;
+   config->status = VIRTIO_NET_S_LINK_UP;
+   eth_random_addr(config->mac);
+
+   INIT_WORK(>work, vdpasim_work);
+   spin_lock_init(>lock);
+
+   vringh_set_iotlb(>vqs[0].vring, vdpasim->iommu);
+   vringh_set_iotlb(>vqs[1].vring, vdpasim->iommu);
+
+   dev = >dev;
+   dev->release = vdpasim_device_release;
+   dev->coherent_dma_mask = DMA_BIT_MASK(64);
+   set_dma_ops(dev, _dma_ops);
+   dev_set_name(dev, "%s", VDPASIM_NAME);
+
+   ret = device_register(>dev);
+   if (ret)
+   goto err_init;

It is a bit weird to be creating this dummy parent, couldn't this be
done by just passing a NULL parent to vdpa_alloc_device, doing
set_dma_ops() on the vdpasim->vdpa->dev and setting dma_device to
vdpasim->vdpa->dev ?



I think it works.



+   vdpasim->vdpa = vdpa_alloc_device(dev, dev, _net_config_ops);
+   if (ret)
+   goto err_vdpa;
+   ret = vdpa_register_device(vdpasim->vdpa);
+   if (ret)
+   goto err_register;
+
+   return vdpasim;
+
+err_register:
+   put_device(>vdpa->dev);
+err_vdpa:
+   device_del(>dev);
+   goto err;
+err_init:
+   put_device(>dev);
+   goto err;

If you do the vdmasim alloc first, and immediately do
device_initialize() then all the failure paths can do put_device
instead of having this ugly goto unwind split. Just check for
vdpasim->iommu == NULL during release.



Yes, that looks simpler.





+static int __init vdpasim_dev_init(void)
+{
+   vdpasim_dev = vdpasim_create();
+
+   if (!IS_ERR(vdpasim_dev))
+   return 0;
+
+   return PTR_ERR(vdpasim_dev);
+}
+
+static int vdpasim_device_remove_cb(struct device *dev, void *data)
+{
+   struct vdpa_device *vdpa = dev_to_vdpa(dev);
+
+   vdpa_unregister_device(vdpa);
+
+   return 0;
+}
+
+static void __exit vdpasim_dev_exit(void)
+{
+   device_for_each_child(_dev->dev, NULL,
+ vdpasim_device_remove_cb);

Why the loop? There is only one device, and it is in the global
varaible vdmasim_dev ?



Not necessary but doesn't harm, will remove this.

Thanks




Jason



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V4 3/5] vDPA: introduce vDPA bus

2020-02-20 Thread Jason Wang


On 2020/2/20 下午11:14, Jason Gunthorpe wrote:

On Thu, Feb 20, 2020 at 02:11:39PM +0800, Jason Wang wrote:

vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by
software. vDPA hardware devices are usually implemented through PCIE
with the following types:

- PF (Physical Function) - A single Physical Function
- VF (Virtual Function) - Device that supports single root I/O
   virtualization (SR-IOV). Its Virtual Function (VF) represents a
   virtualized instance of the device that can be assigned to different
   partitions
- ADI (Assignable Device Interface) and its equivalents - With
   technologies such as Intel Scalable IOV, a virtual device (VDEV)
   composed by host OS utilizing one or more ADIs. Or its equivalent
   like SF (Sub function) from Mellanox.

 From a driver's perspective, depends on how and where the DMA
translation is done, vDPA devices are split into two types:

- Platform specific DMA translation - From the driver's perspective,
   the device can be used on a platform where device access to data in
   memory is limited and/or translated. An example is a PCIE vDPA whose
   DMA request was tagged via a bus (e.g PCIE) specific way. DMA
   translation and protection are done at PCIE bus IOMMU level.
- Device specific DMA translation - The device implements DMA
   isolation and protection through its own logic. An example is a vDPA
   device which uses on-chip IOMMU.

To hide the differences and complexity of the above types for a vDPA
device/IOMMU options and in order to present a generic virtio device
to the upper layer, a device agnostic framework is required.

This patch introduces a software vDPA bus which abstracts the
common attributes of vDPA device, vDPA bus driver and the
communication method (vdpa_config_ops) between the vDPA device
abstraction and the vDPA bus driver. This allows multiple types of
drivers to be used for vDPA device like the virtio_vdpa and vhost_vdpa
driver to operate on the bus and allow vDPA device could be used by
either kernel virtio driver or userspace vhost drivers as:

virtio drivers  vhost drivers
   | |
 [virtio bus]   [vhost uAPI]
   | |
virtio device   vhost device
virtio_vdpa drv vhost_vdpa drv
  \   /
 [vDPA bus]
  |
 vDPA device
 hardware drv
  |
 [hardware bus]
  |
 vDPA hardware

I still don't like this strange complexity, vhost should have been
layered on top of the virtio device instead of adding an extra bus
just for vdpa.



We've considered such method and I think why we choose a bus is:

- vDPA device was originally named as "vhost Datapath Acceleration" 
which means the datapath complies virtio specification but not control 
path. This means the device should behave like vhost. And in order to 
support vhost, vDPA device requires more function than virtio. E.g the 
ability to query the device state (virtqueue indices, counters etc) and 
track dirty pages. This mean even a pure virtio hardware may not work 
for vhost. That's why a multi inheritance is used for a new type of vDPA 
device.


- As we've already discussed, virtio bus is designed for kernel driver 
and a brunches of devices, drivers or even buses have been implemented 
around that. It requires a major refactoring not only with the virtio 
bus but also with the drivers and devices to make it behave more like a 
vhost. Abstract vDPA as a kind of transport for virtio greatly simplify 
the work and have almost zero impact on the exist virtio core. VOP 
(vop_bus) use similar design.





However, I don't see any technical problems with this patch now.



Thanks, your review is greatly appreciated.




Thanks,
Jason



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 3/4] drm/mgag200: Use simple encoder

2020-02-20 Thread Thomas Zimmermann
Hi Sam

thanks for reviewing the patch set.

Am 20.02.20 um 19:56 schrieb Sam Ravnborg:
> Hi Thomas.
> 
> On Tue, Feb 18, 2020 at 09:48:14AM +0100, Thomas Zimmermann wrote:
>> The mgag200 driver uses an empty implementation for its encoder. Replace
>> the code with the generic simple encoder.
>>
>> v2:
>>  * rebase onto new simple-encoder interface
>>
>> Signed-off-by: Thomas Zimmermann 
>> ---
>>  drivers/gpu/drm/mgag200/mgag200_drv.h  |  7 ---
>>  drivers/gpu/drm/mgag200/mgag200_mode.c | 61 ++
>>  2 files changed, 3 insertions(+), 65 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h 
>> b/drivers/gpu/drm/mgag200/mgag200_drv.h
>> index aa32aad222c2..9bb9e8e14539 100644
>> --- a/drivers/gpu/drm/mgag200/mgag200_drv.h
>> +++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
>> @@ -95,7 +95,6 @@
>>  #define MATROX_DPMS_CLEARED (-1)
>>  
>>  #define to_mga_crtc(x) container_of(x, struct mga_crtc, base)
>> -#define to_mga_encoder(x) container_of(x, struct mga_encoder, base)
>>  #define to_mga_connector(x) container_of(x, struct mga_connector, base)
>>  
>>  struct mga_crtc {
>> @@ -110,12 +109,6 @@ struct mga_mode_info {
>>  struct mga_crtc *crtc;
>>  };
>>  
>> -struct mga_encoder {
>> -struct drm_encoder base;
>> -int last_dpms;
>> -};
>> -
>> -
>>  struct mga_i2c_chan {
>>  struct i2c_adapter adapter;
>>  struct drm_device *dev;
> 
> Any particular reason why the drm_encoder is not embedded in struct
> mga_device?
> 
> I found it more elegant - like you did it for ast in the previous patch.

I think I wanted something that uses drm_simple_encoder_create(). But I
can change that. The embedded variant is indeed better.

Best regards
Thomas

> 
> I also noted there is one more unused "last_dpms" - but it is outside
> the scope of this patch to remove it.
> 
>   Sam
> 
>> diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
>> b/drivers/gpu/drm/mgag200/mgag200_mode.c
>> index 62a8e9ccb16d..957ea1057b6c 100644
>> --- a/drivers/gpu/drm/mgag200/mgag200_mode.c
>> +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
>> @@ -15,6 +15,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include "mgag200_drv.h"
>>  
>> @@ -1449,72 +1450,16 @@ static void mga_crtc_init(struct mga_device *mdev)
>>  drm_crtc_helper_add(_crtc->base, _helper_funcs);
>>  }
>>  
>> -/*
>> - * The encoder comes after the CRTC in the output pipeline, but before
>> - * the connector. It's responsible for ensuring that the digital
>> - * stream is appropriately converted into the output format. Setup is
>> - * very simple in this case - all we have to do is inform qemu of the
>> - * colour depth in order to ensure that it displays appropriately
>> - */
>> -
>> -/*
>> - * These functions are analagous to those in the CRTC code, but are intended
>> - * to handle any encoder-specific limitations
>> - */
>> -static void mga_encoder_mode_set(struct drm_encoder *encoder,
>> -struct drm_display_mode *mode,
>> -struct drm_display_mode *adjusted_mode)
>> -{
>> -
>> -}
>> -
>> -static void mga_encoder_dpms(struct drm_encoder *encoder, int state)
>> -{
>> -return;
>> -}
>> -
>> -static void mga_encoder_prepare(struct drm_encoder *encoder)
>> -{
>> -}
>> -
>> -static void mga_encoder_commit(struct drm_encoder *encoder)
>> -{
>> -}
>> -
>> -static void mga_encoder_destroy(struct drm_encoder *encoder)
>> -{
>> -struct mga_encoder *mga_encoder = to_mga_encoder(encoder);
>> -drm_encoder_cleanup(encoder);
>> -kfree(mga_encoder);
>> -}
>> -
>> -static const struct drm_encoder_helper_funcs mga_encoder_helper_funcs = {
>> -.dpms = mga_encoder_dpms,
>> -.mode_set = mga_encoder_mode_set,
>> -.prepare = mga_encoder_prepare,
>> -.commit = mga_encoder_commit,
>> -};
>> -
>> -static const struct drm_encoder_funcs mga_encoder_encoder_funcs = {
>> -.destroy = mga_encoder_destroy,
>> -};
>> -
>>  static struct drm_encoder *mga_encoder_init(struct drm_device *dev)
>>  {
>>  struct drm_encoder *encoder;
>> -struct mga_encoder *mga_encoder;
>>  
>> -mga_encoder = kzalloc(sizeof(struct mga_encoder), GFP_KERNEL);
>> -if (!mga_encoder)
>> +encoder = drm_simple_encoder_create(dev, DRM_MODE_ENCODER_DAC);
>> +if (IS_ERR(encoder))
>>  return NULL;
>>  
>> -encoder = _encoder->base;
>>  encoder->possible_crtcs = 0x1;
>>  
>> -drm_encoder_init(dev, encoder, _encoder_encoder_funcs,
>> - DRM_MODE_ENCODER_DAC, NULL);
>> -drm_encoder_helper_add(encoder, _encoder_helper_funcs);
>> -
>>  return encoder;
>>  }
>>  
>> -- 
>> 2.25.0

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



signature.asc
Description: OpenPGP digital signature
___
Virtualization mailing list

Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

2020-02-20 Thread Jason Wang


On 2020/2/21 上午12:06, Halil Pasic wrote:

Currently if one intends to run a memory protection enabled VM with
virtio devices and linux as the guest OS, one needs to specify the
VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
linux use the DMA API, which in turn handles the memory
encryption/protection stuff if the guest decides to turn itself into
a protected one. This however makes no sense due to multiple reasons:
* The device is not changed by the fact that the guest RAM is
protected. The so called IOMMU bypass quirk is not affected.
* This usage is not congruent with  standardised semantics of
VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
for using DMA API in virtio (orthogonal with respect to what is
expressed by VIRTIO_F_IOMMU_PLATFORM).

This series aims to decouple 'have to use DMA API because my (guest) RAM
is protected' and 'have to use DMA API because the device told me
VIRTIO_F_IOMMU_PLATFORM'.

Please find more detailed explanations about the conceptual aspects in
the individual patches. There is however also a very practical problem
that is addressed by this series.

For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
effect The vhost code assumes it the addresses on the virtio descriptor
ring are not guest physical addresses but iova's, and insists on doing a
translation of these regardless of what transport is used (e.g. whether
we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
"vhost: new device IOTLB API".) On s390 this results in severe
performance degradation (c.a. factor 10).



Do you see a consistent degradation on the performance, or it only 
happen when for during the beginning of the test?




BTW with ccw I/O there is
(architecturally) no IOMMU, so the whole address translation makes no
sense in the context of virtio-ccw.



I suspect we can do optimization in qemu side.

E.g send memtable entry via IOTLB API when vIOMMU is not enabled.

If this makes sense, I can draft patch to see if there's any difference.

Thanks




Halil Pasic (2):
   mm: move force_dma_unencrypted() to mem_encrypt.h
   virtio: let virtio use DMA API when guest RAM is protected

  drivers/virtio/virtio_ring.c |  3 +++
  include/linux/dma-direct.h   |  9 -
  include/linux/mem_encrypt.h  | 10 ++
  3 files changed, 13 insertions(+), 9 deletions(-)


base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected

2020-02-20 Thread Jason Wang


On 2020/2/21 上午10:59, David Gibson wrote:

On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote:

On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 867c7ebd3f10..fafc8f924955 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
if (!virtio_has_iommu_quirk(vdev))
return true;
  
+	if (force_dma_unencrypted(>dev))

+   return true;

Hell no.  This is a detail of the platform DMA direct implementation.
Drivers have no business looking at this flag, and virtio finally needs
to be fixed to use the DMA API properly for everything but legacy devices.

So, this patch definitely isn't right as it stands, but I'm struggling
to understand what it is you're saying is the right way.

By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
see how being a legacy device or not relates to use of the DMA API.

I *think* what you are suggesting here is that virtio devices that
have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
DMA API treats IOVA==PA, which will satisfy what the device expects.



Can this work for swiotlb?

Thanks



Then the virtio driver can use the DMA API the same way for both
F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices.

But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices,
then AFAICT it will work equally well for legacy devices.

Using the DMA API for *everything* in virtio, legacy or not, seems
like a reasonable approach to me.  But, AFAICT, that does require the
DMA layer to have some kind of explicit call to turn on this
behaviour, which the virtio driver would call during initializsation.
I don't think we can do it 100% within the DMA layer, because only the
driver can reasonably know when a device has this weird non-standard
DMA behaviour.



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected

2020-02-20 Thread David Gibson
On Thu, Feb 20, 2020 at 05:17:48PM -0800, Ram Pai wrote:
> On Thu, Feb 20, 2020 at 03:55:14PM -0500, Michael S. Tsirkin wrote:
> > On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > > Currently the advanced guest memory protection technologies (AMD SEV,
> > > powerpc secure guest technology and s390 Protected VMs) abuse the
> > > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> > > is in turn necessary, to make IO work with guest memory protection.
> > > 
> > > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> > > different beast: with virtio devices whose implementation runs on an SMP
> > > CPU we are still fine with doing all the usual optimizations, it is just
> > > that we need to make sure that the memory protection mechanism does not
> > > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> > > side of the guest (and possibly he host side as well) than we actually
> > > need.
> > > 
> > > An additional benefit of teaching the guest to make the right decision
> > > (and use DMA API) on it's own is: removing the need, to mandate special
> > > VM configuration for guests that may run with protection. This is
> > > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> > > the virtio control structures into the first 2G of guest memory:
> > > something we don't necessarily want to do per-default.
> > > 
> > > Signed-off-by: Halil Pasic 
> > > Tested-by: Ram Pai 
> > > Tested-by: Michael Mueller 
> > 
> > This might work for you but it's fragile, since without
> > VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
> > GPA's, not DMA addresses.
> > 
> > 
> > 
> > IOW this looks like another iteration of:
> > 
> > virtio: Support encrypted memory on powerpc secure guests
> > 
> > which I was under the impression was abandoned as unnecessary.
> 
> It has been abondoned on powerpc. We enabled VIRTIO_F_ACCESS_PLATFORM;
> by default, flag on powerpc.

Uh... we haven't yet, though we're working on it.

> We would like to enable secure guests on powerpc without this flag
> aswell enabled, but past experience has educated us that its not a easy
> path.  However if Halil makes some inroads in this path for s390, we
> will like to support him.
> 
> 
> RP
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-20 Thread Tyler Sanderson via Virtualization
Testing this patch is on my short-term TODO list, but I wasn't able to get
to it this week. It is prioritized.

In the meantime, I can anecdotally vouch that kernels before 4.19, the ones
using the OOM notifier callback, have roughly 10x faster balloon inflation
when pressuring the cache. So I anticipate this patch will return to that
state and help my use case.

I will try to post official measurements of this patch next week.

On Sun, Feb 16, 2020 at 1:47 AM Michael S. Tsirkin  wrote:

> On Fri, Feb 14, 2020 at 10:51:43AM +0100, David Hildenbrand wrote:
> > On 05.02.20 17:34, David Hildenbrand wrote:
> > > Commit 71994620bb25 ("virtio_balloon: replace oom notifier with
> shrinker")
> > > changed the behavior when deflation happens automatically. Instead of
> > > deflating when called by the OOM handler, the shrinker is used.
> > >
> > > However, the balloon is not simply some slab cache that should be
> > > shrunk when under memory pressure. The shrinker does not have a
> concept of
> > > priorities, so this behavior cannot be configured.
> > >
> > > There was a report that this results in undesired side effects when
> > > inflating the balloon to shrink the page cache. [1]
> > > "When inflating the balloon against page cache (i.e. no free memory
> > >  remains) vmscan.c will both shrink page cache, but also invoke the
> > >  shrinkers -- including the balloon's shrinker. So the balloon
> > >  driver allocates memory which requires reclaim, vmscan gets this
> > >  memory by shrinking the balloon, and then the driver adds the
> > >  memory back to the balloon. Basically a busy no-op."
> > >
> > > The name "deflate on OOM" makes it pretty clear when deflation should
> > > happen - after other approaches to reclaim memory failed, not while
> > > reclaiming. This allows to minimize the footprint of a guest - memory
> > > will only be taken out of the balloon when really needed.
> > >
> > > Especially, a drop_slab() will result in the whole balloon getting
> > > deflated - undesired. While handling it via the OOM handler might not
> be
> > > perfect, it keeps existing behavior. If we want a different behavior,
> then
> > > we need a new feature bit and document it properly (although, there
> should
> > > be a clear use case and the intended effects should be well described).
> > >
> > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
> > > this has no such side effects. Always register the shrinker with
> > > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse
> free
> > > pages that are still to be processed by the guest. The hypervisor takes
> > > care of identifying and resolving possible races between processing a
> > > hinting request and the guest reusing a page.
> > >
> > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom
> > > notifier with shrinker"), don't add a moodule parameter to configure
> the
> > > number of pages to deflate on OOM. Can be re-added if really needed.
> > > Also, pay attention that leak_balloon() returns the number of 4k pages
> -
> > > convert it properly in virtio_balloon_oom_notify().
> > >
> > > Note1: using the OOM handler is frowned upon, but it really is what we
> > >need for this feature.
> > >
> > > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU)
> we
> > >could actually skip sending deflation requests to our
> hypervisor,
> > >making the OOM path *very* simple. Besically freeing pages and
> > >updating the balloon. If the communication with the host ever
> > >becomes a problem on this call path.
> > >
> >
> > @Michael, how to proceed with this?
> >
>
> I'd like to see some reports that this helps people.
> e.g. a tested-by tag.
>
> > --
> > Thanks,
> >
> > David / dhildenb
>
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected

2020-02-20 Thread David Gibson
On Thu, Feb 20, 2020 at 05:13:09PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 867c7ebd3f10..fafc8f924955 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device 
> > *vdev)
> > if (!virtio_has_iommu_quirk(vdev))
> > return true;
> >  
> > +   if (force_dma_unencrypted(>dev))
> > +   return true;
> 
> Hell no.  This is a detail of the platform DMA direct implementation.
> Drivers have no business looking at this flag, and virtio finally needs
> to be fixed to use the DMA API properly for everything but legacy devices.

So, this patch definitely isn't right as it stands, but I'm struggling
to understand what it is you're saying is the right way.

By "legacy devices" I assume you mean pre-virtio-1.0 devices, that
lack the F_VERSION_1 feature flag.  Is that right?  Because I don't
see how being a legacy device or not relates to use of the DMA API.

I *think* what you are suggesting here is that virtio devices that
have !F_IOMMU_PLATFORM should have their dma_ops set up so that the
DMA API treats IOVA==PA, which will satisfy what the device expects.
Then the virtio driver can use the DMA API the same way for both
F_IOMMU_PLATFORM and !F_IOMMU_PLATFORM devices.

But if that works for !F_IOMMU_PLATFORM_DEVICES+F_VERSION_1 devices,
then AFAICT it will work equally well for legacy devices.

Using the DMA API for *everything* in virtio, legacy or not, seems
like a reasonable approach to me.  But, AFAICT, that does require the
DMA layer to have some kind of explicit call to turn on this
behaviour, which the virtio driver would call during initializsation.
I don't think we can do it 100% within the DMA layer, because only the
driver can reasonably know when a device has this weird non-standard
DMA behaviour.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h

2020-02-20 Thread David Gibson
On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> > >From a users perspective it makes absolutely perfect sense to use the
> > bounce buffers when they are NEEDED. 
> > Forcing the user to specify iommu_platform just because you need bounce 
> > buffers
> > really feels wrong. And obviously we have a severe performance issue
> > because of the indirections.
> 
> The point is that the user should not have to specify iommu_platform.
> We need to make sure any new hypervisor (especially one that might require
> bounce buffering) always sets it,

So, I have draft qemu patches which enable iommu_platform by default.
But that's really because of other problems with !iommu_platform, not
anything to do with bounce buffering or secure VMs.

The thing is that the hypervisor *doesn't* require bounce buffering.
In the POWER (and maybe s390 as well) models for Secure VMs, it's the
*guest*'s choice to enter secure mode, so the hypervisor has no reason
to know whether the guest needs bounce buffering.  As far as the
hypervisor and qemu are concerned that's a guest internal detail, it
just expects to get addresses it can access whether those are GPAs
(iommu_platform=off) or IOVAs (iommu_platform=on).

> as was a rather bogus legacy hack

It was certainly a bad idea, but it was a bad idea that went into a
public spec and has been widely deployed for many years.  We can't
just pretend it didn't happen and move on.

Turning iommu_platform=on by default breaks old guests, some of which
we still care about.  We can't (automatically) do it only for guests
that need bounce buffering, because the hypervisor doesn't know that
ahead of time.

> that isn't extensibe for cases that for example require bounce buffering.

In fact bounce buffering isn't really the issue from the hypervisor
(or spec's) point of view.  It's the fact that not all of guest memory
is accessible to the hypervisor.  Bounce buffering is just one way the
guest might deal with that.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

RE: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected

2020-02-20 Thread Ram Pai
On Thu, Feb 20, 2020 at 03:55:14PM -0500, Michael S. Tsirkin wrote:
> On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> > Currently the advanced guest memory protection technologies (AMD SEV,
> > powerpc secure guest technology and s390 Protected VMs) abuse the
> > VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> > is in turn necessary, to make IO work with guest memory protection.
> > 
> > But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> > different beast: with virtio devices whose implementation runs on an SMP
> > CPU we are still fine with doing all the usual optimizations, it is just
> > that we need to make sure that the memory protection mechanism does not
> > get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> > side of the guest (and possibly he host side as well) than we actually
> > need.
> > 
> > An additional benefit of teaching the guest to make the right decision
> > (and use DMA API) on it's own is: removing the need, to mandate special
> > VM configuration for guests that may run with protection. This is
> > especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> > the virtio control structures into the first 2G of guest memory:
> > something we don't necessarily want to do per-default.
> > 
> > Signed-off-by: Halil Pasic 
> > Tested-by: Ram Pai 
> > Tested-by: Michael Mueller 
> 
> This might work for you but it's fragile, since without
> VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
> GPA's, not DMA addresses.
> 
> 
> 
> IOW this looks like another iteration of:
> 
>   virtio: Support encrypted memory on powerpc secure guests
> 
> which I was under the impression was abandoned as unnecessary.

It has been abondoned on powerpc. We enabled VIRTIO_F_ACCESS_PLATFORM;
by default, flag on powerpc.

We would like to enable secure guests on powerpc without this flag
aswell enabled, but past experience has educated us that its not a easy
path.  However if Halil makes some inroads in this path for s390, we
will like to support him.


RP

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

2020-02-20 Thread Michael S. Tsirkin
On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10). BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.

So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
Working around it by the guest looks wrong ...

> Halil Pasic (2):
>   mm: move force_dma_unencrypted() to mem_encrypt.h
>   virtio: let virtio use DMA API when guest RAM is protected
> 
>  drivers/virtio/virtio_ring.c |  3 +++
>  include/linux/dma-direct.h   |  9 -
>  include/linux/mem_encrypt.h  | 10 ++
>  3 files changed, 13 insertions(+), 9 deletions(-)
> 
> 
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> -- 
> 2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

2020-02-20 Thread Michael S. Tsirkin
On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> * This usage is not congruent with  standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM). 

Quoting the spec:

  \item[VIRTIO_F_ACCESS_PLATFORM(33)] This feature indicates that
  the device can be used on a platform where device access to data
  in memory is limited and/or translated. E.g. this is the case if the device 
can be located
  behind an IOMMU that translates bus addresses from the device into physical
  addresses in memory, if the device can be limited to only access
  certain memory addresses or if special commands such as
  a cache flush can be needed to synchronise data in memory with
  the device. Whether accesses are actually limited or translated
  is described by platform-specific means.
  If this feature bit is set to 0, then the device
  has same access to memory addresses supplied to it as the
  driver has.
  In particular, the device will always use physical addresses
  matching addresses used by the driver (typically meaning
  physical addresses used by the CPU)
  and not translated further, and can access any address supplied to it by
  the driver. When clear, this overrides any platform-specific description of
  whether device access is limited or translated in any way, e.g.
  whether an IOMMU may be present.

since device can't access encrypted memory,
this seems to match your case reasonably well.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected

2020-02-20 Thread Michael S. Tsirkin
On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> Currently the advanced guest memory protection technologies (AMD SEV,
> powerpc secure guest technology and s390 Protected VMs) abuse the
> VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
> is in turn necessary, to make IO work with guest memory protection.
> 
> But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
> different beast: with virtio devices whose implementation runs on an SMP
> CPU we are still fine with doing all the usual optimizations, it is just
> that we need to make sure that the memory protection mechanism does not
> get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
> side of the guest (and possibly he host side as well) than we actually
> need.
> 
> An additional benefit of teaching the guest to make the right decision
> (and use DMA API) on it's own is: removing the need, to mandate special
> VM configuration for guests that may run with protection. This is
> especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
> the virtio control structures into the first 2G of guest memory:
> something we don't necessarily want to do per-default.
> 
> Signed-off-by: Halil Pasic 
> Tested-by: Ram Pai 
> Tested-by: Michael Mueller 

This might work for you but it's fragile, since without
VIRTIO_F_ACCESS_PLATFORM hypervisor assumes it gets
GPA's, not DMA addresses.



IOW this looks like another iteration of:

virtio: Support encrypted memory on powerpc secure guests

which I was under the impression was abandoned as unnecessary.


To summarize, the necessary conditions for a hack along these lines
(using DMA API without VIRTIO_F_ACCESS_PLATFORM) are that we detect that:

  - secure guest mode is enabled - so we know that since we don't share
most memory regular virtio code won't
work, even though the buggy hypervisor didn't set VIRTIO_F_ACCESS_PLATFORM
  - DMA API is giving us addresses that are actually also physical
addresses
  - Hypervisor is buggy and didn't enable VIRTIO_F_ACCESS_PLATFORM
 
I don't see how this patch does this.


> ---
>  drivers/virtio/virtio_ring.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 867c7ebd3f10..fafc8f924955 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>   if (!virtio_has_iommu_quirk(vdev))
>   return true;
>  
> + if (force_dma_unencrypted(>dev))
> + return true;
> +
>   /* Otherwise, we are left to guess. */
>   /*
>* In theory, it's possible to have a buggy QEMU-supposed
> -- 
> 2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

2020-02-20 Thread Michael S. Tsirkin
On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> Currently if one intends to run a memory protection enabled VM with
> virtio devices and linux as the guest OS, one needs to specify the
> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> linux use the DMA API, which in turn handles the memory
> encryption/protection stuff if the guest decides to turn itself into
> a protected one. This however makes no sense due to multiple reasons:
> * The device is not changed by the fact that the guest RAM is
> protected. The so called IOMMU bypass quirk is not affected.
> * This usage is not congruent with  standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM). 
> 
> This series aims to decouple 'have to use DMA API because my (guest) RAM
> is protected' and 'have to use DMA API because the device told me
> VIRTIO_F_IOMMU_PLATFORM'.
> 
> Please find more detailed explanations about the conceptual aspects in
> the individual patches. There is however also a very practical problem
> that is addressed by this series. 
> 
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10). BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.

That's just a QEMU thing. It uses the same flag for VIRTIO_F_ACCESS_PLATFORM
and vhost IOTLB. QEMU can separate them, no need to change linux.


> Halil Pasic (2):
>   mm: move force_dma_unencrypted() to mem_encrypt.h
>   virtio: let virtio use DMA API when guest RAM is protected
> 
>  drivers/virtio/virtio_ring.c |  3 +++
>  include/linux/dma-direct.h   |  9 -
>  include/linux/mem_encrypt.h  | 10 ++
>  3 files changed, 13 insertions(+), 9 deletions(-)
> 
> 
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> -- 
> 2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio: Work around frames incorrectly marked as gso

2020-02-20 Thread Anton Ivanov


On 20/02/2020 07:58, Michael S. Tsirkin wrote:

On Thu, Feb 13, 2020 at 04:23:24PM +, Anton Ivanov wrote:

On 13/02/2020 15:53, Michael S. Tsirkin wrote:

On Thu, Feb 13, 2020 at 07:44:06AM -0800, Eric Dumazet wrote:

On 2/13/20 2:00 AM, Michael S. Tsirkin wrote:

On Wed, Feb 12, 2020 at 05:38:09PM +, Anton Ivanov wrote:

On 11/02/2020 10:37, Michael S. Tsirkin wrote:

On Tue, Feb 11, 2020 at 07:42:37AM +, Anton Ivanov wrote:

On 11/02/2020 02:51, Jason Wang wrote:

On 2020/2/11 上午12:55, Anton Ivanov wrote:

On 09/12/2019 10:48, anton.iva...@cambridgegreys.com wrote:

From: Anton Ivanov 

Some of the frames marked as GSO which arrive at
virtio_net_hdr_from_skb() have no GSO_TYPE, no
fragments (data_len = 0) and length significantly shorter
than the MTU (752 in my experiments).

This is observed on raw sockets reading off vEth interfaces
in all 4.x and 5.x kernels I tested.

These frames are reported as invalid while they are in fact
gso-less frames.

This patch marks the vnet header as no-GSO for them instead
of reporting it as invalid.

Signed-off-by: Anton Ivanov 
---
     include/linux/virtio_net.h | 8 ++--
     1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 0d1fe9297ac6..d90d5cff1b9a 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -112,8 +112,12 @@ static inline int
virtio_net_hdr_from_skb(const struct sk_buff *skb,
     hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
     else if (sinfo->gso_type & SKB_GSO_TCPV6)
     hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
-    else
-    return -EINVAL;
+    else {
+    if (skb->data_len == 0)
+    hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
+    else
+    return -EINVAL;
+    }
     if (sinfo->gso_type & SKB_GSO_TCP_ECN)
     hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
     } else


ping.


Do you mean gso_size is set but gso_type is not? Looks like a bug
elsewhere.

Thanks



Yes.

I could not trace it where it is coming from.

I see it when doing recvmmsg on raw sockets in the UML vector network
drivers.


I think we need to find the culprit and fix it there, lots of other things
can break otherwise.
Just printing out skb->dev->name should do the trick, no?

The printk in virtio_net_hdr_from_skb says NULL.

That is probably normal for a locally originated frame.

I cannot reproduce this with network traffic by the way - it happens only if 
the traffic is locally originated on the host.

A,

OK so is it code in __tcp_transmit_skb that sets gso_size to non-null
when gso_type is 0?


Correct way to determine if a packet is a gso one is by looking at gso_size.
Then only it is legal looking at gso_type


static inline bool skb_is_gso(const struct sk_buff *skb)
{
  return skb_shinfo(skb)->gso_size;
}

/* Note: Should be called only if skb_is_gso(skb) is true */
static inline bool skb_is_gso_v6(const struct sk_buff *skb)
...


There is absolutely no relation between GSO and skb->data_len, skb can be 
linearized
for various orthogonal reasons.

The reported problem is that virtio gets a packet where gso_size
is !0 but gso_type is 0.

It currently drops these on the assumption that it's some type
of a gso packet it does not know how to handle.


So you are saying if skb_is_gso we can still have gso_type set to 0,
and that's an expected configuration?

So the patch should just be:


-if (skb_is_gso(skb)) {
+if (skb_is_gso(skb) && sinfo->gso_type) {


Yes, provided that skb_is_gso(skb) and sinfo->gso_type == 0 is a valid state.

I agree with Jason, there may be something wrong going on here and we need to 
find the source which creates these packets.

A.


Want to submit a patch to address this for now?


I can do that on Monday - traveling till then.

A.





?


___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


--
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 4/4] drm/qxl: Use simple encoder

2020-02-20 Thread Sam Ravnborg
Hi Thomas.

On Tue, Feb 18, 2020 at 09:48:15AM +0100, Thomas Zimmermann wrote:
> The qxl driver uses an empty implementation for its encoder. Replace
> the code with the generic simple encoder.
> 
> v2:
>   * rebase onto new simple-encoder interface
> 
> Signed-off-by: Thomas Zimmermann 
I looked at best_encoder - but could not see we could do anything.
So from browsing the code:
Acked-by: Sam Ravnborg 

Sam

> ---
>  drivers/gpu/drm/qxl/qxl_display.c | 18 +++---
>  1 file changed, 3 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/qxl/qxl_display.c 
> b/drivers/gpu/drm/qxl/qxl_display.c
> index ab4f8dd00400..9c0e1add59fb 100644
> --- a/drivers/gpu/drm/qxl/qxl_display.c
> +++ b/drivers/gpu/drm/qxl/qxl_display.c
> @@ -31,6 +31,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "qxl_drv.h"
>  #include "qxl_object.h"
> @@ -1007,9 +1008,6 @@ static struct drm_encoder *qxl_best_encoder(struct 
> drm_connector *connector)
>   return _output->enc;
>  }
>  
> -static const struct drm_encoder_helper_funcs qxl_enc_helper_funcs = {
> -};
> -
>  static const struct drm_connector_helper_funcs qxl_connector_helper_funcs = {
>   .get_modes = qxl_conn_get_modes,
>   .mode_valid = qxl_conn_mode_valid,
> @@ -1059,15 +1057,6 @@ static const struct drm_connector_funcs 
> qxl_connector_funcs = {
>   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
>  };
>  
> -static void qxl_enc_destroy(struct drm_encoder *encoder)
> -{
> - drm_encoder_cleanup(encoder);
> -}
> -
> -static const struct drm_encoder_funcs qxl_enc_funcs = {
> - .destroy = qxl_enc_destroy,
> -};
> -
>  static int qxl_mode_create_hotplug_mode_update_property(struct qxl_device 
> *qdev)
>  {
>   if (qdev->hotplug_mode_update_property)
> @@ -1098,15 +1087,14 @@ static int qdev_output_init(struct drm_device *dev, 
> int num_output)
>   drm_connector_init(dev, _output->base,
>  _connector_funcs, DRM_MODE_CONNECTOR_VIRTUAL);
>  
> - drm_encoder_init(dev, _output->enc, _enc_funcs,
> -  DRM_MODE_ENCODER_VIRTUAL, NULL);
> + drm_simple_encoder_init(dev, _output->enc,
> + DRM_MODE_ENCODER_VIRTUAL);
>  
>   /* we get HPD via client monitors config */
>   connector->polled = DRM_CONNECTOR_POLL_HPD;
>   encoder->possible_crtcs = 1 << num_output;
>   drm_connector_attach_encoder(_output->base,
> _output->enc);
> - drm_encoder_helper_add(encoder, _enc_helper_funcs);
>   drm_connector_helper_add(connector, _connector_helper_funcs);
>  
>   drm_object_attach_property(>base,
> -- 
> 2.25.0
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/4] drm/ast: Use simple encoder

2020-02-20 Thread Sam Ravnborg
Hi Thomas.

On Tue, Feb 18, 2020 at 09:48:13AM +0100, Thomas Zimmermann wrote:
> The ast driver uses an empty implementation for its encoder. Replace
> the code with the generic simple encoder.
> 
> v2:
>   * rebase onto new simple-encoder interface
> 
> Signed-off-by: Thomas Zimmermann 
>From browsign the code - looks good:
Acked-by: Sam Ravnborg 

Sam

> ---
>  drivers/gpu/drm/ast/ast_drv.h  |  6 +-
>  drivers/gpu/drm/ast/ast_mode.c | 25 -
>  2 files changed, 9 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
> index f5d8780776ae..656d591b154b 100644
> --- a/drivers/gpu/drm/ast/ast_drv.h
> +++ b/drivers/gpu/drm/ast/ast_drv.h
> @@ -121,6 +121,7 @@ struct ast_private {
>   unsigned int next_index;
>   } cursor;
>  
> + struct drm_encoder encoder;
>   struct drm_plane primary_plane;
>   struct drm_plane cursor_plane;
>  
> @@ -238,13 +239,8 @@ struct ast_crtc {
>   u8 offset_x, offset_y;
>  };
>  
> -struct ast_encoder {
> - struct drm_encoder base;
> -};
> -
>  #define to_ast_crtc(x) container_of(x, struct ast_crtc, base)
>  #define to_ast_connector(x) container_of(x, struct ast_connector, base)
> -#define to_ast_encoder(x) container_of(x, struct ast_encoder, base)
>  
>  struct ast_vbios_stdtable {
>   u8 misc;
> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> index 562ea6d9df13..7a9f20a2fd30 100644
> --- a/drivers/gpu/drm/ast/ast_mode.c
> +++ b/drivers/gpu/drm/ast/ast_mode.c
> @@ -40,6 +40,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "ast_drv.h"
>  #include "ast_tables.h"
> @@ -968,28 +969,18 @@ static int ast_crtc_init(struct drm_device *dev)
>   * Encoder
>   */
>  
> -static void ast_encoder_destroy(struct drm_encoder *encoder)
> -{
> - drm_encoder_cleanup(encoder);
> - kfree(encoder);
> -}
> -
> -static const struct drm_encoder_funcs ast_enc_funcs = {
> - .destroy = ast_encoder_destroy,
> -};
> -
>  static int ast_encoder_init(struct drm_device *dev)
>  {
> - struct ast_encoder *ast_encoder;
> + struct ast_private *ast = dev->dev_private;
> + struct drm_encoder *encoder = >encoder;
> + int ret;
>  
> - ast_encoder = kzalloc(sizeof(struct ast_encoder), GFP_KERNEL);
> - if (!ast_encoder)
> - return -ENOMEM;
> + ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_DAC);
> + if (ret)
> + return ret;
>  
> - drm_encoder_init(dev, _encoder->base, _enc_funcs,
> -  DRM_MODE_ENCODER_DAC, NULL);
> + encoder->possible_crtcs = 1;
>  
> - ast_encoder->base.possible_crtcs = 1;
>   return 0;
>  }
>  
> -- 
> 2.25.0
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 3/4] drm/mgag200: Use simple encoder

2020-02-20 Thread Sam Ravnborg
Hi Thomas.

On Tue, Feb 18, 2020 at 09:48:14AM +0100, Thomas Zimmermann wrote:
> The mgag200 driver uses an empty implementation for its encoder. Replace
> the code with the generic simple encoder.
> 
> v2:
>   * rebase onto new simple-encoder interface
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/mgag200/mgag200_drv.h  |  7 ---
>  drivers/gpu/drm/mgag200/mgag200_mode.c | 61 ++
>  2 files changed, 3 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h 
> b/drivers/gpu/drm/mgag200/mgag200_drv.h
> index aa32aad222c2..9bb9e8e14539 100644
> --- a/drivers/gpu/drm/mgag200/mgag200_drv.h
> +++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
> @@ -95,7 +95,6 @@
>  #define MATROX_DPMS_CLEARED (-1)
>  
>  #define to_mga_crtc(x) container_of(x, struct mga_crtc, base)
> -#define to_mga_encoder(x) container_of(x, struct mga_encoder, base)
>  #define to_mga_connector(x) container_of(x, struct mga_connector, base)
>  
>  struct mga_crtc {
> @@ -110,12 +109,6 @@ struct mga_mode_info {
>   struct mga_crtc *crtc;
>  };
>  
> -struct mga_encoder {
> - struct drm_encoder base;
> - int last_dpms;
> -};
> -
> -
>  struct mga_i2c_chan {
>   struct i2c_adapter adapter;
>   struct drm_device *dev;

Any particular reason why the drm_encoder is not embedded in struct
mga_device?

I found it more elegant - like you did it for ast in the previous patch.

I also noted there is one more unused "last_dpms" - but it is outside
the scope of this patch to remove it.

Sam

> diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
> b/drivers/gpu/drm/mgag200/mgag200_mode.c
> index 62a8e9ccb16d..957ea1057b6c 100644
> --- a/drivers/gpu/drm/mgag200/mgag200_mode.c
> +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "mgag200_drv.h"
>  
> @@ -1449,72 +1450,16 @@ static void mga_crtc_init(struct mga_device *mdev)
>   drm_crtc_helper_add(_crtc->base, _helper_funcs);
>  }
>  
> -/*
> - * The encoder comes after the CRTC in the output pipeline, but before
> - * the connector. It's responsible for ensuring that the digital
> - * stream is appropriately converted into the output format. Setup is
> - * very simple in this case - all we have to do is inform qemu of the
> - * colour depth in order to ensure that it displays appropriately
> - */
> -
> -/*
> - * These functions are analagous to those in the CRTC code, but are intended
> - * to handle any encoder-specific limitations
> - */
> -static void mga_encoder_mode_set(struct drm_encoder *encoder,
> - struct drm_display_mode *mode,
> - struct drm_display_mode *adjusted_mode)
> -{
> -
> -}
> -
> -static void mga_encoder_dpms(struct drm_encoder *encoder, int state)
> -{
> - return;
> -}
> -
> -static void mga_encoder_prepare(struct drm_encoder *encoder)
> -{
> -}
> -
> -static void mga_encoder_commit(struct drm_encoder *encoder)
> -{
> -}
> -
> -static void mga_encoder_destroy(struct drm_encoder *encoder)
> -{
> - struct mga_encoder *mga_encoder = to_mga_encoder(encoder);
> - drm_encoder_cleanup(encoder);
> - kfree(mga_encoder);
> -}
> -
> -static const struct drm_encoder_helper_funcs mga_encoder_helper_funcs = {
> - .dpms = mga_encoder_dpms,
> - .mode_set = mga_encoder_mode_set,
> - .prepare = mga_encoder_prepare,
> - .commit = mga_encoder_commit,
> -};
> -
> -static const struct drm_encoder_funcs mga_encoder_encoder_funcs = {
> - .destroy = mga_encoder_destroy,
> -};
> -
>  static struct drm_encoder *mga_encoder_init(struct drm_device *dev)
>  {
>   struct drm_encoder *encoder;
> - struct mga_encoder *mga_encoder;
>  
> - mga_encoder = kzalloc(sizeof(struct mga_encoder), GFP_KERNEL);
> - if (!mga_encoder)
> + encoder = drm_simple_encoder_create(dev, DRM_MODE_ENCODER_DAC);
> + if (IS_ERR(encoder))
>   return NULL;
>  
> - encoder = _encoder->base;
>   encoder->possible_crtcs = 0x1;
>  
> - drm_encoder_init(dev, encoder, _encoder_encoder_funcs,
> -  DRM_MODE_ENCODER_DAC, NULL);
> - drm_encoder_helper_add(encoder, _encoder_helper_funcs);
> -
>   return encoder;
>  }
>  
> -- 
> 2.25.0
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/4] drm/simple-kms: Add drm_simple_encoder_{init, create}()

2020-02-20 Thread Sam Ravnborg
Hi Thomas.

On Tue, Feb 18, 2020 at 09:48:12AM +0100, Thomas Zimmermann wrote:
> This patch makes the internal encoder implementation of the simple
> KMS helpers available to drivers.
> 
> These simple-encoder helpers initialize an encoder with an empty
> implementation. This covers the requirements of most of the existing
> DRM drivers. A call to drm_simple_encoder_create() allocates and
> initializes an encoder instance, a call to drm_simple_encoder_init()
> initializes a pre-allocated instance.
> 
> v2:
>   * move simple encoder to KMS helpers
>   * remove name argument; simplifies implementation
>   * don't allocate with devm_ interfaces; unsafe with DRM
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/drm_simple_kms_helper.c | 83 -
>  include/drm/drm_simple_kms_helper.h |  7 +++
>  2 files changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c 
> b/drivers/gpu/drm/drm_simple_kms_helper.c
> index 15fb516ae2d8..745c2f34c42b 100644
> --- a/drivers/gpu/drm/drm_simple_kms_helper.c
> +++ b/drivers/gpu/drm/drm_simple_kms_helper.c
> @@ -26,12 +26,90 @@
>   * entity. Some flexibility for code reuse is provided through a separately
>   * allocated _connector object and supporting optional _bridge
>   * encoder drivers.
> + *
> + * Many drivers use an encoder with an empty implementation. Such encoders
> + * fulfill the minimum requirements of the display pipeline, but don't add
> + * additional functionality. The simple-encoder functions
> + * drm_simple_encoder_init() and drm_simple_encoder_create() provide an
> + * appropriate implementation.
>   */
>  
> -static const struct drm_encoder_funcs drm_simple_kms_encoder_funcs = {
> +static const struct drm_encoder_funcs drm_simple_encoder_funcs_cleanup = {
>   .destroy = drm_encoder_cleanup,
>  };
>  
> +/**
> + * drm_simple_encoder_init - Initialize a preallocated encoder
> + * @dev: drm device
> + * @funcs: callbacks for this encoder
> + * @encoder_type: user visible type of the encoder
> + *
> + * Initialises a preallocated encoder that has no further functionality. The
> + * encoder will be released automatically.
I got confused here. The comment says the encoder is automatically
released. But in this case we have a preallocated encoder (maybe
embedded in ast_private or something.
So the encoder is - as I understnad it - not released. But it is cleaned
up as it is also documented in the next sentences.

Sorry if I am dense, just returned from some travelling...

Sam


Settings for possible CRTC and
> + * clones are left to their initial values. The encoder will be cleaned up
> + * automatically as part of the mode-setting cleanup.
> + *
> + * Also see drm_simple_encoder_create().
> + *
> + * Returns:
> + * Zero on success, error code on failure.
> + */
> +int drm_simple_encoder_init(struct drm_device *dev,
> + struct drm_encoder *encoder,
> + int encoder_type)
> +{
> + return drm_encoder_init(dev, encoder,
> + _simple_encoder_funcs_cleanup,
> + encoder_type, NULL);
> +}
> +EXPORT_SYMBOL(drm_simple_encoder_init);
> +
> +static void drm_encoder_destroy(struct drm_encoder *encoder)
> +{
> + drm_encoder_cleanup(encoder);
> + kfree(encoder);
> +}
> +
> +static const struct drm_encoder_funcs drm_simple_encoder_funcs_destroy = {
> + .destroy = drm_encoder_destroy,
> +};
> +
> +/**
> + * drm_simple_encoder_create - Allocate and initialize an encoder
> + * @dev: drm device
> + * @encoder_type: user visible type of the encoder
> + *
> + * Allocates and initialises an encoder that has no further functionality. 
> The
> + * encoder will be destroyed automatically as part of the mode-setting 
> cleanup.
> + *
> + * See drm_simple_encoder_init() for more information.
> + *
> + * Returns:
> + * The encoder on success, a pointer-encoder error code on failure.
> + */
> +struct drm_encoder *drm_simple_encoder_create(struct drm_device *dev,
> +   int encoder_type)
> +{
> + struct drm_encoder *encoder;
> + int ret;
> +
> + encoder = kzalloc(sizeof(*encoder), GFP_KERNEL);
> + if (!encoder)
> + return ERR_PTR(-ENOMEM);
> + ret = drm_encoder_init(dev, encoder,
> +_simple_encoder_funcs_destroy,
> +encoder_type, NULL);
> + if (ret)
> + goto err_kfree;
> +
> + return encoder;
> +
> +err_kfree:
> + kfree(encoder);
> + return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL(drm_simple_encoder_create);
> +
>  static enum drm_mode_status
>  drm_simple_kms_crtc_mode_valid(struct drm_crtc *crtc,
>  const struct drm_display_mode *mode)
> @@ -288,8 +366,7 @@ int drm_simple_display_pipe_init(struct drm_device *dev,
>   return ret;
>  
>   encoder->possible_crtcs = 

Re: [PATCH v2 1/4] drm/simple-kms: Add drm_simple_encoder_{init, create}()

2020-02-20 Thread Sam Ravnborg
Hi Thomas.

On Tue, Feb 18, 2020 at 09:48:12AM +0100, Thomas Zimmermann wrote:
> This patch makes the internal encoder implementation of the simple
> KMS helpers available to drivers.
> 
> These simple-encoder helpers initialize an encoder with an empty
> implementation. This covers the requirements of most of the existing
> DRM drivers. A call to drm_simple_encoder_create() allocates and
> initializes an encoder instance, a call to drm_simple_encoder_init()
> initializes a pre-allocated instance.
> 
> v2:
>   * move simple encoder to KMS helpers
>   * remove name argument; simplifies implementation
>   * don't allocate with devm_ interfaces; unsafe with DRM
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/drm_simple_kms_helper.c | 83 -
>  include/drm/drm_simple_kms_helper.h |  7 +++
>  2 files changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c 
> b/drivers/gpu/drm/drm_simple_kms_helper.c
> index 15fb516ae2d8..745c2f34c42b 100644
> --- a/drivers/gpu/drm/drm_simple_kms_helper.c
> +++ b/drivers/gpu/drm/drm_simple_kms_helper.c
> @@ -26,12 +26,90 @@
>   * entity. Some flexibility for code reuse is provided through a separately
>   * allocated _connector object and supporting optional _bridge
>   * encoder drivers.
> + *
> + * Many drivers use an encoder with an empty implementation. Such encoders
> + * fulfill the minimum requirements of the display pipeline, but don't add
> + * additional functionality. The simple-encoder functions
> + * drm_simple_encoder_init() and drm_simple_encoder_create() provide an
> + * appropriate implementation.
This paragraph reads a bit strange to me - I read this as a
justification for addding a generic encoded that can be used by exisitg
drivers.

How about something like this:

Many drivers requires a very simple encoder that only fullfill
the minimum requirements of the display pipeline and do not add
any extra functionslity.
The simple-encoder functions drm_simple_encoder_init() and
drm_simple_encoder_create() provides an impålmentation of such
a simple encoder.
The simple encoder includes automatically release of resources.

And then leave it to the changelog to tell what should be done in
existing drivers.



>   */
>  
> -static const struct drm_encoder_funcs drm_simple_kms_encoder_funcs = {
> +static const struct drm_encoder_funcs drm_simple_encoder_funcs_cleanup = {
>   .destroy = drm_encoder_cleanup,
>  };
>  
> +/**
> + * drm_simple_encoder_init - Initialize a preallocated encoder
> + * @dev: drm device
> + * @funcs: callbacks for this encoder
> + * @encoder_type: user visible type of the encoder
> + *
> + * Initialises a preallocated encoder that has no further functionality. The
> + * encoder will be released automatically. Settings for possible CRTC and
> + * clones are left to their initial values. The encoder will be cleaned up
> + * automatically as part of the mode-setting cleanup.
> + *
> + * Also see drm_simple_encoder_create().
s/Also see/See also/??


> + *
> + * Returns:
> + * Zero on success, error code on failure.
> + */
> +int drm_simple_encoder_init(struct drm_device *dev,
> + struct drm_encoder *encoder,
> + int encoder_type)
> +{
> + return drm_encoder_init(dev, encoder,
> + _simple_encoder_funcs_cleanup,
> + encoder_type, NULL);
> +}
> +EXPORT_SYMBOL(drm_simple_encoder_init);
> +
> +static void drm_encoder_destroy(struct drm_encoder *encoder)
> +{
> + drm_encoder_cleanup(encoder);
> + kfree(encoder);
> +}
> +
> +static const struct drm_encoder_funcs drm_simple_encoder_funcs_destroy = {
> + .destroy = drm_encoder_destroy,
> +};
> +
> +/**
> + * drm_simple_encoder_create - Allocate and initialize an encoder
> + * @dev: drm device
> + * @encoder_type: user visible type of the encoder
> + *
> + * Allocates and initialises an encoder that has no further functionality. 
> The
> + * encoder will be destroyed automatically as part of the mode-setting 
> cleanup.
> + *
> + * See drm_simple_encoder_init() for more information.
> + *
> + * Returns:
> + * The encoder on success, a pointer-encoder error code on failure.
pointer-encoded?



> + */
> +struct drm_encoder *drm_simple_encoder_create(struct drm_device *dev,
> +   int encoder_type)
> +{
> + struct drm_encoder *encoder;
> + int ret;
> +
> + encoder = kzalloc(sizeof(*encoder), GFP_KERNEL);
> + if (!encoder)
> + return ERR_PTR(-ENOMEM);
> + ret = drm_encoder_init(dev, encoder,
> +_simple_encoder_funcs_destroy,
> +encoder_type, NULL);
> + if (ret)
> + goto err_kfree;
> +
> + return encoder;
> +
> +err_kfree:
> + kfree(encoder);
> + return ERR_PTR(ret);
> +}
> 

Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h

2020-02-20 Thread Christian Borntraeger



On 20.02.20 17:31, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
>> >From a users perspective it makes absolutely perfect sense to use the
>> bounce buffers when they are NEEDED. 
>> Forcing the user to specify iommu_platform just because you need bounce 
>> buffers
>> really feels wrong. And obviously we have a severe performance issue
>> because of the indirections.
> 
> The point is that the user should not have to specify iommu_platform.

I (and Halil) agree on that. From a user perspective we want to
have the right thing in the guest without any command option. The iommu_plaform
thing was indeed a first step to make things work. 

I might mis-read Halils patch, but isnt one effect of this patch set 
that things WILL work without iommu_platform?


> We need to make sure any new hypervisor (especially one that might require
> bounce buffering) always sets it, as was a rather bogus legacy hack
> that isn't extensibe for cases that for example require bounce buffering.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h

2020-02-20 Thread Christoph Hellwig
On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:
> >From a users perspective it makes absolutely perfect sense to use the
> bounce buffers when they are NEEDED. 
> Forcing the user to specify iommu_platform just because you need bounce 
> buffers
> really feels wrong. And obviously we have a severe performance issue
> because of the indirections.

The point is that the user should not have to specify iommu_platform.
We need to make sure any new hypervisor (especially one that might require
bounce buffering) always sets it, as was a rather bogus legacy hack
that isn't extensibe for cases that for example require bounce buffering.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h

2020-02-20 Thread Christian Borntraeger



On 20.02.20 17:11, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 05:06:05PM +0100, Halil Pasic wrote:
>> Currently force_dma_unencrypted() is only used by the direct
>> implementation of the DMA API, and thus resides in dma-direct.h. But
>> there is nothing dma-direct specific about it: if one was -- for
>> whatever reason -- to implement custom DMA ops that have to in the
>> encrypted/protected scenarios dma-direct currently deals with, one would
>> need exactly this kind of information.
> 
> I really don't think it has business being anywhre else, and your completely
> bogus second patch just proves the point.

>From a users perspective it makes absolutely perfect sense to use the
bounce buffers when they are NEEDED. 
Forcing the user to specify iommu_platform just because you need bounce buffers
really feels wrong. And obviously we have a severe performance issue
because of the indirections.

Now: I understand that you want to get this fixes differently, but maybe you 
could help to outline how this could be fixed proper. 

Christian

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected

2020-02-20 Thread Christoph Hellwig
On Thu, Feb 20, 2020 at 05:06:06PM +0100, Halil Pasic wrote:
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 867c7ebd3f10..fafc8f924955 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
>   if (!virtio_has_iommu_quirk(vdev))
>   return true;
>  
> + if (force_dma_unencrypted(>dev))
> + return true;

Hell no.  This is a detail of the platform DMA direct implementation.
Drivers have no business looking at this flag, and virtio finally needs
to be fixed to use the DMA API properly for everything but legacy devices.

No amount of desparate hacks is going to fix that fundamental problem,
and I'm starting to get really sick and tired of all the crap patches
published in this area.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h

2020-02-20 Thread Christoph Hellwig
On Thu, Feb 20, 2020 at 05:06:05PM +0100, Halil Pasic wrote:
> Currently force_dma_unencrypted() is only used by the direct
> implementation of the DMA API, and thus resides in dma-direct.h. But
> there is nothing dma-direct specific about it: if one was -- for
> whatever reason -- to implement custom DMA ops that have to in the
> encrypted/protected scenarios dma-direct currently deals with, one would
> need exactly this kind of information.

I really don't think it has business being anywhre else, and your completely
bogus second patch just proves the point.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h

2020-02-20 Thread Halil Pasic
Currently force_dma_unencrypted() is only used by the direct
implementation of the DMA API, and thus resides in dma-direct.h. But
there is nothing dma-direct specific about it: if one was -- for
whatever reason -- to implement custom DMA ops that have to in the
encrypted/protected scenarios dma-direct currently deals with, one would
need exactly this kind of information.

More importantly, virtio has to use DMA API (so that the memory
encryption (x86) or protection (power, s390) is handled) under the very
same circumstances force_dma_unencrypted() returns true. Furthermore,
the in these cases the reason to go via the DMA API is distinct,
compared to the reason indicated by VIRTIO_F_IOMMU_PLATFORM: we need to
use DMA API independently of the device's properties with regards to
access to memory. I.e. the addresses in the descriptors are still guest
physical addresses, the device may still be implemented by a SMP CPU,
and thus the device shall use those without any further translation. See
[1].

Let's move force_dma_unencrypted() the so virtio, or other
implementations of DMA ops can make the right decisions.

[1] 
https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-416
(In the spec VIRTIO_F_IOMMU_PLATFORM is called
VIRTIO_F_ACCESS_PLATFORM).

Signed-off-by: Halil Pasic 
Tested-by: Ram Pai 
Tested-by: Michael Mueller 
---
 include/linux/dma-direct.h  |  9 -
 include/linux/mem_encrypt.h | 10 ++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 24b8684aa21d..590b15d881b0 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -26,15 +26,6 @@ static inline phys_addr_t __dma_to_phys(struct device *dev, 
dma_addr_t dev_addr)
 }
 #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
 
-#ifdef CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED
-bool force_dma_unencrypted(struct device *dev);
-#else
-static inline bool force_dma_unencrypted(struct device *dev)
-{
-   return false;
-}
-#endif /* CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED */
-
 /*
  * If memory encryption is supported, phys_to_dma will set the memory 
encryption
  * bit in the DMA address, and dma_to_phys will clear it.  The raw 
__phys_to_dma
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 5c4a18a91f89..64a48c4b01a2 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -22,6 +22,16 @@ static inline bool mem_encrypt_active(void) { return false; }
 
 #endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */
 
+struct device;
+#ifdef CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED
+bool force_dma_unencrypted(struct device *dev);
+#else
+static inline bool force_dma_unencrypted(struct device *dev)
+{
+   return false;
+}
+#endif /* CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED */
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 /*
  * The __sme_set() and __sme_clr() macros are useful for adding or removing
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

2020-02-20 Thread Halil Pasic
Currently if one intends to run a memory protection enabled VM with
virtio devices and linux as the guest OS, one needs to specify the
VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
linux use the DMA API, which in turn handles the memory
encryption/protection stuff if the guest decides to turn itself into
a protected one. This however makes no sense due to multiple reasons:
* The device is not changed by the fact that the guest RAM is
protected. The so called IOMMU bypass quirk is not affected.
* This usage is not congruent with  standardised semantics of
VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
for using DMA API in virtio (orthogonal with respect to what is
expressed by VIRTIO_F_IOMMU_PLATFORM). 

This series aims to decouple 'have to use DMA API because my (guest) RAM
is protected' and 'have to use DMA API because the device told me
VIRTIO_F_IOMMU_PLATFORM'.

Please find more detailed explanations about the conceptual aspects in
the individual patches. There is however also a very practical problem
that is addressed by this series. 

For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
effect The vhost code assumes it the addresses on the virtio descriptor
ring are not guest physical addresses but iova's, and insists on doing a
translation of these regardless of what transport is used (e.g. whether
we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
"vhost: new device IOTLB API".) On s390 this results in severe
performance degradation (c.a. factor 10). BTW with ccw I/O there is
(architecturally) no IOMMU, so the whole address translation makes no
sense in the context of virtio-ccw.

Halil Pasic (2):
  mm: move force_dma_unencrypted() to mem_encrypt.h
  virtio: let virtio use DMA API when guest RAM is protected

 drivers/virtio/virtio_ring.c |  3 +++
 include/linux/dma-direct.h   |  9 -
 include/linux/mem_encrypt.h  | 10 ++
 3 files changed, 13 insertions(+), 9 deletions(-)


base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected

2020-02-20 Thread Halil Pasic
Currently the advanced guest memory protection technologies (AMD SEV,
powerpc secure guest technology and s390 Protected VMs) abuse the
VIRTIO_F_IOMMU_PLATFORM flag to make virtio core use the DMA API, which
is in turn necessary, to make IO work with guest memory protection.

But VIRTIO_F_IOMMU_PLATFORM a.k.a. VIRTIO_F_ACCESS_PLATFORM is really a
different beast: with virtio devices whose implementation runs on an SMP
CPU we are still fine with doing all the usual optimizations, it is just
that we need to make sure that the memory protection mechanism does not
get in the way. The VIRTIO_F_ACCESS_PLATFORM mandates more work on the
side of the guest (and possibly he host side as well) than we actually
need.

An additional benefit of teaching the guest to make the right decision
(and use DMA API) on it's own is: removing the need, to mandate special
VM configuration for guests that may run with protection. This is
especially interesting for s390 as VIRTIO_F_IOMMU_PLATFORM pushes all
the virtio control structures into the first 2G of guest memory:
something we don't necessarily want to do per-default.

Signed-off-by: Halil Pasic 
Tested-by: Ram Pai 
Tested-by: Michael Mueller 
---
 drivers/virtio/virtio_ring.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 867c7ebd3f10..fafc8f924955 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -243,6 +243,9 @@ static bool vring_use_dma_api(struct virtio_device *vdev)
if (!virtio_has_iommu_quirk(vdev))
return true;
 
+   if (force_dma_unencrypted(>dev))
+   return true;
+
/* Otherwise, we are left to guess. */
/*
 * In theory, it's possible to have a buggy QEMU-supposed
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V4 3/5] vDPA: introduce vDPA bus

2020-02-20 Thread Jason Gunthorpe
On Thu, Feb 20, 2020 at 02:11:39PM +0800, Jason Wang wrote:
> vDPA device is a device that uses a datapath which complies with the
> virtio specifications with vendor specific control path. vDPA devices
> can be both physically located on the hardware or emulated by
> software. vDPA hardware devices are usually implemented through PCIE
> with the following types:
> 
> - PF (Physical Function) - A single Physical Function
> - VF (Virtual Function) - Device that supports single root I/O
>   virtualization (SR-IOV). Its Virtual Function (VF) represents a
>   virtualized instance of the device that can be assigned to different
>   partitions
> - ADI (Assignable Device Interface) and its equivalents - With
>   technologies such as Intel Scalable IOV, a virtual device (VDEV)
>   composed by host OS utilizing one or more ADIs. Or its equivalent
>   like SF (Sub function) from Mellanox.
> 
> From a driver's perspective, depends on how and where the DMA
> translation is done, vDPA devices are split into two types:
> 
> - Platform specific DMA translation - From the driver's perspective,
>   the device can be used on a platform where device access to data in
>   memory is limited and/or translated. An example is a PCIE vDPA whose
>   DMA request was tagged via a bus (e.g PCIE) specific way. DMA
>   translation and protection are done at PCIE bus IOMMU level.
> - Device specific DMA translation - The device implements DMA
>   isolation and protection through its own logic. An example is a vDPA
>   device which uses on-chip IOMMU.
> 
> To hide the differences and complexity of the above types for a vDPA
> device/IOMMU options and in order to present a generic virtio device
> to the upper layer, a device agnostic framework is required.
> 
> This patch introduces a software vDPA bus which abstracts the
> common attributes of vDPA device, vDPA bus driver and the
> communication method (vdpa_config_ops) between the vDPA device
> abstraction and the vDPA bus driver. This allows multiple types of
> drivers to be used for vDPA device like the virtio_vdpa and vhost_vdpa
> driver to operate on the bus and allow vDPA device could be used by
> either kernel virtio driver or userspace vhost drivers as:
> 
>virtio drivers  vhost drivers
>   | |
> [virtio bus]   [vhost uAPI]
>   | |
>virtio device   vhost device
>virtio_vdpa drv vhost_vdpa drv
>  \   /
> [vDPA bus]
>  |
> vDPA device
> hardware drv
>  |
> [hardware bus]
>  |
> vDPA hardware

I still don't like this strange complexity, vhost should have been
layered on top of the virtio device instead of adding an extra bus
just for vdpa.

However, I don't see any technical problems with this patch now.

Thanks,
Jason
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V4 5/5] vdpasim: vDPA device simulator

2020-02-20 Thread Jason Gunthorpe
On Thu, Feb 20, 2020 at 02:11:41PM +0800, Jason Wang wrote:
> +static void vdpasim_device_release(struct device *dev)
> +{
> + struct vdpasim *vdpasim = dev_to_sim(dev);
> +
> + cancel_work_sync(>work);
> + kfree(vdpasim->buffer);
> + vhost_iotlb_free(vdpasim->iommu);
> + kfree(vdpasim);
> +}
> +
> +static struct vdpasim *vdpasim_create(void)
> +{
> + struct virtio_net_config *config;
> + struct vhost_iotlb *iommu;
> + struct vdpasim *vdpasim;
> + struct device *dev;
> + void *buffer;
> + int ret = -ENOMEM;
> +
> + iommu = vhost_iotlb_alloc(2048, 0);
> + if (!iommu)
> + goto err;
> +
> + buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
> + if (!buffer)
> + goto err_buffer;
> +
> + vdpasim = kzalloc(sizeof(*vdpasim), GFP_KERNEL);
> + if (!vdpasim)
> + goto err_alloc;
> +
> + vdpasim->buffer = buffer;
> + vdpasim->iommu = iommu;
> +
> + config = >config;
> + config->mtu = 1500;
> + config->status = VIRTIO_NET_S_LINK_UP;
> + eth_random_addr(config->mac);
> +
> + INIT_WORK(>work, vdpasim_work);
> + spin_lock_init(>lock);
> +
> + vringh_set_iotlb(>vqs[0].vring, vdpasim->iommu);
> + vringh_set_iotlb(>vqs[1].vring, vdpasim->iommu);
> +
> + dev = >dev;
> + dev->release = vdpasim_device_release;
> + dev->coherent_dma_mask = DMA_BIT_MASK(64);
> + set_dma_ops(dev, _dma_ops);
> + dev_set_name(dev, "%s", VDPASIM_NAME);
> +
> + ret = device_register(>dev);
> + if (ret)
> + goto err_init;

It is a bit weird to be creating this dummy parent, couldn't this be
done by just passing a NULL parent to vdpa_alloc_device, doing
set_dma_ops() on the vdpasim->vdpa->dev and setting dma_device to
vdpasim->vdpa->dev ?

> + vdpasim->vdpa = vdpa_alloc_device(dev, dev, _net_config_ops);
> + if (ret)
> + goto err_vdpa;

> + ret = vdpa_register_device(vdpasim->vdpa);
> + if (ret)
> + goto err_register;
> +
> + return vdpasim;
> +
> +err_register:
> + put_device(>vdpa->dev);
> +err_vdpa:
> + device_del(>dev);
> + goto err;
> +err_init:
> + put_device(>dev);
> + goto err;

If you do the vdmasim alloc first, and immediately do
device_initialize() then all the failure paths can do put_device
instead of having this ugly goto unwind split. Just check for
vdpasim->iommu == NULL during release.

> +static int __init vdpasim_dev_init(void)
> +{
> + vdpasim_dev = vdpasim_create();
> +
> + if (!IS_ERR(vdpasim_dev))
> + return 0;
> +
> + return PTR_ERR(vdpasim_dev);
> +}
> +
> +static int vdpasim_device_remove_cb(struct device *dev, void *data)
> +{
> + struct vdpa_device *vdpa = dev_to_vdpa(dev);
> +
> + vdpa_unregister_device(vdpa);
> +
> + return 0;
> +}
> +
> +static void __exit vdpasim_dev_exit(void)
> +{
> + device_for_each_child(_dev->dev, NULL,
> +   vdpasim_device_remove_cb);

Why the loop? There is only one device, and it is in the global
varaible vdmasim_dev ?

Jason
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V4 4/5] virtio: introduce a vDPA based transport

2020-02-20 Thread Jason Gunthorpe
On Thu, Feb 20, 2020 at 02:11:40PM +0800, Jason Wang wrote:
> +static int virtio_vdpa_probe(struct vdpa_device *vdpa)
> +{
> + const struct vdpa_config_ops *ops = vdpa->config;
> + struct virtio_vdpa_device *vd_dev;
> + int ret = -EINVAL;
> +
> + vd_dev = kzalloc(sizeof(*vd_dev), GFP_KERNEL);
> + if (!vd_dev)
> + return -ENOMEM;
> +
> + vd_dev->vdev.dev.parent = vdpa_get_dma_dev(vdpa);
> + vd_dev->vdev.dev.release = virtio_vdpa_release_dev;
> + vd_dev->vdev.config = _vdpa_config_ops;
> + vd_dev->vdpa = vdpa;
> + INIT_LIST_HEAD(_dev->virtqueues);
> + spin_lock_init(_dev->lock);
> +
> + vd_dev->vdev.id.device = ops->get_device_id(vdpa);
> + if (vd_dev->vdev.id.device == 0)
> + goto err;
> +
> + vd_dev->vdev.id.vendor = ops->get_vendor_id(vdpa);
> + ret = register_virtio_device(_dev->vdev);
> + if (ret)
> + goto err;

This error unwind is wrong. register_virtio_device() does
device_initialize() as it's first action. After that point error
unwind must be done with put_device() - particularly calling
kfree(vd_dev) after doing dev_set_name() leaks memory.

Looks like about half of the register_virtio_device() users did this
right, the others not. Perhaps you should fix them too...

Jason
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RESEND PATCH v2 9/9] ath5k: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-20 Thread Jiri Slaby
On 19. 02. 20, 18:50, Krzysztof Kozlowski wrote:
> The ioreadX() helpers have inconsistent interface.  On some architectures
> void *__iomem address argument is a pointer to const, on some not.
> 
> Implementations of ioreadX() do not modify the memory under the address
> so they can be converted to a "const" version for const-safety and
> consistency among architectures.
> 
> Signed-off-by: Krzysztof Kozlowski 
> Acked-by: Kalle Valo 
> ---
>  drivers/net/wireless/ath/ath5k/ahb.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/wireless/ath/ath5k/ahb.c 
> b/drivers/net/wireless/ath/ath5k/ahb.c
> index 2c9cec8b53d9..8bd01df369fb 100644
> --- a/drivers/net/wireless/ath/ath5k/ahb.c
> +++ b/drivers/net/wireless/ath/ath5k/ahb.c
> @@ -138,18 +138,18 @@ static int ath_ahb_probe(struct platform_device *pdev)
>  
>   if (bcfg->devid >= AR5K_SREV_AR2315_R6) {
>   /* Enable WMAC AHB arbitration */
> - reg = ioread32((void __iomem *) AR5K_AR2315_AHB_ARB_CTL);
> + reg = ioread32((const void __iomem *) AR5K_AR2315_AHB_ARB_CTL);

While I understand why the parameter of ioread32 should be const, I
don't see a reason for these casts on the users' side. What does it
bring except longer code to read?

thanks,
-- 
js
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_balloon: Fix build error seen with CONFIG_BALLOON_COMPACTION=n

2020-02-20 Thread David Hildenbrand


> Am 20.02.2020 um 03:32 schrieb Guenter Roeck :
> 
> 0day reports:
> 
> drivers//virtio/virtio_balloon.c: In function 'virtballoon_probe':
> drivers//virtio/virtio_balloon.c:960:1: error:
>label 'out_del_vqs' defined but not used [-Werror=unused-label]
> 
> This is seen with CONFIG_BALLOON_COMPACTION=n.
> 
> Reported-by: kbuild test robot 
> Fixes: 1ad6f58ea936 ("virtio_balloon: Fix memory leaks on errors in 
> virtballoon_probe()")
> Cc: David Hildenbrand 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Guenter Roeck 
> ---

There is already a patch on the list to fix that (and it looks exactly like 
this one :) ). Unfortunately, not picked up yet.

Thanks!

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization