date:20220505

Prior series have transformed other parts of VFIO from working on struct
device or struct vfio_group into working directly on struct
vfio_device. Based on that work we now have vfio_device's readily
available in all the drivers.

Update the rest of the driver facing API to use vfio_device as an input.

The following are switched from struct device to struct vfio_device:
  vfio_register_notifier()
  vfio_unregister_notifier()
  vfio_pin_pages()
  vfio_unpin_pages()
  vfio_dma_rw()

The following group APIs are obsoleted and removed by just using struct
vfio_device with the above:
  vfio_group_pin_pages()
  vfio_group_unpin_pages()
  vfio_group_iommu_domain()
  vfio_group_get_external_user_from_dev()

To retain the performance of the new device APIs relative to their group
versions optimize how vfio_group_add_container_user() is used to avoid
calling it when the driver must already guarantee the device is open and
the container_users incrd.

The remaining exported VFIO group interfaces are only used by kvm, and are
addressed by a parallel series.

This series is based on Christoph's gvt rework here:

 https://lore.kernel.org/all/5a8b9f48-2c32-8177-1c18-e3bd7bfde...@intel.com/

and so will need the PR merged first.

I have a followup series that needs this.

This is also part of the iommufd work - moving the driver facing interface
to vfio_device provides a much cleaner path to integrate with iommufd.

v4:
 - Use 'device' as the argument name for a struct vfio_device in vfio.c
v3: 
https://lore.kernel.org/r/0-v3-e131a9b6b467+14b6-vfio_mdev_no_group_...@nvidia.com
 - Based on VFIO's gvt/iommu merge
 - Remove mention of mdev_legacy_get_vfio_device() from commit message
 - Clarify commit message for vfio_dma_rw() conversion
 - Talk about the open_count change in the commit message
 - No code change
v2: 
https://lore.kernel.org/r/0-v2-6011bde8e0a1+5f-vfio_mdev_no_group_...@nvidia.com
 - Based on Christoph's series so mdev_legacy_get_vfio_device() is removed
 - Reflow indenting
 - Use vfio_assert_device_open() and WARN_ON_ONCE instead of open coding
   the assertion
v1: 
https://lore.kernel.org/r/0-v1-a8faf768d202+125dd-vfio_mdev_no_group_...@nvidia.com

Jason Gunthorpe (7):
  vfio: Make vfio_(un)register_notifier accept a vfio_device
  vfio/ccw: Remove mdev from struct channel_program
  vfio/mdev: Pass in a struct vfio_device * to vfio_pin/unpin_pages()
  vfio/mdev: Pass in a struct vfio_device * to vfio_dma_rw()
  drm/i915/gvt: Change from vfio_group_(un)pin_pages to
vfio_(un)pin_pages
  vfio: Remove dead code
  vfio: Remove calls to vfio_group_add_container_user()

 .../driver-api/vfio-mediated-device.rst   |   4 +-
 drivers/gpu/drm/i915/gvt/gvt.h|   5 +-
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  51 ++-
 drivers/s390/cio/vfio_ccw_cp.c|  47 +--
 drivers/s390/cio/vfio_ccw_cp.h|   4 +-
 drivers/s390/cio/vfio_ccw_fsm.c   |   3 +-
 drivers/s390/cio/vfio_ccw_ops.c   |   7 +-
 drivers/s390/crypto/vfio_ap_ops.c |  23 +-
 drivers/vfio/vfio.c   | 299 +++---
 include/linux/vfio.h  |  21 +-
 10 files changed, 109 insertions(+), 355 deletions(-)


base-commit: 676d7cda1a3c19872428a9bc818577a1aafafdd5
-- 
2.36.0

[PATCH v4 4/7] vfio/mdev: Pass in a struct vfio_device * to vfio_dma_rw()

Every caller has a readily available vfio_device pointer, use that instead
of passing in a generic struct device. Change vfio_dma_rw() to take in the
struct vfio_device and move the container users that would have been held
by vfio_group_get_external_user_from_dev() to vfio_dma_rw() directly, like
vfio_pin/unpin_pages().

Reviewed-by: Christoph Hellwig 
Reviewed-by: Kevin Tian 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/i915/gvt/gvt.h |  4 ++--
 drivers/vfio/vfio.c| 24 +++-
 include/linux/vfio.h   |  2 +-
 3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 03ecffc2ba56a9..5a28ee965b7f3e 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -732,7 +732,7 @@ static inline int intel_gvt_read_gpa(struct intel_vgpu 
*vgpu, unsigned long gpa,
 {
if (!vgpu->attached)
return -ESRCH;
-   return vfio_dma_rw(vgpu->vfio_group, gpa, buf, len, false);
+   return vfio_dma_rw(>vfio_device, gpa, buf, len, false);
 }
 
 /**
@@ -750,7 +750,7 @@ static inline int intel_gvt_write_gpa(struct intel_vgpu 
*vgpu,
 {
if (!vgpu->attached)
return -ESRCH;
-   return vfio_dma_rw(vgpu->vfio_group, gpa, buf, len, true);
+   return vfio_dma_rw(>vfio_device, gpa, buf, len, true);
 }
 
 void intel_gvt_debugfs_remove_vgpu(struct intel_vgpu *vgpu);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 93caab1f29dbd7..85e1304099b8a5 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -2108,32 +2108,28 @@ EXPORT_SYMBOL(vfio_group_unpin_pages);
  * As the read/write of user space memory is conducted via the CPUs and is
  * not a real device DMA, it is not necessary to pin the user space memory.
  *
- * The caller needs to call vfio_group_get_external_user() or
- * vfio_group_get_external_user_from_dev() prior to calling this interface,
- * so as to prevent the VFIO group from disposal in the middle of the call.
- * But it can keep the reference to the VFIO group for several calls into
- * this interface.
- * After finishing using of the VFIO group, the caller needs to release the
- * VFIO group by calling vfio_group_put_external_user().
- *
- * @group [in] : VFIO group
+ * @device [in]: VFIO device
  * @user_iova [in] : base IOVA of a user space buffer
  * @data [in]  : pointer to kernel buffer
  * @len [in]   : kernel buffer length
  * @write  : indicate read or write
  * Return error code on failure or 0 on success.
  */
-int vfio_dma_rw(struct vfio_group *group, dma_addr_t user_iova,
-   void *data, size_t len, bool write)
+int vfio_dma_rw(struct vfio_device *device, dma_addr_t user_iova, void *data,
+   size_t len, bool write)
 {
struct vfio_container *container;
struct vfio_iommu_driver *driver;
int ret = 0;
 
-   if (!group || !data || len <= 0)
+   if (!data || len <= 0)
return -EINVAL;
 
-   container = group->container;
+   ret = vfio_group_add_container_user(device->group);
+   if (ret)
+   return ret;
+
+   container = device->group->container;
driver = container->iommu_driver;
 
if (likely(driver && driver->ops->dma_rw))
@@ -2142,6 +2138,8 @@ int vfio_dma_rw(struct vfio_group *group, dma_addr_t 
user_iova,
else
ret = -ENOTTY;
 
+   vfio_group_try_dissolve_container(device->group);
+
return ret;
 }
 EXPORT_SYMBOL(vfio_dma_rw);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index bddc70f88899c3..8a151025871776 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -161,7 +161,7 @@ extern int vfio_group_pin_pages(struct vfio_group *group,
 extern int vfio_group_unpin_pages(struct vfio_group *group,
  unsigned long *user_iova_pfn, int npage);
 
-extern int vfio_dma_rw(struct vfio_group *group, dma_addr_t user_iova,
+extern int vfio_dma_rw(struct vfio_device *device, dma_addr_t user_iova,
   void *data, size_t len, bool write);
 
 extern struct iommu_domain *vfio_group_iommu_domain(struct vfio_group *group);
-- 
2.36.0

[PATCH v4 6/7] vfio: Remove dead code

Now that callers have been updated to use the vfio_device APIs the driver
facing group interface is no longer used, delete it:

- vfio_group_get_external_user_from_dev()
- vfio_group_pin_pages()
- vfio_group_unpin_pages()
- vfio_group_iommu_domain()

Reviewed-by: Christoph Hellwig 
Reviewed-by: Kevin Tian 
Signed-off-by: Jason Gunthorpe 
--
FIXME: vfio_group_put_external_user() is removable too when combined with the 
KVM series
---
 drivers/vfio/vfio.c  | 151 ---
 include/linux/vfio.h |  11 
 2 files changed, 162 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 85e1304099b8a5..c651c4805acd59 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1732,44 +1732,6 @@ struct vfio_group *vfio_group_get_external_user(struct 
file *filep)
 }
 EXPORT_SYMBOL_GPL(vfio_group_get_external_user);
 
-/*
- * External user API, exported by symbols to be linked dynamically.
- * The external user passes in a device pointer
- * to verify that:
- * - A VFIO group is assiciated with the device;
- * - IOMMU is set for the group.
- * If both checks passed, vfio_group_get_external_user_from_dev()
- * increments the container user counter to prevent the VFIO group
- * from disposal before external user exits and returns the pointer
- * to the VFIO group.
- *
- * When the external user finishes using the VFIO group, it calls
- * vfio_group_put_external_user() to release the VFIO group and
- * decrement the container user counter.
- *
- * @dev [in]   : device
- * Return error PTR or pointer to VFIO group.
- */
-
-struct vfio_group *vfio_group_get_external_user_from_dev(struct device *dev)
-{
-   struct vfio_group *group;
-   int ret;
-
-   group = vfio_group_get_from_dev(dev);
-   if (!group)
-   return ERR_PTR(-ENODEV);
-
-   ret = vfio_group_add_container_user(group);
-   if (ret) {
-   vfio_group_put(group);
-   return ERR_PTR(ret);
-   }
-
-   return group;
-}
-EXPORT_SYMBOL_GPL(vfio_group_get_external_user_from_dev);
-
 void vfio_group_put_external_user(struct vfio_group *group)
 {
vfio_group_try_dissolve_container(group);
@@ -2003,101 +1965,6 @@ int vfio_unpin_pages(struct vfio_device *device, 
unsigned long *user_pfn,
 }
 EXPORT_SYMBOL(vfio_unpin_pages);
 
-/*
- * Pin a set of guest IOVA PFNs and return their associated host PFNs for a
- * VFIO group.
- *
- * The caller needs to call vfio_group_get_external_user() or
- * vfio_group_get_external_user_from_dev() prior to calling this interface,
- * so as to prevent the VFIO group from disposal in the middle of the call.
- * But it can keep the reference to the VFIO group for several calls into
- * this interface.
- * After finishing using of the VFIO group, the caller needs to release the
- * VFIO group by calling vfio_group_put_external_user().
- *
- * @group [in] : VFIO group
- * @user_iova_pfn [in] : array of user/guest IOVA PFNs to be pinned.
- * @npage [in] : count of elements in user_iova_pfn array.
- *   This count should not be greater
- *   VFIO_PIN_PAGES_MAX_ENTRIES.
- * @prot [in]  : protection flags
- * @phys_pfn [out] : array of host PFNs
- * Return error or number of pages pinned.
- */
-int vfio_group_pin_pages(struct vfio_group *group,
-unsigned long *user_iova_pfn, int npage,
-int prot, unsigned long *phys_pfn)
-{
-   struct vfio_container *container;
-   struct vfio_iommu_driver *driver;
-   int ret;
-
-   if (!group || !user_iova_pfn || !phys_pfn || !npage)
-   return -EINVAL;
-
-   if (group->dev_counter > 1)
-   return -EINVAL;
-
-   if (npage > VFIO_PIN_PAGES_MAX_ENTRIES)
-   return -E2BIG;
-
-   container = group->container;
-   driver = container->iommu_driver;
-   if (likely(driver && driver->ops->pin_pages))
-   ret = driver->ops->pin_pages(container->iommu_data,
-group->iommu_group, user_iova_pfn,
-npage, prot, phys_pfn);
-   else
-   ret = -ENOTTY;
-
-   return ret;
-}
-EXPORT_SYMBOL(vfio_group_pin_pages);
-
-/*
- * Unpin a set of guest IOVA PFNs for a VFIO group.
- *
- * The caller needs to call vfio_group_get_external_user() or
- * vfio_group_get_external_user_from_dev() prior to calling this interface,
- * so as to prevent the VFIO group from disposal in the middle of the call.
- * But it can keep the reference to the VFIO group for several calls into
- * this interface.
- * After finishing using of the VFIO group, the caller needs to release the
- * VFIO group by calling vfio_group_put_external_user().
- *
- * @group [in] : vfio group
- * @user_iova_pfn [in] : array of user/guest IOVA PFNs to be unpinned.
- * @npage [in] : count of elements in user_iova_pfn array.
- *

[PATCH v4 2/7] vfio/ccw: Remove mdev from struct channel_program

The next patch wants the vfio_device instead. There is no reason to store
a pointer here since we can container_of back to the vfio_device.

Reviewed-by: Eric Farman 
Signed-off-by: Jason Gunthorpe 
---
 drivers/s390/cio/vfio_ccw_cp.c  | 47 -
 drivers/s390/cio/vfio_ccw_cp.h  |  4 +--
 drivers/s390/cio/vfio_ccw_fsm.c |  3 +--
 3 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 8d1b2771c1aa02..7a1cf3091cd647 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -16,6 +16,7 @@
 #include 
 
 #include "vfio_ccw_cp.h"
+#include "vfio_ccw_private.h"
 
 struct pfn_array {
/* Starting guest physical I/O address. */
@@ -98,17 +99,17 @@ static int pfn_array_alloc(struct pfn_array *pa, u64 iova, 
unsigned int len)
  * If the pin request partially succeeds, or fails completely,
  * all pages are left unpinned and a negative error value is returned.
  */
-static int pfn_array_pin(struct pfn_array *pa, struct device *mdev)
+static int pfn_array_pin(struct pfn_array *pa, struct vfio_device *vdev)
 {
int ret = 0;
 
-   ret = vfio_pin_pages(mdev, pa->pa_iova_pfn, pa->pa_nr,
+   ret = vfio_pin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr,
 IOMMU_READ | IOMMU_WRITE, pa->pa_pfn);
 
if (ret < 0) {
goto err_out;
} else if (ret > 0 && ret != pa->pa_nr) {
-   vfio_unpin_pages(mdev, pa->pa_iova_pfn, ret);
+   vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, ret);
ret = -EINVAL;
goto err_out;
}
@@ -122,11 +123,11 @@ static int pfn_array_pin(struct pfn_array *pa, struct 
device *mdev)
 }
 
 /* Unpin the pages before releasing the memory. */
-static void pfn_array_unpin_free(struct pfn_array *pa, struct device *mdev)
+static void pfn_array_unpin_free(struct pfn_array *pa, struct vfio_device 
*vdev)
 {
/* Only unpin if any pages were pinned to begin with */
if (pa->pa_nr)
-   vfio_unpin_pages(mdev, pa->pa_iova_pfn, pa->pa_nr);
+   vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr);
pa->pa_nr = 0;
kfree(pa->pa_iova_pfn);
 }
@@ -190,8 +191,7 @@ static void convert_ccw0_to_ccw1(struct ccw1 *source, 
unsigned long len)
  * Within the domain (@mdev), copy @n bytes from a guest physical
  * address (@iova) to a host physical address (@to).
  */
-static long copy_from_iova(struct device *mdev,
-  void *to, u64 iova,
+static long copy_from_iova(struct vfio_device *vdev, void *to, u64 iova,
   unsigned long n)
 {
struct pfn_array pa = {0};
@@ -203,9 +203,9 @@ static long copy_from_iova(struct device *mdev,
if (ret < 0)
return ret;
 
-   ret = pfn_array_pin(, mdev);
+   ret = pfn_array_pin(, vdev);
if (ret < 0) {
-   pfn_array_unpin_free(, mdev);
+   pfn_array_unpin_free(, vdev);
return ret;
}
 
@@ -226,7 +226,7 @@ static long copy_from_iova(struct device *mdev,
break;
}
 
-   pfn_array_unpin_free(, mdev);
+   pfn_array_unpin_free(, vdev);
 
return l;
 }
@@ -423,11 +423,13 @@ static int ccwchain_loop_tic(struct ccwchain *chain,
 
 static int ccwchain_handle_ccw(u32 cda, struct channel_program *cp)
 {
+   struct vfio_device *vdev =
+   _of(cp, struct vfio_ccw_private, cp)->vdev;
struct ccwchain *chain;
int len, ret;
 
/* Copy 2K (the most we support today) of possible CCWs */
-   len = copy_from_iova(cp->mdev, cp->guest_cp, cda,
+   len = copy_from_iova(vdev, cp->guest_cp, cda,
 CCWCHAIN_LEN_MAX * sizeof(struct ccw1));
if (len)
return len;
@@ -508,6 +510,8 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
 int idx,
 struct channel_program *cp)
 {
+   struct vfio_device *vdev =
+   _of(cp, struct vfio_ccw_private, cp)->vdev;
struct ccw1 *ccw;
struct pfn_array *pa;
u64 iova;
@@ -526,7 +530,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
if (ccw_is_idal(ccw)) {
/* Read first IDAW to see if it's 4K-aligned or not. */
/* All subsequent IDAws will be 4K-aligned. */
-   ret = copy_from_iova(cp->mdev, , ccw->cda, sizeof(iova));
+   ret = copy_from_iova(vdev, , ccw->cda, sizeof(iova));
if (ret)
return ret;
} else {
@@ -555,7 +559,7 @@ static int ccwchain_fetch_direct(struct ccwchain *chain,
 
if (ccw_is_idal(ccw)) {
/* Copy guest IDAL into host IDAL */
-   ret = copy_from_iova(cp->mdev, idaws, ccw->cda, idal_len);
+   ret =

[PATCH v4 7/7] vfio: Remove calls to vfio_group_add_container_user()

When the open_device() op is called the container_users is incremented and
held incremented until close_device(). Thus, so long as drivers call
functions within their open_device()/close_device() region they do not
need to worry about the container_users.

These functions can all only be called between open_device() and
close_device():

  vfio_pin_pages()
  vfio_unpin_pages()
  vfio_dma_rw()
  vfio_register_notifier()
  vfio_unregister_notifier()

Eliminate the calls to vfio_group_add_container_user() and add
vfio_assert_device_open() to detect driver mis-use. This causes the
close_device() op to check device->open_count so always leave it elevated
while calling the op.

Reviewed-by: Christoph Hellwig 
Reviewed-by: Kevin Tian 
Signed-off-by: Jason Gunthorpe 
---
 drivers/vfio/vfio.c | 80 ++---
 1 file changed, 17 insertions(+), 63 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index c651c4805acd59..8bb38941c1dfd8 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1115,6 +1115,12 @@ static int vfio_group_add_container_user(struct 
vfio_group *group)
 
 static const struct file_operations vfio_device_fops;
 
+/* true if the vfio_device has open_device() called but not close_device() */
+static bool vfio_assert_device_open(struct vfio_device *device)
+{
+   return !WARN_ON_ONCE(!READ_ONCE(device->open_count));
+}
+
 static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 {
struct vfio_device *device;
@@ -1329,8 +1335,10 @@ static int vfio_device_fops_release(struct inode *inode, 
struct file *filep)
struct vfio_device *device = filep->private_data;
 
mutex_lock(>dev_set->lock);
-   if (!--device->open_count && device->ops->close_device)
+   vfio_assert_device_open(device);
+   if (device->open_count == 1 && device->ops->close_device)
device->ops->close_device(device);
+   device->open_count--;
mutex_unlock(>dev_set->lock);
 
module_put(device->dev->driver->owner);
@@ -1897,7 +1905,8 @@ int vfio_pin_pages(struct vfio_device *device, unsigned 
long *user_pfn,
struct vfio_iommu_driver *driver;
int ret;
 
-   if (!user_pfn || !phys_pfn || !npage)
+   if (!user_pfn || !phys_pfn || !npage ||
+   !vfio_assert_device_open(device))
return -EINVAL;
 
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES)
@@ -1906,10 +1915,6 @@ int vfio_pin_pages(struct vfio_device *device, unsigned 
long *user_pfn,
if (group->dev_counter > 1)
return -EINVAL;
 
-   ret = vfio_group_add_container_user(group);
-   if (ret)
-   return ret;
-
container = group->container;
driver = container->iommu_driver;
if (likely(driver && driver->ops->pin_pages))
@@ -1919,8 +1924,6 @@ int vfio_pin_pages(struct vfio_device *device, unsigned 
long *user_pfn,
else
ret = -ENOTTY;
 
-   vfio_group_try_dissolve_container(group);
-
return ret;
 }
 EXPORT_SYMBOL(vfio_pin_pages);
@@ -1941,16 +1944,12 @@ int vfio_unpin_pages(struct vfio_device *device, 
unsigned long *user_pfn,
struct vfio_iommu_driver *driver;
int ret;
 
-   if (!user_pfn || !npage)
+   if (!user_pfn || !npage || !vfio_assert_device_open(device))
return -EINVAL;
 
if (npage > VFIO_PIN_PAGES_MAX_ENTRIES)
return -E2BIG;
 
-   ret = vfio_group_add_container_user(device->group);
-   if (ret)
-   return ret;
-
container = device->group->container;
driver = container->iommu_driver;
if (likely(driver && driver->ops->unpin_pages))
@@ -1959,8 +1958,6 @@ int vfio_unpin_pages(struct vfio_device *device, unsigned 
long *user_pfn,
else
ret = -ENOTTY;
 
-   vfio_group_try_dissolve_container(device->group);
-
return ret;
 }
 EXPORT_SYMBOL(vfio_unpin_pages);
@@ -1989,13 +1986,9 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t 
user_iova, void *data,
struct vfio_iommu_driver *driver;
int ret = 0;
 
-   if (!data || len <= 0)
+   if (!data || len <= 0 || !vfio_assert_device_open(device))
return -EINVAL;
 
-   ret = vfio_group_add_container_user(device->group);
-   if (ret)
-   return ret;
-
container = device->group->container;
driver = container->iommu_driver;
 
@@ -2004,9 +1997,6 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t 
user_iova, void *data,
  user_iova, data, len, write);
else
ret = -ENOTTY;
-
-   vfio_group_try_dissolve_container(device->group);
-
return ret;
 }
 EXPORT_SYMBOL(vfio_dma_rw);
@@ -2019,10 +2009,6 @@ static int vfio_register_iommu_notifier(struct 
vfio_group *group,
struct vfio_iommu_driver *driver;
int ret;
 
-   ret =

[PATCH v4 1/7] vfio: Make vfio_(un)register_notifier accept a vfio_device

All callers have a struct vfio_device trivially available, pass it in
directly and avoid calling the expensive vfio_group_get_from_dev().

Acked-by: Eric Farman 
Reviewed-by: Jason J. Herne 
Reviewed-by: Tony Krowiak 
Reviewed-by: Kevin Tian 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/i915/gvt/kvmgt.c  | 24 
 drivers/s390/cio/vfio_ccw_ops.c   |  7 +++
 drivers/s390/crypto/vfio_ap_ops.c | 14 +++---
 drivers/vfio/vfio.c   | 28 +---
 include/linux/vfio.h  |  4 ++--
 5 files changed, 33 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 0787ba5c301f5e..1cec4f1fdfaced 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -810,8 +810,8 @@ static int intel_vgpu_open_device(struct vfio_device 
*vfio_dev)
vgpu->group_notifier.notifier_call = intel_vgpu_group_notifier;
 
events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
-   ret = vfio_register_notifier(vfio_dev->dev, VFIO_IOMMU_NOTIFY, ,
-   >iommu_notifier);
+   ret = vfio_register_notifier(vfio_dev, VFIO_IOMMU_NOTIFY, ,
+>iommu_notifier);
if (ret != 0) {
gvt_vgpu_err("vfio_register_notifier for iommu failed: %d\n",
ret);
@@ -819,8 +819,8 @@ static int intel_vgpu_open_device(struct vfio_device 
*vfio_dev)
}
 
events = VFIO_GROUP_NOTIFY_SET_KVM;
-   ret = vfio_register_notifier(vfio_dev->dev, VFIO_GROUP_NOTIFY, ,
-   >group_notifier);
+   ret = vfio_register_notifier(vfio_dev, VFIO_GROUP_NOTIFY, ,
+>group_notifier);
if (ret != 0) {
gvt_vgpu_err("vfio_register_notifier for group failed: %d\n",
ret);
@@ -873,12 +873,12 @@ static int intel_vgpu_open_device(struct vfio_device 
*vfio_dev)
vgpu->vfio_group = NULL;
 
 undo_register:
-   vfio_unregister_notifier(vfio_dev->dev, VFIO_GROUP_NOTIFY,
-   >group_notifier);
+   vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY,
+>group_notifier);
 
 undo_iommu:
-   vfio_unregister_notifier(vfio_dev->dev, VFIO_IOMMU_NOTIFY,
-   >iommu_notifier);
+   vfio_unregister_notifier(vfio_dev, VFIO_IOMMU_NOTIFY,
+>iommu_notifier);
 out:
return ret;
 }
@@ -907,13 +907,13 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
 
intel_gvt_release_vgpu(vgpu);
 
-   ret = vfio_unregister_notifier(vgpu->vfio_device.dev, VFIO_IOMMU_NOTIFY,
-   >iommu_notifier);
+   ret = vfio_unregister_notifier(>vfio_device, VFIO_IOMMU_NOTIFY,
+  >iommu_notifier);
drm_WARN(>drm, ret,
 "vfio_unregister_notifier for iommu failed: %d\n", ret);
 
-   ret = vfio_unregister_notifier(vgpu->vfio_device.dev, VFIO_GROUP_NOTIFY,
-   >group_notifier);
+   ret = vfio_unregister_notifier(>vfio_device, VFIO_GROUP_NOTIFY,
+  >group_notifier);
drm_WARN(>drm, ret,
 "vfio_unregister_notifier for group failed: %d\n", ret);
 
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index c4d60cdbf247bf..b49e2e9db2dc6f 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -183,7 +183,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device 
*vdev)
 
private->nb.notifier_call = vfio_ccw_mdev_notifier;
 
-   ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
+   ret = vfio_register_notifier(vdev, VFIO_IOMMU_NOTIFY,
 , >nb);
if (ret)
return ret;
@@ -204,8 +204,7 @@ static int vfio_ccw_mdev_open_device(struct vfio_device 
*vdev)
 
 out_unregister:
vfio_ccw_unregister_dev_regions(private);
-   vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
->nb);
+   vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, >nb);
return ret;
 }
 
@@ -223,7 +222,7 @@ static void vfio_ccw_mdev_close_device(struct vfio_device 
*vdev)
 
cp_free(>cp);
vfio_ccw_unregister_dev_regions(private);
-   vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, >nb);
+   vfio_unregister_notifier(vdev, VFIO_IOMMU_NOTIFY, >nb);
 }
 
 static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
b/drivers/s390/crypto/vfio_ap_ops.c
index ee0a3bf8f476ca..bfa7ee6ef532d9 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1406,21 +1406,21 @@

[PATCH v4 5/7] drm/i915/gvt: Change from vfio_group_(un)pin_pages to vfio_(un)pin_pages

Use the existing vfio_device versions of vfio_(un)pin_pages(). There is no
reason to use a group interface here, kvmgt has easy access to a
vfio_device.

Delete kvmgt_vdev::vfio_group since these calls were the last users.

Reviewed-by: Kevin Tian 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/i915/gvt/gvt.h   |  1 -
 drivers/gpu/drm/i915/gvt/kvmgt.c | 27 ++-
 2 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 5a28ee965b7f3e..2af4c83e733c6c 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -231,7 +231,6 @@ struct intel_vgpu {
struct kvm *kvm;
struct work_struct release_work;
atomic_t released;
-   struct vfio_group *vfio_group;
 
struct kvm_page_track_notifier_node track_node;
 #define NR_BKT (1 << 18)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 1cec4f1fdfaced..7655ffa97d5116 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -243,7 +243,7 @@ static void gvt_unpin_guest_page(struct intel_vgpu *vgpu, 
unsigned long gfn,
for (npage = 0; npage < total_pages; npage++) {
unsigned long cur_gfn = gfn + npage;
 
-   ret = vfio_group_unpin_pages(vgpu->vfio_group, _gfn, 1);
+   ret = vfio_unpin_pages(>vfio_device, _gfn, 1);
drm_WARN_ON(>drm, ret != 1);
}
 }
@@ -266,8 +266,8 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, 
unsigned long gfn,
unsigned long cur_gfn = gfn + npage;
unsigned long pfn;
 
-   ret = vfio_group_pin_pages(vgpu->vfio_group, _gfn, 1,
-  IOMMU_READ | IOMMU_WRITE, );
+   ret = vfio_pin_pages(>vfio_device, _gfn, 1,
+IOMMU_READ | IOMMU_WRITE, );
if (ret != 1) {
gvt_vgpu_err("vfio_pin_pages failed for gfn 0x%lx, ret 
%d\n",
 cur_gfn, ret);
@@ -804,7 +804,6 @@ static int intel_vgpu_open_device(struct vfio_device 
*vfio_dev)
struct intel_vgpu *vgpu = vfio_dev_to_vgpu(vfio_dev);
unsigned long events;
int ret;
-   struct vfio_group *vfio_group;
 
vgpu->iommu_notifier.notifier_call = intel_vgpu_iommu_notifier;
vgpu->group_notifier.notifier_call = intel_vgpu_group_notifier;
@@ -827,28 +826,19 @@ static int intel_vgpu_open_device(struct vfio_device 
*vfio_dev)
goto undo_iommu;
}
 
-   vfio_group =
-   vfio_group_get_external_user_from_dev(vgpu->vfio_device.dev);
-   if (IS_ERR_OR_NULL(vfio_group)) {
-   ret = !vfio_group ? -EFAULT : PTR_ERR(vfio_group);
-   gvt_vgpu_err("vfio_group_get_external_user_from_dev failed\n");
-   goto undo_register;
-   }
-   vgpu->vfio_group = vfio_group;
-
ret = -EEXIST;
if (vgpu->attached)
-   goto undo_group;
+   goto undo_register;
 
ret = -ESRCH;
if (!vgpu->kvm || vgpu->kvm->mm != current->mm) {
gvt_vgpu_err("KVM is required to use Intel vGPU\n");
-   goto undo_group;
+   goto undo_register;
}
 
ret = -EEXIST;
if (__kvmgt_vgpu_exist(vgpu))
-   goto undo_group;
+   goto undo_register;
 
vgpu->attached = true;
kvm_get_kvm(vgpu->kvm);
@@ -868,10 +858,6 @@ static int intel_vgpu_open_device(struct vfio_device 
*vfio_dev)
atomic_set(>released, 0);
return 0;
 
-undo_group:
-   vfio_group_put_external_user(vgpu->vfio_group);
-   vgpu->vfio_group = NULL;
-
 undo_register:
vfio_unregister_notifier(vfio_dev, VFIO_GROUP_NOTIFY,
 >group_notifier);
@@ -925,7 +911,6 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
gvt_cache_destroy(vgpu);
 
intel_vgpu_release_msi_eventfd_ctx(vgpu);
-   vfio_group_put_external_user(vgpu->vfio_group);
 
vgpu->kvm = NULL;
vgpu->attached = false;
-- 
2.36.0

[PATCH v4 3/7] vfio/mdev: Pass in a struct vfio_device * to vfio_pin/unpin_pages()

Every caller has a readily available vfio_device pointer, use that instead
of passing in a generic struct device. The struct vfio_device already
contains the group we need so this avoids complexity, extra refcountings,
and a confusing lifecycle model.

Reviewed-by: Christoph Hellwig 
Acked-by: Eric Farman 
Reviewed-by: Jason J. Herne 
Reviewed-by: Tony Krowiak 
Reviewed-by: Kevin Tian 
Signed-off-by: Jason Gunthorpe 
---
 .../driver-api/vfio-mediated-device.rst   |  4 +-
 drivers/s390/cio/vfio_ccw_cp.c|  6 +--
 drivers/s390/crypto/vfio_ap_ops.c |  9 ++--
 drivers/vfio/vfio.c   | 46 +++
 include/linux/vfio.h  |  4 +-
 5 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/Documentation/driver-api/vfio-mediated-device.rst 
b/Documentation/driver-api/vfio-mediated-device.rst
index 784bbeb22adcf5..eded8719180fba 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -262,10 +262,10 @@ Translation APIs for Mediated Devices
 The following APIs are provided for translating user pfn to host pfn in a VFIO
 driver::
 
-   extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
+   int vfio_pin_pages(struct vfio_device *device, unsigned long *user_pfn,
  int npage, int prot, unsigned long *phys_pfn);
 
-   extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
+   int vfio_unpin_pages(struct vfio_device *device, unsigned long 
*user_pfn,
int npage);
 
 These functions call back into the back-end IOMMU module by using the pin_pages
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 7a1cf3091cd647..0c2be9421ab78f 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -103,13 +103,13 @@ static int pfn_array_pin(struct pfn_array *pa, struct 
vfio_device *vdev)
 {
int ret = 0;
 
-   ret = vfio_pin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr,
+   ret = vfio_pin_pages(vdev, pa->pa_iova_pfn, pa->pa_nr,
 IOMMU_READ | IOMMU_WRITE, pa->pa_pfn);
 
if (ret < 0) {
goto err_out;
} else if (ret > 0 && ret != pa->pa_nr) {
-   vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, ret);
+   vfio_unpin_pages(vdev, pa->pa_iova_pfn, ret);
ret = -EINVAL;
goto err_out;
}
@@ -127,7 +127,7 @@ static void pfn_array_unpin_free(struct pfn_array *pa, 
struct vfio_device *vdev)
 {
/* Only unpin if any pages were pinned to begin with */
if (pa->pa_nr)
-   vfio_unpin_pages(vdev->dev, pa->pa_iova_pfn, pa->pa_nr);
+   vfio_unpin_pages(vdev, pa->pa_iova_pfn, pa->pa_nr);
pa->pa_nr = 0;
kfree(pa->pa_iova_pfn);
 }
diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
b/drivers/s390/crypto/vfio_ap_ops.c
index bfa7ee6ef532d9..e8914024f5b1af 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -124,8 +124,7 @@ static void vfio_ap_free_aqic_resources(struct 
vfio_ap_queue *q)
q->saved_isc = VFIO_AP_ISC_INVALID;
}
if (q->saved_pfn && !WARN_ON(!q->matrix_mdev)) {
-   vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
->saved_pfn, 1);
+   vfio_unpin_pages(>matrix_mdev->vdev, >saved_pfn, 1);
q->saved_pfn = 0;
}
 }
@@ -258,7 +257,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct 
vfio_ap_queue *q,
return status;
}
 
-   ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), _pfn, 1,
+   ret = vfio_pin_pages(>matrix_mdev->vdev, _pfn, 1,
 IOMMU_READ | IOMMU_WRITE, _pfn);
switch (ret) {
case 1:
@@ -301,7 +300,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct 
vfio_ap_queue *q,
break;
case AP_RESPONSE_OTHERWISE_CHANGED:
/* We could not modify IRQ setings: clear new configuration */
-   vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), _pfn, 1);
+   vfio_unpin_pages(>matrix_mdev->vdev, _pfn, 1);
kvm_s390_gisc_unregister(kvm, isc);
break;
default:
@@ -1250,7 +1249,7 @@ static int vfio_ap_mdev_iommu_notifier(struct 
notifier_block *nb,
struct vfio_iommu_type1_dma_unmap *unmap = data;
unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
 
-   vfio_unpin_pages(mdev_dev(matrix_mdev->mdev), _pfn, 1);
+   vfio_unpin_pages(_mdev->vdev, _pfn, 1);
return NOTIFY_OK;
}
 
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index d70890ab9cea92..93caab1f29dbd7 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1919,7 +1919,7 @@

Re: (subset) [PATCH 1/2] [RFC] regmap: Add bulk read/write callbacks into regmap_config

2022-05-05 Thread Marek Vasut


On 5/5/22 23:08, Mark Brown wrote:

On Thu, May 05, 2022 at 07:32:23PM +0200, Marek Vasut wrote:

On 5/5/22 17:12, Mark Brown wrote:

On Sat, 30 Apr 2022 04:51:44 +0200, Marek Vasut wrote:



Currently the regmap_config structure only allows the user to implement
single element register read/write using .reg_read/.reg_write callbacks.



[1/2] regmap: Add bulk read/write callbacks into regmap_config
commit: d77e745613680c54708470402e2b623dcd769681



I was really hoping this would get a lot more review / comments before this
is applied.


I can easily punt for this release, though TBH I'm not anticipating huge
numbers of comments on a regmap patch unless it breaks things for
people, they tend to be very quiet.


In that case, let's wait and see ...


I did go through it and didn't spot
any issues so it seemed like the testing coverage would be useful here.
Are there specific things you're worried about that you'd like feedback
on?


Plumbing on core code is worrying.

[PATCH -next 2/2] drm/amdkfd: Return true/false (not 1/0) from bool functions

2022-05-05 Thread Yang Li

Return boolean values ("true" or "false") instead of 1 or 0 from bool
functions. This fixes the following warnings from coccicheck:

./drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c:244:9-10: WARNING:
return of 0/1 in function 'event_interrupt_isr_v11' with return type
bool

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
index c3919aaa76e6..1431f0961769 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
@@ -241,14 +241,14 @@ static bool event_interrupt_isr_v11(struct kfd_dev *dev,
if (/*!KFD_IRQ_IS_FENCE(client_id, source_id) &&*/
(vmid < dev->vm_info.first_vmid_kfd ||
vmid > dev->vm_info.last_vmid_kfd))
-   return 0;
+   return false;
 
pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry);
context_id0 = SOC15_CONTEXT_ID0_FROM_IH_ENTRY(ih_ring_entry);
 
if ((source_id == SOC15_INTSRC_CP_END_OF_PIPE) &&
(context_id0 & AMDGPU_FENCE_MES_QUEUE_FLAG))
-   return 0;
+   return false;
 
pr_debug("client id 0x%x, source id %d, vmid %d, pasid 0x%x. raw 
data:\n",
 client_id, source_id, vmid, pasid);
@@ -258,7 +258,7 @@ static bool event_interrupt_isr_v11(struct kfd_dev *dev,
 
/* If there is no valid PASID, it's likely a bug */
if (WARN_ONCE(pasid == 0, "Bug: No PASID in KFD interrupt"))
-   return 0;
+   return false;
 
/* Interrupt types we care about: various signals and faults.
 * They will be forwarded to a work queue (see below).
-- 
2.20.1.7.g153144c

[PATCH -next] drm/rockchip: remove unneeded semicolon

2022-05-05 Thread Yang Li

Eliminate the following coccicheck warning:
./drivers/gpu/drm/rockchip/rockchip_drm_vop2.c:1476:2-3: Unneeded
semicolon

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 drivers/gpu/drm/rockchip/rockchip_drm_vop2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
index 0b49fed16535..7f9d88634a77 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
@@ -1473,7 +1473,7 @@ static void rk3568_set_intf_mux(struct vop2_video_port 
*vp, int id,
default:
drm_err(vop2->drm, "Invalid interface id %d on vp%d\n", id, 
vp->id);
return;
-   };
+   }
 
dip |= RK3568_DSP_IF_POL__CFG_DONE_IMD;
 
-- 
2.20.1.7.g153144c

[PATCH -next 1/2] drm/amdgpu/gfx11: remove unneeded semicolon

2022-05-05 Thread Yang Li

Eliminate the following coccicheck warning:
./drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1222:2-3: Unneeded semicolon

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 184bf554acca..d2ab325e40b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -1219,7 +1219,7 @@ static void gfx_v11_0_parse_rlc_toc(struct amdgpu_device 
*adev, void *rlc_toc)
rlc_autoload_info[ucode->id].size = ucode->size * 4;
 
ucode++;
-   };
+   }
 }
 
 static uint32_t gfx_v11_0_calc_toc_total_size(struct amdgpu_device *adev)
-- 
2.20.1.7.g153144c

Re: [PATCH v1 04/15] mm: add device coherent checker to remove migration pte

2022-05-05 Thread Alistair Popple

"Sierra Guiza, Alejandro (Alex)"  writes:

> @apop...@nvidia.com Could you please check this patch? It's somehow related to
> migrate_device_page() for long term device coherent pages.

Sure thing. This whole series is in my queue of things to review once I make it 
home from LSF/MM.

- Alistair

> Regards,
> Alex Sierra
>> -Original Message-
>> From: amd-gfx  On Behalf Of Alex
>> Sierra
>> Sent: Thursday, May 5, 2022 4:34 PM
>> To: j...@nvidia.com
>> Cc: rcampb...@nvidia.com; wi...@infradead.org; da...@redhat.com;
>> Kuehling, Felix ; apop...@nvidia.com; amd-
>> g...@lists.freedesktop.org; linux-...@vger.kernel.org; linux...@kvack.org;
>> jgli...@redhat.com; dri-devel@lists.freedesktop.org; akpm@linux-
>> foundation.org; linux-e...@vger.kernel.org; h...@lst.de
>> Subject: [PATCH v1 04/15] mm: add device coherent checker to remove
>> migration pte
>>
>> During remove_migration_pte(), entries for device coherent type pages that
>> were not created through special migration ptes, ignore _PAGE_RW flag. This
>> path can be found at migrate_device_page(), where valid vma is not
>> required. In this case, migrate_vma_collect_pmd() is not called and special
>> migration ptes are not set.
>>
>> Signed-off-by: Alex Sierra 
>> ---
>>  mm/migrate.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c index
>> 6c31ee1e1c9b..e18ddee56f37 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -206,7 +206,8 @@ static bool remove_migration_pte(struct folio *folio,
>>   * Recheck VMA as permissions can change since migration
>> started
>>   */
>>  entry = pte_to_swp_entry(*pvmw.pte);
>> -if (is_writable_migration_entry(entry))
>> +if (is_writable_migration_entry(entry) ||
>> +is_device_coherent_page(pfn_to_page(pvmw.pfn)))
>>  pte = maybe_mkwrite(pte, vma);
>>  else if (pte_swp_uffd_wp(*pvmw.pte))
>>  pte = pte_mkuffd_wp(pte);
>> --
>> 2.32.0

Re: [PATCH v4 10/15] drm/shmem-helper: Take reservation lock instead of drm_gem_shmem locks

2022-05-05 Thread Dmitry Osipenko

On 5/5/22 11:12, Daniel Vetter wrote:
> On Wed, May 04, 2022 at 06:56:09PM +0300, Dmitry Osipenko wrote:
>> On 5/4/22 11:21, Daniel Vetter wrote:
>> ...
> - Maybe also do what you suggest and keep a separate lock for this, but
>   the fundamental issue is that this doesn't really work - if you share
>   buffers both ways with two drivers using shmem helpers, then the
>   ordering of this vmap_count_mutex vs dma_resv_lock is inconsistent and
>   you can get some nice deadlocks. So not a great approach (and also the
>   reason why we really need to get everyone to move towards dma_resv_lock
>   as _the_ buffer object lock, since otherwise we'll never get a
>   consistent lock nesting hierarchy).

 The separate locks should work okay because it will be always the
 exporter that takes the dma_resv_lock. But I agree that it's less ideal
 than defining the new rules for dma-bufs since sometime you will take
 the resv lock and sometime not, potentially hiding bugs related to 
 lockings.
>>>
>>> That's the issue, some importers need to take the dma_resv_lock for
>>> dma_buf_vmap too (e.g. to first nail the buffer in place when it's a
>>> dynamic memory manager). In practice it'll work as well as what we have
>>> currently, which is similarly inconsistent, except with per-driver locks
>>> instead of shared locks from shmem helpers or dma-buf, so less obvious
>>> that things are inconsistent.
>>>
>>> So yeah if it's too messy maybe the approach is to have a separate lock
>>> for vmap for now, land things, and then fix up dma_buf_vmap in a follow up
>>> series.
>>
>> The amdgpu driver was the fist who introduced the concept of movable
>> memory for dma-bufs. Now we want to support it for DRM SHMEM too. For
>> both amdgpu ttm and shmem drivers we will want to hold the reservation
>> lock when we're touching moveable buffers. The current way of denoting
>> that dma-buf is movable is to implement the pin/unpin callbacks of the
>> dma-buf ops, should be doable for shmem.
> 
> Hm that sounds like a bridge too far? I don't think we want to start
> adding moveable dma-bufs for shmem, thus far at least no one asked for
> that. Goal here is just to streamline the locking a bit and align across
> all the different ways of doing buffers in drm.
> 
> Or do you mean something else and I'm just completely lost?

I'm talking about aligning DRM locks with the dma-buf locks. The problem
is that the convention of dma-bufs isn't specified yet. In particular
there is no convention for the mapping operations.

If we want to switch vmapping of shmem to use reservation lock, then
somebody will have to hold this lock for dma_buf_vmap() and the locking
convention needs to be specified firmly.

In case of dynamic buffers, we will also need to specify whether
dma_buf_vmap() should imply the implicit pinning by exporter or the
buffer must be pinned explicitly by importer before dma_buf_vmap() is
invoked.

Perhaps I indeed shouldn't care about this for this patchset. The
complete locking model of dma-bufs must be specified first.

>> A day ago I found that mapping of imported dma-bufs is broken at least
>> for the Tegra DRM driver (and likely for others too) because driver
>> doesn't assume that anyone will try to mmap imported buffer and just
>> doesn't handle this case at all, so we're getting a hard lockup on
>> touching mapped memory because we're mapping something else than the
>> dma-buf.
> 
> Huh that sounds bad, how does this happen? Pretty much all pieces of
> dma-buf (cpu vmap, userspace mmap, heck even dma_buf_attach) are optional
> or at least can fail for various reasons. So exporters not providing mmap
> support is fine, but importers then dying is not.

Those drivers that die don't have userspace that uses dma-bufs
extensively. I noticed it only because was looking at this code too much
for the last days.

Drivers that don't die either map imported BOs properly or don't allow
mapping at all.

>> My plan is to move the dma-buf management code to the level of DRM core
>> and make it aware of the reservation locks for the dynamic dma-bufs.
>> This way we will get the proper locking for dma-bufs and fix mapping of
>> imported dma-bufs for Tegra and other drivers.
> 
> So maybe we're completely talking past each another, or coffee is not
> working here on my end, but I've no idea what you mean.
> 
> We do have some helpers for taking care of the dma_resv_lock dance, and
> Christian König has an rfc patch set to maybe unify this further. But that
> should be fairly orthogonal to reworking shmem (it might help a bit with
> reworking shmem though).

The reservation lock itself doesn't help much shmem, IMO. It should help
only in the context of dynamic dma-bufs and today we don't have a need
in the dynamic shmem dma-bufs.

You were talking about making DRM locks consistent with dma-buf locks,
so I thought that yours main point of making use of reservation locks
for shmem is to

Re: [PATCH] drm: Document that power requirements for DP AUX transfers

2022-05-05 Thread Doug Anderson

Hi,

On Thu, May 5, 2022 at 3:15 PM Dmitry Baryshkov
 wrote:
>
> On 06/05/2022 00:24, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, May 5, 2022 at 1:56 PM Dmitry Baryshkov
> >  wrote:
> >>
> >> On Thu, 5 May 2022 at 23:21, Doug Anderson  wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Thu, May 5, 2022 at 1:10 PM Dmitry Baryshkov
> >>>  wrote:
> 
>  On Thu, 5 May 2022 at 18:53, Doug Anderson  wrote:
> >
> > Hi,
> >
> > On Thu, May 5, 2022 at 8:29 AM Ville Syrjälä
> >  wrote:
> >>
> >> On Thu, May 05, 2022 at 08:00:20AM -0700, Doug Anderson wrote:
> >>> Hi,
> >>>
> >>> On Thu, May 5, 2022 at 7:46 AM Ville Syrjälä
> >>>  wrote:
> 
>  On Wed, May 04, 2022 at 02:10:08PM -0400, Lyude Paul wrote:
> > On Wed, 2022-05-04 at 09:04 -0700, Doug Anderson wrote:
> >> Hi,
> >>
> >> On Wed, May 4, 2022 at 5:21 AM Ville Syrjälä
> >>  wrote:
> >>>
> >>> On Tue, May 03, 2022 at 04:21:08PM -0700, Douglas Anderson wrote:
>  When doing DP AUX transfers there are two actors that need to be
>  powered in order for the DP AUX transfer to work: the DP source 
>  and
>  the DP sync. Commit bacbab58f09d ("drm: Mention the power state
>  requirement on side-channel operations") added some documentation
>  saying that the DP source is required to power itself up (if 
>  needed)
>  to do AUX transfers. However, that commit doesn't talk anything 
>  about
>  the DP sink.
> 
>  For full fledged DP the sink isn't really a problem. It's 
>  expected
>  that if an external DP monitor isn't plugged in that attempting 
>  to do
>  AUX transfers won't work. It's also expected that if a DP 
>  monitor is
>  plugged in (and thus asserting HPD) that it AUX transfers will 
>  work.
> 
>  When we're looking at eDP, however, things are less obvious. 
>  Let's add
>  some documentation about expectations. Here's what we'll say:
> 
>  1. We don't expect the DP AUX transfer function to power on an 
>  eDP
>  panel. If an eDP panel is physically connected but powered off 
>  then it
>  makes sense for the transfer to fail.
> >>>
> >>> I don't agree with this. I think the panel should just get powred 
> >>> up
> >>> for AUX transfers.
> >>
> >> That's definitely a fair thing to think about and I have at times
> >> thought about trying to make it work that way. It always ends up
> >> hitting a roadblock.
> 
>  How do you even probe the panel initially if you can't power it on
>  without doing some kind of full modeset/etc.?
> >>>
> >>> It's not that we can't power it on without a full modeset. It' that at
> >>> panel probe time all the DRM components haven't been hooked together
> >>> yet, so the bridge chain isn't available yet. The panel can power
> >>> itself on, though. This is why the documentation I added says: "if a
> >>> panel driver is initiating a DP AUX transfer it may power itself up
> >>> however it wants"
> >>>
> >>>
> >> The biggest roadblock that I recall is that to make this work then
> >> you'd have to somehow ensure that the bridge chain's pre_enable() 
> >> call
> >> was made as part of the AUX transfer, right? Since the transfer
> >> function can be called in any context at all, we have to coordinate
> >> this with DRM. If, for instance, DRM is mid way through powering 
> >> the
> >> panel down then we need to wait for DRM to fully finish powering 
> >> down,
> >> then we need to power the panel back up. I don't believe that we 
> >> can
> >> just force the panel to stay on if DRM is turning it off because of
> >> panel power sequencing requirements. At least I know it would have 
> >> the
> >> potential to break "samsung-atna33xc20.c" which absolutely needs to
> >> see the panel power off after it's been disabled.
> >>
> >> We also, I believe, need to handle the fact that the bridge chain 
> >> may
> >> not have even been created yet. We do AUX transfers to read the 
> >> EDID
> >> and also to setup the backlight in the probe function of 
> >> panel-edp. At
> >> that point the panel hasn't been linked into the chain. We had 
> >> _long_
> >> discussions [1] about moving these out of probe and decided that we
> >> could move the EDID read to be later but that it was going to 
> >> really
> >> ugly to move the AUX backlight later. The

Re: [PATCH v10, 16/17] media: mediatek: vcodec: support stateless VP9 decoding

2022-05-05 Thread Nícolas F . R . A . Prado

On Tue, Apr 26, 2022 at 06:08:27PM +0800, Yunfei Dong wrote:
> Add support for VP9 decoding using the stateless API,
> as supported by MT8192. And the drivers is lat and core architecture.
> 
> Signed-off-by: Yunfei Dong 
> Signed-off-by: George Sun 
> Reviewed-by: AngeloGioacchino Del Regno 
> 
> ---
>  .../media/platform/mediatek/vcodec/Makefile   |1 +
>  .../vcodec/mtk_vcodec_dec_stateless.c |   26 +-
>  .../platform/mediatek/vcodec/mtk_vcodec_drv.h |1 +
>  .../vcodec/vdec/vdec_vp9_req_lat_if.c | 2031 +
>  .../platform/mediatek/vcodec/vdec_drv_if.c|4 +
>  .../platform/mediatek/vcodec/vdec_drv_if.h|1 +
>  6 files changed, 2061 insertions(+), 3 deletions(-)
>  create mode 100644 
> drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_req_lat_if.c
> 

...

> + use_128 = !frame_is_intra && !vsi->frame.uh.last_frame_type;
> + v4l2_vp9_adapt_coef_probs(pre_frame_ctx_helper,
> +   counts_helper,
> +   use_128,
> +   frame_is_intra);

Hi Yunfei,

I'm getting

ERROR: modpost: "v4l2_vp9_adapt_noncoef_probs" 
[drivers/media/platform/mediatek/vcodec/mtk-vcodec-dec.ko] undefined!
ERROR: modpost: "v4l2_vp9_adapt_coef_probs" 
[drivers/media/platform/mediatek/vcodec/mtk-vcodec-dec.ko] undefined!

when building this series.

Adding
select V4L2_VP9
to
config VIDEO_MEDIATEK_VCODEC
solved the issue.

Thanks,
Nícolas

Re: [PATCH] drm: Document that power requirements for DP AUX transfers

2022-05-05 Thread Dmitry Baryshkov

On 06/05/2022 00:24, Doug Anderson wrote:

Hi,

On Thu, May 5, 2022 at 1:56 PM Dmitry Baryshkov
wrote:

On Thu, 5 May 2022 at 23:21, Doug Anderson wrote:

Hi,

On Thu, May 5, 2022 at 1:10 PM Dmitry Baryshkov
wrote:

On Thu, 5 May 2022 at 18:53, Doug Anderson wrote:

Hi,

On Thu, May 5, 2022 at 8:29 AM Ville Syrjälä
wrote:

On Thu, May 05, 2022 at 08:00:20AM -0700, Doug Anderson wrote:

Hi,

On Thu, May 5, 2022 at 7:46 AM Ville Syrjälä
wrote:

On Wed, May 04, 2022 at 02:10:08PM -0400, Lyude Paul wrote:

On Wed, 2022-05-04 at 09:04 -0700, Doug Anderson wrote:

Hi,

On Wed, May 4, 2022 at 5:21 AM Ville Syrjälä
wrote:

On Tue, May 03, 2022 at 04:21:08PM -0700, Douglas Anderson wrote:

When doing DP AUX transfers there are two actors that need to be
powered in order for the DP AUX transfer to work: the DP source and
the DP sync. Commit bacbab58f09d ("drm: Mention the power state
requirement on side-channel operations") added some documentation
saying that the DP source is required to power itself up (if needed)
to do AUX transfers. However, that commit doesn't talk anything about
the DP sink.

For full fledged DP the sink isn't really a problem. It's expected
that if an external DP monitor isn't plugged in that attempting to do
AUX transfers won't work. It's also expected that if a DP monitor is
plugged in (and thus asserting HPD) that it AUX transfers will work.

When we're looking at eDP, however, things are less obvious. Let's add
some documentation about expectations. Here's what we'll say:

1. We don't expect the DP AUX transfer function to power on an eDP
panel. If an eDP panel is physically connected but powered off then it
makes sense for the transfer to fail.

I don't agree with this. I think the panel should just get powred up
for AUX transfers.

That's definitely a fair thing to think about and I have at times
thought about trying to make it work that way. It always ends up
hitting a roadblock.

How do you even probe the panel initially if you can't power it on
without doing some kind of full modeset/etc.?

It's not that we can't power it on without a full modeset. It' that at
panel probe time all the DRM components haven't been hooked together
yet, so the bridge chain isn't available yet. The panel can power
itself on, though. This is why the documentation I added says: "if a
panel driver is initiating a DP AUX transfer it may power itself up
however it wants"

The biggest roadblock that I recall is that to make this work then
you'd have to somehow ensure that the bridge chain's pre_enable() call
was made as part of the AUX transfer, right? Since the transfer
function can be called in any context at all, we have to coordinate
this with DRM. If, for instance, DRM is mid way through powering the
panel down then we need to wait for DRM to fully finish powering down,
then we need to power the panel back up. I don't believe that we can
just force the panel to stay on if DRM is turning it off because of
panel power sequencing requirements. At least I know it would have the
potential to break "samsung-atna33xc20.c" which absolutely needs to
see the panel power off after it's been disabled.

We also, I believe, need to handle the fact that the bridge chain may
not have even been created yet. We do AUX transfers to read the EDID
and also to setup the backlight in the probe function of panel-edp. At
that point the panel hasn't been linked into the chain. We had _long_
discussions [1] about moving these out of probe and decided that we
could move the EDID read to be later but that it was going to really
ugly to move the AUX backlight later. The backlight would end up
popping up at some point in time later (the first call to panel
prepare() or maybe get_modes()) and that seemed weird.

[1]
https://lore.kernel.org/lkml/CAD=FV=u5-stdlydkejwlaog-0wgxr49vxtwuyuo7z2puibl...@mail.gmail.com/

Otherwise you can't trust that eg. the /dev/aux
stuff is actually usable.

Yeah, it's been on my mind to talk more about /dev/aux. I think
/dev/aux has some problems, at least with eDP. Specifically:

1. Even if we somehow figure out how to power the panel on as part of
the aux transfer, we actually _still_ not guaranteed to be able to
talk to it as far as I understand. My colleague reported to me that on
a system he was working with that had PSR (panel self refresh) that
when the panel was powered on but in PSR mode that it wouldn't talk
over AUX. Assuming that this is correct then I guess we'd also have to
do even more coordination with DRM to exit PSR and block future
transitions of PSR. (NOTE: it's always possible that my colleague ran
into some other bug and that panels are _supposed_ to be able to talk
in PSR. If you think this is the case, I can always try to dig more).

TBH - the coordination with drm I don't think would be the difficult part, as
we'd just need to add some sort of property (ideally invisible to userspace)
that can be used in an atomic commit to disable PSR - similar to how we

Re: AMD display drivers handling DRM CRTC color mgmt props

2022-05-05 Thread Melissa Wen

On 04/22, Harry Wentland wrote:
> 
> 
> On 2022-04-22 10:28, Melissa Wen wrote:
> > On 04/21, Harry Wentland wrote:
> > > 
> > > 
> > > On 2022-04-21 15:20, Melissa Wen wrote:
> > > > On 04/21, Harry Wentland wrote:
> > > > > 
> > > > > 
> > > > > On 2022-04-21 10:37, Melissa Wen wrote:
> > > > > > Hi all,
> > > > > > 
> > > > > > I'm examining how DRM color management properties (degamma, ctm, 
> > > > > > gamma)
> > > > > > are applied to AMD display drivers. As far I could understand thanks
> > > > > > Nicholas documentation on amdgpu_dm/amdgpu_dm_color, DC drivers have
> > > > > > per-plane color correction features:
> > > > > > 
> > > > Hi Harry,
> > > > 
> > > > Wow, thanks so much for all details!
> > > > > 
> > > > > DC programs some of the color correction features pre-blending but
> > > > > DRM/KMS has not per-plane color correction properties.
> > > > > 
> > > > > See this series from Uma Shankar for an RFC on how to introduce those
> > > > > properties for 1D LUTs and CSC matrix:
> > > > > https://patchwork.freedesktop.org/series/90826/
> > > > > 
> > > > > Bhanuprakash has a series of IGT tests for these properties:
> > > > > https://patchwork.freedesktop.org/series/96895/
> > > > > 
> > > > > I've rebased these on amd-staging-drm-next and maintain a kernel and 
> > > > > IGT
> > > > > branch with these patches:
> > > > > https://gitlab.freedesktop.org/hwentland/linux/-/tree/color-and-hdr
> > > > > https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/tree/color-and-hdr
> > > > > 
> > > > > We've had many discussions with Weston guys on this. In order to 
> > > > > merge the
> > > > > kernel properties we need a canonical userspace implementation that 
> > > > > are
> > > > > using it. Weston guys are working towards that but if you want to 
> > > > > suggest a
> > > > > different userspace to serve as that vehicle I'd be all ears. :)
> > > > > 
> > > > > Note that in order to show this all working we also need a Wayland 
> > > > > Protocol
> > > > > update.
> > > > > 
> > > > > See
> > > > > https://gitlab.freedesktop.org/pq/color-and-hdr
> > > > > https://gitlab.freedesktop.org/swick/wayland-protocols
> > > > > https://gitlab.freedesktop.org/wayland/weston/-/issues/467
> > > > 
> > > > So, I've followed these discussions (until the issue on naming) because
> > > > initially I considered it addresses our current goals for color
> > > > correction. But after some discussions, what we are targeting is a 3D
> > > > LUT after blending (per-CRTC). I found past proposals on dri-devel
> > > > [1][2] to extend the DRM CRTC color management properties, but they
> > > > didn't move forward and were never applied.
> > > > 
> > > 
> > > They're stuck in limbo until we have an upstream userspace
> > > implementation that's making use of them.
> > 
> > Yes... afaiu, the basic requirements for all of these changes are IGT
> > tests + open userspace usage, right?
> > 
> 
> Correct. See [1] and [2].
> 
> [1] https://www.kernel.org/doc/html/latest/gpu/drm-kms.html#requirements
> [2] 
> https://www.kernel.org/doc/html/latest/gpu/drm-uapi.html#open-source-userspace-requirements
> 
> > > 
> > > > > 
> > > > > > * - Input gamma LUT (de-normalized)
> > > > > > * - Input CSC (normalized)
> > > > > > * - Surface degamma LUT (normalized)
> > > > > > * - Surface CSC (normalized)
> > > > > > * - Surface regamma LUT (normalized)
> > > > > > * - Output CSC (normalized)
> > > > > > so DM is "adapting" those DRM per-CRTC properties to fit into three 
> > > > > > of
> > > > > > these color correction stages, which I guess are the surface stages:
> > > > > > 
> > > > > > * - Surface degamma LUT (normalized)
> > > > > > * - Surface CSC (normalized)
> > > > > > * - Surface regamma LUT (normalized)
> > > > > > 
> > > > > > I'm trying to understand what this mapping is doing. A comment 
> > > > > > mentions
> > > > > > that is not possible to do these color corrections after blending, 
> > > > > > so,
> > > > > > the same color correction pipe is performed on every plane before
> > > > > > blending?  (is the surface the plane?) Does this adaptation affect 
> > > > > > the
> > > > > > expected output?  Moreover, is there something that I 
> > > > > > misunderstood? :)
> > > > > > 
> > > > > 
> > > > > What's possible to do before and after blending has changed quite a 
> > > > > bit
> > > > > between DCN generations. We program the CRTC Gamma and CTM after 
> > > > > blending.
> > > > > See attached picture for a view relating the color bits between the 
> > > > > DRM
> > > > > interface, DC interface and DCN 3.0 HW blocks.
> > > > 
> > > > This picture is really enlightening, thanks!
> > > > You said it changes between generations, therefore, I can't consider the
> > > > DCN 2.x family follow the same mapping, right? If so, can you share the
> > > > main differences for a DCN 2.x regarding per-CRTC properties?
> > > > 
> > > 
> > > See attached diagram for DCN 2.0.
> > 
> > Thanks again!
> > 
> > > 
> > > > > 
> > > > > >

[PATCH v3 4/4] fbdev: vesafb: Cleanup fb_info in .fb_destroy rather than .remove

The driver is calling framebuffer_release() in its .remove callback, but
this will cause the struct fb_info to be freed too early. Since it could
be that a reference is still hold to it if user-space opened the fbdev.

This would lead to a use-after-free error if the framebuffer device was
unregistered but later a user-space process tries to close the fbdev fd.

To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy
instead of doing it in the driver's .remove callback.

Strictly speaking, the code flow in the driver is still wrong because all
the hardware cleanupd (i.e: iounmap) should be done in .remove while the
software cleanup (i.e: releasing the framebuffer) should be done in the
.fb_destroy handler. But this at least makes to match the behavior before
commit 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal").

Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal")
Suggested-by: Daniel Vetter 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
Reviewed-by: Daniel Vetter 
---

Changes in v3:
- Only move framebuffer_release() and don't do any other change
  (Daniel Vetter).

Changes in v2:
- Also do the change for vesafb (Thomas Zimmermann).

 drivers/video/fbdev/vesafb.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
index df6de5a9dd4c..e25e8de5ff67 100644
--- a/drivers/video/fbdev/vesafb.c
+++ b/drivers/video/fbdev/vesafb.c
@@ -179,6 +179,10 @@ static int vesafb_setcolreg(unsigned regno, unsigned red, 
unsigned green,
return err;
 }
 
+/*
+ * fb_ops.fb_destroy is called by the last put_fb_info() call at the end
+ * of unregister_framebuffer() or fb_release(). Do any cleanup here.
+ */
 static void vesafb_destroy(struct fb_info *info)
 {
struct vesafb_par *par = info->par;
@@ -188,6 +192,8 @@ static void vesafb_destroy(struct fb_info *info)
if (info->screen_base)
iounmap(info->screen_base);
release_mem_region(info->apertures->ranges[0].base, 
info->apertures->ranges[0].size);
+
+   framebuffer_release(info);
 }
 
 static struct fb_ops vesafb_ops = {
@@ -484,10 +490,10 @@ static int vesafb_remove(struct platform_device *pdev)
 {
struct fb_info *info = platform_get_drvdata(pdev);
 
+   /* vesafb_destroy takes care of info cleanup */
unregister_framebuffer(info);
if (((struct vesafb_par *)(info->par))->region)
release_region(0x3c0, 32);
-   framebuffer_release(info);
 
return 0;
 }
-- 
2.35.1

[PATCH v3 3/4] fbdev: efifb: Cleanup fb_info in .fb_destroy rather than .remove

The driver is calling framebuffer_release() in its .remove callback, but
this will cause the struct fb_info to be freed too early. Since it could
be that a reference is still hold to it if user-space opened the fbdev.

This would lead to a use-after-free error if the framebuffer device was
unregistered but later a user-space process tries to close the fbdev fd.

To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy
instead of doing it in the driver's .remove callback.

Strictly speaking, the code flow in the driver is still wrong because all
the hardware cleanupd (i.e: iounmap) should be done in .remove while the
software cleanup (i.e: releasing the framebuffer) should be done in the
.fb_destroy handler. But this at least makes to match the behavior before
commit 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal").

Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal")
Suggested-by: Daniel Vetter 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
Reviewed-by: Daniel Vetter 
---

(no changes since v1)

 drivers/video/fbdev/efifb.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c
index ea42ba6445b2..cfa3dc0b4eee 100644
--- a/drivers/video/fbdev/efifb.c
+++ b/drivers/video/fbdev/efifb.c
@@ -243,6 +243,10 @@ static void efifb_show_boot_graphics(struct fb_info *info)
 static inline void efifb_show_boot_graphics(struct fb_info *info) {}
 #endif
 
+/*
+ * fb_ops.fb_destroy is called by the last put_fb_info() call at the end
+ * of unregister_framebuffer() or fb_release(). Do any cleanup here.
+ */
 static void efifb_destroy(struct fb_info *info)
 {
if (efifb_pci_dev)
@@ -254,6 +258,9 @@ static void efifb_destroy(struct fb_info *info)
else
memunmap(info->screen_base);
}
+
+   framebuffer_release(info);
+
if (request_mem_succeeded)
release_mem_region(info->apertures->ranges[0].base,
   info->apertures->ranges[0].size);
@@ -620,9 +627,9 @@ static int efifb_remove(struct platform_device *pdev)
 {
struct fb_info *info = platform_get_drvdata(pdev);
 
+   /* efifb_destroy takes care of info cleanup */
unregister_framebuffer(info);
sysfs_remove_groups(>dev.kobj, efifb_groups);
-   framebuffer_release(info);
 
return 0;
 }
-- 
2.35.1

[PATCH v3 2/4] fbdev: simplefb: Cleanup fb_info in .fb_destroy rather than .remove

The driver is calling framebuffer_release() in its .remove callback, but
this will cause the struct fb_info to be freed too early. Since it could
be that a reference is still hold to it if user-space opened the fbdev.

This would lead to a use-after-free error if the framebuffer device was
unregistered but later a user-space process tries to close the fbdev fd.

To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy
instead of doing it in the driver's .remove callback.

Strictly speaking, the code flow in the driver is still wrong because all
the hardware cleanupd (i.e: iounmap) should be done in .remove while the
software cleanup (i.e: releasing the framebuffer) should be done in the
.fb_destroy handler. But this at least makes to match the behavior before
commit 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal").

Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal")
Suggested-by: Daniel Vetter 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
Reviewed-by: Daniel Vetter 
---

(no changes since v1)

 drivers/video/fbdev/simplefb.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/simplefb.c b/drivers/video/fbdev/simplefb.c
index 94fc9c6d0411..2c198561c338 100644
--- a/drivers/video/fbdev/simplefb.c
+++ b/drivers/video/fbdev/simplefb.c
@@ -84,6 +84,10 @@ struct simplefb_par {
 static void simplefb_clocks_destroy(struct simplefb_par *par);
 static void simplefb_regulators_destroy(struct simplefb_par *par);
 
+/*
+ * fb_ops.fb_destroy is called by the last put_fb_info() call at the end
+ * of unregister_framebuffer() or fb_release(). Do any cleanup here.
+ */
 static void simplefb_destroy(struct fb_info *info)
 {
struct simplefb_par *par = info->par;
@@ -94,6 +98,8 @@ static void simplefb_destroy(struct fb_info *info)
if (info->screen_base)
iounmap(info->screen_base);
 
+   framebuffer_release(info);
+
if (mem)
release_mem_region(mem->start, resource_size(mem));
 }
@@ -545,8 +551,8 @@ static int simplefb_remove(struct platform_device *pdev)
 {
struct fb_info *info = platform_get_drvdata(pdev);
 
+   /* simplefb_destroy takes care of info cleanup */
unregister_framebuffer(info);
-   framebuffer_release(info);
 
return 0;
 }
-- 
2.35.1

[PATCH v3 1/4] fbdev: Prevent possible use-after-free in fb_release()

From: Daniel Vetter 

Most fbdev drivers have issues with the fb_info lifetime, because call to
framebuffer_release() from their driver's .remove callback, rather than
doing from fbops.fb_destroy callback.

Doing that will destroy the fb_info too early, while references to it may
still exist, leading to a use-after-free error.

To prevent this, check the fb_info reference counter when attempting to
kfree the data structure in framebuffer_release(). That will leak it but
at least will prevent the mentioned error.

Signed-off-by: Daniel Vetter 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
---

(no changes since v1)

 drivers/video/fbdev/core/fbsysfs.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/video/fbdev/core/fbsysfs.c 
b/drivers/video/fbdev/core/fbsysfs.c
index 8c1ee9ecec3d..c2a60b187467 100644
--- a/drivers/video/fbdev/core/fbsysfs.c
+++ b/drivers/video/fbdev/core/fbsysfs.c
@@ -80,6 +80,10 @@ void framebuffer_release(struct fb_info *info)
 {
if (!info)
return;
+
+   if (WARN_ON(refcount_read(>count)))
+   return;
+
kfree(info->apertures);
kfree(info);
 }
-- 
2.35.1

Re: [PATCH v3 2/2] drm/msm/mdp5: Return error code in mdp5_mixer_release when deadlock is detected

2022-05-05 Thread Rob Clark

On Thu, May 5, 2022 at 2:41 PM Jessica Zhang  wrote:
>
> There is a possibility for mdp5_get_global_state to return
> -EDEADLK when acquiring the modeset lock, but currently global_state in
> mdp5_mixer_release doesn't check for if an error is returned.
>
> To avoid a NULL dereference error, let's have mdp5_mixer_release
> check if an error is returned and propagate that error.
>
> Reported-by: Tomeu Vizoso 
> Signed-off-by: Jessica Zhang 

Fixes: 7907a0d77cb4 ("drm/msm/mdp5: Use the new private_obj state")
Reviewed-by: Rob Clark 

> ---
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c  | 10 --
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c | 15 +++
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h |  4 ++--
>  3 files changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c 
> b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> index b966cd69f99d..fe2922c8d21b 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
> @@ -612,9 +612,15 @@ static int mdp5_crtc_setup_pipeline(struct drm_crtc 
> *crtc,
> if (ret)
> return ret;
>
> -   mdp5_mixer_release(new_crtc_state->state, old_mixer);
> +   ret = mdp5_mixer_release(new_crtc_state->state, old_mixer);
> +   if (ret)
> +   return ret;
> +
> if (old_r_mixer) {
> -   mdp5_mixer_release(new_crtc_state->state, 
> old_r_mixer);
> +   ret = mdp5_mixer_release(new_crtc_state->state, 
> old_r_mixer);
> +   if (ret)
> +   return ret;
> +
> if (!need_right_mixer)
> pipeline->r_mixer = NULL;
> }
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c 
> b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
> index 954db683ae44..2536def2a000 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
> @@ -116,21 +116,28 @@ int mdp5_mixer_assign(struct drm_atomic_state *s, 
> struct drm_crtc *crtc,
> return 0;
>  }
>
> -void mdp5_mixer_release(struct drm_atomic_state *s, struct mdp5_hw_mixer 
> *mixer)
> +int mdp5_mixer_release(struct drm_atomic_state *s, struct mdp5_hw_mixer 
> *mixer)
>  {
> struct mdp5_global_state *global_state = mdp5_get_global_state(s);
> -   struct mdp5_hw_mixer_state *new_state = _state->hwmixer;
> +   struct mdp5_hw_mixer_state *new_state;
>
> if (!mixer)
> -   return;
> +   return 0;
> +
> +   if (IS_ERR(global_state))
> +   return PTR_ERR(global_state);
> +
> +   new_state = _state->hwmixer;
>
> if (WARN_ON(!new_state->hwmixer_to_crtc[mixer->idx]))
> -   return;
> +   return -EINVAL;
>
> DBG("%s: release from crtc %s", mixer->name,
> new_state->hwmixer_to_crtc[mixer->idx]->name);
>
> new_state->hwmixer_to_crtc[mixer->idx] = NULL;
> +
> +   return 0;
>  }
>
>  void mdp5_mixer_destroy(struct mdp5_hw_mixer *mixer)
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h 
> b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
> index 43c9ba43ce18..545ee223b9d7 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
> @@ -30,7 +30,7 @@ void mdp5_mixer_destroy(struct mdp5_hw_mixer *lm);
>  int mdp5_mixer_assign(struct drm_atomic_state *s, struct drm_crtc *crtc,
>   uint32_t caps, struct mdp5_hw_mixer **mixer,
>   struct mdp5_hw_mixer **r_mixer);
> -void mdp5_mixer_release(struct drm_atomic_state *s,
> -   struct mdp5_hw_mixer *mixer);
> +int mdp5_mixer_release(struct drm_atomic_state *s,
> +  struct mdp5_hw_mixer *mixer);
>
>  #endif /* __MDP5_LM_H__ */
> --
> 2.35.1
>

Re: [PATCH v3 1/2] drm/msm/mdp5: Return error code in mdp5_pipe_release when deadlock is detected

2022-05-05 Thread Rob Clark

On Thu, May 5, 2022 at 2:41 PM Jessica Zhang  wrote:
>
> mdp5_get_global_state runs the risk of hitting a -EDEADLK when acquiring
> the modeset lock, but currently mdp5_pipe_release doesn't check for if
> an error is returned. Because of this, there is a possibility of
> mdp5_pipe_release hitting a NULL dereference error.
>
> To avoid this, let's have mdp5_pipe_release check if
> mdp5_get_global_state returns an error and propogate that error.
>
> Changes since v1:
> - Separated declaration and initialization of *new_state to avoid
>   compiler warning
> - Fixed some spelling mistakes in commit message
>
> Changes since v2:
> - Return 0 in case where hwpipe is NULL as this is considered normal
>   behavior
> - Added 2nd patch in series to fix a similar NULL dereference issue in
>   mdp5_mixer_release
>
> Reported-by: Tomeu Vizoso 
> Signed-off-by: Jessica Zhang 

Fixes: 7907a0d77cb4 ("drm/msm/mdp5: Use the new private_obj state")
Reviewed-by: Rob Clark 

> ---
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c  | 15 +++
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h  |  2 +-
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 20 
>  3 files changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c 
> b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
> index ba6695963aa6..a4f5cb90f3e8 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
> @@ -119,18 +119,23 @@ int mdp5_pipe_assign(struct drm_atomic_state *s, struct 
> drm_plane *plane,
> return 0;
>  }
>
> -void mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe 
> *hwpipe)
> +int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe 
> *hwpipe)
>  {
> struct msm_drm_private *priv = s->dev->dev_private;
> struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(priv->kms));
> struct mdp5_global_state *state = mdp5_get_global_state(s);
> -   struct mdp5_hw_pipe_state *new_state = >hwpipe;
> +   struct mdp5_hw_pipe_state *new_state;
>
> if (!hwpipe)
> -   return;
> +   return 0;
> +
> +   if (IS_ERR(state))
> +   return PTR_ERR(state);
> +
> +   new_state = >hwpipe;
>
> if (WARN_ON(!new_state->hwpipe_to_plane[hwpipe->idx]))
> -   return;
> +   return -EINVAL;
>
> DBG("%s: release from plane %s", hwpipe->name,
> new_state->hwpipe_to_plane[hwpipe->idx]->name);
> @@ -141,6 +146,8 @@ void mdp5_pipe_release(struct drm_atomic_state *s, struct 
> mdp5_hw_pipe *hwpipe)
> }
>
> new_state->hwpipe_to_plane[hwpipe->idx] = NULL;
> +
> +   return 0;
>  }
>
>  void mdp5_pipe_destroy(struct mdp5_hw_pipe *hwpipe)
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h 
> b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
> index 9b26d0761bd4..cca67938cab2 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
> @@ -37,7 +37,7 @@ int mdp5_pipe_assign(struct drm_atomic_state *s, struct 
> drm_plane *plane,
>  uint32_t caps, uint32_t blkcfg,
>  struct mdp5_hw_pipe **hwpipe,
>  struct mdp5_hw_pipe **r_hwpipe);
> -void mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe 
> *hwpipe);
> +int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe 
> *hwpipe);
>
>  struct mdp5_hw_pipe *mdp5_pipe_init(enum mdp5_pipe pipe,
> uint32_t reg_offset, uint32_t caps);
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c 
> b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
> index 228b22830970..979458482841 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
> @@ -311,12 +311,24 @@ static int mdp5_plane_atomic_check_with_state(struct 
> drm_crtc_state *crtc_state,
> mdp5_state->r_hwpipe = NULL;
>
>
> -   mdp5_pipe_release(state->state, old_hwpipe);
> -   mdp5_pipe_release(state->state, old_right_hwpipe);
> +   ret = mdp5_pipe_release(state->state, old_hwpipe);
> +   if (ret)
> +   return ret;
> +
> +   ret = mdp5_pipe_release(state->state, 
> old_right_hwpipe);
> +   if (ret)
> +   return ret;
> +
> }
> } else {
> -   mdp5_pipe_release(state->state, mdp5_state->hwpipe);
> -   mdp5_pipe_release(state->state, mdp5_state->r_hwpipe);
> +   ret = mdp5_pipe_release(state->state, mdp5_state->hwpipe);
> +   if (ret)
> +   return ret;
> +
> +   ret = mdp5_pipe_release(state->state, mdp5_state->r_hwpipe);
> +   if (ret)
> +   return ret;
> +
>

[PATCH v3 0/4] fbdev: Fix use-after-free caused by wrong fb_info cleanup in drivers

Hello,

This series contains patches suggested by Daniel Vetter to fix a use-after-free
error in the fb_release() function, due a fb_info associated with a fbdev being
freed too early while a user-space process still has the fbdev dev node opened.

That is caused by a wrong management of the struct fb_info lifetime in drivers,
but the fbdev core can also be made more resilient about it an leak

This can easily be reproduced with the simplefb driver doing the following:

$ cat < /dev/fb0 &
$ echo simple-framebuffer.0 > 
/sys/bus/platform/drivers/simple-framebuffer/unbind
$ kill %1

[  257.490471] [ cut here ]
...
[  257.495125] refcount_t: underflow; use-after-free.
[  257.495222] WARNING: CPU: 0 PID: 975 at lib/refcount.c:28 
refcount_warn_saturate+0xf4/0x144
...
[  257.637482] pstate: 8045 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  257.61] pc : refcount_warn_saturate+0xf4/0x144
[  257.649226] lr : refcount_warn_saturate+0xf4/0x144
[  257.654009] sp : 8a06bbf0
[  257.657315] x29: 8a06bbf0 x28: 000a x27: 000a
[  257.664448] x26:  x25: 470b88c6a180 x24: 000a
[  257.671581] x23: 470b81706480 x22: 470b808c2160 x21: 470b8922ba20
[  257.678713] x20: 470b891f5810 x19: 470b891f5800 x18: 
[  257.685846] x17: 3a725f7463656a62 x16: bb18c6465fd4 x15: 0720072007200720
[  257.692978] x14: 0720072d072d072d x13: 0a2e656572662d72 x12: 657466612d657375
[  257.700110] x11: 203b776f6c667265 x10: 646e75203a745f74 x9 : bb18c58f6c90
[  257.707242] x8 : 75203b776f6c6672 x7 : 65646e75203a745f x6 : 0001
[  257.714373] x5 : 470bff8ec418 x4 :  x3 : 0027
[  257.721506] x2 :  x1 : 0027 x0 : 0026
[  257.728638] Call trace:
[  257.731075]  refcount_warn_saturate+0xf4/0x144
[  257.735513]  put_fb_info+0x70/0x7c
[  257.738916]  fb_release+0x60/0x74
[  257.742225]  __fput+0x88/0x240
[  257.745276]  fput+0x1c/0x30
[  257.748410]  task_work_run+0xc4/0x21c
[  257.752066]  do_exit+0x170/0x370
[  257.755288]  do_group_exit+0x40/0xb4
[  257.758858]  get_signal+0x8e0/0x90c
[  257.762339]  do_signal+0x1a0/0x280
[  257.765733]  do_notify_resume+0xc8/0x390
[  257.769650]  el0_da+0xe8/0xf0
[  257.772613]  el0t_64_sync_handler+0xe8/0x130
[  257.776877]  el0t_64_sync+0x190/0x194
[  257.780534] ---[ end trace  ]---

Patch #1 adds a WARN_ON() to framebuffer_release() to prevent the use-after-free
to happen.

Patch #2, #3 and #4 fix the simplefb, efifb and vesafb drivers respectively, to
free the resources at the correct place.

Changes in v3:
- Add Fixes: tag (Daniel Vetter).
- Include in commit message a note that drivers are still broken
  but at least reverts to the previous behavior (Daniel Vetter).
- Only move framebuffer_release() and don't do any other change
  (Daniel Vetter).

Changes in v2:
- Also do the change for vesafb (Thomas Zimmermann).

Daniel Vetter (1):
  fbdev: Prevent possible use-after-free in fb_release()

Javier Martinez Canillas (3):
  fbdev: simplefb: Cleanup fb_info in .fb_destroy rather than .remove
  fbdev: efifb: Cleanup fb_info in .fb_destroy rather than .remove
  fbdev: vesafb: Cleanup fb_info in .fb_destroy rather than .remove

 drivers/video/fbdev/core/fbsysfs.c | 4 
 drivers/video/fbdev/efifb.c| 9 -
 drivers/video/fbdev/simplefb.c | 8 +++-
 drivers/video/fbdev/vesafb.c   | 8 +++-
 4 files changed, 26 insertions(+), 3 deletions(-)

-- 
2.35.1

[PATCH] drm/tve200: Fix smatch warning

2022-05-05 Thread Linus Walleij

The "ret" variable is ambiguously returning something that
could be zero in the tve200_modeset_init() function, assign
it an explicit error return code to make this unambiguous.

Reported-by: Dan Carpenter 
Signed-off-by: Linus Walleij 
---
 drivers/gpu/drm/tve200/tve200_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tve200/tve200_drv.c 
b/drivers/gpu/drm/tve200/tve200_drv.c
index 6d9d2921abf4..3f4fef574706 100644
--- a/drivers/gpu/drm/tve200/tve200_drv.c
+++ b/drivers/gpu/drm/tve200/tve200_drv.c
@@ -65,7 +65,7 @@ static int tve200_modeset_init(struct drm_device *dev)
struct tve200_drm_dev_private *priv = dev->dev_private;
struct drm_panel *panel;
struct drm_bridge *bridge;
-   int ret = 0;
+   int ret;
 
drm_mode_config_init(dev);
mode_config = >mode_config;
@@ -93,6 +93,7 @@ static int tve200_modeset_init(struct drm_device *dev)
 * method to get the connector out of the bridge.
 */
dev_err(dev->dev, "the bridge is not a panel\n");
+   ret = -EINVAL;
goto out_bridge;
}
 
-- 
2.35.1

[PATCH v3 2/2] drm/msm/mdp5: Return error code in mdp5_mixer_release when deadlock is detected

2022-05-05 Thread Jessica Zhang

There is a possibility for mdp5_get_global_state to return
-EDEADLK when acquiring the modeset lock, but currently global_state in
mdp5_mixer_release doesn't check for if an error is returned.

To avoid a NULL dereference error, let's have mdp5_mixer_release
check if an error is returned and propagate that error.

Reported-by: Tomeu Vizoso 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c  | 10 --
 drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c | 15 +++
 drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h |  4 ++--
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
index b966cd69f99d..fe2922c8d21b 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
@@ -612,9 +612,15 @@ static int mdp5_crtc_setup_pipeline(struct drm_crtc *crtc,
if (ret)
return ret;
 
-   mdp5_mixer_release(new_crtc_state->state, old_mixer);
+   ret = mdp5_mixer_release(new_crtc_state->state, old_mixer);
+   if (ret)
+   return ret;
+
if (old_r_mixer) {
-   mdp5_mixer_release(new_crtc_state->state, old_r_mixer);
+   ret = mdp5_mixer_release(new_crtc_state->state, 
old_r_mixer);
+   if (ret)
+   return ret;
+
if (!need_right_mixer)
pipeline->r_mixer = NULL;
}
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
index 954db683ae44..2536def2a000 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
@@ -116,21 +116,28 @@ int mdp5_mixer_assign(struct drm_atomic_state *s, struct 
drm_crtc *crtc,
return 0;
 }
 
-void mdp5_mixer_release(struct drm_atomic_state *s, struct mdp5_hw_mixer 
*mixer)
+int mdp5_mixer_release(struct drm_atomic_state *s, struct mdp5_hw_mixer *mixer)
 {
struct mdp5_global_state *global_state = mdp5_get_global_state(s);
-   struct mdp5_hw_mixer_state *new_state = _state->hwmixer;
+   struct mdp5_hw_mixer_state *new_state;
 
if (!mixer)
-   return;
+   return 0;
+
+   if (IS_ERR(global_state))
+   return PTR_ERR(global_state);
+
+   new_state = _state->hwmixer;
 
if (WARN_ON(!new_state->hwmixer_to_crtc[mixer->idx]))
-   return;
+   return -EINVAL;
 
DBG("%s: release from crtc %s", mixer->name,
new_state->hwmixer_to_crtc[mixer->idx]->name);
 
new_state->hwmixer_to_crtc[mixer->idx] = NULL;
+
+   return 0;
 }
 
 void mdp5_mixer_destroy(struct mdp5_hw_mixer *mixer)
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
index 43c9ba43ce18..545ee223b9d7 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
@@ -30,7 +30,7 @@ void mdp5_mixer_destroy(struct mdp5_hw_mixer *lm);
 int mdp5_mixer_assign(struct drm_atomic_state *s, struct drm_crtc *crtc,
  uint32_t caps, struct mdp5_hw_mixer **mixer,
  struct mdp5_hw_mixer **r_mixer);
-void mdp5_mixer_release(struct drm_atomic_state *s,
-   struct mdp5_hw_mixer *mixer);
+int mdp5_mixer_release(struct drm_atomic_state *s,
+  struct mdp5_hw_mixer *mixer);
 
 #endif /* __MDP5_LM_H__ */
-- 
2.35.1

[PATCH v3 1/2] drm/msm/mdp5: Return error code in mdp5_pipe_release when deadlock is detected

2022-05-05 Thread Jessica Zhang

mdp5_get_global_state runs the risk of hitting a -EDEADLK when acquiring
the modeset lock, but currently mdp5_pipe_release doesn't check for if
an error is returned. Because of this, there is a possibility of
mdp5_pipe_release hitting a NULL dereference error.

To avoid this, let's have mdp5_pipe_release check if
mdp5_get_global_state returns an error and propogate that error.

Changes since v1:
- Separated declaration and initialization of *new_state to avoid
  compiler warning
- Fixed some spelling mistakes in commit message

Changes since v2:
- Return 0 in case where hwpipe is NULL as this is considered normal
  behavior
- Added 2nd patch in series to fix a similar NULL dereference issue in
  mdp5_mixer_release

Reported-by: Tomeu Vizoso 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c  | 15 +++
 drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h  |  2 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 20 
 3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
index ba6695963aa6..a4f5cb90f3e8 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
@@ -119,18 +119,23 @@ int mdp5_pipe_assign(struct drm_atomic_state *s, struct 
drm_plane *plane,
return 0;
 }
 
-void mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe)
+int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe)
 {
struct msm_drm_private *priv = s->dev->dev_private;
struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(priv->kms));
struct mdp5_global_state *state = mdp5_get_global_state(s);
-   struct mdp5_hw_pipe_state *new_state = >hwpipe;
+   struct mdp5_hw_pipe_state *new_state;
 
if (!hwpipe)
-   return;
+   return 0;
+
+   if (IS_ERR(state))
+   return PTR_ERR(state);
+
+   new_state = >hwpipe;
 
if (WARN_ON(!new_state->hwpipe_to_plane[hwpipe->idx]))
-   return;
+   return -EINVAL;
 
DBG("%s: release from plane %s", hwpipe->name,
new_state->hwpipe_to_plane[hwpipe->idx]->name);
@@ -141,6 +146,8 @@ void mdp5_pipe_release(struct drm_atomic_state *s, struct 
mdp5_hw_pipe *hwpipe)
}
 
new_state->hwpipe_to_plane[hwpipe->idx] = NULL;
+
+   return 0;
 }
 
 void mdp5_pipe_destroy(struct mdp5_hw_pipe *hwpipe)
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
index 9b26d0761bd4..cca67938cab2 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
@@ -37,7 +37,7 @@ int mdp5_pipe_assign(struct drm_atomic_state *s, struct 
drm_plane *plane,
 uint32_t caps, uint32_t blkcfg,
 struct mdp5_hw_pipe **hwpipe,
 struct mdp5_hw_pipe **r_hwpipe);
-void mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe 
*hwpipe);
+int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe);
 
 struct mdp5_hw_pipe *mdp5_pipe_init(enum mdp5_pipe pipe,
uint32_t reg_offset, uint32_t caps);
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
index 228b22830970..979458482841 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
@@ -311,12 +311,24 @@ static int mdp5_plane_atomic_check_with_state(struct 
drm_crtc_state *crtc_state,
mdp5_state->r_hwpipe = NULL;
 
 
-   mdp5_pipe_release(state->state, old_hwpipe);
-   mdp5_pipe_release(state->state, old_right_hwpipe);
+   ret = mdp5_pipe_release(state->state, old_hwpipe);
+   if (ret)
+   return ret;
+
+   ret = mdp5_pipe_release(state->state, old_right_hwpipe);
+   if (ret)
+   return ret;
+
}
} else {
-   mdp5_pipe_release(state->state, mdp5_state->hwpipe);
-   mdp5_pipe_release(state->state, mdp5_state->r_hwpipe);
+   ret = mdp5_pipe_release(state->state, mdp5_state->hwpipe);
+   if (ret)
+   return ret;
+
+   ret = mdp5_pipe_release(state->state, mdp5_state->r_hwpipe);
+   if (ret)
+   return ret;
+
mdp5_state->hwpipe = mdp5_state->r_hwpipe = NULL;
}
 
-- 
2.35.1

[PATCH v2 06/12] drm/i915/pvc: Reduce stack usage in reset selftest with extra blitter engine

From: John Harrison 

PVC adds extra blitter engines (in the following patch). The reset
selftest has a local array on the stack which is sized by the number
of engines. The increase pushes the size of this array to the point
where it trips the 'stack too large' compile warning. This patch takes
the allocation of the stack and makes it dynamic instead.

v2 (MattR):
 - Minor cosmetic changes:  re-sort definition and allocate using
   kmalloc_array().  (Tvrtko)

Cc: Tvrtko Ursulin 
Signed-off-by: John Harrison 
Signed-off-by: Matt Roper 
Reviewed-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 83ff4c2e57c5..6493265d5f64 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -976,6 +976,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
 {
struct i915_gpu_error *global = >i915->gpu_error;
struct intel_engine_cs *engine, *other;
+   struct active_engine *threads;
enum intel_engine_id id, tmp;
struct hang h;
int err = 0;
@@ -996,8 +997,11 @@ static int __igt_reset_engines(struct intel_gt *gt,
h.ctx->sched.priority = 1024;
}
 
+   threads = kmalloc_array(I915_NUM_ENGINES, sizeof(*threads), GFP_KERNEL);
+   if (!threads)
+   return -ENOMEM;
+
for_each_engine(engine, gt, id) {
-   struct active_engine threads[I915_NUM_ENGINES] = {};
unsigned long device = i915_reset_count(global);
unsigned long count = 0, reported;
bool using_guc = intel_engine_uses_guc(engine);
@@ -1016,7 +1020,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
break;
}
 
-   memset(threads, 0, sizeof(threads));
+   memset(threads, 0, sizeof(*threads) * I915_NUM_ENGINES);
for_each_engine(other, gt, tmp) {
struct task_struct *tsk;
 
@@ -1236,6 +1240,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
break;
}
}
+   kfree(threads);
 
if (intel_gt_is_wedged(gt))
err = -EIO;
-- 
2.35.1

[PATCH v2 01/12] drm/i915/uncore: Reorganize and document shadow and forcewake tables

Let's reorganize some of the forcewake/shadow handling in intel_uncore.c
and consolidate the cargo-cult comments on each table into more general
comments that apply to all tables.

We'll probably move forcewake handling to its own dedicated file in the
near future and further enhance this with true kerneldoc.  But this is a
good intermediate step to help clarify the behavior a bit.

Cc: Stuart Summers 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 125 ++--
 1 file changed, 80 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 83517a703eb6..095e071e4053 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -938,36 +938,32 @@ find_fw_domain(struct intel_uncore *uncore, u32 offset)
return entry->domains;
 }
 
-#define GEN_FW_RANGE(s, e, d) \
-   { .start = (s), .end = (e), .domains = (d) }
-
-/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
-static const struct intel_forcewake_range __vlv_fw_ranges[] = {
-   GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x5000, 0x7fff, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0xb000, 0x11fff, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x12000, 0x13fff, FORCEWAKE_MEDIA),
-   GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_MEDIA),
-   GEN_FW_RANGE(0x2e000, 0x2, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x3, 0x3, FORCEWAKE_MEDIA),
-};
-
-#define __fwtable_reg_read_fw_domains(uncore, offset) \
-({ \
-   enum forcewake_domains __fwd = 0; \
-   if (NEEDS_FORCE_WAKE((offset))) \
-   __fwd = find_fw_domain(uncore, offset); \
-   __fwd; \
-})
+/*
+ * Shadowed register tables describe special register ranges that i915 is
+ * allowed to write to without acquiring forcewake.  If these registers' power
+ * wells are down, the hardware will save values written by i915 to a shadow
+ * copy and automatically transfer them into the real register the next time
+ * the power well is woken up.  Shadowing only applies to writes; forcewake
+ * must still be acquired when reading from registers in these ranges.
+ *
+ * The documentation for shadowed registers is somewhat spotty on older
+ * platforms.  However missing registers from these lists is non-fatal; it just
+ * means we'll wake up the hardware for some register accesses where we didn't
+ * really need to.
+ *
+ * The ranges listed in these tables must be sorted by offset.
+ *
+ * When adding new tables here, please also add them to
+ * intel_shadow_table_check() in selftests/intel_uncore.c so that they will be
+ * scanned for obvious mistakes or typos by the selftests.
+ */
 
-/* *Must* be sorted by offset! See intel_shadow_table_check(). */
 static const struct i915_range gen8_shadowed_regs[] = {
{ .start =  0x2030, .end =  0x2030 },
{ .start =  0xA008, .end =  0xA00C },
{ .start = 0x12030, .end = 0x12030 },
{ .start = 0x1a030, .end = 0x1a030 },
{ .start = 0x22030, .end = 0x22030 },
-   /* TODO: Other registers are not yet used */
 };
 
 static const struct i915_range gen11_shadowed_regs[] = {
@@ -1107,11 +1103,71 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
return FORCEWAKE_RENDER;
 }
 
+#define __fwtable_reg_read_fw_domains(uncore, offset) \
+({ \
+   enum forcewake_domains __fwd = 0; \
+   if (NEEDS_FORCE_WAKE((offset))) \
+   __fwd = find_fw_domain(uncore, offset); \
+   __fwd; \
+})
+
+#define __fwtable_reg_write_fw_domains(uncore, offset) \
+({ \
+   enum forcewake_domains __fwd = 0; \
+   const u32 __offset = (offset); \
+   if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \
+   __fwd = find_fw_domain(uncore, __offset); \
+   __fwd; \
+})
+
+#define GEN_FW_RANGE(s, e, d) \
+   { .start = (s), .end = (e), .domains = (d) }
+
+/*
+ * All platforms' forcewake tables below must be sorted by offset ranges.
+ * Furthermore, new forcewake tables added should be "watertight" and have
+ * no gaps between ranges.
+ *
+ * When there are multiple consecutive ranges listed in the bspec with
+ * the same forcewake domain, it is customary to combine them into a single
+ * row in the tables below to keep the tables small and lookups fast.
+ * Likewise, reserved/unused ranges may be combined with the preceding and/or
+ * following ranges since the driver will never be making MMIO accesses in
+ * those ranges.
+ *
+ * For example, if the bspec were to list:
+ *
+ *...
+ *0x1000 - 0x1fff:  GT
+ *0x2000 - 0x2cff:  GT
+ *0x2d00 - 0x2fff:  unused/reserved
+ *0x3000 - 0x:  GT
+ *...
+ *
+ * these could all be represented by a single line in the code:
+ *
+ *   GEN_FW_RANGE(0x1000, 0x, FORCEWAKE_GT)
+ *
+ * When adding new forcewake tables here, please also add them to
+ * intel_uncore_mock_selftests in

[PATCH v2 00/12] i915: Introduce Ponte Vecchio

Ponte Vecchio (PVC) is a new GPU based on the Xe_HPC architecture.  As a
compute-focused platform, PVC has compute engines and enhanced copy
engines, but no render engine (there is no geometry pipeline) and no
display.

This is just a handful of early enablement patches, including some
initial support for the new copy engines (although we're not yet adding
those to the platform's engine list or exposing them to userspace just
yet).

v2:
 - Drop replicated comment from forcewake patch completely and add an
   additional commit to provide better documentation for forcewake and
   shadowed register tables in a way that's clear for all platforms.
 - Move gvt build fix to its own patch.
 - Address various minor review feedback from Lucas, Tvrtko, and
   Prathap.


Ayaz A Siddiqui (1):
  drm/i915/pvc: Define MOCS table for PVC

John Harrison (1):
  drm/i915/pvc: Reduce stack usage in reset selftest with extra blitter
engine

Lucas De Marchi (2):
  drm/i915/pvc: skip all copy engines from aux table invalidate
  drm/i915/pvc: read fuses for link copy engines

Matt Roper (7):
  drm/i915/uncore: Reorganize and document shadow and forcewake tables
  drm/i915/pvc: Add forcewake support
  drm/i915/pvc: Read correct RP_STATE_CAP register
  drm/i915/gvt: Use intel_engine_mask_t for ring mask
  drm/i915/pvc: Engine definitions for new copy engines
  drm/i915/pvc: Interrupt support for new copy engines
  drm/i915/pvc: Reset support for new copy engines

Stuart Summers (1):
  drm/i915/pvc: Remove additional 3D flags from PIPE_CONTROL

 drivers/gpu/drm/i915/gt/gen8_engine_cs.c  |  21 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  93 ++
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  12 +-
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |  12 +-
 drivers/gpu/drm/i915/gt/intel_gt_irq.c|  16 ++
 drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  56 ++--
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |   1 +
 drivers/gpu/drm/i915/gt/intel_mocs.c  |  24 +-
 drivers/gpu/drm/i915/gt/intel_rps.c   |   4 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   |  30 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   9 +-
 drivers/gpu/drm/i915/gvt/cmd_parser.c |   2 +-
 drivers/gpu/drm/i915/i915_drv.h   |   4 +
 drivers/gpu/drm/i915/i915_pci.c   |   4 +-
 drivers/gpu/drm/i915/i915_reg.h   |   9 +
 drivers/gpu/drm/i915/intel_device_info.h  |   4 +-
 drivers/gpu/drm/i915/intel_uncore.c   | 267 +++---
 drivers/gpu/drm/i915/selftests/intel_uncore.c |   2 +
 18 files changed, 483 insertions(+), 87 deletions(-)

-- 
2.35.1

[PATCH v2 09/12] drm/i915/pvc: Interrupt support for new copy engines

Add the interrupt handler support for new copy engines.

Bspec: 54030
Original-author: CQ Tang
Signed-off-by: Matt Roper 
Reviewed-by: Stuart Summers 
---
 drivers/gpu/drm/i915/gt/intel_gt_irq.c  | 16 
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  4 
 2 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c 
b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index 88b4becfcb17..3a72d4fd0214 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -193,6 +193,14 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
/* Restore masks irqs on RCS, BCS, VCS and VECS engines. */
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK,   ~0);
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK,~0);
+   if (HAS_ENGINE(gt, BCS1) || HAS_ENGINE(gt, BCS2))
+   intel_uncore_write(uncore, XEHPC_BCS1_BCS2_INTR_MASK, ~0);
+   if (HAS_ENGINE(gt, BCS3) || HAS_ENGINE(gt, BCS4))
+   intel_uncore_write(uncore, XEHPC_BCS3_BCS4_INTR_MASK, ~0);
+   if (HAS_ENGINE(gt, BCS5) || HAS_ENGINE(gt, BCS6))
+   intel_uncore_write(uncore, XEHPC_BCS5_BCS6_INTR_MASK, ~0);
+   if (HAS_ENGINE(gt, BCS7) || HAS_ENGINE(gt, BCS8))
+   intel_uncore_write(uncore, XEHPC_BCS7_BCS8_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK,   ~0);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK,   ~0);
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
@@ -248,6 +256,14 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
/* Unmask irqs on RCS, BCS, VCS and VECS engines. */
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~smask);
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);
+   if (HAS_ENGINE(gt, BCS1) || HAS_ENGINE(gt, BCS2))
+   intel_uncore_write(uncore, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
+   if (HAS_ENGINE(gt, BCS3) || HAS_ENGINE(gt, BCS4))
+   intel_uncore_write(uncore, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
+   if (HAS_ENGINE(gt, BCS5) || HAS_ENGINE(gt, BCS6))
+   intel_uncore_write(uncore, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
+   if (HAS_ENGINE(gt, BCS7) || HAS_ENGINE(gt, BCS8))
+   intel_uncore_write(uncore, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index aa2c0974b02c..fe09288a3145 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1529,6 +1529,10 @@
 #define GEN11_GUNIT_CSME_INTR_MASK _MMIO(0x1900f4)
 #define GEN12_CCS0_CCS1_INTR_MASK  _MMIO(0x190100)
 #define GEN12_CCS2_CCS3_INTR_MASK  _MMIO(0x190104)
+#define XEHPC_BCS1_BCS2_INTR_MASK  _MMIO(0x190110)
+#define XEHPC_BCS3_BCS4_INTR_MASK  _MMIO(0x190114)
+#define XEHPC_BCS5_BCS6_INTR_MASK  _MMIO(0x190118)
+#define XEHPC_BCS7_BCS8_INTR_MASK  _MMIO(0x19011c)
 
 #define GEN12_SFC_DONE(n)  _MMIO(0x1cc000 + (n) * 0x1000)
 
-- 
2.35.1

[PATCH v2 12/12] drm/i915/pvc: read fuses for link copy engines

From: Lucas De Marchi 

The new Link Copy engines in PVC may be fused off according to the
mslice_mask. Each bit of the MEML3_EN_MASK we read from the
GEN10_MIRROR_FUSE3 register disables a pair of link copy engines.

v2 (Tvrtko):
 - Minor cosmetic changes: s/u8/unsigned long/, use instance local
   variable.  (Tvrtko)

Bspec: 44483
Cc: Matt Roper 
Cc: Tvrtko Ursulin 
Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
Reviewed-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 29 +++
 1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index c6e93db134b1..1adbf34c3632 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -686,6 +686,34 @@ static void engine_mask_apply_compute_fuses(struct 
intel_gt *gt)
}
 }
 
+static void engine_mask_apply_copy_fuses(struct intel_gt *gt)
+{
+   struct drm_i915_private *i915 = gt->i915;
+   struct intel_gt_info *info = >info;
+   unsigned long meml3_mask;
+   unsigned long quad;
+
+   meml3_mask = intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3);
+   meml3_mask = REG_FIELD_GET(GEN12_MEML3_EN_MASK, meml3_mask);
+
+   /*
+* Link Copy engines may be fused off according to meml3_mask. Each
+* bit is a quad that houses 2 Link Copy and two Sub Copy engines.
+*/
+   for_each_clear_bit(quad, _mask, GEN12_MAX_MSLICES) {
+   unsigned int instance = quad * 2 + 1;
+   intel_engine_mask_t mask = GENMASK(_BCS(instance + 1),
+  _BCS(instance));
+
+   if (mask & info->engine_mask) {
+   drm_dbg(>drm, "bcs%u fused off\n", instance);
+   drm_dbg(>drm, "bcs%u fused off\n", instance + 1);
+
+   info->engine_mask &= ~mask;
+   }
+   }
+}
+
 /*
  * Determine which engines are fused off in our particular hardware.
  * Note that we have a catch-22 situation where we need to be able to access
@@ -768,6 +796,7 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt 
*gt)
GEM_BUG_ON(vebox_mask != VEBOX_MASK(gt));
 
engine_mask_apply_compute_fuses(gt);
+   engine_mask_apply_copy_fuses(gt);
 
return info->engine_mask;
 }
-- 
2.35.1

[PATCH v2 10/12] drm/i915/pvc: Reset support for new copy engines

Add the reset support for new copy engines in PVC.

Bspec: 52549
Original-author: CQ Tang
Signed-off-by: Matt Roper 
Reviewed-by: José Roberto de Souza 
Reviewed-by: Stuart Summers 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  8 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h   | 44 +--
 2 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4532c3ea9ace..c6e93db134b1 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -390,6 +390,14 @@ static u32 get_reset_domain(u8 ver, enum intel_engine_id 
id)
static const u32 engine_reset_domains[] = {
[RCS0]  = GEN11_GRDOM_RENDER,
[BCS0]  = GEN11_GRDOM_BLT,
+   [BCS1]  = XEHPC_GRDOM_BLT1,
+   [BCS2]  = XEHPC_GRDOM_BLT2,
+   [BCS3]  = XEHPC_GRDOM_BLT3,
+   [BCS4]  = XEHPC_GRDOM_BLT4,
+   [BCS5]  = XEHPC_GRDOM_BLT5,
+   [BCS6]  = XEHPC_GRDOM_BLT6,
+   [BCS7]  = XEHPC_GRDOM_BLT7,
+   [BCS8]  = XEHPC_GRDOM_BLT8,
[VCS0]  = GEN11_GRDOM_MEDIA,
[VCS1]  = GEN11_GRDOM_MEDIA2,
[VCS2]  = GEN11_GRDOM_MEDIA3,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index fe09288a3145..98ede9c93f00 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -597,24 +597,32 @@
 /* GEN11 changed all bit defs except for FULL & RENDER */
 #define   GEN11_GRDOM_FULL GEN6_GRDOM_FULL
 #define   GEN11_GRDOM_RENDER   GEN6_GRDOM_RENDER
-#define   GEN11_GRDOM_BLT  (1 << 2)
-#define   GEN11_GRDOM_GUC  (1 << 3)
-#define   GEN11_GRDOM_MEDIA(1 << 5)
-#define   GEN11_GRDOM_MEDIA2   (1 << 6)
-#define   GEN11_GRDOM_MEDIA3   (1 << 7)
-#define   GEN11_GRDOM_MEDIA4   (1 << 8)
-#define   GEN11_GRDOM_MEDIA5   (1 << 9)
-#define   GEN11_GRDOM_MEDIA6   (1 << 10)
-#define   GEN11_GRDOM_MEDIA7   (1 << 11)
-#define   GEN11_GRDOM_MEDIA8   (1 << 12)
-#define   GEN11_GRDOM_VECS (1 << 13)
-#define   GEN11_GRDOM_VECS2(1 << 14)
-#define   GEN11_GRDOM_VECS3(1 << 15)
-#define   GEN11_GRDOM_VECS4(1 << 16)
-#define   GEN11_GRDOM_SFC0 (1 << 17)
-#define   GEN11_GRDOM_SFC1 (1 << 18)
-#define   GEN11_GRDOM_SFC2 (1 << 19)
-#define   GEN11_GRDOM_SFC3 (1 << 20)
+#define   XEHPC_GRDOM_BLT8 REG_BIT(31)
+#define   XEHPC_GRDOM_BLT7 REG_BIT(30)
+#define   XEHPC_GRDOM_BLT6 REG_BIT(29)
+#define   XEHPC_GRDOM_BLT5 REG_BIT(28)
+#define   XEHPC_GRDOM_BLT4 REG_BIT(27)
+#define   XEHPC_GRDOM_BLT3 REG_BIT(26)
+#define   XEHPC_GRDOM_BLT2 REG_BIT(25)
+#define   XEHPC_GRDOM_BLT1 REG_BIT(24)
+#define   GEN11_GRDOM_SFC3 REG_BIT(20)
+#define   GEN11_GRDOM_SFC2 REG_BIT(19)
+#define   GEN11_GRDOM_SFC1 REG_BIT(18)
+#define   GEN11_GRDOM_SFC0 REG_BIT(17)
+#define   GEN11_GRDOM_VECS4REG_BIT(16)
+#define   GEN11_GRDOM_VECS3REG_BIT(15)
+#define   GEN11_GRDOM_VECS2REG_BIT(14)
+#define   GEN11_GRDOM_VECS REG_BIT(13)
+#define   GEN11_GRDOM_MEDIA8   REG_BIT(12)
+#define   GEN11_GRDOM_MEDIA7   REG_BIT(11)
+#define   GEN11_GRDOM_MEDIA6   REG_BIT(10)
+#define   GEN11_GRDOM_MEDIA5   REG_BIT(9)
+#define   GEN11_GRDOM_MEDIA4   REG_BIT(8)
+#define   GEN11_GRDOM_MEDIA3   REG_BIT(7)
+#define   GEN11_GRDOM_MEDIA2   REG_BIT(6)
+#define   GEN11_GRDOM_MEDIAREG_BIT(5)
+#define   GEN11_GRDOM_GUC  REG_BIT(3)
+#define   GEN11_GRDOM_BLT  REG_BIT(2)
 #define   GEN11_VCS_SFC_RESET_BIT(instance)(GEN11_GRDOM_SFC0 << 
((instance) >> 1))
 #define   GEN11_VECS_SFC_RESET_BIT(instance)   (GEN11_GRDOM_SFC0 << (instance))
 
-- 
2.35.1

[PATCH v2 11/12] drm/i915/pvc: skip all copy engines from aux table invalidate

From: Lucas De Marchi 

As we have more copy engines now, mask all of them from aux table
invalidate.

v2 (MattR):
 - Use I915_MAX_BCS to determine mask rather than hardcoding BCS8.
   (Prathap)

Cc: Prathap Kumar Valsan 
Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
Reviewed-by: José Roberto de Souza 
Reviewed-by: Prathap Kumar Valsan 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 11c72792573d..0f5ab2c99ffc 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -276,7 +276,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (!HAS_FLAT_CCS(rq->engine->i915) &&
(rq->engine->class == VIDEO_DECODE_CLASS ||
 rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
-   aux_inv = rq->engine->mask & ~BIT(BCS0);
+   aux_inv = rq->engine->mask &
+   ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
if (aux_inv)
cmd += 4;
}
-- 
2.35.1

[PATCH v2 08/12] drm/i915/pvc: Engine definitions for new copy engines

This patch adds the basic definitions needed to support
new copy engines. Also updating the cmd_info to accommodate
new engines, as the engine id's of legacy engines have been
changed.

v2:
 - Add _BCS(n) definition, similar to other engines.  (Tvrtko)
 - Add I915_MAX_BCS definition, similar to other engnes.  (Prathap)
 - Move GVT change to avoid u16 overflow to its own patch.  (Tvrtko)

Original-author: CQ Tang
Cc: Tvrtko Ursulin 
Cc: Prathap Kumar Valsan 
Signed-off-by: Matt Roper 
Reviewed-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 56 
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 12 -
 drivers/gpu/drm/i915/gt/intel_gt_regs.h  |  8 +++
 drivers/gpu/drm/i915/i915_reg.h  |  8 +++
 4 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 14c6ddbbfde8..4532c3ea9ace 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -71,6 +71,62 @@ static const struct engine_info intel_engines[] = {
{ .graphics_ver = 6, .base = BLT_RING_BASE }
},
},
+   [BCS1] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 1,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS1_RING_BASE }
+   },
+   },
+   [BCS2] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 2,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS2_RING_BASE }
+   },
+   },
+   [BCS3] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 3,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS3_RING_BASE }
+   },
+   },
+   [BCS4] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 4,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS4_RING_BASE }
+   },
+   },
+   [BCS5] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 5,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS5_RING_BASE }
+   },
+   },
+   [BCS6] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 6,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS6_RING_BASE }
+   },
+   },
+   [BCS7] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 7,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS7_RING_BASE }
+   },
+   },
+   [BCS8] = {
+   .class = COPY_ENGINE_CLASS,
+   .instance = 8,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHPC_BCS8_RING_BASE }
+   },
+   },
[VCS0] = {
.class = VIDEO_DECODE_CLASS,
.instance = 0,
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 298f2cc7a879..2286f96f5f87 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -35,7 +35,7 @@
 #define OTHER_CLASS4
 #define COMPUTE_CLASS  5
 #define MAX_ENGINE_CLASS   5
-#define MAX_ENGINE_INSTANCE7
+#define MAX_ENGINE_INSTANCE8
 
 #define I915_MAX_SLICES3
 #define I915_MAX_SUBSLICES 8
@@ -99,6 +99,7 @@ struct i915_ctx_workarounds {
 #define I915_MAX_SFC   (I915_MAX_VCS / 2)
 #define I915_MAX_CCS   4
 #define I915_MAX_RCS   1
+#define I915_MAX_BCS   9
 
 /*
  * Engine IDs definitions.
@@ -107,6 +108,15 @@ struct i915_ctx_workarounds {
 enum intel_engine_id {
RCS0 = 0,
BCS0,
+   BCS1,
+   BCS2,
+   BCS3,
+   BCS4,
+   BCS5,
+   BCS6,
+   BCS7,
+   BCS8,
+#define _BCS(n) (BCS0 + (n))
VCS0,
VCS1,
VCS2,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index a0a49c16babd..aa2c0974b02c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1476,6 +1476,14 @@
 #define   GEN11_KCR(19)
 #define   GEN11_GTPM   (16)
 #define   GEN11_BCS(15)
+#define   XEHPC_BCS1   (14)
+#define   XEHPC_BCS2   (13)
+#define   XEHPC_BCS3   (12)
+#define   XEHPC_BCS4   (11)
+#define   XEHPC_BCS5   (10)
+#define   XEHPC_BCS6   (9)
+#define   XEHPC_BCS7   (8)
+#define   XEHPC_BCS8

[PATCH v2 05/12] drm/i915/pvc: Remove additional 3D flags from PIPE_CONTROL

From: Stuart Summers 

Although we already strip 3D-specific flags from PIPE_CONTROL
instructions when submitting to a compute engine, there are some
additional flags that need to be removed when the platform as a whole
lacks a 3D pipeline.  Add those restrictions here.

Bspec: 47112
Signed-off-by: Stuart Summers 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 18 --
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 12 ++--
 drivers/gpu/drm/i915/i915_drv.h  |  2 ++
 drivers/gpu/drm/i915/i915_pci.c  |  3 ++-
 drivers/gpu/drm/i915/intel_device_info.h |  3 ++-
 5 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 3e13960615bd..11c72792573d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -197,8 +197,10 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
flags |= PIPE_CONTROL_CS_STALL;
 
-   if (engine->class == COMPUTE_CLASS)
-   flags &= ~PIPE_CONTROL_3D_FLAGS;
+   if (LACKS_3D_PIPELINE(engine->i915))
+   flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+   else if (engine->class == COMPUTE_CLASS)
+   flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
cs = intel_ring_begin(rq, 6);
if (IS_ERR(cs))
@@ -227,8 +229,10 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
flags |= PIPE_CONTROL_CS_STALL;
 
-   if (engine->class == COMPUTE_CLASS)
-   flags &= ~PIPE_CONTROL_3D_FLAGS;
+   if (LACKS_3D_PIPELINE(engine->i915))
+   flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+   else if (engine->class == COMPUTE_CLASS)
+   flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
if (!HAS_FLAT_CCS(rq->engine->i915))
count = 8 + 4;
@@ -716,8 +720,10 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request 
*rq, u32 *cs)
/* Wa_1409600907 */
flags |= PIPE_CONTROL_DEPTH_STALL;
 
-   if (rq->engine->class == COMPUTE_CLASS)
-   flags &= ~PIPE_CONTROL_3D_FLAGS;
+   if (LACKS_3D_PIPELINE(rq->engine->i915))
+   flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+   else if (rq->engine->class == COMPUTE_CLASS)
+   flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
cs = gen12_emit_ggtt_write_rcs(cs,
   rq->fence.seqno,
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 556bca3be804..900755f4b787 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -288,8 +288,8 @@
 #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH   (1<<0)
 #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
 
-/* 3D-related flags can't be set on compute engine */
-#define PIPE_CONTROL_3D_FLAGS (\
+/* 3D-related flags that can't be set on _engines_ that lack a 3D pipeline */
+#define PIPE_CONTROL_3D_ENGINE_FLAGS (\
PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH | \
PIPE_CONTROL_DEPTH_CACHE_FLUSH | \
PIPE_CONTROL_TILE_CACHE_FLUSH | \
@@ -300,6 +300,14 @@
PIPE_CONTROL_VF_CACHE_INVALIDATE | \
PIPE_CONTROL_GLOBAL_SNAPSHOT_RESET)
 
+/* 3D-related flags that can't be set on _platforms_ that lack a 3D pipeline */
+#define PIPE_CONTROL_3D_ARCH_FLAGS ( \
+   PIPE_CONTROL_3D_ENGINE_FLAGS | \
+   PIPE_CONTROL_INDIRECT_STATE_DISABLE | \
+   PIPE_CONTROL_FLUSH_ENABLE | \
+   PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | \
+   PIPE_CONTROL_DC_FLUSH_ENABLE)
+
 #define MI_MATH(x) MI_INSTR(0x1a, (x) - 1)
 #define MI_MATH_INSTR(opcode, op1, op2) ((opcode) << 20 | (op1) << 10 | (op2))
 /* Opcodes for MI_MATH_INSTR */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b389674b5210..1e153cefc92e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1403,6 +1403,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 
 #define HAS_MBUS_JOINING(i915) (IS_ALDERLAKE_P(i915))
 
+#define LACKS_3D_PIPELINE(i915)(INTEL_INFO(i915)->lacks_3d_pipeline)
+
 /* i915_gem.c */
 void i915_gem_init_early(struct drm_i915_private *dev_priv);
 void i915_gem_cleanup_early(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 07722cdf63ac..14e0e8225324 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1077,7 +1077,8 @@ static const struct intel_device_info ats_m_info = {
 #define XE_HPC_FEATURES \
XE_HP_FEATURES, \
.dma_mask_size =

[PATCH v2 07/12] drm/i915/gvt: Use intel_engine_mask_t for ring mask

When i915 adds additional PVC blitter instances (in an upcoming patch),
the definition of VECS0 will change from bit(10) to bit(18), causing
GVT's R_ALL mask to overflow the u16 storage that's currently used.
Let's replace the u16 with an intel_engine_mask_t to ensure we avoid
this.

Cc: Tvrtko Ursulin 
Cc: Zhi Wang 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gvt/cmd_parser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c 
b/drivers/gpu/drm/i915/gvt/cmd_parser.c
index b9eb75a2b400..0ba2a3455d99 100644
--- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
+++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
@@ -428,7 +428,7 @@ struct cmd_info {
 #define R_VECS BIT(VECS0)
 #define R_ALL (R_RCS | R_VCS | R_BCS | R_VECS)
/* rings that support this cmd: BLT/RCS/VCS/VECS */
-   u16 rings;
+   intel_engine_mask_t rings;
 
/* devices that support this cmd: SNB/IVB/HSW/... */
u16 devices;
-- 
2.35.1

[PATCH v2 03/12] drm/i915/pvc: Define MOCS table for PVC

From: Ayaz A Siddiqui 

v2 (MattR):
 - Clarify comment above RING_CMD_CCTL programming.
 - Remove bspec reference from field definition.  (Lucas)
 - Add WARN if we try to use a (presumably uninitialized) wb_index of 0.
   On most platforms 0 is an invalid MOCS entry and even on the ones
   where it isn't, it isn't the right setting for wb_index.  (Lucas)

Bspec: 45101, 72161
Cc: Lucas De Marchi 
Signed-off-by: Ayaz A Siddiqui 
Signed-off-by: Fei Yang 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt_types.h|  1 +
 drivers/gpu/drm/i915/gt/intel_mocs.c| 24 -
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 -
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_pci.c |  3 ++-
 drivers/gpu/drm/i915/intel_device_info.h|  1 +
 6 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index b06611c1d4ad..7853ea194ea6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -221,6 +221,7 @@ struct intel_gt {
 
struct {
u8 uc_index;
+   u8 wb_index; /* Only for platforms listed in Bspec: 72161 */
} mocs;
 
struct intel_pxp pxp;
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c 
b/drivers/gpu/drm/i915/gt/intel_mocs.c
index c4c37585ae8c..c6ebe2781076 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -23,6 +23,7 @@ struct drm_i915_mocs_table {
unsigned int n_entries;
const struct drm_i915_mocs_entry *table;
u8 uc_index;
+   u8 wb_index; /* Only used on HAS_L3_CCS_READ() platforms */
u8 unused_entries_index;
 };
 
@@ -47,6 +48,7 @@ struct drm_i915_mocs_table {
 
 /* Helper defines */
 #define GEN9_NUM_MOCS_ENTRIES  64  /* 63-64 are reserved, but configured. */
+#define PVC_NUM_MOCS_ENTRIES   3
 
 /* (e)LLC caching options */
 /*
@@ -394,6 +396,17 @@ static const struct drm_i915_mocs_entry 
dg2_mocs_table_g10_ax[] = {
MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
 };
 
+static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
+   /* Error */
+   MOCS_ENTRY(0, 0, L3_3_WB),
+
+   /* UC */
+   MOCS_ENTRY(1, 0, L3_1_UC),
+
+   /* WB */
+   MOCS_ENTRY(2, 0, L3_3_WB),
+};
+
 enum {
HAS_GLOBAL_MOCS = BIT(0),
HAS_ENGINE_MOCS = BIT(1),
@@ -423,7 +436,14 @@ static unsigned int get_mocs_settings(const struct 
drm_i915_private *i915,
memset(table, 0, sizeof(struct drm_i915_mocs_table));
 
table->unused_entries_index = I915_MOCS_PTE;
-   if (IS_DG2(i915)) {
+   if (IS_PONTEVECCHIO(i915)) {
+   table->size = ARRAY_SIZE(pvc_mocs_table);
+   table->table = pvc_mocs_table;
+   table->n_entries = PVC_NUM_MOCS_ENTRIES;
+   table->uc_index = 1;
+   table->wb_index = 2;
+   table->unused_entries_index = 2;
+   } else if (IS_DG2(i915)) {
if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax);
table->table = dg2_mocs_table_g10_ax;
@@ -622,6 +642,8 @@ void intel_set_mocs_index(struct intel_gt *gt)
 
get_mocs_settings(gt->i915, );
gt->mocs.uc_index = table.uc_index;
+   if (HAS_L3_CCS_READ(gt->i915))
+   gt->mocs.wb_index = table.wb_index;
 }
 
 void intel_mocs_init(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index a05c4b99b3fb..756807c4b405 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1994,19 +1994,37 @@ void intel_engine_apply_whitelist(struct 
intel_engine_cs *engine)
 static void
 engine_fake_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 {
-   u8 mocs;
+   u8 mocs_w, mocs_r;
 
/*
-* RING_CMD_CCTL are need to be programed to un-cached
-* for memory writes and reads outputted by Command
-* Streamers on Gen12 onward platforms.
+* RING_CMD_CCTL specifies the default MOCS entry that will be used
+* by the command streamer when executing commands that don't have
+* a way to explicitly specify a MOCS setting.  The default should
+* usually reference whichever MOCS entry corresponds to uncached
+* behavior, although use of a WB cached entry is recommended by the
+* spec in certain circumstances on specific platforms.
 */
if (GRAPHICS_VER(engine->i915) >= 12) {
-   mocs = engine->gt->mocs.uc_index;
+   mocs_r = engine->gt->mocs.uc_index;
+   mocs_w = engine->gt->mocs.uc_index;
+
+   if (HAS_L3_CCS_READ(engine->i915) &&
+   engine->class == COMPUTE_CLASS) {
+

[PATCH v2 04/12] drm/i915/pvc: Read correct RP_STATE_CAP register

The SoC registers, including RP_STATE_CAP, have moved to a new location
in GTTMMADR on Ponte Vecchio.  We need to update the register offset
accordingly.

Cc: Rodrigo Vivi 
Signed-off-by: Matt Roper 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++-
 drivers/gpu/drm/i915/i915_reg.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index 3476a11f294c..3bd8415a0f1b 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1075,7 +1075,9 @@ static u32 intel_rps_read_state_cap(struct intel_rps *rps)
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
 
-   if (IS_XEHPSDV(i915))
+   if (IS_PONTEVECCHIO(i915))
+   return intel_uncore_read(uncore, PVC_RP_STATE_CAP);
+   else if (IS_XEHPSDV(i915))
return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 9ccb67eec1bd..4a3d7b96ef43 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1846,6 +1846,7 @@
 #define BXT_RP_STATE_CAP_MMIO(0x138170)
 #define GEN9_RP_STATE_LIMITS   _MMIO(0x138148)
 #define XEHPSDV_RP_STATE_CAP   _MMIO(0x250014)
+#define PVC_RP_STATE_CAP   _MMIO(0x281014)
 
 #define GT0_PERF_LIMIT_REASONS _MMIO(0x1381a8)
 #define   GT0_PERF_LIMIT_REASONS_MASK  0xde3
-- 
2.35.1

[PATCH v2 02/12] drm/i915/pvc: Add forcewake support