Re: [PATCH v5 00/13] Support DEVICE_GENERIC memory in migrate_vma_*

2021-08-12 Thread Sierra Guiza, Alejandro (Alex)



On 8/12/2021 1:34 AM, Christoph Hellwig wrote:

Do you have a pointer to a git branch with this series and all dependencies
to ease testing?


Hi Christoph,

Yes, actually tomorrow we're planning to send a new version with 
detailed instructions


on how to setup and run each of the tests. This will also contain a link 
to download our


kernel repo with all these patches.

Regards,

Alejandro S.



On Thu, Aug 12, 2021 at 01:30:47AM -0500, Alex Sierra wrote:

v1:
AMD is building a system architecture for the Frontier supercomputer with a
coherent interconnect between CPUs and GPUs. This hardware architecture allows
the CPUs to coherently access GPU device memory. We have hardware in our labs
and we are working with our partner HPE on the BIOS, firmware and software
for delivery to the DOE.

The system BIOS advertises the GPU device memory (aka VRAM) as SPM
(special purpose memory) in the UEFI system address map. The amdgpu driver looks
it up with lookup_resource and registers it with devmap as MEMORY_DEVICE_GENERIC
using devm_memremap_pages.

Now we're trying to migrate data to and from that memory using the migrate_vma_*
helpers so we can support page-based migration in our unified memory 
allocations,
while also supporting CPU access to those pages.

This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages behave
correctly in the migrate_vma_* helpers. We are looking for feedback about this
approach. If we're close, what's needed to make our patches acceptable upstream?
If we're not close, any suggestions how else to achieve what we are trying to do
(i.e. page migration and coherent CPU access to VRAM)?

This work is based on HMM and our SVM memory manager that was recently 
upstreamed
to Dave Airlie's drm-next branch
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm%2Flog%2F%3Fh%3Ddrm-nextdata=04%7C01%7Calex.sierra%40amd.com%7Ce4e14caf03de403b1e2208d95d5b378b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637643468663206442%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=W7NRFXF0t1bqAzsV1RZh4MAPly26c7XPn8kCKvaj%2FMw%3Dreserved=0
On top of that we did some rework of our VRAM management for migrations to 
remove
some incorrect assumptions, allow partially successful migrations and GPU memory
mappings that mix pages in VRAM and system memory.
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210527205606.2660-6-Felix.Kuehling%40amd.com%2FT%2F%23r996356015e295780eb50453e7dbd5d0d68b47cbcdata=04%7C01%7Calex.sierra%40amd.com%7Ce4e14caf03de403b1e2208d95d5b378b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637643468663206442%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=GZZrxedpAlYLfDgO1Y3jvpFacG313SXiUT3ErB4xO0Q%3Dreserved=0

v2:
This patch series version has merged "[RFC PATCH v3 0/2]
mm: remove extra ZONE_DEVICE struct page refcount" patch series made by
Ralph Campbell. It also applies at the top of these series, our changes
to support device generic type in migration_vma helpers.
This has been tested in systems with device memory that has coherent
access by CPU.

Also addresses the following feedback made in v1:
- Isolate in one patch kernel/resource.c modification, based
on Christoph's feedback.
- Add helpers check for generic and private type to avoid
duplicated long lines.

v3:
- Include cover letter from v1.
- Rename dax_layout_is_idle_page func to dax_page_unused in patch
ext4/xfs: add page refcount helper.

v4:
- Add support for zone device generic type in lib/test_hmm and
tool/testing/selftest/vm/hmm-tests.
- Add missing page refcount helper to fuse/dax.c. This was included in
one of Ralph Campbell's patches.

v5:
- Cosmetic changes on patches 3, 5 and 13
- Bug founded at test_hmm, remove devmem->pagemap.type = MEMORY_DEVICE_PRIVATE
at dmirror_allocate_chunk that was forcing to configure pagemap.type to
MEMORY_DEVICE_PRIVATE.
- A bug was found while running one of the xfstest (generic/413) used to
validate fs_dax device type. This was first introduced by patch: "mm: remove
extra ZONE_DEVICE struct page refcount" whic is part of these patch series.
The bug was showed as WARNING message at try_grab_page function call, due to
a page refcounter equal to zero. Part of "mm: remove extra ZONE_DEVICE struct
page refcount" changes, was to initialize page refcounter to zero. Therefore,
a special condition was added to try_grab_page on this v5, were it checks for
device zone pages too. It is included in the same patch.

This is how mm changes from these patch series have been validated:
- hmm-tests were run using device private and device generic types. This last,
just added in these patch series. efi_fake_mem was used to mimic SPM memory
for device generic.
- xfstests tool was used to validate fs-dax device type and page refcounter
changes. DAX configuration was used along with emulated 

Re: [RFC PATCH v3 1/6] drm/doc: Color Management and HDR10 RFC

2021-08-12 Thread Sharma, Shashank

Hello Brian,
(+Uma in cc)

Thanks for your comments, Let me try to fill-in for Harry to keep the 
design discussion going. Please find my comments inline.


On 8/2/2021 10:00 PM, Brian Starkey wrote:

Hi,

Thanks for having a stab at this, it's a massive complex topic to
solve.

Do you have the the HTML rendered somewhere for convenience?

On Fri, Jul 30, 2021 at 04:41:29PM -0400, Harry Wentland wrote:

Use the new DRM RFC doc section to capture the RFC previously only
described in the cover letter at
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F89506%2Fdata=04%7C01%7CShashank.Sharma%40amd.com%7C42a8172c947b41c17a5c08d955d2e859%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637635186605487756%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=DAoEKo7fl83YPgqFvEGCHF2vyYfILfoLBCCu5Q2Lg88%3Dreserved=0

v3:
  * Add sections on single-plane and multi-plane HDR
  * Describe approach to define HW details vs approach to define SW intentions
  * Link Jeremy Cline's excellent HDR summaries
  * Outline intention behind overly verbose doc
  * Describe FP16 use-case
  * Clean up links

v2: create this doc

v1: n/a

Signed-off-by: Harry Wentland 
---
  Documentation/gpu/rfc/color_intentions.drawio |   1 +
  Documentation/gpu/rfc/color_intentions.svg|   3 +
  Documentation/gpu/rfc/colorpipe   |   1 +
  Documentation/gpu/rfc/colorpipe.svg   |   3 +
  Documentation/gpu/rfc/hdr-wide-gamut.rst  | 580 ++
  Documentation/gpu/rfc/index.rst   |   1 +
  6 files changed, 589 insertions(+)
  create mode 100644 Documentation/gpu/rfc/color_intentions.drawio
  create mode 100644 Documentation/gpu/rfc/color_intentions.svg
  create mode 100644 Documentation/gpu/rfc/colorpipe
  create mode 100644 Documentation/gpu/rfc/colorpipe.svg
  create mode 100644 Documentation/gpu/rfc/hdr-wide-gamut.rst



-- snip --


+
+Mastering Luminances
+
+
+Even though we are able to describe the absolute luminance of a pixel
+using the PQ 2084 EOTF we are presented with physical limitations of the
+display technologies on the market today. Here are a few examples of
+luminance ranges of displays.
+
+.. flat-table::
+   :header-rows: 1
+
+   * - Display
+ - Luminance range in nits
+
+   *  - Typical PC display
+  - 0.3 - 200
+
+   *  - Excellent LCD HDTV
+  - 0.3 - 400
+
+   *  - HDR LCD w/ local dimming
+  - 0.05 - 1,500
+
+Since no display can currently show the full 0.0005 to 10,000 nits
+luminance range of PQ the display will need to tone-map the HDR content,
+i.e to fit the content within a display's capabilities. To assist
+with tone-mapping HDR content is usually accompanied by a metadata
+that describes (among other things) the minimum and maximum mastering
+luminance, i.e. the maximum and minimum luminance of the display that
+was used to master the HDR content.
+
+The HDR metadata is currently defined on the drm_connector via the
+hdr_output_metadata blob property.
+
+It might be useful to define per-plane hdr metadata, as different planes
+might have been mastered differently.


I think this only applies to the approach where all the processing is
decided in the kernel right?

If we directly expose each pipeline stage, and userspace controls
everything, there's no need for the kernel to know the mastering
luminance of any of the input content. The kernel would only need to
know the eventual *output* luminance range, which might not even match
any of the input content!


Yes, you are right. In an approach where a compositor controls 
everything, we might not need this property, as the compositor can 
directly prepare the color correction pipeline with the required 
matrices and kernel can just follow it.


The reason why we introduced this property here is that there may be a 
hardware which implements a fixed function degamma HW unit or tone 
mapping unit, and this property might make it easier for their drivers 
to implement.


So the whole idea was to plan a seed for thoughts for those drivers, and 
see if it makes sense to have such a property.



...


+
+How are we solving the problem?
+===
+
+Single-plane
+
+
+If a single drm_plane is used no further work is required. The compositor
+will provide one HDR plane alongside a drm_connector's hdr_output_metadata
+and the display HW will output this plane without further processing if
+no CRTC LUTs are provided.
+
+If desired a compositor can use the CRTC LUTs for HDR content but without
+support for PWL or multi-segmented LUTs the quality of the operation is
+expected to be subpar for HDR content.
+
+
+Multi-plane
+---
+
+In multi-plane configurations we need to solve the problem of blending
+HDR and SDR content. This blending should be done in linear space and
+therefore requires framebuffer data that is presented in linear space
+or a way to convert 

Re: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks

2021-08-12 Thread Lazar, Lijo




On 8/12/2021 10:24 PM, Michel Dänzer wrote:

On 2021-08-12 1:33 p.m., Lazar, Lijo wrote:

On 8/12/2021 1:41 PM, Michel Dänzer wrote:

On 2021-08-12 7:55 a.m., Koenig, Christian wrote:

Hi James,

Evan seems to have understood how this all works together.

See while any begin/end use critical section is active the work should not be 
active.

When you handle only one ring you can just call cancel in begin use and 
schedule in end use. But when you have more than one ring you need a lock or 
counter to prevent concurrent work items to be started.

Michelle's idea to use mod_delayed_work is a bad one because it assumes that 
the delayed work is still running.


It merely assumes that the work may already have been scheduled before.

Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I 
think it can still have some effect when there's a single work item for 
multiple rings, as described by James, it's probably negligible, since 
presumably the time intervals between ring_begin_use and ring_end_use are 
normally much shorter than a second.

So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as 
schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with 
dropping it.



Something similar applies to the first patch I think,


There are no cancel work calls in that case, so the commit log is accurate 
TTBOMK.


Curious -

For patch 1, does it make a difference if any delayed work scheduled is 
cancelled in the else part before proceeding?

} else if (!enable && adev->gfx.gfx_off_state) {
cancel_delayed_work();


I tried the patch below.

While this does seem to fix the problem as well, I see a potential issue:

1. amdgpu_gfx_off_ctrl locks adev->gfx.gfx_off_mutex
2. amdgpu_device_delay_enable_gfx_off runs, blocks in mutex_lock
3. amdgpu_gfx_off_ctrl calls cancel_delayed_work_sync

I'm afraid this would deadlock? (CONFIG_PROVE_LOCKING doesn't complain though)



Should use the cancel_delayed_work instead of the _sync version. As you 
mentioned - at best work is not scheduled yet and cancelled 
successfully, or at worst it's waiting for the mutex. In the worst case, 
if amdgpu_device_delay_enable_gfx_off gets the mutex after 
amdgpu_gfx_off_ctrl unlocks it, there is an extra check as below.


if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count)

The count wouldn't be 0 and hence it won't enable GFXOFF.



Maybe it's possible to fix it with cancel_delayed_work_sync somehow, but I'm 
not sure how offhand. (With cancel_delayed_work instead, I'm worried 
amdgpu_device_delay_enable_gfx_off might still enable GFXOFF in the HW 
immediately after amdgpu_gfx_off_ctrl unlocks the mutex. Then again, that might 
happen with mod_delayed_work as well...)


As mentioned earlier, cancel_delayed_work won't cause this issue.

In the mod_delayed_ patch, mod_ version is called only when req_count is 
0. While that is a good thing, it keeps alive one more contender for the 
mutex.


The cancel_ version eliminates that contender if happens to be called at 
the right time (more likely if there are multiple requests to disable 
gfxoff). On the other hand, don't know how costly it is to call cancel_ 
every time on the else part (or maybe call only once when count 
increments to 1?).


Thanks,
Lijo




diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

index a0be0772c8b3..3e4585ffb9af 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

@@ -570,8 +570,11 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)



 if (enable && !adev->gfx.gfx_off_state && 
!adev->gfx.gfx_off_req_count) {

 schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);

-   } else if (!enable && adev->gfx.gfx_off_state) {

-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {

+   } else if (!enable) {

+   cancel_delayed_work_sync(>gfx.gfx_off_delay_work);

+

+   if (adev->gfx.gfx_off_state &&

+   !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {

 adev->gfx.gfx_off_state = false;



 if (adev->gfx.funcs->init_spm_golden) {





Re: [git pull] drm fixes for 5.14-rc6

2021-08-12 Thread pr-tracker-bot
The pull request you sent on Fri, 13 Aug 2021 07:06:07 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2021-08-13

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/82cce5f4291e089d44b7b9bc77918cbcd52d429e

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


[PATCH v2] drm/virtio: support mapping exported vram

2021-08-12 Thread David Stevens
Implement virtgpu specific map_dma_buf callback to support mapping
exported vram object dma-bufs. The dma-buf callback is used directly, as
vram objects don't have backing pages and thus can't implement the
drm_gem_object_funcs.get_sg_table callback.

Signed-off-by: David Stevens 
---
v1 -> v2:
 - reflow line to fix strict checkpatch warning
 - replace else with return for consistency between functions
---
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  8 
 drivers/gpu/drm/virtio/virtgpu_prime.c | 32 +-
 drivers/gpu/drm/virtio/virtgpu_vram.c  | 61 ++
 3 files changed, 99 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index d4e610a44e12..0c4810982530 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -26,6 +26,7 @@
 #ifndef VIRTIO_DRV_H
 #define VIRTIO_DRV_H
 
+#include 
 #include 
 #include 
 #include 
@@ -459,4 +460,11 @@ bool virtio_gpu_is_vram(struct virtio_gpu_object *bo);
 int virtio_gpu_vram_create(struct virtio_gpu_device *vgdev,
   struct virtio_gpu_object_params *params,
   struct virtio_gpu_object **bo_ptr);
+struct sg_table *virtio_gpu_vram_map_dma_buf(struct virtio_gpu_object *bo,
+struct device *dev,
+enum dma_data_direction dir);
+void virtio_gpu_vram_unmap_dma_buf(struct device *dev,
+  struct sg_table *sgt,
+  enum dma_data_direction dir);
+
 #endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c 
b/drivers/gpu/drm/virtio/virtgpu_prime.c
index 807a27a16365..7b940be3323f 100644
--- a/drivers/gpu/drm/virtio/virtgpu_prime.c
+++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
@@ -43,13 +43,41 @@ static int virtgpu_virtio_get_uuid(struct dma_buf *buf,
return 0;
 }
 
+static struct sg_table *
+virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
+   enum dma_data_direction dir)
+{
+   struct drm_gem_object *obj = attach->dmabuf->priv;
+   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+
+   if (virtio_gpu_is_vram(bo))
+   return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
+
+   return drm_gem_map_dma_buf(attach, dir);
+}
+
+static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
+ struct sg_table *sgt,
+ enum dma_data_direction dir)
+{
+   struct drm_gem_object *obj = attach->dmabuf->priv;
+   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+
+   if (virtio_gpu_is_vram(bo)) {
+   virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
+   return;
+   }
+
+   drm_gem_unmap_dma_buf(attach, sgt, dir);
+}
+
 static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
.ops = {
.cache_sgt_mapping = true,
.attach = virtio_dma_buf_attach,
.detach = drm_gem_map_detach,
-   .map_dma_buf = drm_gem_map_dma_buf,
-   .unmap_dma_buf = drm_gem_unmap_dma_buf,
+   .map_dma_buf = virtgpu_gem_map_dma_buf,
+   .unmap_dma_buf = virtgpu_gem_unmap_dma_buf,
.release = drm_gem_dmabuf_release,
.mmap = drm_gem_dmabuf_mmap,
.vmap = drm_gem_dmabuf_vmap,
diff --git a/drivers/gpu/drm/virtio/virtgpu_vram.c 
b/drivers/gpu/drm/virtio/virtgpu_vram.c
index 5cc34e7330fa..6b45b0429fef 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vram.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vram.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "virtgpu_drv.h"
 
+#include 
+
 static void virtio_gpu_vram_free(struct drm_gem_object *obj)
 {
struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
@@ -64,6 +66,65 @@ static int virtio_gpu_vram_mmap(struct drm_gem_object *obj,
return ret;
 }
 
+struct sg_table *virtio_gpu_vram_map_dma_buf(struct virtio_gpu_object *bo,
+struct device *dev,
+enum dma_data_direction dir)
+{
+   struct virtio_gpu_device *vgdev = bo->base.base.dev->dev_private;
+   struct virtio_gpu_object_vram *vram = to_virtio_gpu_vram(bo);
+   struct sg_table *sgt;
+   dma_addr_t addr;
+   int ret;
+
+   sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
+   if (!sgt)
+   return ERR_PTR(-ENOMEM);
+
+   if (!(bo->blob_flags & VIRTGPU_BLOB_FLAG_USE_MAPPABLE)) {
+   // Virtio devices can access the dma-buf via its UUID. Return a 
stub
+   // sg_table so the dma-buf API still works.
+   if (!is_virtio_device(dev) || !vgdev->has_resource_assign_uuid) 
{
+   ret = -EIO;
+   goto out;
+   }
+  

Re: [PATCH v1 1/2] drm/msm/dp: Add support for SC7280 eDP

2021-08-12 Thread sbillaka

On 2021-08-12 06:11, Stephen Boyd wrote:

Quoting Sankeerth Billakanti (2021-08-11 17:08:01)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c

index b131fd37..1096c44 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -856,9 +856,9 @@ static const struct dpu_intf_cfg sm8150_intf[] = {
 };

 static const struct dpu_intf_cfg sc7280_intf[] = {
-   INTF_BLK("intf_0", INTF_0, 0x34000, INTF_DP, 0, 24, 
INTF_SC7280_MASK, MDP_SSPP_TOP0_INTR, 24, 25),
+   INTF_BLK("intf_0", INTF_0, 0x34000, INTF_DP, 1, 24, 
INTF_SC7280_MASK, MDP_SSPP_TOP0_INTR, 24, 25),
INTF_BLK("intf_1", INTF_1, 0x35000, INTF_DSI, 0, 24, 
INTF_SC7280_MASK, MDP_SSPP_TOP0_INTR, 26, 27),
-   INTF_BLK("intf_5", INTF_5, 0x39000, INTF_EDP, 0, 24, 
INTF_SC7280_MASK, MDP_SSPP_TOP0_INTR, 22, 23),
+   INTF_BLK("intf_5", INTF_5, 0x39000, INTF_DP, 0, 24, 
INTF_SC7280_MASK, MDP_SSPP_TOP0_INTR, 22, 23),

 };

 /*
diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c 
b/drivers/gpu/drm/msm/dp/dp_ctrl.c

index d2569da..06d5a2d 100644
--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
+++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
@@ -1244,7 +1244,9 @@ static int dp_ctrl_link_train(struct 
dp_ctrl_private *ctrl,

struct dp_cr_status *cr, int *training_step)
 {
int ret = 0;
+   u8 *dpcd = ctrl->panel->dpcd;
u8 encoding = DP_SET_ANSI_8B10B;
+   u8 ssc = 0, assr = 0;


Please don't initialize to zero and then overwrite it before using it.
It hides usage before actual initialization bugs.



Okay. I will change it.


struct dp_link_info link_info = {0};

dp_ctrl_config_ctrl(ctrl);
@@ -1254,9 +1256,21 @@ static int dp_ctrl_link_train(struct 
dp_ctrl_private *ctrl,

link_info.capabilities = DP_LINK_CAP_ENHANCED_FRAMING;

dp_aux_link_configure(ctrl->aux, _info);
+
+   if (dpcd[DP_MAX_DOWNSPREAD] & DP_MAX_DOWNSPREAD_0_5) {
+   ssc = DP_SPREAD_AMP_0_5;
+   drm_dp_dpcd_write(ctrl->aux, DP_DOWNSPREAD_CTRL, , 
1);

+   }
+
drm_dp_dpcd_write(ctrl->aux, DP_MAIN_LINK_CHANNEL_CODING_SET,
, 1);

+   if (dpcd[DP_EDP_CONFIGURATION_CAP] & 
DP_ALTERNATE_SCRAMBLER_RESET_CAP) {

+   assr = DP_ALTERNATE_SCRAMBLER_RESET_ENABLE;
+   drm_dp_dpcd_write(ctrl->aux, DP_EDP_CONFIGURATION_SET,
+   , 1);
+   }
+
ret = dp_ctrl_link_train_1(ctrl, cr, training_step);
if (ret) {
DRM_ERROR("link training #1 failed. ret=%d\n", ret);
@@ -1328,9 +1342,11 @@ static int 
dp_ctrl_enable_mainlink_clocks(struct dp_ctrl_private *ctrl)

struct dp_io *dp_io = >parser->io;
struct phy *phy = dp_io->phy;
struct phy_configure_opts_dp *opts_dp = _io->phy_opts.dp;
+   u8 *dpcd = ctrl->panel->dpcd;


const?



Okay. I will change to const u8 *dpcd at all the required places.



opts_dp->lanes = ctrl->link->link_params.num_lanes;
opts_dp->link_rate = ctrl->link->link_params.rate / 100;
+   opts_dp->ssc = dpcd[DP_MAX_DOWNSPREAD] & 
DP_MAX_DOWNSPREAD_0_5;

dp_ctrl_set_clock_rate(ctrl, DP_CTRL_PM, "ctrl_link",
ctrl->link->link_params.rate * 
1000);


@@ -1760,6 +1776,9 @@ int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl)
ctrl->link->link_params.num_lanes = 
ctrl->panel->link_info.num_lanes;
ctrl->dp_ctrl.pixel_rate = 
ctrl->panel->dp_mode.drm_mode.clock;


+   if (ctrl->dp_ctrl.pixel_rate == 0)
+   return -EINVAL;
+


Why are we enabling the stream with a zero pixel clk?



This was an error condition I encountered while bringing up sc7280. HPD 
processing was delayed and I got a commit with pixel clock = 0. I will 
recheck why this is happening.



DRM_DEBUG_DP("rate=%d, num_lanes=%d, pixel_rate=%d\n",
ctrl->link->link_params.rate,
ctrl->link->link_params.num_lanes, 
ctrl->dp_ctrl.pixel_rate);
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c

index ee5bf64..a772290 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -117,8 +117,36 @@ struct dp_display_private {
struct dp_audio *audio;
 };

+struct msm_dp_config {
+   phys_addr_t io_start[3];
+   size_t num_dp;
+};
+
+static const struct msm_dp_config sc7180_dp_cfg = {
+   .io_start = { 0x0ae9 },
+   .num_dp = 1,
+};
+
+static const struct msm_dp_config sc8180x_dp_cfg = {
+   .io_start = { 0xae9, 0xae98000, 0 },
+   .num_dp = 3,
+};
+
+static const struct msm_dp_config sc8180x_edp_cfg = {
+   .io_start = { 0, 0, 0xae9a000 },
+   .num_dp = 3,
+};
+
+static const struct msm_dp_config sc7280_edp_cfg = {
+   .io_start = { 0xaea, 0 },
+   .num_dp = 2,
+};


Are all 

Re: [PATCH 5/9] drm/i915/guc: Flush the work queue for GuC generated G2H

2021-08-12 Thread Matthew Brost
On Thu, Aug 12, 2021 at 09:47:23PM +0200, Daniel Vetter wrote:
> On Thu, Aug 12, 2021 at 03:23:30PM +, Matthew Brost wrote:
> > On Thu, Aug 12, 2021 at 04:11:28PM +0200, Daniel Vetter wrote:
> > > On Wed, Aug 11, 2021 at 01:16:18AM +, Matthew Brost wrote:
> > > > Flush the work queue for GuC generated G2H messages durinr a GT reset.
> > > > This is accomplished by spinning on the the list of outstanding G2H to
> > > > go empty.
> > > > 
> > > > Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC 
> > > > interface")
> > > > Signed-off-by: Matthew Brost 
> > > > Cc: 
> > > > ---
> > > >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +
> > > >  1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > index 3cd2da6f5c03..e5eb2df11b4a 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > @@ -727,6 +727,11 @@ void intel_guc_submission_reset_prepare(struct 
> > > > intel_guc *guc)
> > > > wait_for_reset(guc, 
> > > > >outstanding_submission_g2h);
> > > > } while (!list_empty(>ct.requests.incoming));
> > > > }
> > > > +
> > > > +   /* Flush any GuC generated G2H */
> > > > +   while (!list_empty(>ct.requests.incoming))
> > > > +   msleep(20);
> > > 
> > > flush_work or flush_workqueue, beacuse that comes with lockdep
> > > annotations. Dont hand-roll stuff like this if at all possible.
> > 
> > lockdep puked when used that.
> 
> Lockdep tends to be right ... So we definitely want that, but maybe a
> different flavour, or there's something wrong with the workqueue setup.
> 

Here is a dependency chain that lockdep doesn't like.

fs_reclaim_acquire -> >reset.mutex (shrinker)
workqueue -> fs_reclaim_acquire (error capture in workqueue)
>reset.mutex -> workqueue (reset)

In practice I don't think we couldn't ever hit this but lockdep does
looks right here. Trying to work out how to fix this. We really need to
all G2H to done being processed before we proceed during a reset or we
have races. Have a few ideas of how to handle this but can't convince
myself any of them are fully safe.

Splat below:

[  154.625989] ==
[  154.632195] WARNING: possible circular locking dependency detected
[  154.638393] 5.14.0-rc5-guc+ #50 Tainted: G U
[  154.643991] --
[  154.650196] i915_selftest/1673 is trying to acquire lock:
[  154.655621] 8881079cb918 
((work_completion)(>requests.worker)){+.+.}-{0:0}, at: 
__flush_work+0x350/0x4d0
[  154.665826]
   but task is already holding lock:
[  154.671682] 8881079cbfb8 (>reset.mutex){+.+.}-{3:3}, at: 
intel_gt_reset+0xf0/0x300 [i915]
[  154.680659]
   which lock already depends on the new lock.

[  154.688857]
   the existing dependency chain (in reverse order) is:
[  154.696365]
   -> #2 (>reset.mutex){+.+.}-{3:3}:
[  154.702571]lock_acquire+0xd2/0x300
[  154.706695]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
[  154.712959]intel_gt_init_reset+0x61/0x80 [i915]
[  154.718258]intel_gt_init_early+0xe6/0x120 [i915]
[  154.723648]i915_driver_probe+0x592/0xdc0 [i915]
[  154.728942]i915_pci_probe+0x43/0x1c0 [i915]
[  154.733891]pci_device_probe+0x9b/0x110
[  154.738362]really_probe+0x1a6/0x3a0
[  154.742568]__driver_probe_device+0xf9/0x170
[  154.747468]driver_probe_device+0x19/0x90
[  154.752114]__driver_attach+0x99/0x170
[  154.756492]bus_for_each_dev+0x73/0xc0
[  154.760870]bus_add_driver+0x14b/0x1f0
[  154.765248]driver_register+0x67/0xb0
[  154.769542]i915_init+0x18/0x8c [i915]
[  154.773964]do_one_initcall+0x53/0x2e0
[  154.778343]do_init_module+0x56/0x210
[  154.782639]load_module+0x25fc/0x29f0
[  154.786934]__do_sys_finit_module+0xae/0x110
[  154.791835]do_syscall_64+0x38/0xc0
[  154.795958]entry_SYSCALL_64_after_hwframe+0x44/0xae
[  154.801558]
   -> #1 (fs_reclaim){+.+.}-{0:0}:
[  154.807241]lock_acquire+0xd2/0x300
[  154.811361]fs_reclaim_acquire+0x9e/0xd0
[  154.815914]kmem_cache_alloc_trace+0x30/0x790
[  154.820899]i915_gpu_coredump_alloc+0x53/0x1a0 [i915]
[  154.826649]i915_gpu_coredump+0x39/0x560 [i915]
[  154.831866]i915_capture_error_state+0xa/0x70 [i915]
[  154.837513]intel_guc_context_reset_process_msg+0x174/0x1f0 [i915]
[  154.844383]ct_incoming_request_worker_func+0x130/0x1b0 [i915]
[  154.850898]process_one_work+0x264/0x590
[  154.855451]worker_thread+0x4b/0x3a0
[  154.859655]kthread+0x147/0x170
[  154.863428]

Re: [PATCH 2/2] drm/i915: Add pci ids and uapi for DG1

2021-08-12 Thread Jason Ekstrand
On Thu, Aug 12, 2021 at 9:49 AM Daniel Vetter  wrote:
>
> On Thu, Aug 12, 2021 at 2:44 PM Maarten Lankhorst
>  wrote:
> >
> > DG1 has support for local memory, which requires the usage of the
> > lmem placement extension for creating bo's, and memregion queries
> > to obtain the size. Because of this, those parts of the uapi are
> > no longer guarded behind FAKE_LMEM.
> >
> > According to the pull request referenced below, mesa should be mostly
> > ready for DG1. VK_EXT_memory_budget is not hooked up yet, but we
> > should definitely just enable the uapi parts by default.
> >
> > Signed-off-by: Maarten Lankhorst 
> > References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584
> > Cc: Jordan Justen jordan.l.jus...@intel.com
> > Cc: Jason Ekstrand ja...@jlekstrand.net
>
> Acked-by: Daniel Vetter 

Acked-by: Jason Ekstrand 

>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_create.c | 3 ---
> >  drivers/gpu/drm/i915/i915_pci.c| 1 +
> >  drivers/gpu/drm/i915/i915_query.c  | 3 ---
> >  3 files changed, 1 insertion(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> > index 23fee13a3384..1d341b8c47c0 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> > @@ -347,9 +347,6 @@ static int ext_set_placements(struct 
> > i915_user_extension __user *base,
> >  {
> > struct drm_i915_gem_create_ext_memory_regions ext;
> >
> > -   if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM))
> > -   return -ENODEV;
> > -
> > if (copy_from_user(, base, sizeof(ext)))
> > return -EFAULT;
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c 
> > b/drivers/gpu/drm/i915/i915_pci.c
> > index 1bbd09ad5287..93ccdc6bbd03 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -1115,6 +1115,7 @@ static const struct pci_device_id pciidlist[] = {
> > INTEL_RKL_IDS(_info),
> > INTEL_ADLS_IDS(_s_info),
> > INTEL_ADLP_IDS(_p_info),
> > +   INTEL_DG1_IDS(_info),
> > {0, 0, 0}
> >  };
> >  MODULE_DEVICE_TABLE(pci, pciidlist);
> > diff --git a/drivers/gpu/drm/i915/i915_query.c 
> > b/drivers/gpu/drm/i915/i915_query.c
> > index e49da36c62fb..5e2b909827f4 100644
> > --- a/drivers/gpu/drm/i915/i915_query.c
> > +++ b/drivers/gpu/drm/i915/i915_query.c
> > @@ -432,9 +432,6 @@ static int query_memregion_info(struct drm_i915_private 
> > *i915,
> > u32 total_length;
> > int ret, id, i;
> >
> > -   if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM))
> > -   return -ENODEV;
> > -
> > if (query_item->flags != 0)
> > return -EINVAL;
> >
> > --
> > 2.32.0
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


[git pull] drm fixes for 5.14-rc6

2021-08-12 Thread Dave Airlie
Hey Linus,

Another week, another set of pretty regular fixes, nothing really
stands out too much.

Dave.

drm-fixes-2021-08-13:
drm fixes for 5.14-rc6

amdgpu:
- Yellow carp update
- RAS EEPROM fixes
- BACO/BOCO fixes
- Fix a memory leak in an error path
- Freesync fix
- VCN harvesting fix
- Display fixes

i915:
- GVT fix for Windows VM hang.
- Display fix of 12 BPC bits for display 12 and newer.
- Don't try to access some media register for fused off domains.
- Fix kerneldoc build warnings.

mediatek:
- Fix dpi bridge bug.
- Fix cursor plane no update.

meson:
- Fix colors when booting with HDR
The following changes since commit 36a21d51725af2ce0700c6ebcb6b9594aac658a6:

  Linux 5.14-rc5 (2021-08-08 13:49:31 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2021-08-13

for you to fetch changes up to a1fa72683166b3c69511d5f2ffb37b9f49f48fea:

  Merge tag 'drm-misc-fixes-2021-08-12' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2021-08-13
06:37:40 +1000)


drm fixes for 5.14-rc6

amdgpu:
- Yellow carp update
- RAS EEPROM fixes
- BACO/BOCO fixes
- Fix a memory leak in an error path
- Freesync fix
- VCN harvesting fix
- Display fixes

i915:
- GVT fix for Windows VM hang.
- Display fix of 12 BPC bits for display 12 and newer.
- Don't try to access some media register for fused off domains.
- Fix kerneldoc build warnings.

mediatek:
- Fix dpi bridge bug.
- Fix cursor plane no update.

meson:
- Fix colors when booting with HDR


Alex Deucher (2):
  drm/amdgpu: don't enable baco on boco platforms in runpm
  drm/amdgpu: handle VCN instances when harvesting (v2)

Ankit Nautiyal (1):
  drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg

Anson Jacob (1):
  drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work

Christian Hewitt (1):
  drm/meson: fix colour distortion from HDR set during vendor u-boot

Christophe JAILLET (1):
  drm/amd/pm: Fix a memory leak in an error handling path in
'vangogh_tables_init()'

Daniel Vetter (1):
  drm/doc/rfc: drop lmem uapi section

Dave Airlie (4):
  Merge tag 'mediatek-drm-fixes-5.14' of
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux
into drm-fixes
  Merge tag 'amd-drm-fixes-5.14-2021-08-11' of
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
  Merge tag 'drm-intel-fixes-2021-08-12' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'drm-misc-fixes-2021-08-12' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Eric Bernstein (1):
  drm/amd/display: Remove invalid assert for ODM + MPC case

Frank Wunderlich (1):
  drm/mediatek: dpi: Fix NULL dereference in mtk_dpi_bridge_atomic_check

Hsin-Yi Wang (1):
  drm/mediatek: mtk-dpi: Set out_fmt from config if not the last bridge

John Clements (1):
  drm/amdgpu: set RAS EEPROM address from VBIOS

Kenneth Feng (1):
  drm/amd/pm: bug fix for the runtime pm BACO

Matt Roper (1):
  drm/i915: Only access SFC_DONE when media domain is not fused off

Rodrigo Vivi (1):
  Merge tag 'gvt-fixes-2021-08-10' of
https://github.com/intel/gvt-linux into drm-intel-fixes

Solomon Chiu (1):
  drm/amdgpu: Add preferred mode in modeset when freesync video
mode's enabled.

Xiaomeng Hou (1):
  drm/amd/pm: update smu v13.0.1 firmware header

Zhenyu Wang (1):
  drm/i915/gvt: Fix cached atomics setting for Windows VM

jason-jh.lin (1):
  drm/mediatek: Fix cursor plane no update

 Documentation/gpu/rfc/i915_gem_lmem.rst| 109 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c   |  40 
 drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c  |  12 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c |   4 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |   7 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c  |   2 +-
 .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c  |   1 -
 drivers/gpu/drm/amd/include/atomfirmware.h |   2 +-
 drivers/gpu/drm/amd/pm/inc/smu_v13_0_1_pmfw.h  |   4 +-
 .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c|   3 +-
 drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c   |   2 +-
 drivers/gpu/drm/i915/display/intel_display.c   |  34 +--
 drivers/gpu/drm/i915/gvt/handlers.c|   1 +
 drivers/gpu/drm/i915/gvt/mmio_context.c|   2 +
 drivers/gpu/drm/i915/i915_gpu_error.c  |  19 +++-
 drivers/gpu/drm/i915/i915_reg.h|  16 ++-
 drivers/gpu/drm/mediatek/mtk_dpi.c |   6 +-
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c|   3 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c   |  60 +++-
 

Re: linux-next: build failure after merge of the driver-core tree

2021-08-12 Thread Doug Anderson
Hi,

On Tue, Aug 10, 2021 at 5:13 AM Geert Uytterhoeven  wrote:
>
> On Fri, Jul 23, 2021 at 7:35 AM Uwe Kleine-König
>  wrote:
> > On Fri, Jul 23, 2021 at 03:09:44PM +1000, Stephen Rothwell wrote:
> > > After merging the driver-core tree, today's linux-next build (arm
> > > multi_v7_defconfig) failed like this:
> > >
> > > drivers/gpu/drm/drm_dp_aux_bus.c:106:13: error: initialization of 'void 
> > > (*)(struct device *)' from incompatible pointer type 'int (*)(struct 
> > > device *)' [-Werror=incompatible-pointer-types]
> > >   106 |  .remove  = dp_aux_ep_remove,
> > >   | ^~~~
> > > drivers/gpu/drm/drm_dp_aux_bus.c:106:13: note: (near initialization for 
> > > 'dp_aux_bus_type.remove')
> > >
> > > Caused by commit
> > >
> > >   aeb33699fc2c ("drm: Introduce the DP AUX bus")
> > >
> > > from the drm tree interacting with commit
> > >
> > >   fc7a6209d571 ("bus: Make remove callback return void")
> > >
> > > from the driver-core tree.
> > >
> > > I applied the following merge fix patch.
> > >
> > > From: Stephen Rothwell 
> > > Date: Fri, 23 Jul 2021 14:58:25 +1000
> > > Subject: [PATCH] fix for "drm: Introduce the DP AUX bus"
> > >
> > > interaction with "bus: Make remove callback return void"
> > >
> > > Signed-off-by: Stephen Rothwell 
> > > ---
> > >  drivers/gpu/drm/drm_dp_aux_bus.c | 5 +
> > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_dp_aux_bus.c 
> > > b/drivers/gpu/drm/drm_dp_aux_bus.c
> > > index e49a70f3691b..298ea7a49591 100644
> > > --- a/drivers/gpu/drm/drm_dp_aux_bus.c
> > > +++ b/drivers/gpu/drm/drm_dp_aux_bus.c
> > > @@ -67,9 +67,8 @@ static int dp_aux_ep_probe(struct device *dev)
> > >   *
> > >   * Calls through to the endpoint driver remove.
> > >   *
> > > - * Return: 0 if no error or negative error code.
> > >   */
> > > -static int dp_aux_ep_remove(struct device *dev)
> > > +static void dp_aux_ep_remove(struct device *dev)
> > >  {
> > >   struct dp_aux_ep_driver *aux_ep_drv = to_dp_aux_ep_drv(dev->driver);
> > >   struct dp_aux_ep_device *aux_ep = to_dp_aux_ep_dev(dev);
> > > @@ -77,8 +76,6 @@ static int dp_aux_ep_remove(struct device *dev)
> > >   if (aux_ep_drv->remove)
> > >   aux_ep_drv->remove(aux_ep);
> > >   dev_pm_domain_detach(dev, true);
> > > -
> > > - return 0;
> > >  }
> >
> > This looks right.
> >
> > Greg provided a tag containing fc7a6209d571 ("bus: Make remove callback
> > return void") at
> >
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
> > tags/bus_remove_return_void-5.15
> >
> > (see https://lore.kernel.org/lkml/ypkwqwf0dukng...@kroah.com).
> >
> > It would be great if this could be merged into the drm tree with the
> > above diff squashed into the merge commit.
>
> +1.

I looked at trying to do this but I think it's beyond the scope of
privileges that I'm granted as a drm_misc committer (not a drm_misc
maintainer). Adding the official maintainers [1].
Maarten/Maxime/Thomas would this be something you could do?

[1] https://drm.pages.freedesktop.org/maintainer-tools/repositories.html

-Doug


[PATCH 2/2] drm/ttm: Include pagemap.h from ttm_tt.h

2021-08-12 Thread Jason Ekstrand
It's needed for pgprot_t which is used in the header.

Signed-off-by: Jason Ekstrand 
Cc: Christian König 
---
 drivers/gpu/drm/ttm/ttm_tt.c | 1 -
 include/drm/ttm/ttm_tt.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 24031a8acd2d..d5cd8b5dc0bf 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -32,7 +32,6 @@
 #define pr_fmt(fmt) "[TTM] " fmt
 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 0d97967bf955..b20e89d321b0 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -27,6 +27,7 @@
 #ifndef _TTM_TT_H_
 #define _TTM_TT_H_
 
+#include 
 #include 
 #include 
 #include 
-- 
2.31.1



[PATCH 1/2] drm/ttm: ttm_bo_device is now ttm_device

2021-08-12 Thread Jason Ekstrand
These names were changed in

commit 8af8a109b34fa88b8b91f25d11485b37d37549c3
Author: Christian König 
Date:   Thu Oct 1 14:51:40 2020 +0200

drm/ttm: device naming cleanup

But he missed a couple of them.

Signed-off-by: Jason Ekstrand 
Cc: Christian König 
Fixes: 8af8a109b34f ("drm/ttm: device naming cleanup")
---
 Documentation/gpu/drm-mm.rst | 2 +-
 include/drm/ttm/ttm_tt.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index d5a73fa2c9ef..8126beadc7df 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -37,7 +37,7 @@ TTM initialization
 This section is outdated.
 
 Drivers wishing to support TTM must pass a filled :c:type:`ttm_bo_driver
-` structure to ttm_bo_device_init, together with an
+` structure to ttm_device_init, together with an
 initialized global reference to the memory manager.  The ttm_bo_driver
 structure contains several fields with function pointers for
 initializing the TTM, allocating and freeing memory, waiting for command
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index 818680c6a8ed..0d97967bf955 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -31,7 +31,7 @@
 #include 
 #include 
 
-struct ttm_bo_device;
+struct ttm_device;
 struct ttm_tt;
 struct ttm_resource;
 struct ttm_buffer_object;
-- 
2.31.1



Re: [PATCH 4/4] drm/bridge: ti-sn65dsi86: Add NO_CONNECTOR support

2021-08-12 Thread Doug Anderson
Laurent,

On Thu, Aug 12, 2021 at 12:26 PM Laurent Pinchart
 wrote:
>
> Hi Rob,
>
> Thank you for the patch.
>
> On Wed, Aug 11, 2021 at 04:52:50PM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > Slightly awkward to fish out the display_info when we aren't creating
> > own connector.  But I don't see an obvious better way.
>
> We need a bit more than this, to support the NO_CONNECTOR case, the
> bridge has to implement a few extra operations, and set the bridge .ops
> field. I've submitted two patches to do so a while ago:
>
> - [RFC PATCH 08/11] drm/bridge: ti-sn65dsi86: Implement bridge connector 
> operations ([1])

Rob asked me about this over IRC, so if he left it out and it's needed
then it's my fault. However, I don't believe it's needed until your
series making this bridge chip support full DP. For the the eDP case
the bridge chip driver in ToT no longer queries the EDID itself. It
simply provides an AUX bus to the panel driver and the panel driver
queries the EDID. I think that means we don't need to add
DRM_BRIDGE_OP_EDID, right?

I was also wondering if in the full DP case we should actually model
the physical DP jack as a drm_bridge and have it work the same way. It
would get probed via the DP AUX bus just like a panel. I seem to
remember Stephen Boyd was talking about modeling the DP connector as a
drm_bridge because it would allow us to handle the fact that some TCPC
chips could only support HBR2 whereas others could support HBR3. Maybe
it would end up being a fairly elegant solution?

> - [RFC PATCH 09/11] drm/bridge: ti-sn65dsi86: Make connector creation 
> optional ([2])
>
> The second patch is similar to the first half of this patch, but misses
> the cleanup code. I'll try to rebase this and resubmit, but it may take
> a bit of time.

Whoops! You're right that Rob's patch won't work at all because we'll
just hit the "Fix bridge driver to make connector optional!" case. I
should have noticed that. :(

-Doug


Re: [PATCH] drm: Copy drm_wait_vblank request and copy_to_user before return.

2021-08-12 Thread Mark Yacoub
On Thu, Aug 12, 2021 at 5:26 AM Michel Dänzer  wrote:
>
> On 2021-08-11 7:55 p.m., Mark Yacoub wrote:
> > From: Mark Yacoub 
> >
> > [Why]
> > Userspace should get back a copy of the request that's been modified
> > even when drm_wait_vblank_ioctl returns a failure.
> >
> > Rationale:
> > drm_wait_vblank_ioctl modifies the request and expects the user to read
> > back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates
> > the sequence to become current_vblank_count + sequence (which was
> > relative), not it becomes absolute.
> > drmWaitVBlank (in libdrm), expects this to be the case as it modifies
> > the request to be Absolute as it expects the sequence to would have been
> > updated.
> >
> > The change is in compat_drm_wait_vblank, which is called by
> > drm_compat_ioctl. This change of copying the data back regardless of the
> > return number makes it en par with drm_ioctl, which always copies the
> > data before returning.
> >
> > [How]
> > Copy the drm_wait_vblank request.
> > Return from the function after everything has been copied to user.
> >
> > Fixes: IGT:kms_flip::modeset-vs-vblank-race-interruptible
> > Tested on ChromeOS Trogdor(msm)
> >
> > Signed-off-by: Mark Yacoub 
> > ---
> >  drivers/gpu/drm/drm_ioc32.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_ioc32.c b/drivers/gpu/drm/drm_ioc32.c
> > index d29907955ff79..275b860df8fbe 100644
> > --- a/drivers/gpu/drm/drm_ioc32.c
> > +++ b/drivers/gpu/drm/drm_ioc32.c
> > @@ -855,17 +855,19 @@ static int compat_drm_wait_vblank(struct file *file, 
> > unsigned int cmd,
> >   req.request.sequence = req32.request.sequence;
> >   req.request.signal = req32.request.signal;
> >   err = drm_ioctl_kernel(file, drm_wait_vblank_ioctl, , 
> > DRM_UNLOCKED);
> > - if (err)
> > - return err;
> >
> >   req32.reply.type = req.reply.type;
> >   req32.reply.sequence = req.reply.sequence;
> >   req32.reply.tval_sec = req.reply.tval_sec;
> >   req32.reply.tval_usec = req.reply.tval_usec;
> > + /* drm_wait_vblank_ioctl modifies Request, update their values here 
> > as well. */
> > + req32.request.type = req.request.type;
> > + req32.request.sequence = req.request.sequence;
> > + req32.request.signal = req.request.signal;
>
> The added assignments are superfluous, since req32.reply and req32.request 
> are members of the same union.
Noted.
>
>
> >   if (copy_to_user(argp, , sizeof(req32)))
> >   return -EFAULT;
> >
> > - return 0;
> > + return err;
> >  }
>
> The other changes look correct.
Thanks! updated v2:
https://patchwork.freedesktop.org/patch/449768/?series=93605=2
>
>
> --
> Earthling Michel Dänzer   |   https://redhat.com
> Libre software enthusiast | Mesa and X developer


[PATCH v2] drm: Copy drm_wait_vblank to user before returning

2021-08-12 Thread Mark Yacoub
From: Mark Yacoub 

[Why]
Userspace should get back a copy of drm_wait_vblank that's been modified
even when drm_wait_vblank_ioctl returns a failure.

Rationale:
drm_wait_vblank_ioctl modifies the request and expects the user to read
it back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates
the sequence to become current_vblank_count + sequence (which was
RELATIVE), but now it became ABSOLUTE.
drmWaitVBlank (in libdrm) expects this to be the case as it modifies
the request to be Absolute so it expects the sequence to would have been
updated.

The change is in compat_drm_wait_vblank, which is called by
drm_compat_ioctl. This change of copying the data back regardless of the
return number makes it en par with drm_ioctl, which always copies the
data before returning.

[How]
Return from the function after everything has been copied to user.

Fixes: IGT:kms_flip::modeset-vs-vblank-race-interruptible
Tested on ChromeOS Trogdor(msm)

Signed-off-by: Mark Yacoub 
Change-Id: I98da279a5f1329c66a9d1e06b88d40b247b51313
---
 drivers/gpu/drm/drm_ioc32.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_ioc32.c b/drivers/gpu/drm/drm_ioc32.c
index d29907955ff79..5d82891c32223 100644
--- a/drivers/gpu/drm/drm_ioc32.c
+++ b/drivers/gpu/drm/drm_ioc32.c
@@ -855,8 +855,6 @@ static int compat_drm_wait_vblank(struct file *file, 
unsigned int cmd,
req.request.sequence = req32.request.sequence;
req.request.signal = req32.request.signal;
err = drm_ioctl_kernel(file, drm_wait_vblank_ioctl, , DRM_UNLOCKED);
-   if (err)
-   return err;
 
req32.reply.type = req.reply.type;
req32.reply.sequence = req.reply.sequence;
@@ -865,7 +863,7 @@ static int compat_drm_wait_vblank(struct file *file, 
unsigned int cmd,
if (copy_to_user(argp, , sizeof(req32)))
return -EFAULT;
 
-   return 0;
+   return err;
 }
 
 #if defined(CONFIG_X86)
-- 
2.33.0.rc1.237.g0d66db33f3-goog



Re: [PATCH 5/9] drm/i915/guc: Flush the work queue for GuC generated G2H

2021-08-12 Thread Daniel Vetter
On Thu, Aug 12, 2021 at 03:23:30PM +, Matthew Brost wrote:
> On Thu, Aug 12, 2021 at 04:11:28PM +0200, Daniel Vetter wrote:
> > On Wed, Aug 11, 2021 at 01:16:18AM +, Matthew Brost wrote:
> > > Flush the work queue for GuC generated G2H messages durinr a GT reset.
> > > This is accomplished by spinning on the the list of outstanding G2H to
> > > go empty.
> > > 
> > > Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC 
> > > interface")
> > > Signed-off-by: Matthew Brost 
> > > Cc: 
> > > ---
> > >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > index 3cd2da6f5c03..e5eb2df11b4a 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > @@ -727,6 +727,11 @@ void intel_guc_submission_reset_prepare(struct 
> > > intel_guc *guc)
> > >   wait_for_reset(guc, >outstanding_submission_g2h);
> > >   } while (!list_empty(>ct.requests.incoming));
> > >   }
> > > +
> > > + /* Flush any GuC generated G2H */
> > > + while (!list_empty(>ct.requests.incoming))
> > > + msleep(20);
> > 
> > flush_work or flush_workqueue, beacuse that comes with lockdep
> > annotations. Dont hand-roll stuff like this if at all possible.
> 
> lockdep puked when used that.

Lockdep tends to be right ... So we definitely want that, but maybe a
different flavour, or there's something wrong with the workqueue setup.

This is the major reason why inventing any wheels locally in the kernel
isn't a good idea - aside from the duplicated code because likely there is
a solution for whatever you need. There's all the validation tools,
careful thought about semantics (especially around races) and all that
stuff that you're always missing on your hand-rolled thing. Aside from
anything hand-rolled is much harder to read, since intent isn't clear.
-Daniel


> 
> Matt
> 
> > -Daniel
> > 
> > > +
> > >   scrub_guc_desc_for_outstanding_g2h(guc);
> > >  }
> > >  
> > > -- 
> > > 2.32.0
> > > 
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 4/4] drm/vgem: use shmem helpers

2021-08-12 Thread Daniel Vetter
On Thu, Aug 12, 2021 at 07:01:44PM +0200, Sam Ravnborg wrote:
> Hi Daniel,
> 
> On Thu, Aug 12, 2021 at 03:14:12PM +0200, Daniel Vetter wrote:
> > Aside from deleting lots of code the real motivation here is to switch
> > the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> > work, and even if you try and there's a struct page behind that,
> > touching it and mucking around with its refcount can upset drivers
> > real bad.
> The only thing I understood of all this complicated stuff was "deleting
> lots of code" which is a good thing.
> You may want to s/VM_PFNMP/VM_PFNMAP/ before you push this.

Fixed and patches 2-4 from this series pushed.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PULL] drm-misc-next

2021-08-12 Thread Maarten Lankhorst
Last drm-misc-next for next kernel release!

drm-misc-next-2021-08-12:
drm-misc-next for v5.15:

UAPI Changes:

Cross-subsystem Changes:
- Add lockdep_assert(once) helpers.

Core Changes:
- Add lockdep assert to drm_is_current_master_locked.
- Fix typos in dma-buf documentation.
- Mark drm irq midlayer as legacy only.
- Fix GPF in udmabuf_create.
- Rename member to correct value in drm_edid.h

Driver Changes:
- Build fix to make nouveau build with NOUVEAU_BACKLIGHT.
- Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels.
- Assorted fixes to bridge/it66121, anx7625.
- Add custom crtc_state to simple helpers, and use it to
  convert pll handling in mgag200 to atomic.
- Convert drivers to use offset-adjusted framebuffer bo mappings.
- Assorted small fixes and fix for a use-after-free in vmwgfx.
- Convert remaining callers of non-legacy drivers to use linux irqs directly.
- Small cleanup in ingenic.
- Small fixes to virtio and ti-sn65dsi86.
The following changes since commit 5a04227326b04c15b015181772f5c853172fdb68:

  drm/panel: Add ilitek ili9341 panel driver (2021-08-05 11:09:23 +0200)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-next-2021-08-12

for you to fetch changes up to c7782443a88926a4f938f0193041616328cf2db2:

  drm/bridge: ti-sn65dsi86: Avoid creating multiple connectors (2021-08-12 
09:56:09 -0700)


drm-misc-next for v5.15:

UAPI Changes:

Cross-subsystem Changes:
- Add lockdep_assert(once) helpers.

Core Changes:
- Add lockdep assert to drm_is_current_master_locked.
- Fix typos in dma-buf documentation.
- Mark drm irq midlayer as legacy only.
- Fix GPF in udmabuf_create.
- Rename member to correct value in drm_edid.h

Driver Changes:
- Build fix to make nouveau build with NOUVEAU_BACKLIGHT.
- Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels.
- Assorted fixes to bridge/it66121, anx7625.
- Add custom crtc_state to simple helpers, and use it to
  convert pll handling in mgag200 to atomic.
- Convert drivers to use offset-adjusted framebuffer bo mappings.
- Assorted small fixes and fix for a use-after-free in vmwgfx.
- Convert remaining callers of non-legacy drivers to use linux irqs directly.
- Small cleanup in ingenic.
- Small fixes to virtio and ti-sn65dsi86.


Baokun Li (2):
  drm/vmwgfx: Use list_move_tail instead of list_del/list_add_tail in 
vmwgfx_cmdbuf.c
  drm/vmwgfx: Use list_move_tail instead of list_del/list_add_tail in 
vmwgfx_cmdbuf_res.c

Cai Huoqing (2):
  drm/vmwgfx: Make use of PFN_ALIGN/PFN_UP helper macro
  drm/vmwgfx: Replace "vmw_num_pages" with "PFN_UP"

David Stevens (1):
  drm/virtio: set non-cross device blob uuid_state

Desmond Cheong Zhi Xi (2):
  drm: add lockdep assert to drm_is_current_master_locked
  drm/vmwgfx: fix potential UAF in vmwgfx_surface.c

Gal Pressman (1):
  dma-buf: Fix a few typos in dma-buf documentation

Lucas De Marchi (1):
  drm/edid: fix edid field name

Paul Cercueil (2):
  drm/ingenic: Remove dead code
  drm/ingenic: Use standard drm_atomic_helper_commit_tail

Pavel Skripkin (1):
  udmabuf: fix general protection fault in udmabuf_create

Peter Zijlstra (1):
  locking/lockdep: Provide lockdep_assert{,_once}() helpers

Randy Dunlap (1):
  drm: nouveau: fix disp.c build when NOUVEAU_BACKLIGHT is not enabled

Rob Clark (1):
  drm/bridge: ti-sn65dsi86: Avoid creating multiple connectors

Robert Foss (1):
  drm: bridge: it66121: Check drm_bridge_attach retval

Sam Ravnborg (1):
  drm/panel: simple: add Multi-Innotechnology MI1010AIT-1CP1

Shaokun Zhang (1):
  drm/vmwgfx: Remove the repeated declaration

Søren Andersen (1):
  drm/panel: simple: add LOGIC Technologies LTTD800480070-L6WH-RT

Thomas Zimmermann (38):
  drm/mgag200: Select clock in PLL update functions
  drm/mgag200: Return errno codes from PLL compute functions
  drm/mgag200: Remove P_ARRAY_SIZE
  drm/mgag200: Split PLL setup into compute and update functions
  drm/mgag200: Introduce separate variable for PLL S parameter
  drm/mgag200: Store values (not bits) in struct mgag200_pll_values
  drm/mgag200: Split PLL compute functions by device type
  drm/mgag200: Split PLL compute function for G200SE by rev
  drm/mgag200: Declare PLL clock constants static const
  drm/mgag200: Abstract pixel PLL via struct mgag200_pll
  drm/simple-kms: Support custom CRTC state
  drm/mgag200: Introduce custom CRTC state
  drm/mgag200: Compute PLL values during atomic check
  drm/gem: Provide offset-adjusted framebuffer BO mappings
  drm/ast: Use offset-adjusted shadow-plane mappings
  drm/gud: Get offset-adjusted mapping from drm_gem_fb_vmap()
  drm/hyperv: Use offset-adjusted shadow-plane mappings
  drm/mgag200: Use offset-adjusted shadow-plane mappings
  drm/cirrus: Use 

Re: [PATCH v5 13/20] drm/gem: Delete gem array fencing helpers

2021-08-12 Thread Daniel Vetter
On Thu, Aug 05, 2021 at 12:46:58PM +0200, Daniel Vetter wrote:
> Integrated into the scheduler now and all users converted over.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org

Some acks would be great here.
-Daniel

> ---
>  drivers/gpu/drm/drm_gem.c | 96 ---
>  include/drm/drm_gem.h |  5 --
>  2 files changed, 101 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 09c820045859..37e2e2820f08 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1272,99 +1272,3 @@ drm_gem_unlock_reservations(struct drm_gem_object 
> **objs, int count,
>   ww_acquire_fini(acquire_ctx);
>  }
>  EXPORT_SYMBOL(drm_gem_unlock_reservations);
> -
> -/**
> - * drm_gem_fence_array_add - Adds the fence to an array of fences to be
> - * waited on, deduplicating fences from the same context.
> - *
> - * @fence_array: array of dma_fence * for the job to block on.
> - * @fence: the dma_fence to add to the list of dependencies.
> - *
> - * This functions consumes the reference for @fence both on success and error
> - * cases.
> - *
> - * Returns:
> - * 0 on success, or an error on failing to expand the array.
> - */
> -int drm_gem_fence_array_add(struct xarray *fence_array,
> - struct dma_fence *fence)
> -{
> - struct dma_fence *entry;
> - unsigned long index;
> - u32 id = 0;
> - int ret;
> -
> - if (!fence)
> - return 0;
> -
> - /* Deduplicate if we already depend on a fence from the same context.
> -  * This lets the size of the array of deps scale with the number of
> -  * engines involved, rather than the number of BOs.
> -  */
> - xa_for_each(fence_array, index, entry) {
> - if (entry->context != fence->context)
> - continue;
> -
> - if (dma_fence_is_later(fence, entry)) {
> - dma_fence_put(entry);
> - xa_store(fence_array, index, fence, GFP_KERNEL);
> - } else {
> - dma_fence_put(fence);
> - }
> - return 0;
> - }
> -
> - ret = xa_alloc(fence_array, , fence, xa_limit_32b, GFP_KERNEL);
> - if (ret != 0)
> - dma_fence_put(fence);
> -
> - return ret;
> -}
> -EXPORT_SYMBOL(drm_gem_fence_array_add);
> -
> -/**
> - * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
> - * in the GEM object's reservation object to an array of dma_fences for use 
> in
> - * scheduling a rendering job.
> - *
> - * This should be called after drm_gem_lock_reservations() on your array of
> - * GEM objects used in the job but before updating the reservations with your
> - * own fences.
> - *
> - * @fence_array: array of dma_fence * for the job to block on.
> - * @obj: the gem object to add new dependencies from.
> - * @write: whether the job might write the object (so we need to depend on
> - * shared fences in the reservation object).
> - */
> -int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
> -  struct drm_gem_object *obj,
> -  bool write)
> -{
> - int ret;
> - struct dma_fence **fences;
> - unsigned int i, fence_count;
> -
> - if (!write) {
> - struct dma_fence *fence =
> - dma_resv_get_excl_unlocked(obj->resv);
> -
> - return drm_gem_fence_array_add(fence_array, fence);
> - }
> -
> - ret = dma_resv_get_fences(obj->resv, NULL,
> - _count, );
> - if (ret || !fence_count)
> - return ret;
> -
> - for (i = 0; i < fence_count; i++) {
> - ret = drm_gem_fence_array_add(fence_array, fences[i]);
> - if (ret)
> - break;
> - }
> -
> - for (; i < fence_count; i++)
> - dma_fence_put(fences[i]);
> - kfree(fences);
> - return ret;
> -}
> -EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index 35e7f44c2a75..e55a767188af 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -407,11 +407,6 @@ int drm_gem_lock_reservations(struct drm_gem_object 
> **objs, int count,
> struct ww_acquire_ctx *acquire_ctx);
>  void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
>struct ww_acquire_ctx *acquire_ctx);
> -int drm_gem_fence_array_add(struct xarray *fence_array,
> - struct dma_fence *fence);
> -int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
> -  struct drm_gem_object *obj,
> -

Re: [PATCH v5 12/20] drm/msm: Use scheduler dependency handling

2021-08-12 Thread Daniel Vetter
On Thu, Aug 05, 2021 at 12:46:57PM +0200, Daniel Vetter wrote:
> drm_sched_job_init is already at the right place, so this boils down
> to deleting code.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: linux-arm-...@vger.kernel.org
> Cc: freedr...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org

Ping for ack please.
-Daniel

> ---
>  drivers/gpu/drm/msm/msm_gem.h|  5 -
>  drivers/gpu/drm/msm/msm_gem_submit.c | 19 +--
>  drivers/gpu/drm/msm/msm_ringbuffer.c | 12 
>  3 files changed, 5 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
> index f9e3ffb2309a..8bf0ac707fd7 100644
> --- a/drivers/gpu/drm/msm/msm_gem.h
> +++ b/drivers/gpu/drm/msm/msm_gem.h
> @@ -312,11 +312,6 @@ struct msm_gem_submit {
>   struct ww_acquire_ctx ticket;
>   uint32_t seqno; /* Sequence number of the submit on the ring */
>  
> - /* Array of struct dma_fence * to block on before submitting this job.
> -  */
> - struct xarray deps;
> - unsigned long last_dep;
> -
>   /* Hw fence, which is created when the scheduler executes the job, and
>* is signaled when the hw finishes (via seqno write from cmdstream)
>*/
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 96cea0ba4cfd..fb5a2eab27a2 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -52,8 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> drm_device *dev,
>   return ERR_PTR(ret);
>   }
>  
> - xa_init_flags(>deps, XA_FLAGS_ALLOC);
> -
>   kref_init(>ref);
>   submit->dev = dev;
>   submit->aspace = queue->ctx->aspace;
> @@ -72,8 +70,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
>  {
>   struct msm_gem_submit *submit =
>   container_of(kref, struct msm_gem_submit, ref);
> - unsigned long index;
> - struct dma_fence *fence;
>   unsigned i;
>  
>   if (submit->fence_id) {
> @@ -82,12 +78,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
>   mutex_unlock(>queue->lock);
>   }
>  
> - xa_for_each (>deps, index, fence) {
> - dma_fence_put(fence);
> - }
> -
> - xa_destroy(>deps);
> -
>   dma_fence_put(submit->user_fence);
>   dma_fence_put(submit->hw_fence);
>  
> @@ -343,8 +333,9 @@ static int submit_fence_sync(struct msm_gem_submit 
> *submit, bool no_implicit)
>   if (no_implicit)
>   continue;
>  
> - ret = drm_gem_fence_array_add_implicit(>deps, obj,
> - write);
> + ret = drm_sched_job_add_implicit_dependencies(>base,
> +   obj,
> +   write);
>   if (ret)
>   break;
>   }
> @@ -588,7 +579,7 @@ static struct drm_syncobj **msm_parse_deps(struct 
> msm_gem_submit *submit,
>   if (ret)
>   break;
>  
> - ret = drm_gem_fence_array_add(>deps, fence);
> + ret = drm_sched_job_add_dependency(>base, fence);
>   if (ret)
>   break;
>  
> @@ -798,7 +789,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
>   goto out_unlock;
>   }
>  
> - ret = drm_gem_fence_array_add(>deps, in_fence);
> + ret = drm_sched_job_add_dependency(>base, in_fence);
>   if (ret)
>   goto out_unlock;
>   }
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
> b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index bd54c1412649..652b1dedd7c1 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -11,17 +11,6 @@ static uint num_hw_submissions = 8;
>  MODULE_PARM_DESC(num_hw_submissions, "The max # of jobs to write into 
> ringbuffer (default 8)");
>  module_param(num_hw_submissions, uint, 0600);
>  
> -static struct dma_fence *msm_job_dependency(struct drm_sched_job *job,
> - struct drm_sched_entity *s_entity)
> -{
> - struct msm_gem_submit *submit = to_msm_submit(job);
> -
> - if (!xa_empty(>deps))
> - return xa_erase(>deps, submit->last_dep++);
> -
> - return NULL;
> -}
> -
>  static struct dma_fence *msm_job_run(struct drm_sched_job *job)
>  {
>   struct msm_gem_submit *submit = to_msm_submit(job);
> @@ -52,7 +41,6 @@ static void msm_job_free(struct drm_sched_job *job)
>  }
>  
>  const struct drm_sched_backend_ops msm_sched_ops = {
> - .dependency = msm_job_dependency,
>   .run_job = msm_job_run,
>   .free_job = msm_job_free
>  };
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation

Re: [PATCH v5 11/20] drm/etnaviv: Use scheduler dependency handling

2021-08-12 Thread Daniel Vetter
On Thu, Aug 05, 2021 at 12:46:56PM +0200, Daniel Vetter wrote:
> We need to pull the drm_sched_job_init much earlier, but that's very
> minor surgery.
> 
> v2: Actually fix up cleanup paths by calling drm_sched_job_init, which
> I wanted to to in the previous round (and did, for all other drivers).
> Spotted by Lucas.
> 
> v3: Rebase over renamed functions to add dependencies.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Lucas Stach 
> Cc: Russell King 
> Cc: Christian Gmeiner 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: etna...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org

Ping for an ack please.
-Daniel

> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem.h|  5 +-
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 60 ++-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 63 +---
>  drivers/gpu/drm/etnaviv/etnaviv_sched.h  |  3 +-
>  4 files changed, 37 insertions(+), 94 deletions(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h 
> b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
> index 98e60df882b6..63688e6e4580 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
> @@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo {
>   u64 va;
>   struct etnaviv_gem_object *obj;
>   struct etnaviv_vram_mapping *mapping;
> - struct dma_fence *excl;
> - unsigned int nr_shared;
> - struct dma_fence **shared;
>  };
>  
>  /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
> @@ -95,7 +92,7 @@ struct etnaviv_gem_submit {
>   struct etnaviv_file_private *ctx;
>   struct etnaviv_gpu *gpu;
>   struct etnaviv_iommu_context *mmu_context, *prev_mmu_context;
> - struct dma_fence *out_fence, *in_fence;
> + struct dma_fence *out_fence;
>   int out_fence_id;
>   struct list_head node; /* GPU active submit list */
>   struct etnaviv_cmdbuf cmdbuf;
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 4dd7d9d541c0..e3d43678eb09 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -188,16 +188,11 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
> *submit)
>   if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
>   continue;
>  
> - if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
> - ret = dma_resv_get_fences(robj, >excl,
> -   >nr_shared,
> -   >shared);
> - if (ret)
> - return ret;
> - } else {
> - bo->excl = dma_resv_get_excl_unlocked(robj);
> - }
> -
> + ret = 
> drm_sched_job_add_implicit_dependencies(>sched_job,
> +   >obj->base,
> +   bo->flags & 
> ETNA_SUBMIT_BO_WRITE);
> + if (ret)
> + return ret;
>   }
>  
>   return ret;
> @@ -403,8 +398,6 @@ static void submit_cleanup(struct kref *kref)
>  
>   wake_up_all(>gpu->fence_event);
>  
> - if (submit->in_fence)
> - dma_fence_put(submit->in_fence);
>   if (submit->out_fence) {
>   /* first remove from IDR, so fence can not be found anymore */
>   mutex_lock(>gpu->fence_lock);
> @@ -529,7 +522,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
>   ret = etnaviv_cmdbuf_init(priv->cmdbuf_suballoc, >cmdbuf,
> ALIGN(args->stream_size, 8) + 8);
>   if (ret)
> - goto err_submit_objects;
> + goto err_submit_put;
>  
>   submit->ctx = file->driver_priv;
>   etnaviv_iommu_context_get(submit->ctx->mmu);
> @@ -537,51 +530,62 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, 
> void *data,
>   submit->exec_state = args->exec_state;
>   submit->flags = args->flags;
>  
> + ret = drm_sched_job_init(>sched_job,
> +  >sched_entity[args->pipe],
> +  submit->ctx);
> + if (ret)
> + goto err_submit_put;
> +
>   ret = submit_lookup_objects(submit, file, bos, args->nr_bos);
>   if (ret)
> - goto err_submit_objects;
> + goto err_submit_job;
>  
>   if ((priv->mmu_global->version != ETNAVIV_IOMMU_V2) &&
>   !etnaviv_cmd_validate_one(gpu, stream, args->stream_size / 4,
> relocs, args->nr_relocs)) {
>   ret = -EINVAL;
> - goto err_submit_objects;
> + goto err_submit_job;
>   }
>  
>   if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) {
> - submit->in_fence = sync_file_get_fence(args->fence_fd);
> - if (!submit->in_fence) 

Re: [PATCH v5 08/20] drm/lima: use scheduler dependency tracking

2021-08-12 Thread Daniel Vetter
On Thu, Aug 05, 2021 at 12:46:53PM +0200, Daniel Vetter wrote:
> Nothing special going on here.
> 
> Aside reviewing the code, it seems like drm_sched_job_arm() should be
> moved into lima_sched_context_queue_task and put under some mutex
> together with drm_sched_push_job(). See the kerneldoc for
> drm_sched_push_job().
> 
> v2: Rebase over renamed functions to add dependencies.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Qiang Yu 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: l...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org

Ping for an ack here please. Testing would be even better ofc.
-Daniel

> ---
>  drivers/gpu/drm/lima/lima_gem.c   |  6 --
>  drivers/gpu/drm/lima/lima_sched.c | 21 -
>  drivers/gpu/drm/lima/lima_sched.h |  3 ---
>  3 files changed, 4 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index c528f40981bb..640acc060467 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -267,7 +267,9 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, 
> struct lima_bo *bo,
>   if (explicit)
>   return 0;
>  
> - return drm_gem_fence_array_add_implicit(>deps, >base.base, 
> write);
> + return drm_sched_job_add_implicit_dependencies(>base,
> +>base.base,
> +write);
>  }
>  
>  static int lima_gem_add_deps(struct drm_file *file, struct lima_submit 
> *submit)
> @@ -285,7 +287,7 @@ static int lima_gem_add_deps(struct drm_file *file, 
> struct lima_submit *submit)
>   if (err)
>   return err;
>  
> - err = drm_gem_fence_array_add(>task->deps, fence);
> + err = drm_sched_job_add_dependency(>task->base, fence);
>   if (err) {
>   dma_fence_put(fence);
>   return err;
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index e968b5a8f0b0..99d5f6f1a882 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task,
>   task->num_bos = num_bos;
>   task->vm = lima_vm_get(vm);
>  
> - xa_init_flags(>deps, XA_FLAGS_ALLOC);
> -
>   return 0;
>  }
>  
>  void lima_sched_task_fini(struct lima_sched_task *task)
>  {
> - struct dma_fence *fence;
> - unsigned long index;
>   int i;
>  
>   drm_sched_job_cleanup(>base);
>  
> - xa_for_each(>deps, index, fence) {
> - dma_fence_put(fence);
> - }
> - xa_destroy(>deps);
> -
>   if (task->bos) {
>   for (i = 0; i < task->num_bos; i++)
>   drm_gem_object_put(>bos[i]->base.base);
> @@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct 
> lima_sched_task *task)
>   return fence;
>  }
>  
> -static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job,
> -struct drm_sched_entity *entity)
> -{
> - struct lima_sched_task *task = to_lima_task(job);
> -
> - if (!xa_empty(>deps))
> - return xa_erase(>deps, task->last_dep++);
> -
> - return NULL;
> -}
> -
>  static int lima_pm_busy(struct lima_device *ldev)
>  {
>   int ret;
> @@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job)
>  }
>  
>  static const struct drm_sched_backend_ops lima_sched_ops = {
> - .dependency = lima_sched_dependency,
>   .run_job = lima_sched_run_job,
>   .timedout_job = lima_sched_timedout_job,
>   .free_job = lima_sched_free_job,
> diff --git a/drivers/gpu/drm/lima/lima_sched.h 
> b/drivers/gpu/drm/lima/lima_sched.h
> index ac70006b0e26..6a11764d87b3 100644
> --- a/drivers/gpu/drm/lima/lima_sched.h
> +++ b/drivers/gpu/drm/lima/lima_sched.h
> @@ -23,9 +23,6 @@ struct lima_sched_task {
>   struct lima_vm *vm;
>   void *frame;
>  
> - struct xarray deps;
> - unsigned long last_dep;
> -
>   struct lima_bo **bos;
>   int num_bos;
>  
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 46/46] drm/i915/guc: Add delay before disabling scheduling on contexts

2021-08-12 Thread Daniel Vetter
On Tue, Aug 03, 2021 at 03:29:43PM -0700, Matthew Brost wrote:
> Some workloads use lots of contexts that continually pin / unpin
> contexts. With GuC submission an unpin translates to a schedule disable
> H2G which puts pressure on both the i915 and GuC. A schedule disable can
> also block future requests from being submitted until the operation
> completes. None of this is ideal.
> 
> Add a configurable, via debugfs, delay period before the schedule
> disable is issued. Default delay period is 1 second. The delay period is
> skipped if more than 3/4 of the guc_ids are in use.
> 
> This patch also updates the selftests to turn off this delay period as
> this extra time would likely cause many selftests to fail. Follow up
> patches will fix all the selftests and enable the delay period.
> 
> Signed-off-by: Matthew Brost 

Quick summary of what we just discussed:

- mod_delayed_work (one per context) is your friend.

- if we go with that for the sched_disable work then that can just take
  the ce->pin_mutex, and recheck there. Which takes care of all the races
  (or well, should at least), because mod_delayed_work is making sure you
  never miss an update.

I'm feeling like that maybe this would be a semi-reasonable intermediate
option instead of just hard-pinning contexts completely for their entire.

I think that would be a smaller step than perma-pinnned context with their
guc_id, and it would allow us to clean up a lot of this code here still.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
>  .../i915/gem/selftests/i915_gem_coherency.c   |   2 +-
>  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |   2 +-
>  .../drm/i915/gem/selftests/i915_gem_mman.c|   2 +-
>  .../drm/i915/gem/selftests/i915_gem_object.c  |   2 +-
>  drivers/gpu/drm/i915/gt/intel_context.c   |   2 +
>  drivers/gpu/drm/i915/gt/intel_context.h   |   9 +
>  drivers/gpu/drm/i915/gt/intel_context_types.h |   8 +
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|   7 +
>  .../gpu/drm/i915/gt/uc/intel_guc_debugfs.c|  28 ++
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 322 +-
>  .../i915/gt/uc/selftest_guc_flow_control.c|  19 +-
>  drivers/gpu/drm/i915/i915_selftest.h  |   2 +
>  drivers/gpu/drm/i915/i915_trace.h |  10 +
>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   2 +-
>  drivers/gpu/drm/i915/selftests/i915_perf.c|   2 +-
>  drivers/gpu/drm/i915/selftests/i915_request.c |   2 +-
>  drivers/gpu/drm/i915/selftests/i915_vma.c |   2 +-
>  18 files changed, 405 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index b199d59bd2c4..1553287e5491 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1298,7 +1298,7 @@ static void engines_idle_release(struct 
> i915_gem_context *ctx,
>   int err;
>  
>   /* serialises with execbuf */
> - set_bit(CONTEXT_CLOSED_BIT, >flags);
> + intel_context_close(ce);
>   if (!intel_context_pin_if_active(ce))
>   continue;
>  
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
> index 13b088cc787e..a666d7e610f5 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
> @@ -434,5 +434,5 @@ int i915_gem_coherency_live_selftests(struct 
> drm_i915_private *i915)
>   SUBTEST(igt_gem_coherency),
>   };
>  
> - return i915_subtests(tests, i915);
> + return i915_live_subtests(tests, i915);
>  }
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> index ffae7df5e4d7..2c92afa9d608 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> @@ -474,5 +474,5 @@ int i915_gem_dmabuf_live_selftests(struct 
> drm_i915_private *i915)
>   SUBTEST(igt_dmabuf_import_same_driver_lmem_smem),
>   };
>  
> - return i915_subtests(tests, i915);
> + return i915_live_subtests(tests, i915);
>  }
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> index b20f5621f62b..4745c78a48de 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -1414,5 +1414,5 @@ int i915_gem_mman_live_selftests(struct 
> drm_i915_private *i915)
>   SUBTEST(igt_mmap_gpu),
>   };
>  
> - return i915_subtests(tests, i915);
> + return i915_live_subtests(tests, i915);
>  }
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
> index 740ee8086a27..ae1361c7c4cf 

Re: [PATCH 4/4] drm/bridge: ti-sn65dsi86: Add NO_CONNECTOR support

2021-08-12 Thread Laurent Pinchart
Hi Rob,

Thank you for the patch.

On Wed, Aug 11, 2021 at 04:52:50PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Slightly awkward to fish out the display_info when we aren't creating
> own connector.  But I don't see an obvious better way.

We need a bit more than this, to support the NO_CONNECTOR case, the
bridge has to implement a few extra operations, and set the bridge .ops
field. I've submitted two patches to do so a while ago:

- [RFC PATCH 08/11] drm/bridge: ti-sn65dsi86: Implement bridge connector 
operations ([1])
- [RFC PATCH 09/11] drm/bridge: ti-sn65dsi86: Make connector creation optional 
([2])

The second patch is similar to the first half of this patch, but misses
the cleanup code. I'll try to rebase this and resubmit, but it may take
a bit of time.

[1] 
https://lore.kernel.org/dri-devel/20210322030128.2283-9-laurent.pinchart+rene...@ideasonboard.com/
[2] 
https://lore.kernel.org/dri-devel/20210322030128.2283-10-laurent.pinchart+rene...@ideasonboard.com/

> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 34 +++
>  1 file changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 38dcc49eccaf..dc8112bab3d3 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -693,9 +693,11 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>   return ret;
>   }
>  
> - ret = ti_sn_bridge_connector_init(pdata);
> - if (ret < 0)
> - goto err_conn_init;
> + if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR)) {
> + ret = ti_sn_bridge_connector_init(pdata);
> + if (ret < 0)
> + goto err_conn_init;
> + }
>  
>   /*
>* TODO: ideally finding host resource and dsi dev registration needs
> @@ -757,7 +759,8 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>  err_dsi_attach:
>   mipi_dsi_device_unregister(dsi);
>  err_dsi_host:
> - drm_connector_cleanup(>connector);
> + if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR))
> + drm_connector_cleanup(>connector);
>  err_conn_init:
>   drm_dp_aux_unregister(>aux);
>   return ret;
> @@ -806,9 +809,30 @@ static void ti_sn_bridge_set_dsi_rate(struct 
> ti_sn65dsi86 *pdata)
>   regmap_write(pdata->regmap, SN_DSIA_CLK_FREQ_REG, val);
>  }
>  
> +/*
> + * Find the connector and fish out the bpc from display_info.  It would
> + * be nice if we could get this instead from drm_bridge_state, but that
> + * doesn't yet appear to be the case.
> + */

This should be done with

struct drm_atomic_state *state = old_bridge_state->base.state;
struct drm_connector *connector;

connector = drm_atomic_get_new_connector_for_encoder(state,
 bridge->encoder);

>  static unsigned int ti_sn_bridge_get_bpp(struct ti_sn65dsi86 *pdata)
>  {
> - if (pdata->connector.display_info.bpc <= 6)
> + struct drm_bridge *bridge = >bridge;
> + struct drm_connector_list_iter conn_iter;
> + struct drm_connector *connector;
> + unsigned bpc = 0;
> +
> + drm_connector_list_iter_begin(bridge->dev, _iter);
> + drm_for_each_connector_iter(connector, _iter) {
> + if (drm_connector_has_possible_encoder(connector, 
> bridge->encoder)) {
> + bpc = connector->display_info.bpc;
> + break;
> + }
> + }
> + drm_connector_list_iter_end(_iter);
> +
> + WARN_ON(bpc == 0);
> +
> + if (bpc <= 6)
>   return 18;
>   else
>   return 24;

-- 
Regards,

Laurent Pinchart


Re: [PATCH 3/4] drm/bridge: ti-sn65dsi86: Implement bridge->mode_valid()

2021-08-12 Thread Laurent Pinchart
Hi Rob,

On Thu, Aug 12, 2021 at 12:09:12PM -0700, Rob Clark wrote:
> On Thu, Aug 12, 2021 at 11:44 AM Laurent Pinchart wrote:
> > On Wed, Aug 11, 2021 at 04:52:49PM -0700, Rob Clark wrote:
> > > From: Rob Clark 
> > >
> > > For the brave new world of bridges not creating their own connectors, we
> > > need to implement the max clock limitation via bridge->mode_valid()
> > > instead of connector->mode_valid().
> > >
> > > Signed-off-by: Rob Clark 
> > > ---
> > >  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 24 +++-
> > >  1 file changed, 19 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> > > b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> > > index 5d3b30b2f547..38dcc49eccaf 100644
> > > --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> > > +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> > > @@ -595,6 +595,15 @@ static struct auxiliary_driver ti_sn_aux_driver = {
> > >   .id_table = ti_sn_aux_id_table,
> > >  };
> > >
> > > +static enum drm_mode_status check_mode(const struct drm_display_mode 
> > > *mode)
> > > +{
> > > + /* maximum supported resolution is 4K at 60 fps */
> > > + if (mode->clock > 594000)
> > > + return MODE_CLOCK_HIGH;
> > > +
> > > + return MODE_OK;
> > > +}
> > > +
> > >  /* 
> > > -
> > >   * DRM Connector Operations
> > >   */
> > > @@ -616,11 +625,7 @@ static enum drm_mode_status
> > >  ti_sn_bridge_connector_mode_valid(struct drm_connector *connector,
> > > struct drm_display_mode *mode)
> > >  {
> > > - /* maximum supported resolution is 4K at 60 fps */
> > > - if (mode->clock > 594000)
> > > - return MODE_CLOCK_HIGH;
> > > -
> > > - return MODE_OK;
> > > + return check_mode(mode);
> >
> > Do we need to implement the connector .mode_valid() operation, given
> > that the bridge is linked in the chain ?
> 
> My understanding is that we need to keep it for display drivers that
> are not converted to NO_CONNECTOR..
> 
> But AFAIK snapdragon is the only upstream user of this bridge, so
> after the drm/msm/dsi patch lands we could probably garbage collect
> the connector support.

Even in the !NO_CONNECTOR case, the bridge will still be linked in the
chain, so the atomic helpers should call the bridge .mode_valid() in
addition to the connector .mode_valid(). I think the connector operation
is redundant.

> > >  }
> > >
> > >  static struct drm_connector_helper_funcs 
> > > ti_sn_bridge_connector_helper_funcs = {
> > > @@ -763,6 +768,14 @@ static void ti_sn_bridge_detach(struct drm_bridge 
> > > *bridge)
> > >   drm_dp_aux_unregister(_to_ti_sn65dsi86(bridge)->aux);
> > >  }
> > >
> > > +static enum drm_mode_status
> > > +ti_sn_bridge_mode_valid(struct drm_bridge *bridge,
> > > + const struct drm_display_info *info,
> > > + const struct drm_display_mode *mode)
> > > +{
> > > + return check_mode(mode);
> > > +}
> > > +
> > >  static void ti_sn_bridge_disable(struct drm_bridge *bridge)
> > >  {
> > >   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
> > > @@ -1118,6 +1131,7 @@ static void ti_sn_bridge_post_disable(struct 
> > > drm_bridge *bridge)
> > >  static const struct drm_bridge_funcs ti_sn_bridge_funcs = {
> > >   .attach = ti_sn_bridge_attach,
> > >   .detach = ti_sn_bridge_detach,
> > > + .mode_valid = ti_sn_bridge_mode_valid,
> > >   .pre_enable = ti_sn_bridge_pre_enable,
> > >   .enable = ti_sn_bridge_enable,
> > >   .disable = ti_sn_bridge_disable,

-- 
Regards,

Laurent Pinchart


Re: [PATCH 3/4] drm/bridge: ti-sn65dsi86: Implement bridge->mode_valid()

2021-08-12 Thread Rob Clark
On Thu, Aug 12, 2021 at 11:44 AM Laurent Pinchart
 wrote:
>
> Hi Rob,
>
> Thank you for the patch.
>
> On Wed, Aug 11, 2021 at 04:52:49PM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > For the brave new world of bridges not creating their own connectors, we
> > need to implement the max clock limitation via bridge->mode_valid()
> > instead of connector->mode_valid().
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 24 +++-
> >  1 file changed, 19 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> > b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> > index 5d3b30b2f547..38dcc49eccaf 100644
> > --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> > +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> > @@ -595,6 +595,15 @@ static struct auxiliary_driver ti_sn_aux_driver = {
> >   .id_table = ti_sn_aux_id_table,
> >  };
> >
> > +static enum drm_mode_status check_mode(const struct drm_display_mode *mode)
> > +{
> > + /* maximum supported resolution is 4K at 60 fps */
> > + if (mode->clock > 594000)
> > + return MODE_CLOCK_HIGH;
> > +
> > + return MODE_OK;
> > +}
> > +
> >  /* 
> > -
> >   * DRM Connector Operations
> >   */
> > @@ -616,11 +625,7 @@ static enum drm_mode_status
> >  ti_sn_bridge_connector_mode_valid(struct drm_connector *connector,
> > struct drm_display_mode *mode)
> >  {
> > - /* maximum supported resolution is 4K at 60 fps */
> > - if (mode->clock > 594000)
> > - return MODE_CLOCK_HIGH;
> > -
> > - return MODE_OK;
> > + return check_mode(mode);
>
> Do we need to implement the connector .mode_valid() operation, given
> that the bridge is linked in the chain ?

My understanding is that we need to keep it for display drivers that
are not converted to NO_CONNECTOR..

But AFAIK snapdragon is the only upstream user of this bridge, so
after the drm/msm/dsi patch lands we could probably garbage collect
the connector support.

BR,
-R

> >  }
> >
> >  static struct drm_connector_helper_funcs 
> > ti_sn_bridge_connector_helper_funcs = {
> > @@ -763,6 +768,14 @@ static void ti_sn_bridge_detach(struct drm_bridge 
> > *bridge)
> >   drm_dp_aux_unregister(_to_ti_sn65dsi86(bridge)->aux);
> >  }
> >
> > +static enum drm_mode_status
> > +ti_sn_bridge_mode_valid(struct drm_bridge *bridge,
> > + const struct drm_display_info *info,
> > + const struct drm_display_mode *mode)
> > +{
> > + return check_mode(mode);
> > +}
> > +
> >  static void ti_sn_bridge_disable(struct drm_bridge *bridge)
> >  {
> >   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
> > @@ -1118,6 +1131,7 @@ static void ti_sn_bridge_post_disable(struct 
> > drm_bridge *bridge)
> >  static const struct drm_bridge_funcs ti_sn_bridge_funcs = {
> >   .attach = ti_sn_bridge_attach,
> >   .detach = ti_sn_bridge_detach,
> > + .mode_valid = ti_sn_bridge_mode_valid,
> >   .pre_enable = ti_sn_bridge_pre_enable,
> >   .enable = ti_sn_bridge_enable,
> >   .disable = ti_sn_bridge_disable,
>
> --
> Regards,
>
> Laurent Pinchart


Re: [PATCH 3/4] drm/bridge: ti-sn65dsi86: Implement bridge->mode_valid()

2021-08-12 Thread Laurent Pinchart
Hi Rob,

Thank you for the patch.

On Wed, Aug 11, 2021 at 04:52:49PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> For the brave new world of bridges not creating their own connectors, we
> need to implement the max clock limitation via bridge->mode_valid()
> instead of connector->mode_valid().
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 24 +++-
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 5d3b30b2f547..38dcc49eccaf 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -595,6 +595,15 @@ static struct auxiliary_driver ti_sn_aux_driver = {
>   .id_table = ti_sn_aux_id_table,
>  };
>  
> +static enum drm_mode_status check_mode(const struct drm_display_mode *mode)
> +{
> + /* maximum supported resolution is 4K at 60 fps */
> + if (mode->clock > 594000)
> + return MODE_CLOCK_HIGH;
> +
> + return MODE_OK;
> +}
> +
>  /* 
> -
>   * DRM Connector Operations
>   */
> @@ -616,11 +625,7 @@ static enum drm_mode_status
>  ti_sn_bridge_connector_mode_valid(struct drm_connector *connector,
> struct drm_display_mode *mode)
>  {
> - /* maximum supported resolution is 4K at 60 fps */
> - if (mode->clock > 594000)
> - return MODE_CLOCK_HIGH;
> -
> - return MODE_OK;
> + return check_mode(mode);

Do we need to implement the connector .mode_valid() operation, given
that the bridge is linked in the chain ?

>  }
>  
>  static struct drm_connector_helper_funcs ti_sn_bridge_connector_helper_funcs 
> = {
> @@ -763,6 +768,14 @@ static void ti_sn_bridge_detach(struct drm_bridge 
> *bridge)
>   drm_dp_aux_unregister(_to_ti_sn65dsi86(bridge)->aux);
>  }
>  
> +static enum drm_mode_status
> +ti_sn_bridge_mode_valid(struct drm_bridge *bridge,
> + const struct drm_display_info *info,
> + const struct drm_display_mode *mode)
> +{
> + return check_mode(mode);
> +}
> +
>  static void ti_sn_bridge_disable(struct drm_bridge *bridge)
>  {
>   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
> @@ -1118,6 +1131,7 @@ static void ti_sn_bridge_post_disable(struct drm_bridge 
> *bridge)
>  static const struct drm_bridge_funcs ti_sn_bridge_funcs = {
>   .attach = ti_sn_bridge_attach,
>   .detach = ti_sn_bridge_detach,
> + .mode_valid = ti_sn_bridge_mode_valid,
>   .pre_enable = ti_sn_bridge_pre_enable,
>   .enable = ti_sn_bridge_enable,
>   .disable = ti_sn_bridge_disable,

-- 
Regards,

Laurent Pinchart


Re: [PATCH 2/4] drm/msm/dsi: Support NO_CONNECTOR bridges

2021-08-12 Thread Rob Clark
On Thu, Aug 12, 2021 at 10:31 AM Sam Ravnborg  wrote:
>
> Hi Rob,
>
> On Wed, Aug 11, 2021 at 04:52:48PM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > For now, since we have a mix of bridges which support this flag, which
> > which do *not* support this flag, or work both ways, try it once with
> > NO_CONNECTOR and then fall back to the old way if that doesn't work.
> > Eventually we can drop the fallback path.
>
> Which bridges are missing support for the NO_CONNECTOR flags that you
> are using?

Currently (as far as things that I am aware of) it is just
ti-sn65dsi86, which the last two patches in this series convert.

But who knows what someone somewhere decides to build ;-)

BR,
-R

> Background for the question is that if you are using one of the bridges
> missing a conversion then we could do the conversion and maybe you could
> help with some tests.
>
> We in the above sentence is likely me, so it may take some weeks,
> but known there is a chance for testing is a good motivator.
>
> Sam


Re: [PATCH 2/4] drm/msm/dsi: Support NO_CONNECTOR bridges

2021-08-12 Thread Sam Ravnborg
Hi Rob,

On Wed, Aug 11, 2021 at 04:52:48PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> For now, since we have a mix of bridges which support this flag, which
> which do *not* support this flag, or work both ways, try it once with
> NO_CONNECTOR and then fall back to the old way if that doesn't work.
> Eventually we can drop the fallback path.

Which bridges are missing support for the NO_CONNECTOR flags that you
are using?

Background for the question is that if you are using one of the bridges
missing a conversion then we could do the conversion and maybe you could
help with some tests.

We in the above sentence is likely me, so it may take some weeks,
but known there is a chance for testing is a good motivator.

Sam


Re: [PATCH 3/4] drm/bridge: ti-sn65dsi86: Implement bridge->mode_valid()

2021-08-12 Thread Doug Anderson
Hi,

On Wed, Aug 11, 2021 at 4:51 PM Rob Clark  wrote:
>
> From: Rob Clark 
>
> For the brave new world of bridges not creating their own connectors, we
> need to implement the max clock limitation via bridge->mode_valid()
> instead of connector->mode_valid().
>
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 24 +++-
>  1 file changed, 19 insertions(+), 5 deletions(-)

This looks good to me. I'll plan to land this together with the next
patch into drm-misc-next sometime next week unless someone beats me to
it.

Reviewed-by: Douglas Anderson 


Re: [PATCH 4/4] drm/bridge: ti-sn65dsi86: Add NO_CONNECTOR support

2021-08-12 Thread Doug Anderson
Hi,

On Wed, Aug 11, 2021 at 4:51 PM Rob Clark  wrote:
>
> From: Rob Clark 
>
> Slightly awkward to fish out the display_info when we aren't creating
> own connector.  But I don't see an obvious better way.
>
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 34 +++
>  1 file changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 38dcc49eccaf..dc8112bab3d3 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -693,9 +693,11 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
> return ret;
> }
>
> -   ret = ti_sn_bridge_connector_init(pdata);
> -   if (ret < 0)
> -   goto err_conn_init;
> +   if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR)) {
> +   ret = ti_sn_bridge_connector_init(pdata);
> +   if (ret < 0)
> +   goto err_conn_init;
> +   }
>
> /*
>  * TODO: ideally finding host resource and dsi dev registration needs
> @@ -757,7 +759,8 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>  err_dsi_attach:
> mipi_dsi_device_unregister(dsi);
>  err_dsi_host:
> -   drm_connector_cleanup(>connector);
> +   if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR))
> +   drm_connector_cleanup(>connector);
>  err_conn_init:
> drm_dp_aux_unregister(>aux);
> return ret;
> @@ -806,9 +809,30 @@ static void ti_sn_bridge_set_dsi_rate(struct 
> ti_sn65dsi86 *pdata)
> regmap_write(pdata->regmap, SN_DSIA_CLK_FREQ_REG, val);
>  }
>
> +/*
> + * Find the connector and fish out the bpc from display_info.  It would
> + * be nice if we could get this instead from drm_bridge_state, but that
> + * doesn't yet appear to be the case.
> + */
>  static unsigned int ti_sn_bridge_get_bpp(struct ti_sn65dsi86 *pdata)
>  {
> -   if (pdata->connector.display_info.bpc <= 6)
> +   struct drm_bridge *bridge = >bridge;
> +   struct drm_connector_list_iter conn_iter;
> +   struct drm_connector *connector;
> +   unsigned bpc = 0;
> +
> +   drm_connector_list_iter_begin(bridge->dev, _iter);
> +   drm_for_each_connector_iter(connector, _iter) {
> +   if (drm_connector_has_possible_encoder(connector, 
> bridge->encoder)) {
> +   bpc = connector->display_info.bpc;
> +   break;
> +   }
> +   }
> +   drm_connector_list_iter_end(_iter);

This looks reasonable to me. I'll plan to apply it to drm-misc-next
sometime next week to give Laurent a chance to comment on whether this
causes any problems with his planned support for full DP using this
bridge chip. IIUC that means it'll hit mainline 1 rev later, but as
per IRC comments this should be fine.

Reviewed-by: Douglas Anderson 


-Doug


Re: [PATCH 1/4] drm/bridge: ti-sn65dsi86: Avoid creating multiple connectors

2021-08-12 Thread Rob Clark
On Thu, Aug 12, 2021 at 9:55 AM Doug Anderson  wrote:
>
> Hi,
>
> On Wed, Aug 11, 2021 at 4:51 PM Rob Clark  wrote:
> >
> > From: Rob Clark 
> >
> > If we created our own connector because the driver does not support the
> > NO_CONNECTOR flag, we don't want the downstream bridge to *also* create
> > a connector.  And if this driver did pass the NO_CONNECTOR flag (and we
> > supported that mode) this would change nothing.
> >
> > Fixes: 4e5763f03e10 ("drm/bridge: ti-sn65dsi86: Wrap panel with 
> > panel-bridge")
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 3 +++
> >  1 file changed, 3 insertions(+)
>
> Reviewed-by: Douglas Anderson 
> Tested-by: Douglas Anderson 
>
> I'm going to apply this one to drm-misc/drm-misc-next and push since
> it's a fix and we're before -rc6, then I'll take a look at the later
> patches in the series.
>

Thanks.. this is the only one with some urgency, the rest can wait
until next cycle.  (And the bridge vs msm patches can land
independently, I've tested the different possible combinations)

BR,
-R


Re: [PATCH 4/4] drm/vgem: use shmem helpers

2021-08-12 Thread Sam Ravnborg
Hi Daniel,

On Thu, Aug 12, 2021 at 03:14:12PM +0200, Daniel Vetter wrote:
> Aside from deleting lots of code the real motivation here is to switch
> the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> work, and even if you try and there's a struct page behind that,
> touching it and mucking around with its refcount can upset drivers
> real bad.
The only thing I understood of all this complicated stuff was "deleting
lots of code" which is a good thing.
You may want to s/VM_PFNMP/VM_PFNMAP/ before you push this.

Sam


Re: [PATCH 1/4] drm/bridge: ti-sn65dsi86: Avoid creating multiple connectors

2021-08-12 Thread Doug Anderson
Hi,

On Wed, Aug 11, 2021 at 4:51 PM Rob Clark  wrote:
>
> From: Rob Clark 
>
> If we created our own connector because the driver does not support the
> NO_CONNECTOR flag, we don't want the downstream bridge to *also* create
> a connector.  And if this driver did pass the NO_CONNECTOR flag (and we
> supported that mode) this would change nothing.
>
> Fixes: 4e5763f03e10 ("drm/bridge: ti-sn65dsi86: Wrap panel with panel-bridge")
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Douglas Anderson 
Tested-by: Douglas Anderson 

I'm going to apply this one to drm-misc/drm-misc-next and push since
it's a fix and we're before -rc6, then I'll take a look at the later
patches in the series.

-Doug


Re: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks

2021-08-12 Thread Michel Dänzer
On 2021-08-12 1:33 p.m., Lazar, Lijo wrote:
> On 8/12/2021 1:41 PM, Michel Dänzer wrote:
>> On 2021-08-12 7:55 a.m., Koenig, Christian wrote:
>>> Hi James,
>>>
>>> Evan seems to have understood how this all works together.
>>>
>>> See while any begin/end use critical section is active the work should not 
>>> be active.
>>>
>>> When you handle only one ring you can just call cancel in begin use and 
>>> schedule in end use. But when you have more than one ring you need a lock 
>>> or counter to prevent concurrent work items to be started.
>>>
>>> Michelle's idea to use mod_delayed_work is a bad one because it assumes 
>>> that the delayed work is still running.
>>
>> It merely assumes that the work may already have been scheduled before.
>>
>> Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I 
>> think it can still have some effect when there's a single work item for 
>> multiple rings, as described by James, it's probably negligible, since 
>> presumably the time intervals between ring_begin_use and ring_end_use are 
>> normally much shorter than a second.
>>
>> So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as 
>> schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with 
>> dropping it.
>>
>>
>>> Something similar applies to the first patch I think,
>>
>> There are no cancel work calls in that case, so the commit log is accurate 
>> TTBOMK.
> 
> Curious -
> 
> For patch 1, does it make a difference if any delayed work scheduled is 
> cancelled in the else part before proceeding?
> 
> } else if (!enable && adev->gfx.gfx_off_state) {
> cancel_delayed_work();

I tried the patch below.

While this does seem to fix the problem as well, I see a potential issue:

1. amdgpu_gfx_off_ctrl locks adev->gfx.gfx_off_mutex
2. amdgpu_device_delay_enable_gfx_off runs, blocks in mutex_lock
3. amdgpu_gfx_off_ctrl calls cancel_delayed_work_sync

I'm afraid this would deadlock? (CONFIG_PROVE_LOCKING doesn't complain though)


Maybe it's possible to fix it with cancel_delayed_work_sync somehow, but I'm 
not sure how offhand. (With cancel_delayed_work instead, I'm worried 
amdgpu_device_delay_enable_gfx_off might still enable GFXOFF in the HW 
immediately after amdgpu_gfx_off_ctrl unlocks the mutex. Then again, that might 
happen with mod_delayed_work as well...)


diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

index a0be0772c8b3..3e4585ffb9af 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c

@@ -570,8 +570,11 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool 
enable)



if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) 
{

schedule_delayed_work(>gfx.gfx_off_delay_work, 
GFX_OFF_DELAY_ENABLE);

-   } else if (!enable && adev->gfx.gfx_off_state) {

-   if (!amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {

+   } else if (!enable) {

+   cancel_delayed_work_sync(>gfx.gfx_off_delay_work);

+

+   if (adev->gfx.gfx_off_state &&

+   !amdgpu_dpm_set_powergating_by_smu(adev, 
AMD_IP_BLOCK_TYPE_GFX, false)) {

adev->gfx.gfx_off_state = false;



if (adev->gfx.funcs->init_spm_golden) {



-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


[pull] drm/msm: drm-msm-next-2021-08-12 for v5.15

2021-08-12 Thread Rob Clark
Hi Dave & Daniel,

This is the main pull for v5.15, after the early pull request with
drm/scheduler conversion:

* New a6xx GPU support: a680 and 7c3
* dsi: 7nm phi, sc7280 support, test pattern generator support
* mdp4 fixes for older hw like the nexus7
* displayport fixes

There will be minor conflict, not with merging into drm-next (in it's
current state) but when that gets merged with fixes from the v5.14
cycle.  The resolution that Stephen Rothwell made in linux-next is
correct:

-
diff --cc drivers/gpu/drm/msm/dp/dp_display.c
index 867388a399ad,419fd4a14cbf..
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@@ -1312,10 -1314,10 +1315,14 @@@ static int dp_pm_resume(struct device *
else
dp->dp_display.is_connected = false;

 +  dp_display_handle_plugged_change(g_dp_display,
 +  dp->dp_display.is_connected);
 +
 +
+   DRM_DEBUG_DP("After, sink_count=%d is_connected=%d
core_inited=%d power_on=%d\n",
+   dp->link->sink_count, dp->dp_display.is_connected,
+   dp->core_initialized, dp_display->power_on);
+
mutex_unlock(>event_mutex);

return 0;
-

The following changes since commit 4541e4f2225c30b0e9442be9eb2fb8b7086cdd1f:

  drm/msm/gem: Mark active before pinning (2021-07-28 09:19:00 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/msm.git drm-msm-next-2021-08-12

for you to fetch changes up to cb0927ab80d224c9074f53d1a55b087d12ec5a85:

  drm/msi/mdp4: populate priv->kms in mdp4_kms_init (2021-08-11 10:57:28 -0700)


Abhinav Kumar (2):
  drm/msm/dsi: update dsi register header file for tpg
  drm/msm/dsi: add support for dsi test pattern generator

Akhil P Oommen (3):
  drm/msm/a6xx: Fix llcc configuration for a660 gpu
  drm/msm/a6xx: Use rev to identify SKU
  drm/msm/a6xx: Add support for Adreno 7c Gen 3 gpu

Baokun Li (1):
  drm/msm: Use list_move_tail instead of list_del/list_add_tail in msm_gem.c

Bjorn Andersson (1):
  drm: msm: Add 680 gpu to the adreno gpu list

Christophe JAILLET (1):
  drm/msm/dsi: Fix some reference counted resource leaks

David Heidelberg (4):
  drm/msm/mdp4: refactor HW revision detection into read_mdp_hw_revision
  drm/msm/mdp4: move HW revision detection to earlier phase
  drm/msm: mdp4: drop vblank get/put from prepare/complete_commit
  drm/msi/mdp4: populate priv->kms in mdp4_kms_init

Dmitry Baryshkov (14):
  drm/msm/dsi: drop gdsc regulator handling
  drm/msm/dsi: phy: use of_device_get_match_data
  drm/msm/dsi: drop msm_dsi_phy_get_shared_timings
  drm/msm/dsi: rename dual DSI to bonded DSI
  drm/msm/dsi: add three helper functions
  drm/msm/dpu: support setting up two independent DSI connectors
  drm/msm/mdp5: move mdp5_encoder_set_intf_mode after msm_dsi_modeset_init
  drm/msm/dp: stop calling set_encoder_mode callback
  drm/msm/dsi: stop calling set_encoder_mode callback
  drm/msm/kms: drop set_encoder_mode callback
  drm/msm/dpu: add support for alpha blending properties
  drm/msm/dpu: make dpu_hw_ctl_clear_all_blendstages clear necessary LMs
  dt-bindings: display: msm: dsi-controller-main: restore assigned-clocks
  drm/msm/dsi: add continuous clock support for 7nm PHY

Douglas Anderson (1):
  drm/msm: Use nvmem_cell_read_variable_le_u32() to read speed bin

Guo Zhengkui (1):
  drm/msm: remove a repeated including of 

Jonathan Marek (3):
  dt-bindings: msm: dsi: add missing 7nm bindings
  dt-bindings: msm: dsi: document phy-type property for 7nm dsi phy
  drm/msm/dsi: support CPHY mode for 7nm pll/phy

Kalyan Thota (1):
  drm/msm/disp/dpu1: add safe lut config in dpu driver

Konrad Dybcio (1):
  drm/msm/dsi: Fix DSI and DSI PHY regulator config from SDM660

Kuogee Hsieh (8):
  drm/msm/dp: update is_connected status base on sink count at
dp_pm_resume()
  drm/msm/dp: use dp_ctrl_off_link_stream during PHY compliance test run
  drm/msm/dp: reduce link rate if failed at link training 1
  drm/msm/dp: reset aux controller after dp_aux_cmd_fifo_tx() failed.
  drm/msm/dp: replug event is converted into an unplug followed by
an plug events
  drm/msm/dp: return correct edid checksum after corrupted edid
checksum read
  drm/msm/dp: do not end dp link training until video is ready
  drm/msm/dp: add drm debug logs to dp_pm_resume/suspend

Maitreyee Rao (1):
  drm/msm/dp: add logs across DP driver for ease of debugging

Rajeev Nandan (3):
  dt-bindings: msm/dsi: Add sc7280 7nm dsi phy
  drm/msm/dsi: Add PHY configuration for SC7280
  drm/msm/dsi: Add DSI support for SC7280

Rob Clark (3):
  drm/msm: Periodically update RPTR shadow
  drm/msm: Add adreno_is_a640_family()
  drm/msm: Rework SQE version check

Souptick 

Re: [PATCH 2/4] drm/msm/dsi: Support NO_CONNECTOR bridges

2021-08-12 Thread Laurent Pinchart
Hi Rob

Thank you for the patch.

On Wed, Aug 11, 2021 at 04:52:48PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> For now, since we have a mix of bridges which support this flag, which
> which do *not* support this flag, or work both ways, try it once with
> NO_CONNECTOR and then fall back to the old way if that doesn't work.
> Eventually we can drop the fallback path.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/Kconfig   |  2 ++
>  drivers/gpu/drm/msm/dsi/dsi_manager.c | 41 ++-
>  2 files changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
> index e9c6af78b1d7..36e5ba3ccc28 100644
> --- a/drivers/gpu/drm/msm/Kconfig
> +++ b/drivers/gpu/drm/msm/Kconfig
> @@ -14,6 +14,8 @@ config DRM_MSM
>   select REGULATOR
>   select DRM_KMS_HELPER
>   select DRM_PANEL
> + select DRM_BRIDGE
> + select DRM_PANEL_BRIDGE
>   select DRM_SCHED
>   select SHMEM
>   select TMPFS
> diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c 
> b/drivers/gpu/drm/msm/dsi/dsi_manager.c
> index c41d39f5b7cf..1fd1cf93abbf 100644
> --- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
> +++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
> @@ -3,6 +3,8 @@
>   * Copyright (c) 2015, The Linux Foundation. All rights reserved.
>   */
>  
> +#include "drm/drm_bridge_connector.h"
> +
>  #include "msm_kms.h"
>  #include "dsi.h"
>  
> @@ -690,8 +692,7 @@ struct drm_connector *msm_dsi_manager_ext_bridge_init(u8 
> id)
>   struct drm_device *dev = msm_dsi->dev;
>   struct drm_encoder *encoder;
>   struct drm_bridge *int_bridge, *ext_bridge;
> - struct drm_connector *connector;
> - struct list_head *connector_list;
> + int ret;
>  
>   int_bridge = msm_dsi->bridge;
>   ext_bridge = msm_dsi->external_bridge =
> @@ -699,22 +700,36 @@ struct drm_connector 
> *msm_dsi_manager_ext_bridge_init(u8 id)
>  
>   encoder = msm_dsi->encoder;
>  
> - /* link the internal dsi bridge to the external bridge */
> - drm_bridge_attach(encoder, ext_bridge, int_bridge, 0);
> -
>   /*
> -  * we need the drm_connector created by the external bridge
> -  * driver (or someone else) to feed it to our driver's
> -  * priv->connector[] list, mainly for msm_fbdev_init()
> +  * Try first to create the bridge without it creating it's own

s/it's/its/

> +  * connector.. currently some bridges support this, and others
> +  * do not (and some support both modes)
>*/
> - connector_list = >mode_config.connector_list;
> + ret = drm_bridge_attach(encoder, ext_bridge, int_bridge,
> + DRM_BRIDGE_ATTACH_NO_CONNECTOR);

Should all this be moved one layer up, to the code that attaches to the
mem_dsi->bridge ? I suppose we can start here, but as part of a global
move to bridges and DRM_BRIDGE_ATTACH_NO_CONNECTOR, I think the
top-level would make more sense in the long term.

If you want to start here,

Reviewed-by: Laurent Pinchart 

> + if (ret == -EINVAL) {
> + struct drm_connector *connector;
> + struct list_head *connector_list;
> +
> + /* link the internal dsi bridge to the external bridge */
> + drm_bridge_attach(encoder, ext_bridge, int_bridge, 0);
> +
> + /*
> +  * we need the drm_connector created by the external bridge
> +  * driver (or someone else) to feed it to our driver's
> +  * priv->connector[] list, mainly for msm_fbdev_init()
> +  */
> + connector_list = >mode_config.connector_list;
> +
> + list_for_each_entry(connector, connector_list, head) {
> + if (drm_connector_has_possible_encoder(connector, 
> encoder))
> + return connector;
> + }
>  
> - list_for_each_entry(connector, connector_list, head) {
> - if (drm_connector_has_possible_encoder(connector, encoder))
> - return connector;
> + return ERR_PTR(-ENODEV);
>   }
>  
> - return ERR_PTR(-ENODEV);
> + return drm_bridge_connector_init(dev, encoder);
>  }
>  
>  void msm_dsi_manager_bridge_destroy(struct drm_bridge *bridge)

-- 
Regards,

Laurent Pinchart


Re: [PATCH 1/4] drm/bridge: ti-sn65dsi86: Avoid creating multiple connectors

2021-08-12 Thread Laurent Pinchart
Hi Rob,

Thank you for the patch.

On Wed, Aug 11, 2021 at 04:52:47PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> If we created our own connector because the driver does not support the
> NO_CONNECTOR flag, we don't want the downstream bridge to *also* create
> a connector.  And if this driver did pass the NO_CONNECTOR flag (and we
> supported that mode) this would change nothing.
> 
> Fixes: 4e5763f03e10 ("drm/bridge: ti-sn65dsi86: Wrap panel with panel-bridge")
> Signed-off-by: Rob Clark 

Makes complete sense.

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/gpu/drm/bridge/ti-sn65dsi86.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
> b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> index 9bf889302bcc..5d3b30b2f547 100644
> --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
> @@ -736,6 +736,9 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
>   }
>   pdata->dsi = dsi;
>  
> + /* We never want the next bridge to *also* create a connector: */
> + flags |= DRM_BRIDGE_ATTACH_NO_CONNECTOR;
> +
>   /* Attach the next bridge */
>   ret = drm_bridge_attach(bridge->encoder, pdata->next_bridge,
>   >bridge, flags);

-- 
Regards,

Laurent Pinchart


Re: [PATCH v2 0/6] eDP: Support probing eDP panels dynamically instead of hardcoding

2021-08-12 Thread Doug Anderson
Hi,

On Thu, Aug 12, 2021 at 2:38 AM Sam Ravnborg  wrote:
>
> Hi Doug,
> On Mon, Aug 09, 2021 at 03:18:03PM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Tue, Aug 3, 2021 at 1:41 PM Sam Ravnborg  wrote:
> > >
> > > Hi Douglas,
> > >
> > > On Fri, Jul 30, 2021 at 02:26:19PM -0700, Douglas Anderson wrote:
> > > > The goal of this patch series is to move away from hardcoding exact
> > > > eDP panels in device tree files. As discussed in the various patches
> > > > in this series (I'm not repeating everything here), most eDP panels
> > > > are 99% probable and we can get that last 1% by allowing two "power
> > > > up" delays to be specified in the device tree file and then using the
> > > > panel ID (found in the EDID) to look up additional power sequencing
> > > > delays for the panel.
> > >
> > > Have you considered a new driver for edp panels?
> > > panel-edp.c?
> > >
> > > There will be some duplicate code from pnale-simple - but the same can
> > > be said by the other panel drivers too.
> > > In the end I think it is better to separate them so we end up with two
> > > less complex panel drivers rather than one do-it-all panel driver.
> > >
> > > I have not looked in detail how this would look like, but my first
> > > impression is that we should split it out.
> >
> > I certainly could, but my argument against it is that really it's the
> > exact same set of eDP panels that would be supported by both drivers.
>
> The idea was to move all eDP panels to the new driver.
>
> My hope it that we can make panel-simple handle a more more narrow set
> of panels. eDP capable displays are IMO not simple panels.

Ah! OK, this makes sense. I can work on this, though it might be a
short while before I can. I think moving all eDP panels out of
panel-simple.c to something like panel-simple-edp.c makes sense. It
will be a patch that will be very hard to cherry-pick anywhere since
it will conflict with everything but it should be doable.


> Likewise DSI capable panels could also be pulled out of panel-simple.

At the moment I haven't done much with DSI panels so I might leave
them in panel-simple for now. I'll evaluate to see how nasty it would
be for me to try this.


> This would continue to duplicate some code - but we have a lot of
> duplicated code across the various panels and the best way forward
> would be to implement more helpers that can be used by the drivers.
>
> Sam - who is trying to recover form the deadly man flu...

Feel better!

-Doug


[PULL] drm-intel-fixes

2021-08-12 Thread Rodrigo Vivi
Hi Dave and Daniel,

Here goes drm-intel-fixes-2021-08-12:

- GVT fix for Windows VM hang.
- Display fix of 12 BPC bits for display 12 and newer.
- Don't try to access some media register for fused off domains.
- Fix kerneldoc build warnings.

Thanks,
Rodrigo.

The following changes since commit 36a21d51725af2ce0700c6ebcb6b9594aac658a6:

  Linux 5.14-rc5 (2021-08-08 13:49:31 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-fixes-2021-08-12

for you to fetch changes up to ffd5caa26f6afde0c1e3ed126806607748a83c6e:

  drm/doc/rfc: drop lmem uapi section (2021-08-12 06:05:45 -0400)


- GVT fix for Windows VM hang.
- Display fix of 12 BPC bits for display 12 and newer.
- Don't try to access some media register for fused off domains.
- Fix kerneldoc build warnings.


Ankit Nautiyal (1):
  drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg

Daniel Vetter (1):
  drm/doc/rfc: drop lmem uapi section

Matt Roper (1):
  drm/i915: Only access SFC_DONE when media domain is not fused off

Rodrigo Vivi (1):
  Merge tag 'gvt-fixes-2021-08-10' of https://github.com/intel/gvt-linux 
into drm-intel-fixes

Zhenyu Wang (1):
  drm/i915/gvt: Fix cached atomics setting for Windows VM

 Documentation/gpu/rfc/i915_gem_lmem.rst  | 109 ---
 drivers/gpu/drm/i915/display/intel_display.c |  34 ++---
 drivers/gpu/drm/i915/gvt/handlers.c  |   1 +
 drivers/gpu/drm/i915/gvt/mmio_context.c  |   2 +
 drivers/gpu/drm/i915/i915_gpu_error.c|  19 -
 drivers/gpu/drm/i915/i915_reg.h  |  16 ++--
 6 files changed, 56 insertions(+), 125 deletions(-)


Re: [PATCH] drm/i915: Use locked access to ctx->engines in set_priority

2021-08-12 Thread Daniel Vetter
On Thu, Aug 12, 2021 at 5:10 PM Jason Ekstrand  wrote:
> On Tue, Aug 10, 2021 at 8:05 AM Daniel Vetter  wrote:
> >
> > This essentially reverts
> >
> > commit 89ff76bf9b3b0b86e6bbe344bd6378d8661303fc
> > Author: Chris Wilson 
> > Date:   Thu Apr 2 13:42:18 2020 +0100
> >
> > drm/i915/gem: Utilize rcu iteration of context engines
> >
> > Note that the other use of __context_engines_await have disappeard in
> > the following commits:
> >
> > ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
> > contexts (v4)")
> > c7a71fc8ee04 ("drm/i915: Drop getparam support for 
> > I915_CONTEXT_PARAM_ENGINES")
> > 4a766ae40ec8 ("drm/i915: Drop the CONTEXT_CLONE API (v2)")
> >
> > None of these have any business to optimize their engine lookup with
> > rcu, unless extremely convincing benchmark data and a solid analysis
> > why we can't make that workload (whatever it is that does) faster with
> > a proper design fix.
> >
> > Also since there's only one caller of context_apply_all left and it's
> > really just a loop, inline it and then inline the lopp body too. This
> > is how all other callers that take the engine lock loop over engines,
> > it's much simpler.
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: Chris Wilson 
> > Cc: Mika Kuoppala 
> > Cc: Daniel Vetter 
> > Cc: Jason Ekstrand 
> > Cc: Tvrtko Ursulin 
> > Cc: Joonas Lahtinen 
> > Cc: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_context.c | 72 -
> >  1 file changed, 14 insertions(+), 58 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index dbaeb924a437..fd169cf2f75a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1284,49 +1284,6 @@ static int __context_set_persistence(struct 
> > i915_gem_context *ctx, bool state)
> > return 0;
> >  }
> >
> > -static inline struct i915_gem_engines *
> > -__context_engines_await(const struct i915_gem_context *ctx,
> > -   bool *user_engines)
> > -{
> > -   struct i915_gem_engines *engines;
> > -
> > -   rcu_read_lock();
> > -   do {
> > -   engines = rcu_dereference(ctx->engines);
> > -   GEM_BUG_ON(!engines);
> > -
> > -   if (user_engines)
> > -   *user_engines = i915_gem_context_user_engines(ctx);
> > -
> > -   /* successful await => strong mb */
> > -   if (unlikely(!i915_sw_fence_await(>fence)))
>
> Ugh... The first time I looked at this I thought the SW fence meant it
> was actually waiting on something.  But, no, it's just making sure the
> engines object still exists.  *sigh*  Burn it!

... why did you force me to page this in again, I already forgot.

> Reviewed-by: Jason Ekstrand 

Merged to drm-intel-gt-next, thanks for the review.
-Daniel

>
> > -   continue;
> > -
> > -   if (likely(engines == rcu_access_pointer(ctx->engines)))
> > -   break;
> > -
> > -   i915_sw_fence_complete(>fence);
> > -   } while (1);
> > -   rcu_read_unlock();
> > -
> > -   return engines;
> > -}
> > -
> > -static void
> > -context_apply_all(struct i915_gem_context *ctx,
> > - void (*fn)(struct intel_context *ce, void *data),
> > - void *data)
> > -{
> > -   struct i915_gem_engines_iter it;
> > -   struct i915_gem_engines *e;
> > -   struct intel_context *ce;
> > -
> > -   e = __context_engines_await(ctx, NULL);
> > -   for_each_gem_engine(ce, e, it)
> > -   fn(ce, data);
> > -   i915_sw_fence_complete(>fence);
> > -}
> > -
> >  static struct i915_gem_context *
> >  i915_gem_create_context(struct drm_i915_private *i915,
> > const struct i915_gem_proto_context *pc)
> > @@ -1776,23 +1733,11 @@ set_persistence(struct i915_gem_context *ctx,
> > return __context_set_persistence(ctx, args->value);
> >  }
> >
> > -static void __apply_priority(struct intel_context *ce, void *arg)
> > -{
> > -   struct i915_gem_context *ctx = arg;
> > -
> > -   if (!intel_engine_has_timeslices(ce->engine))
> > -   return;
> > -
> > -   if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> > -   intel_engine_has_semaphores(ce->engine))
> > -   intel_context_set_use_semaphores(ce);
> > -   else
> > -   intel_context_clear_use_semaphores(ce);
> > -}
> > -
> >  static int set_priority(struct i915_gem_context *ctx,
> > const struct drm_i915_gem_context_param *args)
> >  {
> > +   struct i915_gem_engines_iter it;
> > +   struct intel_context *ce;
> > int err;
> >
> > err = validate_priority(ctx->i915, args);
> > @@ -1800,7 +1745,18 @@ static int set_priority(struct i915_gem_context *ctx,
> > return err;
> >
> > ctx->sched.priority = 

Re: [PATCH 5/9] drm/i915/guc: Flush the work queue for GuC generated G2H

2021-08-12 Thread Matthew Brost
On Thu, Aug 12, 2021 at 04:11:28PM +0200, Daniel Vetter wrote:
> On Wed, Aug 11, 2021 at 01:16:18AM +, Matthew Brost wrote:
> > Flush the work queue for GuC generated G2H messages durinr a GT reset.
> > This is accomplished by spinning on the the list of outstanding G2H to
> > go empty.
> > 
> > Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC 
> > interface")
> > Signed-off-by: Matthew Brost 
> > Cc: 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 3cd2da6f5c03..e5eb2df11b4a 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -727,6 +727,11 @@ void intel_guc_submission_reset_prepare(struct 
> > intel_guc *guc)
> > wait_for_reset(guc, >outstanding_submission_g2h);
> > } while (!list_empty(>ct.requests.incoming));
> > }
> > +
> > +   /* Flush any GuC generated G2H */
> > +   while (!list_empty(>ct.requests.incoming))
> > +   msleep(20);
> 
> flush_work or flush_workqueue, beacuse that comes with lockdep
> annotations. Dont hand-roll stuff like this if at all possible.

lockdep puked when used that.

Matt

> -Daniel
> 
> > +
> > scrub_guc_desc_for_outstanding_g2h(guc);
> >  }
> >  
> > -- 
> > 2.32.0
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: [PATCH] drm/i915: Use locked access to ctx->engines in set_priority

2021-08-12 Thread Jason Ekstrand
On Tue, Aug 10, 2021 at 8:05 AM Daniel Vetter  wrote:
>
> This essentially reverts
>
> commit 89ff76bf9b3b0b86e6bbe344bd6378d8661303fc
> Author: Chris Wilson 
> Date:   Thu Apr 2 13:42:18 2020 +0100
>
> drm/i915/gem: Utilize rcu iteration of context engines
>
> Note that the other use of __context_engines_await have disappeard in
> the following commits:
>
> ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running contexts 
> (v4)")
> c7a71fc8ee04 ("drm/i915: Drop getparam support for 
> I915_CONTEXT_PARAM_ENGINES")
> 4a766ae40ec8 ("drm/i915: Drop the CONTEXT_CLONE API (v2)")
>
> None of these have any business to optimize their engine lookup with
> rcu, unless extremely convincing benchmark data and a solid analysis
> why we can't make that workload (whatever it is that does) faster with
> a proper design fix.
>
> Also since there's only one caller of context_apply_all left and it's
> really just a loop, inline it and then inline the lopp body too. This
> is how all other callers that take the engine lock loop over engines,
> it's much simpler.
>
> Signed-off-by: Daniel Vetter 
> Cc: Chris Wilson 
> Cc: Mika Kuoppala 
> Cc: Daniel Vetter 
> Cc: Jason Ekstrand 
> Cc: Tvrtko Ursulin 
> Cc: Joonas Lahtinen 
> Cc: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c | 72 -
>  1 file changed, 14 insertions(+), 58 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index dbaeb924a437..fd169cf2f75a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1284,49 +1284,6 @@ static int __context_set_persistence(struct 
> i915_gem_context *ctx, bool state)
> return 0;
>  }
>
> -static inline struct i915_gem_engines *
> -__context_engines_await(const struct i915_gem_context *ctx,
> -   bool *user_engines)
> -{
> -   struct i915_gem_engines *engines;
> -
> -   rcu_read_lock();
> -   do {
> -   engines = rcu_dereference(ctx->engines);
> -   GEM_BUG_ON(!engines);
> -
> -   if (user_engines)
> -   *user_engines = i915_gem_context_user_engines(ctx);
> -
> -   /* successful await => strong mb */
> -   if (unlikely(!i915_sw_fence_await(>fence)))

Ugh... The first time I looked at this I thought the SW fence meant it
was actually waiting on something.  But, no, it's just making sure the
engines object still exists.  *sigh*  Burn it!

Reviewed-by: Jason Ekstrand 

> -   continue;
> -
> -   if (likely(engines == rcu_access_pointer(ctx->engines)))
> -   break;
> -
> -   i915_sw_fence_complete(>fence);
> -   } while (1);
> -   rcu_read_unlock();
> -
> -   return engines;
> -}
> -
> -static void
> -context_apply_all(struct i915_gem_context *ctx,
> - void (*fn)(struct intel_context *ce, void *data),
> - void *data)
> -{
> -   struct i915_gem_engines_iter it;
> -   struct i915_gem_engines *e;
> -   struct intel_context *ce;
> -
> -   e = __context_engines_await(ctx, NULL);
> -   for_each_gem_engine(ce, e, it)
> -   fn(ce, data);
> -   i915_sw_fence_complete(>fence);
> -}
> -
>  static struct i915_gem_context *
>  i915_gem_create_context(struct drm_i915_private *i915,
> const struct i915_gem_proto_context *pc)
> @@ -1776,23 +1733,11 @@ set_persistence(struct i915_gem_context *ctx,
> return __context_set_persistence(ctx, args->value);
>  }
>
> -static void __apply_priority(struct intel_context *ce, void *arg)
> -{
> -   struct i915_gem_context *ctx = arg;
> -
> -   if (!intel_engine_has_timeslices(ce->engine))
> -   return;
> -
> -   if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> -   intel_engine_has_semaphores(ce->engine))
> -   intel_context_set_use_semaphores(ce);
> -   else
> -   intel_context_clear_use_semaphores(ce);
> -}
> -
>  static int set_priority(struct i915_gem_context *ctx,
> const struct drm_i915_gem_context_param *args)
>  {
> +   struct i915_gem_engines_iter it;
> +   struct intel_context *ce;
> int err;
>
> err = validate_priority(ctx->i915, args);
> @@ -1800,7 +1745,18 @@ static int set_priority(struct i915_gem_context *ctx,
> return err;
>
> ctx->sched.priority = args->value;
> -   context_apply_all(ctx, __apply_priority, ctx);
> +
> +   for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
> +   if (!intel_engine_has_timeslices(ce->engine))
> +   continue;
> +
> +   if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
> +   intel_engine_has_semaphores(ce->engine))
> +   intel_context_set_use_semaphores(ce);
> 

Re: [Intel-gfx] [PATCH 16/46] drm/i915/guc: Implement GuC parent-child context pin / unpin functions

2021-08-12 Thread Daniel Vetter
On Thu, Aug 12, 2021 at 4:45 PM Daniel Vetter  wrote:
>
> On Wed, Aug 11, 2021 at 06:06:36PM +, Matthew Brost wrote:
> > On Tue, Aug 10, 2021 at 11:07:55AM +0200, Daniel Vetter wrote:
> > > On Tue, Aug 10, 2021 at 10:53:37AM +0200, Daniel Vetter wrote:
> > > > On Mon, Aug 09, 2021 at 06:58:23PM +, Matthew Brost wrote:
> > > > > On Mon, Aug 09, 2021 at 05:17:34PM +0200, Daniel Vetter wrote:
> > > > > > On Tue, Aug 03, 2021 at 03:29:13PM -0700, Matthew Brost wrote:
> > > > > > > Implement GuC parent-child context pin / unpin functions in which 
> > > > > > > in any
> > > > > > > contexts in the relationship are pinned all the contexts are 
> > > > > > > pinned. The
> > > > > > > parent owns most of the pinning / unpinning process and the 
> > > > > > > children
> > > > > > > direct any pins / unpins to the parent.
> > > > > > >
> > > > > > > Patch implements a number of unused functions that will be 
> > > > > > > connected
> > > > > > > later in the series.
> > > > > > >
> > > > > > > Signed-off-by: Matthew Brost 
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/i915/gt/intel_context.c   | 187 
> > > > > > > --
> > > > > > >  drivers/gpu/drm/i915/gt/intel_context.h   |  43 +---
> > > > > > >  drivers/gpu/drm/i915/gt/intel_context_types.h |   4 +-
> > > > > > >  .../drm/i915/gt/intel_execlists_submission.c  |  25 ++-
> > > > > > >  drivers/gpu/drm/i915/gt/intel_lrc.c   |  26 +--
> > > > > > >  drivers/gpu/drm/i915/gt/intel_lrc.h   |   6 +-
> > > > > > >  .../gpu/drm/i915/gt/intel_ring_submission.c   |   5 +-
> > > > > > >  drivers/gpu/drm/i915/gt/mock_engine.c |   4 +-
> > > > > > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 183 
> > > > > > > +++--
> > > > > > >  9 files changed, 371 insertions(+), 112 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > > > > > > b/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > > index 8cb92b10b547..bb4c14656067 100644
> > > > > > > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > > @@ -158,8 +158,8 @@ static void __ring_retire(struct intel_ring 
> > > > > > > *ring)
> > > > > > > intel_ring_unpin(ring);
> > > > > > >  }
> > > > > > >
> > > > > > > -static int intel_context_pre_pin(struct intel_context *ce,
> > > > > > > -struct i915_gem_ww_ctx *ww)
> > > > > > > +static int __intel_context_pre_pin(struct intel_context *ce,
> > > > > > > +  struct i915_gem_ww_ctx *ww)
> > > > > > >  {
> > > > > > > int err;
> > > > > > >
> > > > > > > @@ -190,7 +190,7 @@ static int intel_context_pre_pin(struct 
> > > > > > > intel_context *ce,
> > > > > > > return err;
> > > > > > >  }
> > > > > > >
> > > > > > > -static void intel_context_post_unpin(struct intel_context *ce)
> > > > > > > +static void __intel_context_post_unpin(struct intel_context *ce)
> > > > > > >  {
> > > > > > > if (ce->state)
> > > > > > > __context_unpin_state(ce->state);
> > > > > > > @@ -199,13 +199,85 @@ static void intel_context_post_unpin(struct 
> > > > > > > intel_context *ce)
> > > > > > > __ring_retire(ce->ring);
> > > > > > >  }
> > > > > > >
> > > > > > > -int __intel_context_do_pin_ww(struct intel_context *ce,
> > > > > > > - struct i915_gem_ww_ctx *ww)
> > > > > > > +static int intel_context_pre_pin(struct intel_context *ce,
> > > > > > > +struct i915_gem_ww_ctx *ww)
> > > > > > >  {
> > > > > > > -   bool handoff = false;
> > > > > > > -   void *vaddr;
> > > > > > > +   struct intel_context *child;
> > > > > > > +   int err, i = 0;
> > > > > > > +
> > > > > > > +   GEM_BUG_ON(intel_context_is_child(ce));
> > > > > > > +
> > > > > > > +   for_each_child(ce, child) {
> > > > > > > +   err = __intel_context_pre_pin(child, ww);
> > > > > > > +   if (unlikely(err))
> > > > > > > +   goto unwind;
> > > > > > > +   ++i;
> > > > > > > +   }
> > > > > > > +
> > > > > > > +   err = __intel_context_pre_pin(ce, ww);
> > > > > > > +   if (unlikely(err))
> > > > > > > +   goto unwind;
> > > > > > > +
> > > > > > > +   return 0;
> > > > > > > +
> > > > > > > +unwind:
> > > > > > > +   for_each_child(ce, child) {
> > > > > > > +   if (!i--)
> > > > > > > +   break;
> > > > > > > +   __intel_context_post_unpin(ce);
> > > > > > > +   }
> > > > > > > +
> > > > > > > +   return err;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static void intel_context_post_unpin(struct intel_context *ce)
> > > > > > > +{
> > > > > > > +   struct intel_context *child;
> > > > > > > +
> > > > > > > +   GEM_BUG_ON(intel_context_is_child(ce));
> > > > > > > +
> > > > > > > +   for_each_child(ce, child)
> > > > > > > +   __intel_context_post_unpin(child);
> > > > > > > +
> > > > > > > +   __intel_context_post_unpin(ce);
> > 

Re: [PATCH] drm/edid: fix edid field name

2021-08-12 Thread Simon Ser
Pushed to drm-misc-next, thanks!


Re: [PATCH 2/2] drm/i915: Add pci ids and uapi for DG1

2021-08-12 Thread Daniel Vetter
On Thu, Aug 12, 2021 at 2:44 PM Maarten Lankhorst
 wrote:
>
> DG1 has support for local memory, which requires the usage of the
> lmem placement extension for creating bo's, and memregion queries
> to obtain the size. Because of this, those parts of the uapi are
> no longer guarded behind FAKE_LMEM.
>
> According to the pull request referenced below, mesa should be mostly
> ready for DG1. VK_EXT_memory_budget is not hooked up yet, but we
> should definitely just enable the uapi parts by default.
>
> Signed-off-by: Maarten Lankhorst 
> References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584
> Cc: Jordan Justen jordan.l.jus...@intel.com
> Cc: Jason Ekstrand ja...@jlekstrand.net

Acked-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_create.c | 3 ---
>  drivers/gpu/drm/i915/i915_pci.c| 1 +
>  drivers/gpu/drm/i915/i915_query.c  | 3 ---
>  3 files changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> index 23fee13a3384..1d341b8c47c0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> @@ -347,9 +347,6 @@ static int ext_set_placements(struct i915_user_extension 
> __user *base,
>  {
> struct drm_i915_gem_create_ext_memory_regions ext;
>
> -   if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM))
> -   return -ENODEV;
> -
> if (copy_from_user(, base, sizeof(ext)))
> return -EFAULT;
>
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 1bbd09ad5287..93ccdc6bbd03 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1115,6 +1115,7 @@ static const struct pci_device_id pciidlist[] = {
> INTEL_RKL_IDS(_info),
> INTEL_ADLS_IDS(_s_info),
> INTEL_ADLP_IDS(_p_info),
> +   INTEL_DG1_IDS(_info),
> {0, 0, 0}
>  };
>  MODULE_DEVICE_TABLE(pci, pciidlist);
> diff --git a/drivers/gpu/drm/i915/i915_query.c 
> b/drivers/gpu/drm/i915/i915_query.c
> index e49da36c62fb..5e2b909827f4 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -432,9 +432,6 @@ static int query_memregion_info(struct drm_i915_private 
> *i915,
> u32 total_length;
> int ret, id, i;
>
> -   if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM))
> -   return -ENODEV;
> -
> if (query_item->flags != 0)
> return -EINVAL;
>
> --
> 2.32.0
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 16/46] drm/i915/guc: Implement GuC parent-child context pin / unpin functions

2021-08-12 Thread Daniel Vetter
On Wed, Aug 11, 2021 at 06:06:36PM +, Matthew Brost wrote:
> On Tue, Aug 10, 2021 at 11:07:55AM +0200, Daniel Vetter wrote:
> > On Tue, Aug 10, 2021 at 10:53:37AM +0200, Daniel Vetter wrote:
> > > On Mon, Aug 09, 2021 at 06:58:23PM +, Matthew Brost wrote:
> > > > On Mon, Aug 09, 2021 at 05:17:34PM +0200, Daniel Vetter wrote:
> > > > > On Tue, Aug 03, 2021 at 03:29:13PM -0700, Matthew Brost wrote:
> > > > > > Implement GuC parent-child context pin / unpin functions in which 
> > > > > > in any
> > > > > > contexts in the relationship are pinned all the contexts are 
> > > > > > pinned. The
> > > > > > parent owns most of the pinning / unpinning process and the children
> > > > > > direct any pins / unpins to the parent.
> > > > > > 
> > > > > > Patch implements a number of unused functions that will be connected
> > > > > > later in the series.
> > > > > > 
> > > > > > Signed-off-by: Matthew Brost 
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/gt/intel_context.c   | 187 
> > > > > > --
> > > > > >  drivers/gpu/drm/i915/gt/intel_context.h   |  43 +---
> > > > > >  drivers/gpu/drm/i915/gt/intel_context_types.h |   4 +-
> > > > > >  .../drm/i915/gt/intel_execlists_submission.c  |  25 ++-
> > > > > >  drivers/gpu/drm/i915/gt/intel_lrc.c   |  26 +--
> > > > > >  drivers/gpu/drm/i915/gt/intel_lrc.h   |   6 +-
> > > > > >  .../gpu/drm/i915/gt/intel_ring_submission.c   |   5 +-
> > > > > >  drivers/gpu/drm/i915/gt/mock_engine.c |   4 +-
> > > > > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 183 
> > > > > > +++--
> > > > > >  9 files changed, 371 insertions(+), 112 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > > > > > b/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > index 8cb92b10b547..bb4c14656067 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > > > > > @@ -158,8 +158,8 @@ static void __ring_retire(struct intel_ring 
> > > > > > *ring)
> > > > > > intel_ring_unpin(ring);
> > > > > >  }
> > > > > >  
> > > > > > -static int intel_context_pre_pin(struct intel_context *ce,
> > > > > > -struct i915_gem_ww_ctx *ww)
> > > > > > +static int __intel_context_pre_pin(struct intel_context *ce,
> > > > > > +  struct i915_gem_ww_ctx *ww)
> > > > > >  {
> > > > > > int err;
> > > > > >  
> > > > > > @@ -190,7 +190,7 @@ static int intel_context_pre_pin(struct 
> > > > > > intel_context *ce,
> > > > > > return err;
> > > > > >  }
> > > > > >  
> > > > > > -static void intel_context_post_unpin(struct intel_context *ce)
> > > > > > +static void __intel_context_post_unpin(struct intel_context *ce)
> > > > > >  {
> > > > > > if (ce->state)
> > > > > > __context_unpin_state(ce->state);
> > > > > > @@ -199,13 +199,85 @@ static void intel_context_post_unpin(struct 
> > > > > > intel_context *ce)
> > > > > > __ring_retire(ce->ring);
> > > > > >  }
> > > > > >  
> > > > > > -int __intel_context_do_pin_ww(struct intel_context *ce,
> > > > > > - struct i915_gem_ww_ctx *ww)
> > > > > > +static int intel_context_pre_pin(struct intel_context *ce,
> > > > > > +struct i915_gem_ww_ctx *ww)
> > > > > >  {
> > > > > > -   bool handoff = false;
> > > > > > -   void *vaddr;
> > > > > > +   struct intel_context *child;
> > > > > > +   int err, i = 0;
> > > > > > +
> > > > > > +   GEM_BUG_ON(intel_context_is_child(ce));
> > > > > > +
> > > > > > +   for_each_child(ce, child) {
> > > > > > +   err = __intel_context_pre_pin(child, ww);
> > > > > > +   if (unlikely(err))
> > > > > > +   goto unwind;
> > > > > > +   ++i;
> > > > > > +   }
> > > > > > +
> > > > > > +   err = __intel_context_pre_pin(ce, ww);
> > > > > > +   if (unlikely(err))
> > > > > > +   goto unwind;
> > > > > > +
> > > > > > +   return 0;
> > > > > > +
> > > > > > +unwind:
> > > > > > +   for_each_child(ce, child) {
> > > > > > +   if (!i--)
> > > > > > +   break;
> > > > > > +   __intel_context_post_unpin(ce);
> > > > > > +   }
> > > > > > +
> > > > > > +   return err;
> > > > > > +}
> > > > > > +
> > > > > > +static void intel_context_post_unpin(struct intel_context *ce)
> > > > > > +{
> > > > > > +   struct intel_context *child;
> > > > > > +
> > > > > > +   GEM_BUG_ON(intel_context_is_child(ce));
> > > > > > +
> > > > > > +   for_each_child(ce, child)
> > > > > > +   __intel_context_post_unpin(child);
> > > > > > +
> > > > > > +   __intel_context_post_unpin(ce);
> > > > > > +}
> > > > > > +
> > > > > > +static int __do_ww_lock(struct intel_context *ce,
> > > > > > +   struct i915_gem_ww_ctx *ww)
> > > > > > +{
> > > > > > +   int err = i915_gem_object_lock(ce->timeline->hwsp_ggtt->obj, 
> > > > > > ww);
> > > > > > +
> > > > > > +   if 

Re: [PATCH 5/9] drm/i915/guc: Flush the work queue for GuC generated G2H

2021-08-12 Thread Daniel Vetter
On Wed, Aug 11, 2021 at 01:16:18AM +, Matthew Brost wrote:
> Flush the work queue for GuC generated G2H messages durinr a GT reset.
> This is accomplished by spinning on the the list of outstanding G2H to
> go empty.
> 
> Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC 
> interface")
> Signed-off-by: Matthew Brost 
> Cc: 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 3cd2da6f5c03..e5eb2df11b4a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -727,6 +727,11 @@ void intel_guc_submission_reset_prepare(struct intel_guc 
> *guc)
>   wait_for_reset(guc, >outstanding_submission_g2h);
>   } while (!list_empty(>ct.requests.incoming));
>   }
> +
> + /* Flush any GuC generated G2H */
> + while (!list_empty(>ct.requests.incoming))
> + msleep(20);

flush_work or flush_workqueue, beacuse that comes with lockdep
annotations. Dont hand-roll stuff like this if at all possible.
-Daniel

> +
>   scrub_guc_desc_for_outstanding_g2h(guc);
>  }
>  
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 46/46] drm/i915/guc: Add delay before disabling scheduling on contexts

2021-08-12 Thread Daniel Vetter
On Wed, Aug 11, 2021 at 05:43:23PM +, Matthew Brost wrote:
> On Wed, Aug 11, 2021 at 11:55:48AM +0200, Daniel Vetter wrote:
> > On Mon, Aug 09, 2021 at 07:32:26PM +, Matthew Brost wrote:
> > > On Mon, Aug 09, 2021 at 07:17:27PM +0200, Daniel Vetter wrote:
> > > > On Tue, Aug 03, 2021 at 03:29:43PM -0700, Matthew Brost wrote:
> > > > > Some workloads use lots of contexts that continually pin / unpin
> > > > > contexts. With GuC submission an unpin translates to a schedule 
> > > > > disable
> > > > > H2G which puts pressure on both the i915 and GuC. A schedule disable 
> > > > > can
> > > > > also block future requests from being submitted until the operation
> > > > > completes. None of this is ideal.
> > > > > 
> > > > > Add a configurable, via debugfs, delay period before the schedule
> > > > > disable is issued. Default delay period is 1 second. The delay period 
> > > > > is
> > > > > skipped if more than 3/4 of the guc_ids are in use.
> > > > > 
> > > > > This patch also updates the selftests to turn off this delay period as
> > > > > this extra time would likely cause many selftests to fail. Follow up
> > > > > patches will fix all the selftests and enable the delay period.
> > > > > 
> > > > > Signed-off-by: Matthew Brost 
> > > > 
> > > > I think this is more evidence that we should just pin/unpin context at
> > > > create/destruction time. The current scheme doesn't really work that 
> > > > well
> > > > and causes way more pain than benefits it seems.
> > > > 
> > > 
> > > Well that choice is above my pay grade, but for what it is worth it
> > > would simplify the GuC backend quite a bit if we perma-pin contexts. By
> > > quite a bit, I actually mean a lot of complexity goes away.
> > > 
> > > In the meantime I think we probably need this code though to avoid
> > > trashes on the scheduling enable / disable.
> > 
> > The trouble is that you muck around with the context close state bit,
> 
> This really doesn't mess this bit anymore that what is there, it just
> adds callback to the backend.
> 
> > which is one of these lockless trickeries where my cursory analysis (just
> > a few days in total of randomly stumbling over it when reading other code)
> > strongly suggests it's busted.
> > 
> > I really don't want to build more on top, especially not without careful
> > review and all that.
> > 
> > Also since this is a perf claim, the commit message needs some numbers.
> >
> 
> This was basically just visual inspection of ftrace of a media workload
> that uses lots of contexts. The contexts were repeatedly pinned /
> unpinned. Disabling / enabling scheduling is a rather expensive
> operation so we really shouldn't be doing it all the time. We visually
> observed an ftrace after this change and all this unnecessary traffic
> went away.

That's the kinds of stuff that should be included in the commit message,
ideally with some numbers (like how many you manage to remove or whatever
metric it is you picked, something quick like done with grep and line
counting is good enough).

> > Finally even if we decide to make contexts properly evictable, we need a
> > different scheme anyway. As you realized the current active tracking is
> > kinda backwards because it unpins immediately when no longer in use.
> 
> Right, this basically just works around the fact that contexts are
> immediately unpinned when not in use. As stated before if we perma-pin
> contexts all this goes away.

Yeah, sounds all good then. Well, more or less.
-Daniel

> 
> Matt
> 
> > -Daniel
> > 
> > > 
> > > Matt
> > > 
> > > > If anyone screams, and that's a big if aside of some igts, we can come 
> > > > up
> > > > with a proper scheme to evict contexts without pin/unpin and layer hacks
> > > > over that misdesign.
> > > > -Daniel
> > > > 
> > > > > ---
> > > > >  drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
> > > > >  .../i915/gem/selftests/i915_gem_coherency.c   |   2 +-
> > > > >  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |   2 +-
> > > > >  .../drm/i915/gem/selftests/i915_gem_mman.c|   2 +-
> > > > >  .../drm/i915/gem/selftests/i915_gem_object.c  |   2 +-
> > > > >  drivers/gpu/drm/i915/gt/intel_context.c   |   2 +
> > > > >  drivers/gpu/drm/i915/gt/intel_context.h   |   9 +
> > > > >  drivers/gpu/drm/i915/gt/intel_context_types.h |   8 +
> > > > >  drivers/gpu/drm/i915/gt/uc/intel_guc.h|   7 +
> > > > >  .../gpu/drm/i915/gt/uc/intel_guc_debugfs.c|  28 ++
> > > > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 322 
> > > > > +-
> > > > >  .../i915/gt/uc/selftest_guc_flow_control.c|  19 +-
> > > > >  drivers/gpu/drm/i915/i915_selftest.h  |   2 +
> > > > >  drivers/gpu/drm/i915/i915_trace.h |  10 +
> > > > >  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   2 +-
> > > > >  drivers/gpu/drm/i915/selftests/i915_perf.c|   2 +-
> > > > >  drivers/gpu/drm/i915/selftests/i915_request.c |   2 +-
> > > > >  drivers/gpu/drm/i915/selftests/i915_vma.c

[PATCH] video: fbdev: au1200fb: Make use of dma_mmap_coherent()

2021-08-12 Thread Cai Huoqing
replace dma_mmap_attrs() with dma_mmap_coherent() kindly.
BTW, fix indentation.

Signed-off-by: Cai Huoqing 
---
 drivers/video/fbdev/au1200fb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/fbdev/au1200fb.c b/drivers/video/fbdev/au1200fb.c
index c00e01a17368..81c315454428 100644
--- a/drivers/video/fbdev/au1200fb.c
+++ b/drivers/video/fbdev/au1200fb.c
@@ -1233,8 +1233,8 @@ static int au1200fb_fb_mmap(struct fb_info *info, struct 
vm_area_struct *vma)
 {
struct au1200fb_device *fbdev = info->par;
 
-   return dma_mmap_attrs(fbdev->dev, vma, fbdev->fb_mem, fbdev->fb_phys,
-   fbdev->fb_len, 0);
+   return dma_mmap_coherent(fbdev->dev, vma,
+fbdev->fb_mem, fbdev->fb_phys, fbdev->fb_len);
 }
 
 static void set_global(u_int cmd, struct au1200_lcd_global_regs_t *pdata)
-- 
2.25.1



[PATCH 4/4] drm/vgem: use shmem helpers

2021-08-12 Thread Daniel Vetter
Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

v2: Review from Thomas:
- sort #include
- drop more dead code that I didn't spot somehow

v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)

v4: I got tricked by 0cf2ef46c6c0 ("drm/shmem-helper: Use cached
mappings by default"), and we need WC in vgem because vgem doesn't
have explicit begin/end cpu access ioctls.

Also add a comment why exactly vgem has to use wc.

v5: Don't set obj->base.funcs, it will default to drm_gem_shmem_funcs
(Thomas)

v6: vgem also needs an MMU for remapping

v7: I absolutely butchered the rebases over the vgem mmap change and
revert and broke the patch. Actually go back to v6 from before the
vgem mmap changes.

Cc: Thomas Zimmermann 
Acked-by: Thomas Zimmermann 
Cc: John Stultz 
Cc: Sumit Semwal 
Cc: "Christian König" 
Signed-off-by: Daniel Vetter 
Cc: Melissa Wen 
Cc: Chris Wilson 
---
 drivers/gpu/drm/Kconfig |   5 +-
 drivers/gpu/drm/vgem/vgem_drv.c | 342 ++--
 2 files changed, 16 insertions(+), 331 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 314eefa39892..28f7d2006e8b 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -272,7 +272,8 @@ source "drivers/gpu/drm/kmb/Kconfig"
 
 config DRM_VGEM
tristate "Virtual GEM provider"
-   depends on DRM
+   depends on DRM && MMU
+   select DRM_GEM_SHMEM_HELPER
help
  Choose this option to get a virtual graphics memory manager,
  as used by Mesa's software renderer for enhanced performance.
@@ -280,7 +281,7 @@ config DRM_VGEM
 
 config DRM_VKMS
tristate "Virtual KMS (EXPERIMENTAL)"
-   depends on DRM
+   depends on DRM && MMU
select DRM_KMS_HELPER
select DRM_GEM_SHMEM_HELPER
select CRC32
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index bf38a7e319d1..a87eafa89e9f 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -38,6 +38,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,87 +51,11 @@
 #define DRIVER_MAJOR   1
 #define DRIVER_MINOR   0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
struct drm_device drm;
struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-   struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-   kvfree(vgem_obj->pages);
-   mutex_destroy(_obj->pages_lock);
-
-   if (obj->import_attach)
-   drm_prime_gem_destroy(obj, vgem_obj->table);
-
-   drm_gem_object_release(obj);
-   kfree(vgem_obj);
-}
-
-static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
-{
-   struct vm_area_struct *vma = vmf->vma;
-   struct drm_vgem_gem_object *obj = vma->vm_private_data;
-   /* We don't use vmf->pgoff since that has the fake offset */
-   unsigned long vaddr = vmf->address;
-   vm_fault_t ret = VM_FAULT_SIGBUS;
-   loff_t num_pages;
-   pgoff_t page_offset;
-   page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
-
-   num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
-
-   if (page_offset >= num_pages)
-   return VM_FAULT_SIGBUS;
-
-   mutex_lock(>pages_lock);
-   if (obj->pages) {
-   get_page(obj->pages[page_offset]);
-   vmf->page = obj->pages[page_offset];
-   ret = 0;
-   }
-   mutex_unlock(>pages_lock);
-   if (ret) {
-   struct page *page;
-
-   page = shmem_read_mapping_page(
-   file_inode(obj->base.filp)->i_mapping,
-   page_offset);
-   if (!IS_ERR(page)) {
-   vmf->page = page;
-   ret = 0;
-   } else switch (PTR_ERR(page)) {
-   case -ENOSPC:
-   case -ENOMEM:
-   ret = VM_FAULT_OOM;
-   break;
-   case -EBUSY:
-   ret = VM_FAULT_RETRY;
-   break;
-   case -EFAULT:
-   case -EINVAL:
-   ret = VM_FAULT_SIGBUS;
-   break;
-   default:
-   WARN_ON(PTR_ERR(page));
-   ret = VM_FAULT_SIGBUS;
-   break;
-   }
-
-   }
-   return ret;
-}
-
-static const struct 

[PATCH 3/4] drm/shmem-helpers: Allocate wc pages on x86

2021-08-12 Thread Daniel Vetter
intel-gfx-ci realized that something is not quite coherent anymore on
some platforms for our i915+vgem tests, when I tried to switch vgem
over to shmem helpers.

After lots of head-scratching I realized that I've removed calls to
drm_clflush. And we need those. To make this a bit cleaner use the
same page allocation tooling as ttm, which does internally clflush
(and more, as neeeded on any platform instead of just the intel x86
cpus i915 can be combined with).

Unfortunately this doesn't exist on arm, or as a generic feature. For
that I think only the dma-api can get at wc memory reliably, so maybe
we'd need some kind of GFP_WC flag to do this properly.

v2: Add a TODO comment about what should be done to support this in
other places (Thomas)

Acked-by: Thomas Zimmermann 
Signed-off-by: Daniel Vetter 
Cc: Christian König 
Cc: "Thomas Hellström" 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index cc96d1c3570e..0e0986dfbe0c 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -10,6 +10,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_X86
+#include 
+#endif
+
 #include 
 #include 
 #include 
@@ -162,6 +166,16 @@ static int drm_gem_shmem_get_pages_locked(struct 
drm_gem_shmem_object *shmem)
return PTR_ERR(pages);
}
 
+   /*
+* TODO: Allocating WC pages which are correctly flushed is only
+* supported on x86. Ideal solution would be a GFP_WC flag, which also
+* ttm_pool.c could use.
+*/
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
+#endif
+
shmem->pages = pages;
 
return 0;
@@ -203,6 +217,11 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
if (--shmem->pages_use_count > 0)
return;
 
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
drm_gem_put_pages(obj, shmem->pages,
  shmem->pages_mark_dirty_on_put,
  shmem->pages_mark_accessed_on_put);
-- 
2.32.0



[PATCH 2/4] drm/shmem-helper: Switch to vmf_insert_pfn

2021-08-12 Thread Daniel Vetter
We want to stop gup, which isn't the case if we use vmf_insert_page
and VM_MIXEDMAP, because that does not set pte_special.

The motivation here is to stop get_user_pages from working on buffer
object mmaps in general. Quoting some discussion with Thomas:

On Thu, Jul 22, 2021 at 08:22:43PM +0200, Thomas Zimmermann wrote:
> Am 13.07.21 um 22:51 schrieb Daniel Vetter:
> > We want to stop gup, which isn't the case if we use vmf_insert_page
>
> What is gup?

get_user_pages. It pins memory wherever it is, which badly wreaks at least
ttm and could also cause trouble with cma allocations. In both cases
becaue we can't move/reuse these pages anymore.

Now get_user_pages fails when the memory isn't considered "normal", like
with VM_PFNMAP and using vm_insert_pfn. For consistency across all dma-buf
I'm trying (together with Christian König) to roll this out everywhere,
for fewer surprises.

E.g. for 5.14 iirc we merged a patch to do the same for ttm, where it
closes an actual bug (ttm gets really badly confused when there's suddenly
pinned pages where it thought it can move them).

cma allcoations already use VM_PFNMAP (because that's what dma_mmap is
using underneath), as is anything that's using remap_pfn_range. Worst case
we have to revert this patch for shmem helpers if it breaks something, but
I hope that's not the case. On the ttm side we've also had some fallout
that we needed to paper over with clever tricks.
v2: With this shmem gem helpers now definitely need CONFIG_MMU (0day)

v3: add more depends on MMU. For usb drivers this is a bit awkward,
but really it's correct: To be able to provide a contig mapping of
buffers to userspace on !MMU platforms we'd need to use the cma
helpers for these drivers on those platforms. As-is this wont work.

Also not exactly sure why vm_insert_page doesn't go boom, because that
definitely wont fly in practice since the pages are non-contig to
begin with.

v4: Explain the entire motivation a lot more (Thomas)

Acked-by: Thomas Zimmermann 
Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/Kconfig| 2 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c | 4 ++--
 drivers/gpu/drm/gud/Kconfig| 2 +-
 drivers/gpu/drm/tiny/Kconfig   | 4 ++--
 drivers/gpu/drm/udl/Kconfig| 1 +
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 0d372354c2d0..314eefa39892 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -211,7 +211,7 @@ config DRM_KMS_CMA_HELPER
 
 config DRM_GEM_SHMEM_HELPER
bool
-   depends on DRM
+   depends on DRM && MMU
help
  Choose this if you need the GEM shmem helper functions
 
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index a61946374c82..cc96d1c3570e 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -542,7 +542,7 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
} else {
page = shmem->pages[page_offset];
 
-   ret = vmf_insert_page(vma, vmf->address, page);
+   ret = vmf_insert_pfn(vma, vmf->address, page_to_pfn(page));
}
 
mutex_unlock(>pages_lock);
@@ -612,7 +612,7 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma)
return ret;
}
 
-   vma->vm_flags |= VM_MIXEDMAP | VM_DONTEXPAND;
+   vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND;
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
if (shmem->map_wc)
vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
diff --git a/drivers/gpu/drm/gud/Kconfig b/drivers/gpu/drm/gud/Kconfig
index 1c8601bf4d91..9c1e61f9eec3 100644
--- a/drivers/gpu/drm/gud/Kconfig
+++ b/drivers/gpu/drm/gud/Kconfig
@@ -2,7 +2,7 @@
 
 config DRM_GUD
tristate "GUD USB Display"
-   depends on DRM && USB
+   depends on DRM && USB && MMU
select LZ4_COMPRESS
select DRM_KMS_HELPER
select DRM_GEM_SHMEM_HELPER
diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
index d31be274a2bd..1ceb93fbdc50 100644
--- a/drivers/gpu/drm/tiny/Kconfig
+++ b/drivers/gpu/drm/tiny/Kconfig
@@ -44,7 +44,7 @@ config DRM_CIRRUS_QEMU
 
 config DRM_GM12U320
tristate "GM12U320 driver for USB projectors"
-   depends on DRM && USB
+   depends on DRM && USB && MMU
select DRM_KMS_HELPER
select DRM_GEM_SHMEM_HELPER
help
@@ -53,7 +53,7 @@ config DRM_GM12U320
 
 config DRM_SIMPLEDRM
tristate "Simple framebuffer driver"
-   depends on DRM
+   depends on DRM && MMU
select DRM_GEM_SHMEM_HELPER
select DRM_KMS_HELPER
help
diff --git a/drivers/gpu/drm/udl/Kconfig b/drivers/gpu/drm/udl/Kconfig
index 1f497d8f1ae5..c744175c6992 100644
--- 

[PATCH 1/4] dma-buf: Require VM_PFNMAP vma for mmap

2021-08-12 Thread Daniel Vetter
tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.

Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.

To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.

Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.

v2:

Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.

>From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.

v3: Change to WARN_ON_ONCE (Thomas Zimmermann)

References: 
https://lore.kernel.org/lkml/cakmk7uhi+mg0z0humnt13qccvuturvjpcr0njrl12k-wbwz...@mail.gmail.com/
Acked-by: Christian König 
Acked-by: Thomas Zimmermann 
Cc: Thomas Zimmermann 
Cc: Jason Gunthorpe 
Cc: Suren Baghdasaryan 
Cc: Matthew Wilcox 
Cc: John Stultz 
Signed-off-by: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
--
Resending this so I can test the next patches for vgem/shmem in
intel-gfx-ci.

No immediate plans to merge this patch here since ttm isn't addressed
yet (and there we have the hugepte issue, for which I don't think we
have a clear consensus yet).
-Daniel
---
 drivers/dma-buf/dma-buf.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 63d32261b63f..d19b1cf6c34f 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -130,6 +130,7 @@ static struct file_system_type dma_buf_fs_type = {
 static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 {
struct dma_buf *dmabuf;
+   int ret;
 
if (!is_dma_buf_file(file))
return -EINVAL;
@@ -145,7 +146,11 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
dmabuf->size >> PAGE_SHIFT)
return -EINVAL;
 
-   return dmabuf->ops->mmap(dmabuf, vma);
+   ret = dmabuf->ops->mmap(dmabuf, vma);
+
+   WARN_ON_ONCE(!(vma->vm_flags & VM_PFNMAP));
+
+   return ret;
 }
 
 static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
@@ -1260,6 +1265,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
 int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 unsigned long pgoff)
 {
+   int ret;
+
if (WARN_ON(!dmabuf || !vma))
return -EINVAL;
 
@@ -1280,7 +1287,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct 
vm_area_struct *vma,
vma_set_file(vma, dmabuf->file);
vma->vm_pgoff = pgoff;
 
-   return dmabuf->ops->mmap(dmabuf, vma);
+   ret = dmabuf->ops->mmap(dmabuf, vma);
+
+   WARN_ON_ONCE(!(vma->vm_flags & VM_PFNMAP));
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(dma_buf_mmap);
 
-- 
2.32.0



Re: [PATCH] drm/virtio: set non-cross device blob uuid_state

2021-08-12 Thread Gerd Hoffmann
On Wed, Aug 11, 2021 at 01:04:01PM +0900, David Stevens wrote:
> Blob resources without the cross device flag don't have a uuid to share
> with other virtio devices. When exporting such blobs, set uuid_state to
> STATE_ERR so that virtgpu_virtio_get_uuid doesn't hang.
> 
> Signed-off-by: David Stevens 

Pushed to drm-misc-next.

thanks,
  Gerd



Re: [PATCH v4 2/4] drm/shmem-helper: Switch to vmf_insert_pfn

2021-08-12 Thread Daniel Vetter
On Thu, Jul 22, 2021 at 08:22:43PM +0200, Thomas Zimmermann wrote:
> Hi,
> 
> I'm not knowledgeable enougth to give this a full review. If you can just
> answer my questions, fell free to add an
> 
> Acked-by: Thomas Zimmermann 
> 
> to the patch. :)
> 
> Am 13.07.21 um 22:51 schrieb Daniel Vetter:
> > We want to stop gup, which isn't the case if we use vmf_insert_page
> 
> What is gup?
> 
> > and VM_MIXEDMAP, because that does not set pte_special.
> > 
> > v2: With this shmem gem helpers now definitely need CONFIG_MMU (0day)
> > 
> > v3: add more depends on MMU. For usb drivers this is a bit awkward,
> > but really it's correct: To be able to provide a contig mapping of
> > buffers to userspace on !MMU platforms we'd need to use the cma
> > helpers for these drivers on those platforms. As-is this wont work.
> > 
> > Also not exactly sure why vm_insert_page doesn't go boom, because that
> > definitely wont fly in practice since the pages are non-contig to
> > begin with.
> > 
> > Signed-off-by: Daniel Vetter 
> > Cc: Maarten Lankhorst 
> > Cc: Maxime Ripard 
> > Cc: Thomas Zimmermann 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > ---
> >   drivers/gpu/drm/Kconfig| 2 +-
> >   drivers/gpu/drm/drm_gem_shmem_helper.c | 4 ++--
> >   drivers/gpu/drm/gud/Kconfig| 2 +-
> >   drivers/gpu/drm/tiny/Kconfig   | 4 ++--
> >   drivers/gpu/drm/udl/Kconfig| 1 +
> >   5 files changed, 7 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index 0d372354c2d0..314eefa39892 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -211,7 +211,7 @@ config DRM_KMS_CMA_HELPER
> >   config DRM_GEM_SHMEM_HELPER
> > bool
> > -   depends on DRM
> > +   depends on DRM && MMU
> > help
> >   Choose this if you need the GEM shmem helper functions
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index d5e6d4568f99..296ab1b7c07f 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -542,7 +542,7 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault 
> > *vmf)
> > } else {
> > page = shmem->pages[page_offset];
> > -   ret = vmf_insert_page(vma, vmf->address, page);
> > +   ret = vmf_insert_pfn(vma, vmf->address, page_to_pfn(page));
> > }
> > mutex_unlock(>pages_lock);
> > @@ -612,7 +612,7 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, 
> > struct vm_area_struct *vma)
> > return ret;
> > }
> > -   vma->vm_flags |= VM_MIXEDMAP | VM_DONTEXPAND;
> > +   vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND;
> > vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
> > if (shmem->map_wc)
> > vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> > diff --git a/drivers/gpu/drm/gud/Kconfig b/drivers/gpu/drm/gud/Kconfig
> > index 1c8601bf4d91..9c1e61f9eec3 100644
> > --- a/drivers/gpu/drm/gud/Kconfig
> > +++ b/drivers/gpu/drm/gud/Kconfig
> > @@ -2,7 +2,7 @@
> >   config DRM_GUD
> > tristate "GUD USB Display"
> > -   depends on DRM && USB
> > +   depends on DRM && USB && MMU
> > select LZ4_COMPRESS
> > select DRM_KMS_HELPER
> > select DRM_GEM_SHMEM_HELPER
> 
> I'm a kconfig noob, so this is rather a question than a review comment:
> 
> 
> 
> If DRM_GEM_SHMEM_HELPER already depends on MMU, this select will fail on
> non-MMU platforms? Why does the driver also depend on MMU? Simply to make
> the item disappear in menuconfig?

I totally missed this somehow. vmf_insert_pfn functions only exists for
MMU based system. So we can't compile vgem without that. And yes it just
makes it disappear.

tbh I'm not sure it even worked with the old code, because on !MMU
platforms it's the mmap's implementation job to make sure the pages are
physically contiguous. There's another mmap related callback which should
return the physical address where the memory starts.

The cma helpers otoh should work on !MMU platforms, because they will give
us a physically contig memory region.
-Daniel

> 
> Best regards
> Thomas
> 
> > diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
> > index 5593128eeff9..c11fb5be7d09 100644
> > --- a/drivers/gpu/drm/tiny/Kconfig
> > +++ b/drivers/gpu/drm/tiny/Kconfig
> > @@ -44,7 +44,7 @@ config DRM_CIRRUS_QEMU
> >   config DRM_GM12U320
> > tristate "GM12U320 driver for USB projectors"
> > -   depends on DRM && USB
> > +   depends on DRM && USB && MMU
> > select DRM_KMS_HELPER
> > select DRM_GEM_SHMEM_HELPER
> > help
> > @@ -53,7 +53,7 @@ config DRM_GM12U320
> >   config DRM_SIMPLEDRM
> > tristate "Simple framebuffer driver"
> > -   depends on DRM
> > +   depends on DRM && MMU
> > select DRM_GEM_SHMEM_HELPER
> > select DRM_KMS_HELPER
> > help
> > diff --git a/drivers/gpu/drm/udl/Kconfig b/drivers/gpu/drm/udl/Kconfig
> > index 

Re: [PATCH] drm/virtio: support mapping exported vram

2021-08-12 Thread Gerd Hoffmann
  Hi,

> +static struct sg_table *virtgpu_gem_map_dma_buf(
> + struct dma_buf_attachment *attach,
> + enum dma_data_direction dir)

checkpatch doesn't like that:

-:47: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#47: FILE: drivers/gpu/drm/virtio/virtgpu_prime.c:46:
+static struct sg_table *virtgpu_gem_map_dma_buf(

> +{
> + struct drm_gem_object *obj = attach->dmabuf->priv;
> + struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +
> + if (virtio_gpu_is_vram(bo))
> + return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
> +
> + return drm_gem_map_dma_buf(attach, dir);
> +}
> +
> +static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> +   struct sg_table *sgt,
> +   enum dma_data_direction dir)
> +{
> + struct drm_gem_object *obj = attach->dmabuf->priv;
> + struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +
> + if (virtio_gpu_is_vram(bo))
> + virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> + else
> + drm_gem_unmap_dma_buf(attach, sgt, dir);
> +}

Minor nit:  Can we use the same logic in both functions?  I like the
virtgpu_gem_map_dma_buf variant (without else) more.

Otherwise looks sane to me.

thanks,
  Gerd



[PATCH 2/2] drm/i915: Add pci ids and uapi for DG1

2021-08-12 Thread Maarten Lankhorst
DG1 has support for local memory, which requires the usage of the
lmem placement extension for creating bo's, and memregion queries
to obtain the size. Because of this, those parts of the uapi are
no longer guarded behind FAKE_LMEM.

According to the pull request referenced below, mesa should be mostly
ready for DG1. VK_EXT_memory_budget is not hooked up yet, but we
should definitely just enable the uapi parts by default.

Signed-off-by: Maarten Lankhorst 
References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584
Cc: Jordan Justen jordan.l.jus...@intel.com
Cc: Jason Ekstrand ja...@jlekstrand.net
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 3 ---
 drivers/gpu/drm/i915/i915_pci.c| 1 +
 drivers/gpu/drm/i915/i915_query.c  | 3 ---
 3 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 23fee13a3384..1d341b8c47c0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -347,9 +347,6 @@ static int ext_set_placements(struct i915_user_extension 
__user *base,
 {
struct drm_i915_gem_create_ext_memory_regions ext;
 
-   if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM))
-   return -ENODEV;
-
if (copy_from_user(, base, sizeof(ext)))
return -EFAULT;
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 1bbd09ad5287..93ccdc6bbd03 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1115,6 +1115,7 @@ static const struct pci_device_id pciidlist[] = {
INTEL_RKL_IDS(_info),
INTEL_ADLS_IDS(_s_info),
INTEL_ADLP_IDS(_p_info),
+   INTEL_DG1_IDS(_info),
{0, 0, 0}
 };
 MODULE_DEVICE_TABLE(pci, pciidlist);
diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index e49da36c62fb..5e2b909827f4 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -432,9 +432,6 @@ static int query_memregion_info(struct drm_i915_private 
*i915,
u32 total_length;
int ret, id, i;
 
-   if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM))
-   return -ENODEV;
-
if (query_item->flags != 0)
return -EINVAL;
 
-- 
2.32.0



[PATCH 1/2] Revert "drm/i915: allow DG1 autoprobe for CONFIG_BROKEN"

2021-08-12 Thread Maarten Lankhorst
This reverts commit fae352efb12196e7110f98bc1297ce533472764d.

Inside core-for-CI, reverting to make next patch apply cleanly.
---
 drivers/gpu/drm/i915/i915_pci.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 2c1cb9b6b556..1bbd09ad5287 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1115,9 +1115,6 @@ static const struct pci_device_id pciidlist[] = {
INTEL_RKL_IDS(_info),
INTEL_ADLS_IDS(_s_info),
INTEL_ADLP_IDS(_p_info),
-#if IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)
-   INTEL_DG1_IDS(_info),
-#endif
{0, 0, 0}
 };
 MODULE_DEVICE_TABLE(pci, pciidlist);
-- 
2.32.0



Re: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks

2021-08-12 Thread Lazar, Lijo




On 8/12/2021 1:41 PM, Michel Dänzer wrote:

On 2021-08-12 7:55 a.m., Koenig, Christian wrote:

Hi James,

Evan seems to have understood how this all works together.

See while any begin/end use critical section is active the work should not be 
active.

When you handle only one ring you can just call cancel in begin use and 
schedule in end use. But when you have more than one ring you need a lock or 
counter to prevent concurrent work items to be started.

Michelle's idea to use mod_delayed_work is a bad one because it assumes that 
the delayed work is still running.


It merely assumes that the work may already have been scheduled before.

Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I 
think it can still have some effect when there's a single work item for 
multiple rings, as described by James, it's probably negligible, since 
presumably the time intervals between ring_begin_use and ring_end_use are 
normally much shorter than a second.

So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as 
schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with 
dropping it.



Something similar applies to the first patch I think,


There are no cancel work calls in that case, so the commit log is accurate 
TTBOMK.


Curious -

For patch 1, does it make a difference if any delayed work scheduled is 
cancelled in the else part before proceeding?


} else if (!enable && adev->gfx.gfx_off_state) {
cancel_delayed_work();


Thanks,
Lijo



I noticed this because current mutter Git main wasn't able to sustain 60 fps on 
Navi 14 with a simple glxgears -fullscreen. mutter was dropping frames because 
its CPU work for a frame update occasionally took up to 3 ms, instead of the 
normal 2-300 microseconds. sysprof showed a lot of cycles spent in the 
functions which enable/disable GFXOFF in the HW.



so when this makes a difference it is actually a bug.


There was certainly a bug though, which patch 1 fixes. :)




[PATCH 1/3] Revert "drm/i915: allow DG1 autoprobe for CONFIG_BROKEN"

2021-08-12 Thread Maarten Lankhorst
This reverts commit fae352efb12196e7110f98bc1297ce533472764d.

Inside core-for-CI, reverting to make next patch apply cleanly.
---
 drivers/gpu/drm/i915/i915_pci.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 2c1cb9b6b556..1bbd09ad5287 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1115,9 +1115,6 @@ static const struct pci_device_id pciidlist[] = {
INTEL_RKL_IDS(_info),
INTEL_ADLS_IDS(_s_info),
INTEL_ADLP_IDS(_p_info),
-#if IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)
-   INTEL_DG1_IDS(_info),
-#endif
{0, 0, 0}
 };
 MODULE_DEVICE_TABLE(pci, pciidlist);
-- 
2.32.0



[PATCH 3/3] drm/i915/topic/for-CI: Disable fake LMEM implementation

2021-08-12 Thread Maarten Lankhorst
see if anything breaks.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/Kconfig.debug | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index e7fd3e76f8a2..f27c0b5873f7 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -48,7 +48,6 @@ config DRM_I915_DEBUG
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
select DRM_I915_SELFTEST
-   select BROKEN # for prototype uAPI
default n
help
  Choose this option to turn on extra driver debugging that may affect
-- 
2.32.0



[PATCH 2/3] drm/i915: Add pci ids for DG1

2021-08-12 Thread Maarten Lankhorst
Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 1bbd09ad5287..93ccdc6bbd03 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1115,6 +1115,7 @@ static const struct pci_device_id pciidlist[] = {
INTEL_RKL_IDS(_info),
INTEL_ADLS_IDS(_s_info),
INTEL_ADLP_IDS(_p_info),
+   INTEL_DG1_IDS(_info),
{0, 0, 0}
 };
 MODULE_DEVICE_TABLE(pci, pciidlist);
-- 
2.32.0



Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()

2021-08-12 Thread Kirill A. Shutemov
On Wed, Aug 11, 2021 at 10:52:55AM -0500, Tom Lendacky wrote:
> On 8/11/21 7:19 AM, Kirill A. Shutemov wrote:
> > On Tue, Aug 10, 2021 at 02:48:54PM -0500, Tom Lendacky wrote:
> >> On 8/10/21 1:45 PM, Kuppuswamy, Sathyanarayanan wrote:
> >>>
> >>>
> >>> On 7/27/21 3:26 PM, Tom Lendacky wrote:
>  diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>  index de01903c3735..cafed6456d45 100644
>  --- a/arch/x86/kernel/head64.c
>  +++ b/arch/x86/kernel/head64.c
>  @@ -19,7 +19,7 @@
>    #include 
>    #include 
>    #include 
>  -#include 
>  +#include 
>    #include 
>      #include 
>  @@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long
>  physaddr,
>     * there is no need to zero it after changing the memory encryption
>     * attribute.
>     */
>  -    if (mem_encrypt_active()) {
>  +    if (prot_guest_has(PATTR_MEM_ENCRYPT)) {
>    vaddr = (unsigned long)__start_bss_decrypted;
>    vaddr_end = (unsigned long)__end_bss_decrypted;
> >>>
> >>>
> >>> Since this change is specific to AMD, can you replace PATTR_MEM_ENCRYPT 
> >>> with
> >>> prot_guest_has(PATTR_SME) || prot_guest_has(PATTR_SEV). It is not used in
> >>> TDX.
> >>
> >> This is a direct replacement for now.
> > 
> > With current implementation of prot_guest_has() for TDX it breaks boot for
> > me.
> > 
> > Looking at code agains, now I *think* the reason is accessing a global
> > variable from __startup_64() inside TDX version of prot_guest_has().
> > 
> > __startup_64() is special. If you access any global variable you need to
> > use fixup_pointer(). See comment before __startup_64().
> > 
> > I'm not sure how you get away with accessing sme_me_mask directly from
> > there. Any clues? Maybe just a luck and complier generates code just right
> > for your case, I donno.
> 
> Hmm... yeah, could be that the compiler is using rip-relative addressing
> for it because it lives in the .data section?

I guess. It has to be fixed. It may break with complier upgrade or any
random change around the code.

BTW, does it work with clang for you?

> For the static variables in mem_encrypt_identity.c I did an assembler rip
> relative LEA, but probably could have passed physaddr to sme_enable() and
> used a fixup_pointer() style function, instead.

Sounds like a plan.

> > A separate point is that TDX version of prot_guest_has() relies on
> > cpu_feature_enabled() which is not ready at this point.
> 
> Does TDX have to do anything special to make memory able to be shared with
> the hypervisor?

Yes. But there's nothing that required any changes in early boot. It
handled in ioremap/set_memory.

> You might have to use something that is available earlier
> than cpu_feature_enabled() in that case (should you eventually support
> kvmclock).

Maybe.

> > I think __bss_decrypted fixup has to be done if sme_me_mask is non-zero.
> > Or just do it uncoditionally because it's NOP for sme_me_mask == 0.
> 
> For SNP, we'll have to additionally call the HV to update the RMP to make
> the memory shared. But that could also be done unconditionally since the
> early_snp_set_memory_shared() routine will check for SNP before doing
> anything.

-- 
 Kirill A. Shutemov


Re: How to obtain a drm lease from X for overlay planes as well as a primary plane?

2021-08-12 Thread John Cox
>> Raspberry Pi displaying video with subtitles or other controls.  I was
>> thinking of the fullscreen case but if zero copy video can be made to
>> work to the main desktop then that would even better.
>>
>> If displaying 4k video the Pi does not have enough bandwidth left for a
>> single frame copy, convert or merge so I need hardware scaling,
>> composition & display taking the raw video frame (its in a dmabuf).  The
>> raw video is in a somewhat unique format, I'd expect the other layers to
>> be ARGB.  The Pi h/w can do this and I believe I can make it work via
>> DRM if I own the screen so that was where I started.
>>
>> >Why not use an xdg_toplevel and wl_subsurface?
>>
>> Probably because I am woefully underinformed about how I should be doing
>> stuff properly.  Please feel free to point me in the correct direction -
>> any example that takes NV12 video (it isn't NV12 but if NV12 works then
>> SAND can probably be made to too) would be a great start.  Also Wayland
>> hasn't yet come to the Pi though it will shortly be using mutter.
>
>By SAND do you mean one of these vc4-specific buffer tilings [1]? e.g.
>BROADCOM_SAND64, SAND128 or SAND256?
>
>[1]: https://drmdb.emersion.fr/formats?driver=vc4

Yes - for SAND8 (or SAND128 in your terms) drm output we have the
required types as NV12 + a broadcom modifier.  Then there is SAND30 for
10-bit output which fits in the same column tiling but packs 3 10-bit
quantities into 32 bits with 2 junk (zero) bits.  Again we have a DRM
definition for that which I think may have made it upstream.

>The fullscreen case may work already on all major Wayland compositors,
>assuming the video size matches exactly the current mode. You'll need to use
>the linux-dmabuf Wayland extension to pass NV12 buffers to the compositor.
>
>If you want to add scaling into the mix, you'll need to use the viewporter
>extension as well. Most compositors aren't yet rigged up for direct scan-out,
>they'll fall back to composition. Weston is your best bet if you want to try
>this, it supports direct scan-out to multiple KMS planes with scaling and
>cropping. There is some active work in wlroots to support this.  I'm not aware
>of any effort in this direction for mutter or kwin at the time of writing.
>
>If you want to also use KMS planes with other layers (RGBA or something else),
>then you'll need to setup wl_subsurfaces with the rest of the content. As said
>above, Weston will do its best to offload the composition work to KMS planes.
>You'll need to make sure each buffer you submit can be scanned out by the
>display engine -- there's not yet a generic way of doing it, but the upcoming
>linux-dmabuf hints protocol will fix that.
>
>If you want to get started, maybe have a look at clients/simple-dmabuf-gbm in
>Weston.
>
>Hope this helps!

Very many thanks for the pointers - to a large extent my problem is that
I don't know what should work in order to build something around it and
then work out why it doesn't.  I've got video decode down pat, but
modern display still eludes me - I grew up on STBs and the like where
you could just use the h/w directly, now its a lot more controlled.

Ta again

John Cox


Re: [PATCH v2 0/6] eDP: Support probing eDP panels dynamically instead of hardcoding

2021-08-12 Thread Sam Ravnborg
Hi Doug,
On Mon, Aug 09, 2021 at 03:18:03PM -0700, Doug Anderson wrote:
> Hi,
> 
> On Tue, Aug 3, 2021 at 1:41 PM Sam Ravnborg  wrote:
> >
> > Hi Douglas,
> >
> > On Fri, Jul 30, 2021 at 02:26:19PM -0700, Douglas Anderson wrote:
> > > The goal of this patch series is to move away from hardcoding exact
> > > eDP panels in device tree files. As discussed in the various patches
> > > in this series (I'm not repeating everything here), most eDP panels
> > > are 99% probable and we can get that last 1% by allowing two "power
> > > up" delays to be specified in the device tree file and then using the
> > > panel ID (found in the EDID) to look up additional power sequencing
> > > delays for the panel.
> >
> > Have you considered a new driver for edp panels?
> > panel-edp.c?
> >
> > There will be some duplicate code from pnale-simple - but the same can
> > be said by the other panel drivers too.
> > In the end I think it is better to separate them so we end up with two
> > less complex panel drivers rather than one do-it-all panel driver.
> >
> > I have not looked in detail how this would look like, but my first
> > impression is that we should split it out.
> 
> I certainly could, but my argument against it is that really it's the
> exact same set of eDP panels that would be supported by both drivers.

The idea was to move all eDP panels to the new driver.

My hope it that we can make panel-simple handle a more more narrow set
of panels. eDP capable displays are IMO not simple panels.

Likewise DSI capable panels could also be pulled out of panel-simple.

This would continue to duplicate some code - but we have a lot of
duplicated code across the various panels and the best way forward
would be to implement more helpers that can be used by the drivers.

Sam - who is trying to recover form the deadly man flu...


Re: [PATCH] drm: Copy drm_wait_vblank request and copy_to_user before return.

2021-08-12 Thread Michel Dänzer
On 2021-08-11 7:55 p.m., Mark Yacoub wrote:
> From: Mark Yacoub 
> 
> [Why]
> Userspace should get back a copy of the request that's been modified
> even when drm_wait_vblank_ioctl returns a failure.
> 
> Rationale:
> drm_wait_vblank_ioctl modifies the request and expects the user to read
> back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates
> the sequence to become current_vblank_count + sequence (which was
> relative), not it becomes absolute.
> drmWaitVBlank (in libdrm), expects this to be the case as it modifies
> the request to be Absolute as it expects the sequence to would have been
> updated.
> 
> The change is in compat_drm_wait_vblank, which is called by
> drm_compat_ioctl. This change of copying the data back regardless of the
> return number makes it en par with drm_ioctl, which always copies the
> data before returning.
> 
> [How]
> Copy the drm_wait_vblank request.
> Return from the function after everything has been copied to user.
> 
> Fixes: IGT:kms_flip::modeset-vs-vblank-race-interruptible
> Tested on ChromeOS Trogdor(msm)
> 
> Signed-off-by: Mark Yacoub 
> ---
>  drivers/gpu/drm/drm_ioc32.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_ioc32.c b/drivers/gpu/drm/drm_ioc32.c
> index d29907955ff79..275b860df8fbe 100644
> --- a/drivers/gpu/drm/drm_ioc32.c
> +++ b/drivers/gpu/drm/drm_ioc32.c
> @@ -855,17 +855,19 @@ static int compat_drm_wait_vblank(struct file *file, 
> unsigned int cmd,
>   req.request.sequence = req32.request.sequence;
>   req.request.signal = req32.request.signal;
>   err = drm_ioctl_kernel(file, drm_wait_vblank_ioctl, , DRM_UNLOCKED);
> - if (err)
> - return err;
>  
>   req32.reply.type = req.reply.type;
>   req32.reply.sequence = req.reply.sequence;
>   req32.reply.tval_sec = req.reply.tval_sec;
>   req32.reply.tval_usec = req.reply.tval_usec;
> + /* drm_wait_vblank_ioctl modifies Request, update their values here as 
> well. */
> + req32.request.type = req.request.type;
> + req32.request.sequence = req.request.sequence;
> + req32.request.signal = req.request.signal;

The added assignments are superfluous, since req32.reply and req32.request are 
members of the same union.


>   if (copy_to_user(argp, , sizeof(req32)))
>   return -EFAULT;
>  
> - return 0;
> + return err;
>  }

The other changes look correct.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


[PULL] drm-misc-fixes

2021-08-12 Thread Thomas Zimmermann
Hi Dave and Daniel,

only one bug fix in this week's drm-misc-fixes.

Best regards
Thomas

drm-misc-fixes-2021-08-12:
Short summary of fixes pull:

 * meson: Fix colors when booting with HDR
The following changes since commit e89afb51f97ae03ee246c1fd0b47e3e491266aef:

  drm/vmwgfx: Fix a 64bit regression on svga3 (2021-08-02 21:00:37 +0200)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-08-12

for you to fetch changes up to bf33677a3c394bb8fddd48d3bbc97adf0262e045:

  drm/meson: fix colour distortion from HDR set during vendor u-boot 
(2021-08-10 10:00:02 +0200)


Short summary of fixes pull:

 * meson: Fix colors when booting with HDR


Christian Hewitt (1):
  drm/meson: fix colour distortion from HDR set during vendor u-boot

 drivers/gpu/drm/meson/meson_registers.h | 5 +
 drivers/gpu/drm/meson/meson_viu.c   | 7 ++-
 2 files changed, 11 insertions(+), 1 deletion(-)

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


Re: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks

2021-08-12 Thread Michel Dänzer
On 2021-08-12 7:55 a.m., Koenig, Christian wrote:
> Hi James,
> 
> Evan seems to have understood how this all works together.
> 
> See while any begin/end use critical section is active the work should not be 
> active.
> 
> When you handle only one ring you can just call cancel in begin use and 
> schedule in end use. But when you have more than one ring you need a lock or 
> counter to prevent concurrent work items to be started.
> 
> Michelle's idea to use mod_delayed_work is a bad one because it assumes that 
> the delayed work is still running.

It merely assumes that the work may already have been scheduled before.

Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I 
think it can still have some effect when there's a single work item for 
multiple rings, as described by James, it's probably negligible, since 
presumably the time intervals between ring_begin_use and ring_end_use are 
normally much shorter than a second.

So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as 
schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with 
dropping it.


> Something similar applies to the first patch I think,

There are no cancel work calls in that case, so the commit log is accurate 
TTBOMK.

I noticed this because current mutter Git main wasn't able to sustain 60 fps on 
Navi 14 with a simple glxgears -fullscreen. mutter was dropping frames because 
its CPU work for a frame update occasionally took up to 3 ms, instead of the 
normal 2-300 microseconds. sysprof showed a lot of cycles spent in the 
functions which enable/disable GFXOFF in the HW.


> so when this makes a difference it is actually a bug.

There was certainly a bug though, which patch 1 fixes. :)


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer


[PATCH] drm/vmwgfx: Fix some scheduling in atomic bugs

2021-08-12 Thread Dan Carpenter
The vmw_gmrid_man_get_node() function calls vmw_host_printf() while
holding a spinlock so these functions need to be atomic.  Generally,
no one expects printf() functions to sleep.

Fixes: cfdc3458db8a ("drm/vmwgfx: Be a lot more flexible with MOB limits")
Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
index ed9c7b3a1e08..16be71c4c679 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
@@ -506,14 +506,14 @@ int vmw_host_printf(const char *fmt, ...)
return ret;
 
va_start(ap, fmt);
-   log = kvasprintf(GFP_KERNEL, fmt, ap);
+   log = kvasprintf(GFP_ATOMIC, fmt, ap);
va_end(ap);
if (!log) {
DRM_ERROR("Cannot allocate memory for the log message.\n");
return -ENOMEM;
}
 
-   msg = kasprintf(GFP_KERNEL, "log %s", log);
+   msg = kasprintf(GFP_ATOMIC, "log %s", log);
if (!msg) {
DRM_ERROR("Cannot allocate memory for host log message.\n");
kfree(log);
-- 
2.20.1



[PATCH v5 13/13] tools: update test_hmm script to support SP config

2021-08-12 Thread Alex Sierra
Add two more parameters to set spm_addr_dev0 & spm_addr_dev1
addresses. These two parameters configure the start SP
addresses for each device in test_hmm driver.
Consequently, this configures zone device type as generic.

Signed-off-by: Alex Sierra 
---
 tools/testing/selftests/vm/test_hmm.sh | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/test_hmm.sh 
b/tools/testing/selftests/vm/test_hmm.sh
index 0647b525a625..3eeabe94399f 100755
--- a/tools/testing/selftests/vm/test_hmm.sh
+++ b/tools/testing/selftests/vm/test_hmm.sh
@@ -40,7 +40,18 @@ check_test_requirements()
 
 load_driver()
 {
-   modprobe $DRIVER > /dev/null 2>&1
+   if [ $# -eq 0 ]; then
+   modprobe $DRIVER > /dev/null 2>&1
+   else
+   if [ $# -eq 2 ]; then
+   modprobe $DRIVER spm_addr_dev0=$1 spm_addr_dev1=$2
+   > /dev/null 2>&1
+   else
+   echo "Missing module parameters. Make sure pass"\
+   "spm_addr_dev0 and spm_addr_dev1"
+   usage
+   fi
+   fi
if [ $? == 0 ]; then
major=$(awk "\$2==\"HMM_DMIRROR\" {print \$1}" /proc/devices)
mknod /dev/hmm_dmirror0 c $major 0
@@ -58,7 +69,7 @@ run_smoke()
 {
echo "Running smoke test. Note, this test provides basic coverage."
 
-   load_driver
+   load_driver $1 $2
$(dirname "${BASH_SOURCE[0]}")/hmm-tests
unload_driver
 }
@@ -75,6 +86,9 @@ usage()
echo "# Smoke testing"
echo "./${TEST_NAME}.sh smoke"
echo
+   echo "# Smoke testing with SPM enabled"
+   echo "./${TEST_NAME}.sh smoke  "
+   echo
exit 0
 }
 
@@ -84,7 +98,7 @@ function run_test()
usage
else
if [ "$1" = "smoke" ]; then
-   run_smoke
+   run_smoke $2 $3
else
usage
fi
-- 
2.32.0



[PATCH v5 11/13] lib: add support for device generic type in test_hmm

2021-08-12 Thread Alex Sierra
Device Generic type uses device memory that is coherently
accesible by the CPU. Usually, this is shown as SP
(special purpose) memory range at the BIOS-e820 memory
enumeration. If no SP memory is supported in system,
this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.

Currently, test_hmm only supports two different SP ranges
of at least 256MB size. This could be specified in the
kernel parameter variable efi_fake_mem. Ex. Two SP ranges
of 1GB starting at 0x1 & 0x14000 physical address.
efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 170 
 lib/test_hmm_uapi.h |  10 ++-
 2 files changed, 116 insertions(+), 64 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 9283ad1f4280..52b6190fab66 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -469,6 +469,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
unsigned long pfn_first;
unsigned long pfn_last;
void *ptr;
+   int ret = -ENOMEM;
 
devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
if (!devmem)
@@ -516,8 +517,10 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
}
 
ptr = memremap_pages(>pagemap, numa_node_id());
-   if (IS_ERR(ptr))
+   if (IS_ERR(ptr)) {
+   ret = PTR_ERR(ptr);
goto err_release;
+   }
 
devmem->mdevice = mdevice;
pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT;
@@ -546,7 +549,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
}
spin_unlock(>lock);
 
-   return true;
+   return 0;
 
 err_release:
mutex_unlock(>devmem_lock);
@@ -554,7 +557,7 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 err_devmem:
kfree(devmem);
 
-   return false;
+   return ret;
 }
 
 static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
@@ -563,8 +566,10 @@ static struct page *dmirror_devmem_alloc_page(struct 
dmirror_device *mdevice)
struct page *rpage;
 
/*
-* This is a fake device so we alloc real system memory to store
-* our device memory.
+* For ZONE_DEVICE private type, this is a fake device so we alloc real
+* system memory to store our device memory.
+* For ZONE_DEVICE generic type we use the actual dpage to store the 
data
+* and ignore rpage.
 */
rpage = alloc_page(GFP_HIGHUSER);
if (!rpage)
@@ -597,7 +602,7 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
   struct dmirror *dmirror)
 {
struct dmirror_device *mdevice = dmirror->mdevice;
-   const unsigned long *src = args->src;
+   unsigned long *src = args->src;
unsigned long *dst = args->dst;
unsigned long addr;
 
@@ -615,12 +620,18 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
 * unallocated pte_none() or read-only zero page.
 */
spage = migrate_pfn_to_page(*src);
-
+   if (spage && is_zone_device_page(spage)) {
+   pr_debug("page already in device spage pfn: 0x%lx\n",
+ page_to_pfn(spage));
+   *src &= ~MIGRATE_PFN_MIGRATE;
+   continue;
+   }
dpage = dmirror_devmem_alloc_page(mdevice);
if (!dpage)
continue;
 
-   rpage = dpage->zone_device_data;
+   rpage = is_device_private_page(dpage) ? dpage->zone_device_data 
:
+   dpage;
if (spage)
copy_highpage(rpage, spage);
else
@@ -632,8 +643,10 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
 * the simulated device memory and that page holds the pointer
 * to the mirror.
 */
+   rpage = dpage->zone_device_data;
rpage->zone_device_data = dmirror;
-
+   pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 
0x%lx\n",
+page_to_pfn(spage), page_to_pfn(dpage));
*dst = migrate_pfn(page_to_pfn(dpage)) |
MIGRATE_PFN_LOCKED;
if ((*src & MIGRATE_PFN_WRITE) ||
@@ -667,10 +680,13 @@ static int dmirror_migrate_finalize_and_map(struct 
migrate_vma *args,
continue;
 
/*
-* Store the page that holds the data so the page table
-* doesn't have to deal with ZONE_DEVICE private pages.
+* For ZONE_DEVICE private pages we store the page that
+* holds the data so the page table doesn't have to deal it.
+* 

[PATCH v5 10/13] lib: test_hmm add module param for zone device type

2021-08-12 Thread Alex Sierra
In order to configure device generic in test_hmm, two
module parameters should be passed, which correspon to the
SP start address of each device (2) spm_addr_dev0 &
spm_addr_dev1. If no parameters are passed, private device
type is configured.

v5:
Remove devmem->pagemap.type = MEMORY_DEVICE_PRIVATE at
dmirror_allocate_chunk that was forcing to configure pagemap.type
to MEMORY_DEVICE_PRIVATE

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 46 -
 lib/test_hmm_uapi.h |  1 +
 2 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 3cd91ca31dd7..9283ad1f4280 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -33,6 +33,16 @@
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+static unsigned long spm_addr_dev0;
+module_param(spm_addr_dev0, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev0,
+   "Specify start address for SPM (special purpose memory) used 
for device 0. By setting this Generic device type will be used. Make sure 
spm_addr_dev1 is set too");
+
+static unsigned long spm_addr_dev1;
+module_param(spm_addr_dev1, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev1,
+   "Specify start address for SPM (special purpose memory) used 
for device 1. By setting this Generic device type will be used. Make sure 
spm_addr_dev0 is set too");
+
 static const struct dev_pagemap_ops dmirror_devmem_ops;
 static const struct mmu_interval_notifier_ops dmirror_min_ops;
 static dev_t dmirror_dev;
@@ -450,11 +460,11 @@ static int dmirror_write(struct dmirror *dmirror, struct 
hmm_dmirror_cmd *cmd)
return ret;
 }
 
-static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
+static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
   struct page **ppage)
 {
struct dmirror_chunk *devmem;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long pfn;
unsigned long pfn_first;
unsigned long pfn_last;
@@ -462,15 +472,26 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 
devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
if (!devmem)
-   return false;
+   return -ENOMEM;
+
+   if (!spm_addr_dev0 && !spm_addr_dev1) {
+   res = request_free_mem_region(_resource, 
DEVMEM_CHUNK_SIZE,
+ "hmm_dmirror");
+   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   } else if (spm_addr_dev0 && spm_addr_dev1) {
+   res = lookup_resource(_resource, 
MINOR(mdevice->cdevice.dev) ?
+   spm_addr_dev0 :
+   spm_addr_dev1);
+   devmem->pagemap.type = MEMORY_DEVICE_GENERIC;
+   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_GENERIC;
+   } else {
+   pr_err("Both spm_addr_dev parameters should be set\n");
+   }
 
-   res = request_free_mem_region(_resource, DEVMEM_CHUNK_SIZE,
- "hmm_dmirror");
if (IS_ERR(res))
goto err_devmem;
 
-   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
-   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
devmem->pagemap.range.start = res->start;
devmem->pagemap.range.end = res->end;
devmem->pagemap.nr_range = 1;
@@ -1097,10 +1118,8 @@ static int dmirror_device_init(struct dmirror_device 
*mdevice, int id)
if (ret)
return ret;
 
-   /* Build a list of free ZONE_DEVICE private struct pages */
-   dmirror_allocate_chunk(mdevice, NULL);
-
-   return 0;
+   /* Build a list of free ZONE_DEVICE struct pages */
+   return dmirror_allocate_chunk(mdevice, NULL);
 }
 
 static void dmirror_device_remove(struct dmirror_device *mdevice)
@@ -1113,8 +1132,9 @@ static void dmirror_device_remove(struct dmirror_device 
*mdevice)
mdevice->devmem_chunks[i];
 
memunmap_pages(>pagemap);
-   release_mem_region(devmem->pagemap.range.start,
-  range_len(>pagemap.range));
+   if (devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
+   release_mem_region(devmem->pagemap.range.start,
+  
range_len(>pagemap.range));
kfree(devmem);
}
kfree(mdevice->devmem_chunks);
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index ee88701793d5..17a6b5059871 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -65,6 +65,7 @@ enum {
 enum {
/* 0 is reserved to catch uninitialized type fields */
  

[PATCH v5 09/13] lib: test_hmm add ioctl to get zone device type

2021-08-12 Thread Alex Sierra
new ioctl cmd added to query zone device type. This will be
used once the test_hmm adds zone device generic type.

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 15 ++-
 lib/test_hmm_uapi.h |  7 +++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 6998f10350ea..3cd91ca31dd7 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -82,6 +82,7 @@ struct dmirror_chunk {
 struct dmirror_device {
struct cdev cdevice;
struct hmm_devmem   *devmem;
+   unsigned intzone_device_type;
 
unsigned intdevmem_capacity;
unsigned intdevmem_count;
@@ -468,6 +469,7 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
if (IS_ERR(res))
goto err_devmem;
 
+   mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
devmem->pagemap.range.start = res->start;
devmem->pagemap.range.end = res->end;
@@ -912,6 +914,15 @@ static int dmirror_snapshot(struct dmirror *dmirror,
return ret;
 }
 
+static int dmirror_get_device_type(struct dmirror *dmirror,
+   struct hmm_dmirror_cmd *cmd)
+{
+   mutex_lock(>mutex);
+   cmd->zone_device_type = dmirror->mdevice->zone_device_type;
+   mutex_unlock(>mutex);
+
+   return 0;
+}
 static long dmirror_fops_unlocked_ioctl(struct file *filp,
unsigned int command,
unsigned long arg)
@@ -952,7 +963,9 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp,
case HMM_DMIRROR_SNAPSHOT:
ret = dmirror_snapshot(dmirror, );
break;
-
+   case HMM_DMIRROR_GET_MEM_DEV_TYPE:
+   ret = dmirror_get_device_type(dmirror, );
+   break;
default:
return -EINVAL;
}
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index 670b4ef2a5b6..ee88701793d5 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -26,6 +26,7 @@ struct hmm_dmirror_cmd {
__u64   npages;
__u64   cpages;
__u64   faults;
+   __u64   zone_device_type;
 };
 
 /* Expose the address space of the calling process through hmm device file */
@@ -33,6 +34,7 @@ struct hmm_dmirror_cmd {
 #define HMM_DMIRROR_WRITE  _IOWR('H', 0x01, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_MIGRATE_IOWR('H', 0x02, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_SNAPSHOT   _IOWR('H', 0x03, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_GET_MEM_DEV_TYPE   _IOWR('H', 0x04, struct hmm_dmirror_cmd)
 
 /*
  * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT.
@@ -60,4 +62,9 @@ enum {
HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE = 0x30,
 };
 
+enum {
+   /* 0 is reserved to catch uninitialized type fields */
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE = 1,
+};
+
 #endif /* _LIB_TEST_HMM_UAPI_H */
-- 
2.32.0



[PATCH v5 07/13] mm: add generic type support to migrate_vma helpers

2021-08-12 Thread Alex Sierra
Device generic type case added for migrate_vma_pages and
migrate_vma_check_page helpers.
Both, generic and private device types have the same
conditions to decide to migrate pages from/to device
memory.

Signed-off-by: Alex Sierra 
---
 mm/migrate.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 8c2430d3e77b..7bac06ae831e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2602,7 +2602,7 @@ static bool migrate_vma_check_page(struct page *page)
 * FIXME proper solution is to rework migration_entry_wait() so
 * it does not need to take a reference on page.
 */
-   return is_device_private_page(page);
+   return is_device_page(page);
}
 
/* For file back page */
@@ -2891,7 +2891,7 @@ EXPORT_SYMBOL(migrate_vma_setup);
  * handle_pte_fault()
  *   do_anonymous_page()
  * to map in an anonymous zero page but the struct page will be a ZONE_DEVICE
- * private page.
+ * private or generic page.
  */
 static void migrate_vma_insert_page(struct migrate_vma *migrate,
unsigned long addr,
@@ -2956,13 +2956,11 @@ static void migrate_vma_insert_page(struct migrate_vma 
*migrate,
 */
__SetPageUptodate(page);
 
-   if (is_zone_device_page(page)) {
-   if (is_device_private_page(page)) {
-   swp_entry_t swp_entry;
+   if (is_device_private_page(page)) {
+   swp_entry_t swp_entry;
 
-   swp_entry = make_device_private_entry(page, 
vma->vm_flags & VM_WRITE);
-   entry = swp_entry_to_pte(swp_entry);
-   }
+   swp_entry = make_device_private_entry(page, vma->vm_flags & 
VM_WRITE);
+   entry = swp_entry_to_pte(swp_entry);
} else {
entry = mk_pte(page, vma->vm_page_prot);
if (vma->vm_flags & VM_WRITE)
@@ -3064,10 +3062,10 @@ void migrate_vma_pages(struct migrate_vma *migrate)
mapping = page_mapping(page);
 
if (is_zone_device_page(newpage)) {
-   if (is_device_private_page(newpage)) {
+   if (is_device_page(newpage)) {
/*
-* For now only support private anonymous when
-* migrating to un-addressable device memory.
+* For now only support private and generic
+* anonymous when migrating to device memory.
 */
if (mapping) {
migrate->src[i] &= ~MIGRATE_PFN_MIGRATE;
-- 
2.32.0



[PATCH v5 04/13] drm/amdkfd: add SPM support for SVM

2021-08-12 Thread Alex Sierra
When CPU is connected throug XGMI, it has coherent
access to VRAM resource. In this case that resource
is taken from a table in the device gmc aperture base.
This resource is used along with the device type, which could
be DEVICE_PRIVATE or DEVICE_GENERIC to create the device
page map region.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 21b86c35a4f2..3e9315354ce4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -916,6 +916,7 @@ int svm_migrate_init(struct amdgpu_device *adev)
struct resource *res;
unsigned long size;
void *r;
+   bool xgmi_connected_to_cpu = adev->gmc.xgmi.connected_to_cpu;
 
/* Page migration works on Vega10 or newer */
if (kfddev->device_info->asic_family < CHIP_VEGA10)
@@ -928,17 +929,22 @@ int svm_migrate_init(struct amdgpu_device *adev)
 * should remove reserved size
 */
size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
-   res = devm_request_free_mem_region(adev->dev, _resource, size);
+   if (xgmi_connected_to_cpu)
+   res = lookup_resource(_resource, adev->gmc.aper_base);
+   else
+   res = devm_request_free_mem_region(adev->dev, _resource, 
size);
+
if (IS_ERR(res))
return -ENOMEM;
 
-   pgmap->type = MEMORY_DEVICE_PRIVATE;
pgmap->nr_range = 1;
pgmap->range.start = res->start;
pgmap->range.end = res->end;
+   pgmap->type = xgmi_connected_to_cpu ?
+   MEMORY_DEVICE_GENERIC : MEMORY_DEVICE_PRIVATE;
pgmap->ops = _migrate_pgmap_ops;
pgmap->owner = SVM_ADEV_PGMAP_OWNER(adev);
-   pgmap->flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
+   pgmap->flags = 0;
r = devm_memremap_pages(adev->dev, pgmap);
if (IS_ERR(r)) {
pr_err("failed to register HMM device memory\n");
@@ -962,6 +968,7 @@ void svm_migrate_fini(struct amdgpu_device *adev)
struct dev_pagemap *pgmap = >kfd.dev->pgmap;
 
devm_memunmap_pages(adev->dev, pgmap);
-   devm_release_mem_region(adev->dev, pgmap->range.start,
-   pgmap->range.end - pgmap->range.start + 1);
+   if (pgmap->type == MEMORY_DEVICE_PRIVATE)
+   devm_release_mem_region(adev->dev, pgmap->range.start,
+   pgmap->range.end - pgmap->range.start + 
1);
 }
-- 
2.32.0



[PATCH v5 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-08-12 Thread Alex Sierra
From: Ralph Campbell 

ZONE_DEVICE struct pages have an extra reference count that complicates the
code for put_page() and several places in the kernel that need to check the
reference count to see that a page is not being used (gup, compaction,
migration, etc.). Clean up the code so the reference count doesn't need to
be treated specially for ZONE_DEVICE.

v2:
AS: merged this patch in linux 5.11 version

v5:
AS: add condition at try_grab_page to check for the zone device type, while
page ref counter is checked less/equal to zero. In case of device zone, pages
ref counter are initialized to zero.

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
---
 arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
 fs/dax.c   |  4 +-
 include/linux/dax.h|  2 +-
 include/linux/memremap.h   |  7 +--
 include/linux/mm.h | 46 +
 lib/test_hmm.c |  2 +-
 mm/internal.h  |  8 +++
 mm/memremap.c  | 68 +++---
 mm/migrate.c   |  5 --
 mm/page_alloc.c|  3 ++
 mm/swap.c  | 45 ++---
 12 files changed, 46 insertions(+), 148 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 84e5a2dc8be5..acee67710620 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -711,7 +711,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
 
dpage = pfn_to_page(uvmem_pfn);
dpage->zone_device_data = pvt;
-   get_page(dpage);
+   init_page_count(dpage);
lock_page(dpage);
return dpage;
 out_clear:
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 92987daa5e17..8bc7120e1216 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -324,7 +324,7 @@ nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm)
return NULL;
}
 
-   get_page(page);
+   init_page_count(page);
lock_page(page);
return page;
 }
diff --git a/fs/dax.c b/fs/dax.c
index 4820bb511d68..8d699c8e888b 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -560,14 +560,14 @@ static void *grab_mapping_entry(struct xa_state *xas,
 
 /**
  * dax_layout_busy_page_range - find first pinned page in @mapping
- * @mapping: address space to scan for a page with ref count > 1
+ * @mapping: address space to scan for a page with ref count > 0
  * @start: Starting offset. Page containing 'start' is included.
  * @end: End offset. Page containing 'end' is included. If 'end' is LLONG_MAX,
  *   pages from 'start' till the end of file are included.
  *
  * DAX requires ZONE_DEVICE mapped pages. These pages are never
  * 'onlined' to the page allocator so they are considered idle when
- * page->count == 1. A filesystem uses this interface to determine if
+ * page->count == 0. A filesystem uses this interface to determine if
  * any page in the mapping is busy, i.e. for DMA, or other
  * get_user_pages() usages.
  *
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 8b5da1d60dbc..05fc982ce153 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -245,7 +245,7 @@ static inline bool dax_mapping(struct address_space 
*mapping)
 
 static inline bool dax_page_unused(struct page *page)
 {
-   return page_ref_count(page) == 1;
+   return page_ref_count(page) == 0;
 }
 
 #define dax_wait_page(_inode, _page, _wait_cb) \
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 79c49e7f5c30..327f32427d21 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -66,9 +66,10 @@ enum memory_type {
 
 struct dev_pagemap_ops {
/*
-* Called once the page refcount reaches 1.  (ZONE_DEVICE pages never
-* reach 0 refcount unless there is a refcount bug. This allows the
-* device driver to implement its own memory management.)
+* Called once the page refcount reaches 0. The reference count
+* should be reset to one with init_page_count(page) before reusing
+* the page. This allows the device driver to implement its own
+* memory management.
 */
void (*page_free)(struct page *page);
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c9900aedc195..c0fcb47d7641 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1117,39 +1117,6 @@ static inline bool is_zone_device_page(const struct page 
*page)
 }
 #endif
 
-#ifdef CONFIG_DEV_PAGEMAP_OPS
-void free_devmap_managed_page(struct page *page);
-DECLARE_STATIC_KEY_FALSE(devmap_managed_key);
-
-static inline bool page_is_devmap_managed(struct page *page)
-{
-   if 

[PATCH v5 12/13] tools: update hmm-test to support device generic type

2021-08-12 Thread Alex Sierra
Test cases such as migrate_fault and migrate_multiple,
were modified to explicit migrate from device to sys memory
without the need of page faults, when using device generic
type.

Snapshot test case updated to read memory device type
first and based on that, get the proper returned results
migrate_ping_pong test case added to test explicit migration
from device to sys memory for both private and generic
zone types.

Helpers to migrate from device to sys memory and vicerversa
were also added.

Signed-off-by: Alex Sierra 
---
 tools/testing/selftests/vm/hmm-tests.c | 142 +
 1 file changed, 124 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 5d1ac691b9f4..70632b195497 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -44,6 +44,8 @@ struct hmm_buffer {
int fd;
uint64_tcpages;
uint64_tfaults;
+   int zone_device_type;
+   boolalloc_to_devmem;
 };
 
 #define TWOMEG (1 << 21)
@@ -133,6 +135,7 @@ static int hmm_dmirror_cmd(int fd,
cmd.addr = (__u64)buffer->ptr;
cmd.ptr = (__u64)buffer->mirror;
cmd.npages = npages;
+   cmd.alloc_to_devmem = buffer->alloc_to_devmem;
 
for (;;) {
ret = ioctl(fd, request, );
@@ -144,6 +147,7 @@ static int hmm_dmirror_cmd(int fd,
}
buffer->cpages = cmd.cpages;
buffer->faults = cmd.faults;
+   buffer->zone_device_type = cmd.zone_device_type;
 
return 0;
 }
@@ -211,6 +215,34 @@ static void hmm_nanosleep(unsigned int n)
nanosleep(, NULL);
 }
 
+static int hmm_migrate_sys_to_dev(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   buffer->alloc_to_devmem = true;
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+}
+
+static int hmm_migrate_dev_to_sys(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   buffer->alloc_to_devmem = false;
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+}
+
+static int hmm_is_private_device(int fd, bool *res)
+{
+   struct hmm_buffer buffer;
+   int ret;
+
+   buffer.ptr = 0;
+   ret = hmm_dmirror_cmd(fd, HMM_DMIRROR_GET_MEM_DEV_TYPE, , 1);
+   *res = (buffer.zone_device_type == HMM_DMIRROR_MEMORY_DEVICE_PRIVATE);
+
+   return ret;
+}
+
 /*
  * Simple NULL test of device open/close.
  */
@@ -875,7 +907,7 @@ TEST_F(hmm, migrate)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -923,7 +955,7 @@ TEST_F(hmm, migrate_fault)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -936,7 +968,7 @@ TEST_F(hmm, migrate_fault)
ASSERT_EQ(ptr[i], i);
 
/* Migrate memory to the device again. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -976,7 +1008,7 @@ TEST_F(hmm, migrate_shared)
ASSERT_NE(buffer->ptr, MAP_FAILED);
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, -ENOENT);
 
hmm_buffer_free(buffer);
@@ -1015,7 +1047,7 @@ TEST_F(hmm2, migrate_mixed)
p = buffer->ptr;
 
/* Migrating a protected area should be an error. */
-   ret = hmm_dmirror_cmd(self->fd1, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd1, buffer, npages);
ASSERT_EQ(ret, -EINVAL);
 
/* Punch a hole after the first page address. */
@@ -1023,7 +1055,7 @@ TEST_F(hmm2, migrate_mixed)
ASSERT_EQ(ret, 0);
 
/* We expect an error if the vma doesn't cover the range. */
-   ret = hmm_dmirror_cmd(self->fd1, HMM_DMIRROR_MIGRATE, buffer, 3);
+   ret = hmm_migrate_sys_to_dev(self->fd1, buffer, 3);
ASSERT_EQ(ret, -EINVAL);
 
/* Page 2 will be a read-only zero page. */
@@ -1055,13 +1087,13 @@ TEST_F(hmm2, migrate_mixed)
 
/* Now try to migrate pages 2-5 to device 1. */
buffer->ptr = p + 2 * self->page_size;
-   ret = hmm_dmirror_cmd(self->fd1, HMM_DMIRROR_MIGRATE, buffer, 4);
+   ret = 

[PATCH v5 05/13] drm/amdkfd: generic type as sys mem on migration to ram

2021-08-12 Thread Alex Sierra
Generic device type memory on VRAM to RAM migration,
has similar access as System RAM from the CPU. This flag sets
the source from the sender. Which in Generic type case,
should be set as SYSTEM.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 3e9315354ce4..c511932e56d7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -664,9 +664,12 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.vma = vma;
migrate.start = start;
migrate.end = end;
-   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
 
+   if (adev->gmc.xgmi.connected_to_cpu)
+   migrate.flags = MIGRATE_VMA_SELECT_SYSTEM;
+   else
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
size = 2 * sizeof(*migrate.src) + sizeof(uint64_t) + sizeof(dma_addr_t);
size *= npages;
buf = kvmalloc(size, GFP_KERNEL | __GFP_ZERO);
-- 
2.32.0



[PATCH v5 08/13] mm: call pgmap->ops->page_free for DEVICE_GENERIC pages

2021-08-12 Thread Alex Sierra
Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback.
Device generic type memory case is now able to free its pages properly.

Signed-off-by: Alex Sierra 
---
 mm/memremap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memremap.c b/mm/memremap.c
index 614b3d600e95..6c884e2542a9 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -438,7 +438,7 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
 EXPORT_SYMBOL_GPL(get_dev_pagemap);
 
 #ifdef CONFIG_DEV_PAGEMAP_OPS
-static void free_device_private_page(struct page *page)
+static void free_device_page(struct page *page)
 {
 
__ClearPageWaiters(page);
@@ -477,7 +477,8 @@ void free_zone_device_page(struct page *page)
wake_up_var(>_refcount);
return;
case MEMORY_DEVICE_PRIVATE:
-   free_device_private_page(page);
+   case MEMORY_DEVICE_GENERIC:
+   free_device_page(page);
return;
default:
return;
-- 
2.32.0



[PATCH v5 03/13] kernel: resource: lookup_resource as exported symbol

2021-08-12 Thread Alex Sierra
The AMD architecture for the Frontier supercomputer will
have device memory which can be coherently accessed by
the CPU. The system BIOS advertises this memory as SPM
(special purpose memory) in the UEFI system address map.

The AMDGPU driver needs to be able to lookup this resource
in order to claim it as MEMORY_DEVICE_GENERIC using
devm_memremap_pages.

Signed-off-by: Alex Sierra 
---
 kernel/resource.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/resource.c b/kernel/resource.c
index 627e61b0c124..2d4abd395c73 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -783,6 +783,7 @@ struct resource *lookup_resource(struct resource *root, 
resource_size_t start)
 
return res;
 }
+EXPORT_SYMBOL_GPL(lookup_resource);
 
 /*
  * Insert a resource into the resource tree. If successful, return NULL,
-- 
2.32.0



[PATCH v5 06/13] include/linux/mm.h: helpers to check zone device generic type

2021-08-12 Thread Alex Sierra
Two helpers added. One checks if zone device page is generic
type. The other if page is either private or generic type.

Signed-off-by: Alex Sierra 
---
 include/linux/mm.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c0fcb47d7641..8853acb89b1a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1125,6 +1125,14 @@ static inline bool is_device_private_page(const struct 
page *page)
page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
 
+static inline bool is_device_page(const struct page *page)
+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   is_zone_device_page(page) &&
+   (page->pgmap->type == MEMORY_DEVICE_PRIVATE ||
+page->pgmap->type == MEMORY_DEVICE_GENERIC);
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
-- 
2.32.0



[PATCH v5 01/13] ext4/xfs: add page refcount helper

2021-08-12 Thread Alex Sierra
From: Ralph Campbell 

There are several places where ZONE_DEVICE struct pages assume a reference
count == 1 means the page is idle and free. Instead of open coding this,
add a helper function to hide this detail.

v3:
[AS]: rename dax_layout_is_idle_page func to dax_page_unused

v4:
[AS]: This ref count functionality was missing on fuse/dax.c.

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
---
 fs/dax.c|  4 ++--
 fs/ext4/inode.c |  5 +
 fs/fuse/dax.c   |  4 +---
 fs/xfs/xfs_file.c   |  4 +---
 include/linux/dax.h | 10 ++
 5 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 26d5dcd2d69e..4820bb511d68 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct 
address_space *mapping,
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
 
-   WARN_ON_ONCE(trunc && page_ref_count(page) > 1);
+   WARN_ON_ONCE(trunc && !dax_page_unused(page));
WARN_ON_ONCE(page->mapping && page->mapping != mapping);
page->mapping = NULL;
page->index = 0;
@@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry)
for_each_mapped_pfn(entry, pfn) {
struct page *page = pfn_to_page(pfn);
 
-   if (page_ref_count(page) > 1)
+   if (!dax_page_unused(page))
return page;
}
return NULL;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c173c8405856..9ee00186412f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3972,10 +3972,7 @@ int ext4_break_layouts(struct inode *inode)
if (!page)
return 0;
 
-   error = ___wait_var_event(>_refcount,
-   atomic_read(>_refcount) == 1,
-   TASK_INTERRUPTIBLE, 0, 0,
-   ext4_wait_dax_page(ei));
+   error = dax_wait_page(ei, page, ext4_wait_dax_page);
} while (error == 0);
 
return error;
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index ff99ab2a3c43..2b1f190ba78a 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -677,9 +677,7 @@ static int __fuse_dax_break_layouts(struct inode *inode, 
bool *retry,
return 0;
 
*retry = true;
-   return ___wait_var_event(>_refcount,
-   atomic_read(>_refcount) == 1, TASK_INTERRUPTIBLE,
-   0, 0, fuse_wait_dax_page(inode));
+   return dax_wait_page(inode, page, fuse_wait_dax_page);
 }
 
 /* dmap_end == 0 leads to unmapping of whole file */
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5b0f93f73837..39565fe5f817 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -782,9 +782,7 @@ xfs_break_dax_layouts(
return 0;
 
*retry = true;
-   return ___wait_var_event(>_refcount,
-   atomic_read(>_refcount) == 1, TASK_INTERRUPTIBLE,
-   0, 0, xfs_wait_dax_page(inode));
+   return dax_wait_page(inode, page, xfs_wait_dax_page);
 }
 
 int
diff --git a/include/linux/dax.h b/include/linux/dax.h
index b52f084aa643..8b5da1d60dbc 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space 
*mapping)
return mapping->host && IS_DAX(mapping->host);
 }
 
+static inline bool dax_page_unused(struct page *page)
+{
+   return page_ref_count(page) == 1;
+}
+
+#define dax_wait_page(_inode, _page, _wait_cb) \
+   ___wait_var_event(&(_page)->_refcount,  \
+   dax_page_unused(_page), \
+   TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode))
+
 #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
 void hmem_register_device(int target_nid, struct resource *r);
 #else
-- 
2.32.0



[PATCH v5 00/13] Support DEVICE_GENERIC memory in migrate_vma_*

2021-08-12 Thread Alex Sierra
v1:
AMD is building a system architecture for the Frontier supercomputer with a
coherent interconnect between CPUs and GPUs. This hardware architecture allows
the CPUs to coherently access GPU device memory. We have hardware in our labs
and we are working with our partner HPE on the BIOS, firmware and software
for delivery to the DOE.

The system BIOS advertises the GPU device memory (aka VRAM) as SPM
(special purpose memory) in the UEFI system address map. The amdgpu driver looks
it up with lookup_resource and registers it with devmap as MEMORY_DEVICE_GENERIC
using devm_memremap_pages.

Now we're trying to migrate data to and from that memory using the migrate_vma_*
helpers so we can support page-based migration in our unified memory 
allocations,
while also supporting CPU access to those pages.

This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages behave
correctly in the migrate_vma_* helpers. We are looking for feedback about this
approach. If we're close, what's needed to make our patches acceptable upstream?
If we're not close, any suggestions how else to achieve what we are trying to do
(i.e. page migration and coherent CPU access to VRAM)?

This work is based on HMM and our SVM memory manager that was recently 
upstreamed
to Dave Airlie's drm-next branch
https://cgit.freedesktop.org/drm/drm/log/?h=drm-next
On top of that we did some rework of our VRAM management for migrations to 
remove
some incorrect assumptions, allow partially successful migrations and GPU memory
mappings that mix pages in VRAM and system memory.
https://lore.kernel.org/dri-devel/20210527205606.2660-6-felix.kuehl...@amd.com/T/#r996356015e295780eb50453e7dbd5d0d68b47cbc

v2:
This patch series version has merged "[RFC PATCH v3 0/2]
mm: remove extra ZONE_DEVICE struct page refcount" patch series made by
Ralph Campbell. It also applies at the top of these series, our changes
to support device generic type in migration_vma helpers.
This has been tested in systems with device memory that has coherent
access by CPU.

Also addresses the following feedback made in v1:
- Isolate in one patch kernel/resource.c modification, based
on Christoph's feedback.
- Add helpers check for generic and private type to avoid
duplicated long lines.

v3:
- Include cover letter from v1.
- Rename dax_layout_is_idle_page func to dax_page_unused in patch
ext4/xfs: add page refcount helper.

v4:
- Add support for zone device generic type in lib/test_hmm and
tool/testing/selftest/vm/hmm-tests.
- Add missing page refcount helper to fuse/dax.c. This was included in
one of Ralph Campbell's patches.

v5:
- Cosmetic changes on patches 3, 5 and 13
- Bug founded at test_hmm, remove devmem->pagemap.type = MEMORY_DEVICE_PRIVATE
at dmirror_allocate_chunk that was forcing to configure pagemap.type to
MEMORY_DEVICE_PRIVATE.
- A bug was found while running one of the xfstest (generic/413) used to
validate fs_dax device type. This was first introduced by patch: "mm: remove
extra ZONE_DEVICE struct page refcount" whic is part of these patch series.
The bug was showed as WARNING message at try_grab_page function call, due to
a page refcounter equal to zero. Part of "mm: remove extra ZONE_DEVICE struct
page refcount" changes, was to initialize page refcounter to zero. Therefore,
a special condition was added to try_grab_page on this v5, were it checks for
device zone pages too. It is included in the same patch.

This is how mm changes from these patch series have been validated:
- hmm-tests were run using device private and device generic types. This last,
just added in these patch series. efi_fake_mem was used to mimic SPM memory
for device generic.
- xfstests tool was used to validate fs-dax device type and page refcounter
changes. DAX configuration was used along with emulated Persisten Memory set as
memmap=4G!4G memmap=4G!9G. xfstests were run from ext4 and generic lists. Some
of them, did not run due to limitations in configuration. Ex. test not
supporting specific file system or DAX mode.
Only three tests failed, generic/356/357 and ext4/049. However, these failures
were consistent before and after applying these patch series.
xfstest configuration:
TEST_DEV=/dev/pmem0
TEST_DIR=/mnt/ram0
SCRATCH_DEV=/dev/pmem1
SCRATCH_MNT=/mnt/ram1
TEST_FS_MOUNT_OPTS="-o dax"
EXT_MOUNT_OPTIONS="-o dax"
MKFS_OPTIONS="-b4096"
xfstest passed list:
Ext4:
001,003,005,021,022,023,025,026,030,031,032,036,037,038,042,043,044,271,306
Generic:
1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,20,21,22,23,24,25,28,29,30,31,32,33,35,37,
50,52,53,58,60,61,62,63,64,67,69,70,71,75,76,78,79,80,82,84,86,87,88,91,92,94,
96,97,98,99,103,105,112,113,114,117,120,124,126,129,130,131,135,141,169,184,
198,207,210,211,212,213,214,215,221,223,225,228,236,237,240,244,245,246,247,
248,249,255,257,258,263,277,286,294,306,307,308,309,313,315,316,318,319,337,
346,360,361,371,375,377,379,380,383,384,385,386,389,391,392,393,394,400,401,
403,404,406,409,410,411,412,413,417,420,422,423,424,425,426,427,428

Patches