date:20210622

Re: [PATCH 4/4] drm/amdkfd: implement counters for vm fault and migration

2021-06-22 Thread Felix Kuehling

Am 2021-06-22 um 9:32 a.m. schrieb Philip Yang:
> Add helper function to get process device data structure from adev to
> update counters.
>
> Update vm faults, page_in, page_out counters will no be executed in
> parallel, use WRITE_ONCE to avoid any form of compiler optimizations.
>
> Signed-off-by: Philip Yang 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 14 ++
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 24 
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.h |  2 ++
>  3 files changed, 40 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> index fd8f544f0de2..45b5349283af 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> @@ -413,6 +413,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, 
> struct svm_range *prange,
>   uint64_t end)
>  {
>   uint64_t npages = (end - start) >> PAGE_SHIFT;
> + struct kfd_process_device *pdd;
>   struct dma_fence *mfence = NULL;
>   struct migrate_vma migrate;
>   dma_addr_t *scratch;
> @@ -473,6 +474,12 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, 
> struct svm_range *prange,
>  out_free:
>   kvfree(buf);
>  out:
> + if (!r) {
> + pdd = svm_range_get_pdd_by_adev(prange, adev);
> + if (pdd)
> + WRITE_ONCE(pdd->page_in, pdd->page_in + migrate.cpages);
> + }
> +
>   return r;
>  }
>  
> @@ -629,6 +636,7 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
> svm_range *prange,
>  struct vm_area_struct *vma, uint64_t start, uint64_t end)
>  {
>   uint64_t npages = (end - start) >> PAGE_SHIFT;
> + struct kfd_process_device *pdd;
>   struct dma_fence *mfence = NULL;
>   struct migrate_vma migrate;
>   dma_addr_t *scratch;
> @@ -678,6 +686,12 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, 
> struct svm_range *prange,
>  out_free:
>   kvfree(buf);
>  out:
> + if (!r) {
> + pdd = svm_range_get_pdd_by_adev(prange, adev);
> + if (pdd)
> + WRITE_ONCE(pdd->page_out,
> +pdd->page_out + migrate.cpages);
> + }
>   return r;
>  }
>  
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 5468ea4264c6..f3323328f01f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -564,6 +564,24 @@ svm_range_get_adev_by_id(struct svm_range *prange, 
> uint32_t gpu_id)
>   return (struct amdgpu_device *)pdd->dev->kgd;
>  }
>  
> +struct kfd_process_device *
> +svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device 
> *adev)
> +{
> + struct kfd_process *p;
> + int32_t gpu_idx, gpuid;
> + int r;
> +
> + p = container_of(prange->svms, struct kfd_process, svms);
> +
> + r = kfd_process_gpuid_from_kgd(p, adev, , _idx);
> + if (r) {
> + pr_debug("failed to get device id by adev %p\n", adev);
> + return NULL;
> + }
> +
> + return kfd_process_device_from_gpuidx(p, gpu_idx);
> +}
> +
>  static int svm_range_bo_validate(void *param, struct amdgpu_bo *bo)
>  {
>   struct ttm_operation_ctx ctx = { false, false };
> @@ -2315,6 +2333,7 @@ int
>  svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>   uint64_t addr)
>  {
> + struct kfd_process_device *pdd;
>   struct mm_struct *mm = NULL;
>   struct svm_range_list *svms;
>   struct svm_range *prange;
> @@ -2440,6 +2459,11 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
> unsigned int pasid,
>  out_unlock_svms:
>   mutex_unlock(>lock);
>   mmap_read_unlock(mm);
> +
> + pdd = svm_range_get_pdd_by_adev(prange, adev);

svm_range_get_pdd_by_adev needs to do a linear search. You don't need
this here because you already know the gpuidx. I think you can just call
kfd_process_device_from_gpuidx(p, gpu_idx) here.

With that fixed, the series is

Reviewed-by: Felix Kuehling 


P.S.: Thanks for catching and fixing those memory leaks in patch 2.


> + if (pdd)
> + WRITE_ONCE(pdd->faults, pdd->faults + 1);
> +
>   mmput(mm);
>  out:
>   kfd_unref_process(p);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h 
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> index 0c0fc399395e..a9af03994d1a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> @@ -174,6 +174,8 @@ void svm_range_dma_unmap(struct device *dev, dma_addr_t 
> *dma_addr,
>unsigned long offset, unsigned long npages);
>  void svm_range_free_dma_mappings(struct svm_range *prange);
>  void svm_range_prefault(struct svm_range *prange, struct mm_struct *mm);
> +struct kfd_process_device *
> +svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device 
> *adev);
>

[pull] amdgpu, amdkfd, radeon drm-next-5.14

2021-06-22 Thread Alex Deucher

Hi Dave, Daniel,

Fixes for 5.14.

The following changes since commit d472b36efbf8a27dc8a80519db8b5a8caffe42b6:

  Merge tag 'amd-drm-next-5.14-2021-06-16' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2021-06-18 12:55:09 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-5.14-2021-06-22-1

for you to fetch changes up to 8fe44c080a53ac0ccbe88053a2e40f9acca33091:

  drm/amdgpu/display: fold DRM_AMD_DC_DCN3_1 into DRM_AMD_DC_DCN (2021-06-22 
16:51:45 -0400)


amd-drm-next-5.14-2021-06-22-1:

amdgpu:
- Userptr BO fixes
- RAS fixes
- Beige Goby fixes
- Add some missing freesync documentation
- Aldebaran fixes
- SR-IOV fixes
- Potential memory corruption fix in framebuffer handling
- Revert GFX9, 10 doorbell fixes, we just
  end up trading one bug for another
- Multi-plane cursor fixes with rotation
- LTTPR fixes
- Backlight fixes
- eDP fix
- Fold DRM_AMD_DC_DCN3_1 into DRM_AMD_DC_DCN
- Misc code cleanups

amdkfd:
- Topology fix
- Locking fix

radeon:
- Misc code cleanup


Alex Deucher (2):
  drm/amdgpu/vcn3: drop extraneous Beige Goby hunk
  drm/amdgpu/display: fold DRM_AMD_DC_DCN3_1 into DRM_AMD_DC_DCN

Anthony Koo (1):
  drm/amd/display: [FW Promotion] Release 0.0.71

Aric Cyr (2):
  drm/amd/display: Multiplane cursor position incorrect when plane rotated
  drm/amd/display: 3.2.141

Ashish Pawar (1):
  drm/amdgpu: PWRBRK sequence changes for Aldebaran

Aurabindo Pillai (2):
  drm/amd/display: Increase stutter watermark for dcn302 and dcn303
  drm/amd/display: get socBB from VBIOS for dcn302 and dcn303

Bernard Zhao (1):
  drm/radeon: delete useless function return values & remove meaningless 
if(r) check code

Bokun Zhang (1):
  drm/amd/amdgpu: Use IP discovery data to determine VCN enablement instead 
of MMSCH

Charlene Liu (1):
  drm/amd/display: get refclk from MICROSECOND_TIME_BASE_DIV HW register

Darren Powell (1):
  amdgpu/pm: replaced snprintf usage in amdgpu_pm.c with sysfs_emit

Eric Huang (1):
  drm/amdkfd: Set iolink non-coherent in topology

Gustavo A. R. Silva (1):
  drm/amd/display: Fix fall-through warning for Clang

Josip Pavic (1):
  drm/amd/display: do not compare integers of different widths

Logush Oliver (1):
  drm/amd/display: Fix edp_bootup_bl_level initialization issue

Martin Tsai (1):
  drm/amd/display: Clear lane settings after LTTPRs have been trained

Michel Dänzer (1):
  drm/amdgpu: Call drm_framebuffer_init last for framebuffer init

Nikola Cornij (1):
  drm/amd/display: Clamp VStartup value at DML calculations time

Pu Lehui (2):
  drm/amd/display: Fix gcc unused variable warning
  drm/amd/display: remove unused variable 'dc'

Rodrigo Siqueira (1):
  drm/amd/display: Add Freesync video documentation

Roman Li (1):
  drm/amd/display: Delay PSR entry

Shaokun Zhang (1):
  drm/amd/display: Remove the repeated dpp1_full_bypass declaration

Stanley.Yang (3):
  drm/amdgpu: add vega20 to ras quirk list
  drm/amdgpu: fix bad address translation for sienna_cichlid
  drm/amdgpu: message smu to update hbm bad page number

Stylon Wang (1):
  drm/amd/display: Revert "Guard ASSR with internal display flag"

Wan Jiabing (1):
  drm/display: Fix duplicated argument

Wesley Chalmers (1):
  drm/amd/display: Fix incorrect variable name

Yifan Zha (1):
  drm/amd/pm: Disable SMU messages in navi10 sriov

Yifan Zhang (3):
  drm/amdgpu: remove unused parameter in amdgpu_gart_bind
  Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover 
full doorbell."
  Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."

xinhui pan (2):
  drm/amdgpu: Set TTM_PAGE_FLAG_SG earlier for userprt BOs
  drm/amdkfd: Walk through list with dqm lock hold

 Documentation/gpu/amdgpu-dc.rst|   6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c  |   8 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h  |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c|  12 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  19 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c|   7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  13 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c|   4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h|   5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c|  23 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h|  13 +++
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |   6 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |   6 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c|   5 -

[PATCH v2] drm/radeon: Fix NULL dereference when updating memory stats

2021-06-22 Thread Mikel Rychliski

radeon_ttm_bo_destroy() is attempting to access the resource object to
update memory counters. However, the resource object is already freed when
ttm calls this function via the destroy callback. This causes an oops when
a bo is freed:

BUG: kernel NULL pointer dereference, address: 0010
RIP: 0010:radeon_ttm_bo_destroy+0x2c/0x100 [radeon]
Call Trace:
 radeon_bo_unref+0x1a/0x30 [radeon]
 radeon_gem_object_free+0x33/0x50 [radeon]
 drm_gem_object_release_handle+0x69/0x70 [drm]
 drm_gem_handle_delete+0x62/0xa0 [drm]
 ? drm_mode_destroy_dumb+0x40/0x40 [drm]
 drm_ioctl_kernel+0xb2/0xf0 [drm]
 drm_ioctl+0x30a/0x3c0 [drm]
 ? drm_mode_destroy_dumb+0x40/0x40 [drm]
 radeon_drm_ioctl+0x49/0x80 [radeon]
 __x64_sys_ioctl+0x8e/0xd0

Avoid the issue by updating the counters in the delete_mem_notify callback
instead. Also, fix memory statistic updating in radeon_bo_move() to
identify the source type correctly. The source type needs to be saved
before the move, because the moved from object may be altered by the move.

Fixes: bfa3357ef9ab ("drm/ttm: allocate resource object instead of embedding it 
v2")
Signed-off-by: Mikel Rychliski 
---

v2: Update statistics on ghost object destroy

 drivers/gpu/drm/radeon/radeon_object.c | 33 -
 drivers/gpu/drm/radeon/radeon_object.h |  7 ---
 drivers/gpu/drm/radeon/radeon_ttm.c| 20 +---
 3 files changed, 29 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index bfaaa3c969a3..e0f98b394acd 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -49,23 +49,23 @@ static void radeon_bo_clear_surface_reg(struct radeon_bo 
*bo);
  * function are calling it.
  */
 
-static void radeon_update_memory_usage(struct radeon_bo *bo,
-  unsigned mem_type, int sign)
+void radeon_update_memory_usage(struct ttm_buffer_object *bo,
+   unsigned int mem_type, int sign)
 {
-   struct radeon_device *rdev = bo->rdev;
+   struct radeon_device *rdev = radeon_get_rdev(bo->bdev);
 
switch (mem_type) {
case TTM_PL_TT:
if (sign > 0)
-   atomic64_add(bo->tbo.base.size, >gtt_usage);
+   atomic64_add(bo->base.size, >gtt_usage);
else
-   atomic64_sub(bo->tbo.base.size, >gtt_usage);
+   atomic64_sub(bo->base.size, >gtt_usage);
break;
case TTM_PL_VRAM:
if (sign > 0)
-   atomic64_add(bo->tbo.base.size, >vram_usage);
+   atomic64_add(bo->base.size, >vram_usage);
else
-   atomic64_sub(bo->tbo.base.size, >vram_usage);
+   atomic64_sub(bo->base.size, >vram_usage);
break;
}
 }
@@ -76,8 +76,6 @@ static void radeon_ttm_bo_destroy(struct ttm_buffer_object 
*tbo)
 
bo = container_of(tbo, struct radeon_bo, tbo);
 
-   radeon_update_memory_usage(bo, bo->tbo.resource->mem_type, -1);
-
mutex_lock(>rdev->gem.mutex);
list_del_init(>list);
mutex_unlock(>rdev->gem.mutex);
@@ -726,25 +724,10 @@ int radeon_bo_check_tiling(struct radeon_bo *bo, bool 
has_moved,
return radeon_bo_get_surface_reg(bo);
 }
 
-void radeon_bo_move_notify(struct ttm_buffer_object *bo,
-  bool evict,
-  struct ttm_resource *new_mem)
+void radeon_bo_move_notify(struct radeon_bo *rbo)
 {
-   struct radeon_bo *rbo;
-
-   if (!radeon_ttm_bo_is_radeon_bo(bo))
-   return;
-
-   rbo = container_of(bo, struct radeon_bo, tbo);
radeon_bo_check_tiling(rbo, 0, 1);
radeon_vm_bo_invalidate(rbo->rdev, rbo);
-
-   /* update statistics */
-   if (!new_mem)
-   return;
-
-   radeon_update_memory_usage(rbo, bo->resource->mem_type, -1);
-   radeon_update_memory_usage(rbo, new_mem->mem_type, 1);
 }
 
 vm_fault_t radeon_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index 1739c6a142cd..0be50d28bafa 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -133,6 +133,9 @@ static inline u64 radeon_bo_mmap_offset(struct radeon_bo 
*bo)
return drm_vma_node_offset_addr(>tbo.base.vma_node);
 }
 
+extern void radeon_update_memory_usage(struct ttm_buffer_object *bo,
+  unsigned int mem_type, int sign);
+
 extern int radeon_bo_create(struct radeon_device *rdev,
unsigned long size, int byte_align,
bool kernel, u32 domain, u32 flags,
@@ -160,9 +163,7 @@ extern

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Felix Kuehling

Am 2021-06-22 um 11:29 a.m. schrieb Christian König:
> Am 22.06.21 um 17:23 schrieb Jason Gunthorpe:
>> On Tue, Jun 22, 2021 at 02:23:03PM +0200, Christian König wrote:
>>> Am 22.06.21 um 14:01 schrieb Jason Gunthorpe:
 On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> On Tue, Jun 22, 2021 at 9:37 AM Christian König
>  wrote:
>> Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
>>> On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
>>>
 Another thing I want to emphasize is that we are doing p2p only
 through the export/import of the FD. We do *not* allow the user to
 mmap the dma-buf as we do not support direct IO. So there is no
 access
 to these pages through the userspace.
>>> Arguably mmaping the memory is a better choice, and is the
>>> direction
>>> that Logan's series goes in. Here the use of DMABUF was
>>> specifically
>>> designed to allow hitless revokation of the memory, which this
>>> isn't
>>> even using.
>> The major problem with this approach is that DMA-buf is also used
>> for
>> memory which isn't CPU accessible.
 That isn't an issue here because the memory is only intended to be
 used with P2P transfers so it must be CPU accessible.
>>> No, especially P2P is often done on memory resources which are not even
>>> remotely CPU accessible.
>> That is a special AMD thing, P2P here is PCI P2P and all PCI memory is
>> CPU accessible.
>
> No absolutely not. NVidia GPUs work exactly the same way.
>
> And you have tons of similar cases in embedded and SoC systems where
> intermediate memory between devices isn't directly addressable with
> the CPU.
>
>>> So you are taking the hit of very limited hardware support and
>>> reduced
>>> performance just to squeeze into DMABUF..
 You still have the issue that this patch is doing all of this P2P
 stuff wrong - following the already NAK'd AMD approach.
>>> Well that stuff was NAKed because we still use sg_tables, not
>>> because we
>>> don't want to allocate struct pages.
>> sg lists in general.
>>  
>>> The plan is to push this forward since DEVICE_PRIVATE clearly can't
>>> handle
>>> all of our use cases and is not really a good fit to be honest.
>>>
>>> IOMMU is now working as well, so as far as I can see we are all good
>>> here.
>> How? Is that more AMD special stuff?
>
> No, just using the dma_map_resource() interface.
>
> We have that working on tons of IOMMU enabled systems.
>
>> This patch series never calls to the iommu driver, AFAICT.
>>
> I'll go and read Logan's patch-set to see if that will work for us in
> the future. Please remember, as Daniel said, we don't have struct
> page
> backing our device memory, so if that is a requirement to connect to
> Logan's work, then I don't think we will want to do it at this point.
 It is trivial to get the struct page for a PCI BAR.
>>> Yeah, but it doesn't make much sense. Why should we create a struct
>>> page for
>>> something that isn't even memory in a lot of cases?
>> Because the iommu and other places need this handle to setup their
>> stuff. Nobody has yet been brave enough to try to change those flows
>> to be able to use a physical CPU address.
>
> Well that is certainly not true. I'm just not sure if that works with
> all IOMMU drivers thought.
>
> Would need to ping Felix when the support for this was merged.

We have been working on IOMMU support for all our multi-GPU memory
mappings in KFD. The PCIe P2P side of this is currently only merged on
our internal branch. Before we can actually use this, we need
CONFIG_DMABUF_MOVE_NOTIFY enabled (which is still documented as
experimental and disabled by default). Otherwise we'll end up pinning
all our VRAM.

I think we'll try to put together an upstream patch series of all our
PCIe P2P support in a few weeks or so. This will include IOMMU mappings,
checking that PCIe P2P is actually possible between two devices, and KFD
topology updates to correctly report those capabilities to user mode.

It will not use struct pages for exported VRAM buffers.

Regards,
  Felix


>
> Regards,
> Christian.
>
>>
>> This is why we have a special struct page type just for PCI BAR
>> memory.
>>
>> Jason
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 5/6] drm/amdgpu: Fix BUG_ON assert

2021-06-22 Thread Andrey Grodzovsky

With added CPU domain to placement you can have
now 3 placemnts at once.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index b7a2070d90af..81268eded073 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -180,7 +180,7 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, 
u32 domain)
c++;
}
 
-   BUG_ON(c >= AMDGPU_BO_MAX_PLACEMENTS);
+   BUG_ON(c > AMDGPU_BO_MAX_PLACEMENTS);
 
placement->num_placement = c;
placement->placement = places;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/6] drm/amdgpu: switch gtt_mgr to counting used pages

2021-06-22 Thread Andrey Grodzovsky

From: Lang Yu 

Change mgr->available into mgr->used (invert the value).

Makes more sense to do it this way since we don't need the spinlock any
more to double check the handling.

v3 (chk): separated from the TEMPOARAY FLAG change.

Signed-off-by: Lang Yu 
Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 26 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  2 +-
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index b694dc96b336..495dd3bd4f1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -132,14 +132,10 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager 
*man,
struct amdgpu_gtt_node *node;
int r;
 
-   if (!(place->flags & TTM_PL_FLAG_TEMPORARY)) {
-   spin_lock(>lock);
-   if (atomic64_read(>available) < num_pages) {
-   spin_unlock(>lock);
-   return -ENOSPC;
-   }
-   atomic64_sub(num_pages, >available);
-   spin_unlock(>lock);
+   if (!(place->flags & TTM_PL_FLAG_TEMPORARY) &&
+   atomic64_add_return(num_pages, >used) >  man->size) {
+   atomic64_sub(num_pages, >used);
+   return -ENOSPC;
}
 
node = kzalloc(struct_size(node, base.mm_nodes, 1), GFP_KERNEL);
@@ -177,7 +173,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager 
*man,
 
 err_out:
if (!(place->flags & TTM_PL_FLAG_TEMPORARY))
-   atomic64_add(num_pages, >available);
+   atomic64_sub(num_pages, >used);
 
return r;
 }
@@ -202,7 +198,7 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager 
*man,
spin_unlock(>lock);

if (!(res->placement & TTM_PL_FLAG_TEMPORARY))
-   atomic64_add(res->num_pages, >available);
+   atomic64_sub(res->num_pages, >used);
 
kfree(node);
 }
@@ -217,9 +213,8 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager 
*man,
 uint64_t amdgpu_gtt_mgr_usage(struct ttm_resource_manager *man)
 {
struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
-   s64 result = man->size - atomic64_read(>available);
 
-   return (result > 0 ? result : 0) * PAGE_SIZE;
+   return atomic64_read(>used) * PAGE_SIZE;
 }
 
 /**
@@ -269,9 +264,8 @@ static void amdgpu_gtt_mgr_debug(struct 
ttm_resource_manager *man,
drm_mm_print(>mm, printer);
spin_unlock(>lock);
 
-   drm_printf(printer, "man size:%llu pages, gtt available:%lld pages, 
usage:%lluMB\n",
-  man->size, (u64)atomic64_read(>available),
-  amdgpu_gtt_mgr_usage(man) >> 20);
+   drm_printf(printer, "man size:%llu pages,  gtt used:%llu pages\n",
+  man->size, atomic64_read(>used));
 }
 
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func = {
@@ -303,7 +297,7 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, 
uint64_t gtt_size)
size = (adev->gmc.gart_size >> PAGE_SHIFT) - start;
drm_mm_init(>mm, start, size);
spin_lock_init(>lock);
-   atomic64_set(>available, gtt_size >> PAGE_SHIFT);
+   atomic64_set(>used, 0);
 
ttm_set_driver_manager(>mman.bdev, TTM_PL_TT, >manager);
ttm_resource_manager_set_used(man, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index e69f3e8e06e5..3205fd520060 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -52,7 +52,7 @@ struct amdgpu_gtt_mgr {
struct ttm_resource_manager manager;
struct drm_mm mm;
spinlock_t lock;
-   atomic64_t available;
+   atomic64_t used;
 };
 
 struct amdgpu_preempt_mgr {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/6] drm/amdgpu: user temporary GTT as bounce buffer

2021-06-22 Thread Andrey Grodzovsky

From: Lang Yu 

Currently, we have a limitted GTT memory size and need a bounce buffer
when doing buffer migration between VRAM and SYSTEM domain.

The problem is under GTT memory pressure we can't do buffer migration
between VRAM and SYSTEM domain. But in some cases we really need that.
Eespecially when validating a VRAM backing store BO which resides in
SYSTEM domain.

v2: still account temporary GTT allocations
v3 (chk): revert to the simpler change for now

Signed-off-by: Lang Yu 
Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 20 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  2 +-
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index ec96e0b26b11..b694dc96b336 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -132,14 +132,15 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager 
*man,
struct amdgpu_gtt_node *node;
int r;
 
-   spin_lock(>lock);
-   if (tbo->resource && tbo->resource->mem_type != TTM_PL_TT &&
-   atomic64_read(>available) < num_pages) {
+   if (!(place->flags & TTM_PL_FLAG_TEMPORARY)) {
+   spin_lock(>lock);
+   if (atomic64_read(>available) < num_pages) {
+   spin_unlock(>lock);
+   return -ENOSPC;
+   }
+   atomic64_sub(num_pages, >available);
spin_unlock(>lock);
-   return -ENOSPC;
}
-   atomic64_sub(num_pages, >available);
-   spin_unlock(>lock);
 
node = kzalloc(struct_size(node, base.mm_nodes, 1), GFP_KERNEL);
if (!node) {
@@ -175,7 +176,8 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager 
*man,
kfree(node);
 
 err_out:
-   atomic64_add(num_pages, >available);
+   if (!(place->flags & TTM_PL_FLAG_TEMPORARY))
+   atomic64_add(num_pages, >available);
 
return r;
 }
@@ -198,7 +200,9 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager 
*man,
if (drm_mm_node_allocated(>base.mm_nodes[0]))
drm_mm_remove_node(>base.mm_nodes[0]);
spin_unlock(>lock);
-   atomic64_add(res->num_pages, >available);
+   
+   if (!(res->placement & TTM_PL_FLAG_TEMPORARY))
+   atomic64_add(res->num_pages, >available);
 
kfree(node);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 80dff29f2bc7..79f875792b30 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -521,7 +521,7 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
hop->fpfn = 0;
hop->lpfn = 0;
hop->mem_type = TTM_PL_TT;
-   hop->flags = 0;
+   hop->flags = TTM_PL_FLAG_TEMPORARY;
return -EMULTIHOP;
}
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 6/6] drm/ttm: Fix multihop assert on eviction.

2021-06-22 Thread Andrey Grodzovsky

Problem:
Under memory pressure when GTT domain is almost full multihop assert
will come up when trying to evict LRU BO from VRAM to SYSTEM.

Fix:
Don't assert on multihop error in evict code but rather do a retry
as we do in ttm_bo_move_buffer

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 63 +++-
 1 file changed, 34 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 45145d02aed2..5a2dc712c632 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -485,6 +485,31 @@ void ttm_bo_unlock_delayed_workqueue(struct ttm_device 
*bdev, int resched)
 }
 EXPORT_SYMBOL(ttm_bo_unlock_delayed_workqueue);
 
+static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo,
+struct ttm_resource **mem,
+struct ttm_operation_ctx *ctx,
+struct ttm_place *hop)
+{
+   struct ttm_placement hop_placement;
+   struct ttm_resource *hop_mem;
+   int ret;
+
+   hop_placement.num_placement = hop_placement.num_busy_placement = 1;
+   hop_placement.placement = hop_placement.busy_placement = hop;
+
+   /* find space in the bounce domain */
+   ret = ttm_bo_mem_space(bo, _placement, _mem, ctx);
+   if (ret)
+   return ret;
+   /* move to the bounce domain */
+   ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL);
+   if (ret) {
+   ttm_resource_free(bo, _mem);
+   return ret;
+   }
+   return 0;
+}
+
 static int ttm_bo_evict(struct ttm_buffer_object *bo,
struct ttm_operation_ctx *ctx)
 {
@@ -524,12 +549,17 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
goto out;
}
 
+bounce:
ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, );
-   if (unlikely(ret)) {
-   WARN(ret == -EMULTIHOP, "Unexpected multihop in eviction - 
likely driver bug\n");
-   if (ret != -ERESTARTSYS)
+   if (ret == -EMULTIHOP) {
+   ret = ttm_bo_bounce_temp_buffer(bo, _mem, ctx, );
+   if (ret) {
pr_err("Buffer eviction failed\n");
-   ttm_resource_free(bo, _mem);
+   ttm_resource_free(bo, _mem);
+   goto out;
+   }
+   /* try and move to final place now. */
+   goto bounce;
}
 out:
return ret;
@@ -844,31 +874,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
 
-static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo,
-struct ttm_resource **mem,
-struct ttm_operation_ctx *ctx,
-struct ttm_place *hop)
-{
-   struct ttm_placement hop_placement;
-   struct ttm_resource *hop_mem;
-   int ret;
-
-   hop_placement.num_placement = hop_placement.num_busy_placement = 1;
-   hop_placement.placement = hop_placement.busy_placement = hop;
-
-   /* find space in the bounce domain */
-   ret = ttm_bo_mem_space(bo, _placement, _mem, ctx);
-   if (ret)
-   return ret;
-   /* move to the bounce domain */
-   ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL);
-   if (ret) {
-   ttm_resource_free(bo, _mem);
-   return ret;
-   }
-   return 0;
-}
-
 static int ttm_bo_move_buffer(struct ttm_buffer_object *bo,
  struct ttm_placement *placement,
  struct ttm_operation_ctx *ctx)
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/6] drm/amdgpu: always allow evicting to SYSTEM domain

2021-06-22 Thread Andrey Grodzovsky

From: Christian König 

When we run out of GTT we should still be able to evict VRAM->SYSTEM
with a bounce bufferdrm/amdgpu: always allow evicting to SYSTEM domain

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 79f875792b30..b46726e47bce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -149,14 +149,16 @@ static void amdgpu_evict_flags(struct ttm_buffer_object 
*bo,
 * BOs to be evicted from VRAM
 */
amdgpu_bo_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_VRAM |
-AMDGPU_GEM_DOMAIN_GTT);
+   AMDGPU_GEM_DOMAIN_GTT |
+   AMDGPU_GEM_DOMAIN_CPU);
abo->placements[0].fpfn = adev->gmc.visible_vram_size 
>> PAGE_SHIFT;
abo->placements[0].lpfn = 0;
abo->placement.busy_placement = >placements[1];
abo->placement.num_busy_placement = 1;
} else {
/* Move to GTT memory */
-   amdgpu_bo_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_GTT);
+   amdgpu_bo_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_GTT |
+   AMDGPU_GEM_DOMAIN_CPU);
}
break;
case TTM_PL_TT:
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/6] drm/ttm: add TTM_PL_FLAG_TEMPORARY flag v3

2021-06-22 Thread Andrey Grodzovsky

From: Lang Yu 

Sometimes drivers need to use bounce buffers to evict BOs. While those reside
in some domain they are not necessarily suitable for CS.

Add a flag so that drivers can note that a bounce buffers needs to be
reallocated during validation.

v2: add detailed comments
v3 (chk): merge commits and rework commit message

Suggested-by: Christian König 
Signed-off-by: Lang Yu 
Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c| 3 +++
 include/drm/ttm/ttm_placement.h | 7 +--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index db53fecca696..45145d02aed2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -913,6 +913,9 @@ static bool ttm_bo_places_compat(const struct ttm_place 
*places,
 {
unsigned i;
 
+   if (mem->placement & TTM_PL_FLAG_TEMPORARY)
+   return false;
+
for (i = 0; i < num_placement; i++) {
const struct ttm_place *heap = [i];
 
diff --git a/include/drm/ttm/ttm_placement.h b/include/drm/ttm/ttm_placement.h
index aa6ba4d0cf78..8995c9e4ec1b 100644
--- a/include/drm/ttm/ttm_placement.h
+++ b/include/drm/ttm/ttm_placement.h
@@ -47,8 +47,11 @@
  * top of the memory area, instead of the bottom.
  */
 
-#define TTM_PL_FLAG_CONTIGUOUS  (1 << 19)
-#define TTM_PL_FLAG_TOPDOWN (1 << 22)
+#define TTM_PL_FLAG_CONTIGUOUS  (1 << 0)
+#define TTM_PL_FLAG_TOPDOWN (1 << 1)
+
+/* For multihop handling */
+#define TTM_PL_FLAG_TEMPORARY   (1 << 2)
 
 /**
  * struct ttm_place
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Christian König


Am 22.06.21 um 17:40 schrieb Oded Gabbay:

On Tue, Jun 22, 2021 at 6:31 PM Christian König
 wrote:



Am 22.06.21 um 17:28 schrieb Jason Gunthorpe:

On Tue, Jun 22, 2021 at 05:24:08PM +0200, Christian König wrote:


I will take two GAUDI devices and use one as an exporter and one as an
importer. I want to see that the solution works end-to-end, with real
device DMA from importer to exporter.

I can tell you it doesn't. Stuffing physical addresses directly into
the sg list doesn't involve any of the IOMMU code so any configuration
that requires IOMMU page table setup will not work.

Sure it does. See amdgpu_vram_mgr_alloc_sgt:

  amdgpu_res_first(res, offset, length, );

   ^^

I'm not talking about the AMD driver, I'm talking about this patch.

+ bar_address = hdev->dram_pci_bar_start +
+ (pages[cur_page] - prop->dram_base_address);
+ sg_dma_address(sg) = bar_address;

Yeah, that is indeed not working.

Oded you need to use dma_map_resource() for this.

Christian.

Yes, of course.
But will it be enough ?
Jason said that supporting IOMMU isn't nice when we don't have struct pages.
I fail to understand the connection, I need to dig into this.


Question is what you want to do with this?

A struct page is always needed if you want to do stuff like HMM with it, 
if you only want P2P between device I actually recommend to avoid it.


Christian.



Oded





Jason


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Christian König


Am 22.06.21 um 17:40 schrieb Jason Gunthorpe:

On Tue, Jun 22, 2021 at 05:29:01PM +0200, Christian König wrote:

[SNIP]
No absolutely not. NVidia GPUs work exactly the same way.

And you have tons of similar cases in embedded and SoC systems where
intermediate memory between devices isn't directly addressable with the CPU.

None of that is PCI P2P.

It is all some specialty direct transfer.

You can't reasonably call dma_map_resource() on non CPU mapped memory
for instance, what address would you pass?

Do not confuse "I am doing transfers between two HW blocks" with PCI
Peer to Peer DMA transfers - the latter is a very narrow subcase.


No, just using the dma_map_resource() interface.

Ik, but yes that does "work". Logan's series is better.


No it isn't. It makes devices depend on allocating struct pages for 
their BARs which is not necessary nor desired.


How do you prevent direct I/O on those pages for example?

Allocating a struct pages has their use case, for example for exposing 
VRAM as memory for HMM. But that is something very specific and should 
not limit PCIe P2P DMA in general.



[SNIP]
Well that is certainly not true. I'm just not sure if that works with all
IOMMU drivers thought.

Huh? All the iommu interfaces except for the dma_map_resource() are
struct page based. dma_map_resource() is slow ad limited in what it
can do.


Yeah, but that is exactly the functionality we need. And as far as I can 
see that is also what Oded wants here.


Mapping stuff into userspace and then doing direct DMA to it is only a 
very limited use case and we need to be more flexible here.


Christian.



Jason


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Christian König




Am 22.06.21 um 17:28 schrieb Jason Gunthorpe:

On Tue, Jun 22, 2021 at 05:24:08PM +0200, Christian König wrote:


I will take two GAUDI devices and use one as an exporter and one as an
importer. I want to see that the solution works end-to-end, with real
device DMA from importer to exporter.

I can tell you it doesn't. Stuffing physical addresses directly into
the sg list doesn't involve any of the IOMMU code so any configuration
that requires IOMMU page table setup will not work.

Sure it does. See amdgpu_vram_mgr_alloc_sgt:

 amdgpu_res_first(res, offset, length, );

  ^^

I'm not talking about the AMD driver, I'm talking about this patch.

+   bar_address = hdev->dram_pci_bar_start +
+   (pages[cur_page] - prop->dram_base_address);
+   sg_dma_address(sg) = bar_address;


Yeah, that is indeed not working.

Oded you need to use dma_map_resource() for this.

Christian.





Jason


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Christian König


Am 22.06.21 um 17:23 schrieb Jason Gunthorpe:

On Tue, Jun 22, 2021 at 02:23:03PM +0200, Christian König wrote:

Am 22.06.21 um 14:01 schrieb Jason Gunthorpe:

On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:

On Tue, Jun 22, 2021 at 9:37 AM Christian König
 wrote:

Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:

On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:


Another thing I want to emphasize is that we are doing p2p only
through the export/import of the FD. We do *not* allow the user to
mmap the dma-buf as we do not support direct IO. So there is no access
to these pages through the userspace.

Arguably mmaping the memory is a better choice, and is the direction
that Logan's series goes in. Here the use of DMABUF was specifically
designed to allow hitless revokation of the memory, which this isn't
even using.

The major problem with this approach is that DMA-buf is also used for
memory which isn't CPU accessible.

That isn't an issue here because the memory is only intended to be
used with P2P transfers so it must be CPU accessible.

No, especially P2P is often done on memory resources which are not even
remotely CPU accessible.

That is a special AMD thing, P2P here is PCI P2P and all PCI memory is
CPU accessible.


No absolutely not. NVidia GPUs work exactly the same way.

And you have tons of similar cases in embedded and SoC systems where 
intermediate memory between devices isn't directly addressable with the CPU.



So you are taking the hit of very limited hardware support and reduced
performance just to squeeze into DMABUF..

You still have the issue that this patch is doing all of this P2P
stuff wrong - following the already NAK'd AMD approach.

Well that stuff was NAKed because we still use sg_tables, not because we
don't want to allocate struct pages.

sg lists in general.
  

The plan is to push this forward since DEVICE_PRIVATE clearly can't handle
all of our use cases and is not really a good fit to be honest.

IOMMU is now working as well, so as far as I can see we are all good here.

How? Is that more AMD special stuff?


No, just using the dma_map_resource() interface.

We have that working on tons of IOMMU enabled systems.


This patch series never calls to the iommu driver, AFAICT.


I'll go and read Logan's patch-set to see if that will work for us in
the future. Please remember, as Daniel said, we don't have struct page
backing our device memory, so if that is a requirement to connect to
Logan's work, then I don't think we will want to do it at this point.

It is trivial to get the struct page for a PCI BAR.

Yeah, but it doesn't make much sense. Why should we create a struct page for
something that isn't even memory in a lot of cases?

Because the iommu and other places need this handle to setup their
stuff. Nobody has yet been brave enough to try to change those flows
to be able to use a physical CPU address.


Well that is certainly not true. I'm just not sure if that works with 
all IOMMU drivers thought.


Would need to ping Felix when the support for this was merged.

Regards,
Christian.



This is why we have a special struct page type just for PCI BAR
memory.

Jason


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Christian König




Am 22.06.21 um 17:11 schrieb Jason Gunthorpe:

On Tue, Jun 22, 2021 at 04:12:26PM +0300, Oded Gabbay wrote:


1) Setting sg_page to NULL
2) 'mapping' pages for P2P DMA without going through the iommu
3) Allowing P2P DMA without using the p2p dma API to validate that it
can work at all in the first place.

All of these result in functional bugs in certain system
configurations.

Jason

Hi Jason,
Thanks for the feedback.
Regarding point 1, why is that a problem if we disable the option to
mmap the dma-buf from user-space ?

Userspace has nothing to do with needing struct pages or not

Point 1 and 2 mostly go together, you supporting the iommu is not nice
if you dont have struct pages.

You should study Logan's patches I pointed you at as they are solving
exactly this problem.


In addition, I didn't see any problem with sg_page being NULL in the
RDMA p2p dma-buf code. Did I miss something here ?

No, the design of the dmabuf requires the exporter to do the dma maps
and so it is only the exporter that is wrong to omit all the iommu and
p2p logic.

RDMA is OK today only because nobody has implemented dma buf support
in rxe/si - mainly because the only implementations of exporters don't
set the struct page and are thus buggy.


I will take two GAUDI devices and use one as an exporter and one as an
importer. I want to see that the solution works end-to-end, with real
device DMA from importer to exporter.

I can tell you it doesn't. Stuffing physical addresses directly into
the sg list doesn't involve any of the IOMMU code so any configuration
that requires IOMMU page table setup will not work.


Sure it does. See amdgpu_vram_mgr_alloc_sgt:

    amdgpu_res_first(res, offset, length, );
    for_each_sgtable_sg((*sgt), sg, i) {
    phys_addr_t phys = cursor.start + adev->gmc.aper_base;
    size_t size = cursor.size;
    dma_addr_t addr;

    addr = dma_map_resource(dev, phys, size, dir,
    DMA_ATTR_SKIP_CPU_SYNC);
    r = dma_mapping_error(dev, addr);
    if (r)
    goto error_unmap;

    sg_set_page(sg, NULL, size, 0);
    sg_dma_address(sg) = addr;
    sg_dma_len(sg) = size;

    amdgpu_res_next(, cursor.size);
    }

dma_map_resource() does the IOMMU mapping for us.

Regards,
Christian.




Jason


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] This patch replaces all the instances of dev_info with drm_info

2021-06-22 Thread kernel test robot

Hi Aman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.13-rc7 next-20210622]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Aman-Jain/This-patch-replaces-all-the-instances-of-dev_info-with-drm_info/20210622-151557
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
a96bfed64c8986d6404e553f18203cae1f5ac7e6
config: powerpc64-randconfig-r001-20210622 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
b3634d3e88b7f26534a5057bff182b7dced584fc)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc64 cross compiling tool for clang build
# apt-get install binutils-powerpc64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/ca9c5b613cf15d038d10e80c402b78e5925fc31e
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Aman-Jain/This-patch-replaces-all-the-instances-of-dev_info-with-drm_info/20210622-151557
git checkout ca9c5b613cf15d038d10e80c402b78e5925fc31e
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from drivers/gpu/drm/radeon/radeon_drv.c:33:
   In file included from include/linux/compat.h:14:
   In file included from include/linux/sem.h:5:
   In file included from include/uapi/linux/sem.h:5:
   In file included from include/linux/ipc.h:5:
   In file included from include/linux/spinlock.h:51:
   In file included from include/linux/preempt.h:11:
   In file included from include/linux/list.h:9:
   In file included from include/linux/kernel.h:12:
   In file included from include/linux/bitops.h:32:
   In file included from arch/powerpc/include/asm/bitops.h:62:
   arch/powerpc/include/asm/barrier.h:49:9: warning: '__lwsync' macro redefined 
[-Wmacro-redefined]
   #define __lwsync()  __asm__ __volatile__ (stringify_in_c(LWSYNC) : : 
:"memory")
   ^
   :309:9: note: previous definition is here
   #define __lwsync __builtin_ppc_lwsync
   ^
>> drivers/gpu/drm/radeon/radeon_drv.c:312:4: error: no member named 'dev' in 
>> 'struct device'
   drm_info(>dev,
   ^~~~
   include/drm/drm_print.h:416:2: note: expanded from macro 'drm_info'
   __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
   ^~
   include/drm/drm_print.h:412:27: note: expanded from macro '__drm_printk'
   dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
   ~^
   include/linux/dev_printk.h:118:12: note: expanded from macro 'dev_info'
   _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
 ^~~
   drivers/gpu/drm/radeon/radeon_drv.c:324:4: error: no member named 'dev' in 
'struct device'
   drm_info(>dev,
   ^~~~
   include/drm/drm_print.h:416:2: note: expanded from macro 'drm_info'
   __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
   ^~
   include/drm/drm_print.h:412:27: note: expanded from macro '__drm_printk'
   dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
   ~^
   include/linux/dev_printk.h:118:12: note: expanded from macro 'dev_info'
   _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
 ^~~
   1 warning and 2 errors generated.


vim +312 drivers/gpu/drm/radeon/radeon_drv.c

   292  
   293  static int radeon_pci_probe(struct pci_dev *pdev,
   294  const struct pci_device_id *ent)
   295  {
   296  unsigned long flags = 0;
   297  struct drm_device *dev;
   298  int ret;
   299  
   300  if (!ent)
   301  return -ENODEV; /* Avoid NULL-ptr deref in 
drm_get_pci_dev */
   302  
   303  flags = ent->driver_data;
   304  
   305  if (!radeon_si_support) {
   306  switch (flags & RADEON_FAMILY_MASK) {
   307  case CHIP_TAHITI:
   308  case CHIP_PITCAIRN:
   309  case CHIP_VERDE:
   310  case CHIP_OLAND:
   311  case

[PATCH 2/4] drm/amdkfd: fix sysfs kobj leak

2021-06-22 Thread Philip Yang

3 cases of kobj leak, which causes memory leak:

kobj_type must have release() method to free memory from release
callback. Don't need NULL default_attrs to init kobj.

sysfs files created under kobj_status should be removed with kobj_status
as parent kobject.

Remove queue sysfs files when releasing queue from process MMU notifier
release callback.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 14 ++
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  1 +
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 3147dc8bb051..cfc36fceac8a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -451,13 +451,9 @@ static const struct sysfs_ops procfs_stats_ops = {
.show = kfd_procfs_stats_show,
 };
 
-static struct attribute *procfs_stats_attrs[] = {
-   NULL
-};
-
 static struct kobj_type procfs_stats_type = {
.sysfs_ops = _stats_ops,
-   .default_attrs = procfs_stats_attrs,
+   .release = kfd_procfs_kobj_release,
 };
 
 int kfd_procfs_add_queue(struct queue *q)
@@ -946,9 +942,11 @@ static void kfd_process_wq_release(struct work_struct 
*work)
 
sysfs_remove_file(p->kobj, >attr_vram);
sysfs_remove_file(p->kobj, >attr_sdma);
-   sysfs_remove_file(p->kobj, >attr_evict);
-   if (pdd->dev->kfd2kgd->get_cu_occupancy != NULL)
-   sysfs_remove_file(p->kobj, 
>attr_cu_occupancy);
+
+   sysfs_remove_file(pdd->kobj_stats, >attr_evict);
+   if (pdd->dev->kfd2kgd->get_cu_occupancy)
+   sysfs_remove_file(pdd->kobj_stats,
+ >attr_cu_occupancy);
kobject_del(pdd->kobj_stats);
kobject_put(pdd->kobj_stats);
pdd->kobj_stats = NULL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 95a6c36cea4c..243dd1efcdbf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -153,6 +153,7 @@ void pqm_uninit(struct process_queue_manager *pqm)
if (pqn->q && pqn->q->gws)

amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info,
pqn->q->gws);
+   kfd_procfs_del_queue(pqn->q);
uninit_queue(pqn->q);
list_del(>process_queue_list);
kfree(pqn);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/4] drm/amdkfd: add helper function for kfd sysfs create

2021-06-22 Thread Philip Yang

No functionality change. Modify kfd_sysfs_create_file to use kobject as
parameter, so it becomes common helper function to remove duplicate code
and will simplify new kfd sysfs file create in future.

Move pr_warn to helper function if sysfs file create failed. Set helper
function as void return because caller doesn't use the helper function
return value.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 119 ---
 1 file changed, 39 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 09b98a83f670..3147dc8bb051 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -484,34 +484,31 @@ int kfd_procfs_add_queue(struct queue *q)
return 0;
 }
 
-static int kfd_sysfs_create_file(struct kfd_process *p, struct attribute *attr,
+static void kfd_sysfs_create_file(struct kobject *kobj, struct attribute *attr,
 char *name)
 {
-   int ret = 0;
+   int ret;
 
-   if (!p || !attr || !name)
-   return -EINVAL;
+   if (!kobj || !attr || !name)
+   return;
 
attr->name = name;
attr->mode = KFD_SYSFS_FILE_MODE;
sysfs_attr_init(attr);
 
-   ret = sysfs_create_file(p->kobj, attr);
-
-   return ret;
+   ret = sysfs_create_file(kobj, attr);
+   if (ret)
+   pr_warn("Create sysfs %s/%s failed %d", kobj->name, name, ret);
 }
 
-static int kfd_procfs_add_sysfs_stats(struct kfd_process *p)
+static void kfd_procfs_add_sysfs_stats(struct kfd_process *p)
 {
-   int ret = 0;
+   int ret;
int i;
char stats_dir_filename[MAX_SYSFS_FILENAME_LEN];
 
-   if (!p)
-   return -EINVAL;
-
-   if (!p->kobj)
-   return -EFAULT;
+   if (!p || !p->kobj)
+   return;
 
/*
 * Create sysfs files for each GPU:
@@ -521,63 +518,43 @@ static int kfd_procfs_add_sysfs_stats(struct kfd_process 
*p)
 */
for (i = 0; i < p->n_pdds; i++) {
struct kfd_process_device *pdd = p->pdds[i];
-   struct kobject *kobj_stats;
 
snprintf(stats_dir_filename, MAX_SYSFS_FILENAME_LEN,
"stats_%u", pdd->dev->id);
-   kobj_stats = kfd_alloc_struct(kobj_stats);
-   if (!kobj_stats)
-   return -ENOMEM;
+   pdd->kobj_stats = kfd_alloc_struct(pdd->kobj_stats);
+   if (!pdd->kobj_stats)
+   return;
 
-   ret = kobject_init_and_add(kobj_stats,
-   _stats_type,
-   p->kobj,
-   stats_dir_filename);
+   ret = kobject_init_and_add(pdd->kobj_stats,
+  _stats_type,
+  p->kobj,
+  stats_dir_filename);
 
if (ret) {
pr_warn("Creating KFD proc/stats_%s folder failed",
-   stats_dir_filename);
-   kobject_put(kobj_stats);
-   goto err;
+   stats_dir_filename);
+   kobject_put(pdd->kobj_stats);
+   pdd->kobj_stats = NULL;
+   return;
}
 
-   pdd->kobj_stats = kobj_stats;
-   pdd->attr_evict.name = "evicted_ms";
-   pdd->attr_evict.mode = KFD_SYSFS_FILE_MODE;
-   sysfs_attr_init(>attr_evict);
-   ret = sysfs_create_file(kobj_stats, >attr_evict);
-   if (ret)
-   pr_warn("Creating eviction stats for gpuid %d failed",
-   (int)pdd->dev->id);
-
+   kfd_sysfs_create_file(pdd->kobj_stats, >attr_evict,
+ "evicted_ms");
/* Add sysfs file to report compute unit occupancy */
-   if (pdd->dev->kfd2kgd->get_cu_occupancy != NULL) {
-   pdd->attr_cu_occupancy.name = "cu_occupancy";
-   pdd->attr_cu_occupancy.mode = KFD_SYSFS_FILE_MODE;
-   sysfs_attr_init(>attr_cu_occupancy);
-   ret = sysfs_create_file(kobj_stats,
-   >attr_cu_occupancy);
-   if (ret)
-   pr_warn("Creating %s failed for gpuid: %d",
-   pdd->attr_cu_occupancy.name,
-   (int)pdd->dev->id);
-   }
+   if (pdd->dev->kfd2kgd->get_cu_occupancy)
+   kfd_sysfs_create_file(pdd->kobj_stats,
+

[PATCH 3/4] drm/amdkfd: add sysfs counters for vm fault and migration

2021-06-22 Thread Philip Yang

This is part of SVM profiling API, export sysfs counters for
per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram.

counters will not be updated in parallel in GPU retry fault handler and
migration to vram/ram path, use READ_ONCE to avoid compiler
optimization.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|   9 ++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 151 ++-
 2 files changed, 131 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 6dc22fa1e555..3426743ed228 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -730,6 +730,15 @@ struct kfd_process_device {
 *  number of CU's a device has along with number of other competing 
processes
 */
struct attribute attr_cu_occupancy;
+
+   /* sysfs counters for GPU retry fault and page migration tracking */
+   struct kobject *kobj_counters;
+   struct attribute attr_faults;
+   struct attribute attr_page_in;
+   struct attribute attr_page_out;
+   uint64_t faults;
+   uint64_t page_in;
+   uint64_t page_out;
 };
 
 #define qpd_to_pdd(x) container_of(x, struct kfd_process_device, qpd)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index cfc36fceac8a..21ec8a18cad2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -416,6 +416,29 @@ static ssize_t kfd_procfs_stats_show(struct kobject *kobj,
return 0;
 }
 
+static ssize_t kfd_sysfs_counters_show(struct kobject *kobj,
+  struct attribute *attr, char *buf)
+{
+   struct kfd_process_device *pdd;
+
+   if (!strcmp(attr->name, "faults")) {
+   pdd = container_of(attr, struct kfd_process_device,
+  attr_faults);
+   return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->faults));
+   }
+   if (!strcmp(attr->name, "page_in")) {
+   pdd = container_of(attr, struct kfd_process_device,
+  attr_page_in);
+   return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_in));
+   }
+   if (!strcmp(attr->name, "page_out")) {
+   pdd = container_of(attr, struct kfd_process_device,
+  attr_page_out);
+   return sysfs_emit(buf, "%llu\n", READ_ONCE(pdd->page_out));
+   }
+   return 0;
+}
+
 static struct attribute attr_queue_size = {
.name = "size",
.mode = KFD_SYSFS_FILE_MODE
@@ -456,6 +479,15 @@ static struct kobj_type procfs_stats_type = {
.release = kfd_procfs_kobj_release,
 };
 
+static const struct sysfs_ops sysfs_counters_ops = {
+   .show = kfd_sysfs_counters_show,
+};
+
+static struct kobj_type sysfs_counters_type = {
+   .sysfs_ops = _counters_ops,
+   .release = kfd_procfs_kobj_release,
+};
+
 int kfd_procfs_add_queue(struct queue *q)
 {
struct kfd_process *proc;
@@ -544,6 +576,50 @@ static void kfd_procfs_add_sysfs_stats(struct kfd_process 
*p)
}
 }
 
+static void kfd_procfs_add_sysfs_counters(struct kfd_process *p)
+{
+   int ret = 0;
+   int i;
+   char counters_dir_filename[MAX_SYSFS_FILENAME_LEN];
+
+   if (!p || !p->kobj)
+   return;
+
+   /*
+* Create sysfs files for each GPU which supports SVM
+* - proc//counters_/
+* - proc//counters_/faults
+* - proc//counters_/page_in
+* - proc//counters_/page_out
+*/
+   for_each_set_bit(i, p->svms.bitmap_supported, p->n_pdds) {
+   struct kfd_process_device *pdd = p->pdds[i];
+   struct kobject *kobj_counters;
+
+   snprintf(counters_dir_filename, MAX_SYSFS_FILENAME_LEN,
+   "counters_%u", pdd->dev->id);
+   kobj_counters = kfd_alloc_struct(kobj_counters);
+   if (!kobj_counters)
+   return;
+
+   ret = kobject_init_and_add(kobj_counters, _counters_type,
+  p->kobj, counters_dir_filename);
+   if (ret) {
+   pr_warn("Creating KFD proc/%s folder failed",
+   counters_dir_filename);
+   kobject_put(kobj_counters);
+   return;
+   }
+
+   pdd->kobj_counters = kobj_counters;
+   kfd_sysfs_create_file(kobj_counters, >attr_faults,
+ "faults");
+   kfd_sysfs_create_file(kobj_counters, >attr_page_in,
+ "page_in");
+   kfd_sysfs_create_file(kobj_counters, >attr_page_out,
+ "page_out");
+   }
+}
 
 static void kfd_procfs_add_sysfs_files(struct kfd_process *p)
 {

[PATCH 4/4] drm/amdkfd: implement counters for vm fault and migration

2021-06-22 Thread Philip Yang

Add helper function to get process device data structure from adev to
update counters.

Update vm faults, page_in, page_out counters will no be executed in
parallel, use WRITE_ONCE to avoid any form of compiler optimizations.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 14 ++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 24 
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h |  2 ++
 3 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index fd8f544f0de2..45b5349283af 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -413,6 +413,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
uint64_t end)
 {
uint64_t npages = (end - start) >> PAGE_SHIFT;
+   struct kfd_process_device *pdd;
struct dma_fence *mfence = NULL;
struct migrate_vma migrate;
dma_addr_t *scratch;
@@ -473,6 +474,12 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
 out_free:
kvfree(buf);
 out:
+   if (!r) {
+   pdd = svm_range_get_pdd_by_adev(prange, adev);
+   if (pdd)
+   WRITE_ONCE(pdd->page_in, pdd->page_in + migrate.cpages);
+   }
+
return r;
 }
 
@@ -629,6 +636,7 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
   struct vm_area_struct *vma, uint64_t start, uint64_t end)
 {
uint64_t npages = (end - start) >> PAGE_SHIFT;
+   struct kfd_process_device *pdd;
struct dma_fence *mfence = NULL;
struct migrate_vma migrate;
dma_addr_t *scratch;
@@ -678,6 +686,12 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
 out_free:
kvfree(buf);
 out:
+   if (!r) {
+   pdd = svm_range_get_pdd_by_adev(prange, adev);
+   if (pdd)
+   WRITE_ONCE(pdd->page_out,
+  pdd->page_out + migrate.cpages);
+   }
return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 5468ea4264c6..f3323328f01f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -564,6 +564,24 @@ svm_range_get_adev_by_id(struct svm_range *prange, 
uint32_t gpu_id)
return (struct amdgpu_device *)pdd->dev->kgd;
 }
 
+struct kfd_process_device *
+svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device *adev)
+{
+   struct kfd_process *p;
+   int32_t gpu_idx, gpuid;
+   int r;
+
+   p = container_of(prange->svms, struct kfd_process, svms);
+
+   r = kfd_process_gpuid_from_kgd(p, adev, , _idx);
+   if (r) {
+   pr_debug("failed to get device id by adev %p\n", adev);
+   return NULL;
+   }
+
+   return kfd_process_device_from_gpuidx(p, gpu_idx);
+}
+
 static int svm_range_bo_validate(void *param, struct amdgpu_bo *bo)
 {
struct ttm_operation_ctx ctx = { false, false };
@@ -2315,6 +2333,7 @@ int
 svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
uint64_t addr)
 {
+   struct kfd_process_device *pdd;
struct mm_struct *mm = NULL;
struct svm_range_list *svms;
struct svm_range *prange;
@@ -2440,6 +2459,11 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
 out_unlock_svms:
mutex_unlock(>lock);
mmap_read_unlock(mm);
+
+   pdd = svm_range_get_pdd_by_adev(prange, adev);
+   if (pdd)
+   WRITE_ONCE(pdd->faults, pdd->faults + 1);
+
mmput(mm);
 out:
kfd_unref_process(p);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
index 0c0fc399395e..a9af03994d1a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
@@ -174,6 +174,8 @@ void svm_range_dma_unmap(struct device *dev, dma_addr_t 
*dma_addr,
 unsigned long offset, unsigned long npages);
 void svm_range_free_dma_mappings(struct svm_range *prange);
 void svm_range_prefault(struct svm_range *prange, struct mm_struct *mm);
+struct kfd_process_device *
+svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device 
*adev);
 
 /* SVM API and HMM page migration work together, device memory type
  * is initialized to not 0 when page migration register device memory.
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 umr 3/3] Enhance printing of page tables in AI+

2021-06-22 Thread StDenis, Tom

[AMD Official Use Only]

Hi,

Just a quick update.  Your first vector passes with your v2 patch in place.  
I'll add the other 3 and then start reviewing the code.

Thanks,
Tom


From: Greathouse, Joseph 
Sent: Monday, June 21, 2021 12:37
To: amd-gfx@lists.freedesktop.org
Cc: StDenis, Tom; Greathouse, Joseph
Subject: [PATCH v2 umr 3/3] Enhance printing of page tables in AI+

Pulls print functions for GPUVM page tables on AI+ chips into their
own set of generalized functions, so that we don't have subtly
different printouts for different layers.

Explicitly prints PDEs with P bit (which makes it a PTE) and makes
the PTE with F bit set (further, which makes it a PDE) properly
indent the next layer of the print.

Prints remaining fields from the PTE and PDE printouts, such as
read/write/execute bits and MTYPE from PTE.

v2: Correctly handle printing translate-further PTEs

Signed-off-by: Joseph Greathouse 
---
 src/lib/read_vram.c | 184 ++--
 1 file changed, 127 insertions(+), 57 deletions(-)

diff --git a/src/lib/read_vram.c b/src/lib/read_vram.c
index 2998873..bea1232 100644
--- a/src/lib/read_vram.c
+++ b/src/lib/read_vram.c
@@ -415,6 +415,112 @@ static pte_fields_ai_t decode_pte_entry_ai(uint64_t 
pte_entry)
return pte_fields;
 }

+static void print_pde_fields_ai(struct umr_asic *asic,
+   pde_fields_ai_t pde_fields)
+{
+   asic->mem_funcs.vm_message(
+   ", PBA==0x%012" PRIx64 ", V=%" PRIu64
+   ", S=%" PRIu64 ", C=%" PRIu64
+   ", P=%" PRIu64 ", FS=%" PRIu64 "\n",
+   pde_fields.pte_base_addr,
+   pde_fields.valid,
+   pde_fields.system,
+   pde_fields.coherent,
+   pde_fields.pte,
+   pde_fields.frag_size);
+}
+static void print_base_ai(struct umr_asic *asic,
+ uint64_t pde_entry, uint64_t address,
+ uint64_t va_mask, pde_fields_ai_t pde_fields,
+ int is_base_not_pde)
+{
+   if (is_base_not_pde)
+   asic->mem_funcs.vm_message("BASE");
+   else
+   asic->mem_funcs.vm_message("PDE");
+   asic->mem_funcs.vm_message("=0x%016" PRIx64 ", VA=0x%012" PRIx64,
+   pde_entry,
+   address & va_mask);
+   print_pde_fields_ai(asic, pde_fields);
+}
+
+static void print_pde_ai(struct umr_asic *asic,
+   const char * indentation, int pde_cnt,
+   int page_table_depth, uint64_t prev_addr,
+   uint64_t pde_idx, uint64_t pde_entry, uint64_t address,
+   uint64_t va_mask, pde_fields_ai_t pde_fields)
+{
+   asic->mem_funcs.vm_message("%s ", [18-pde_cnt*3]);
+   if (pde_fields.further)
+   asic->mem_funcs.vm_message("PTE-FURTHER");
+   else
+   asic->mem_funcs.vm_message("PDE%d", page_table_depth - pde_cnt);
+
+   asic->mem_funcs.vm_message("@{0x%" PRIx64 "/%" PRIx64
+   "}=0x%016" PRIx64 ", VA=0x%012" PRIx64,
+   prev_addr,
+   pde_idx,
+   pde_entry,
+   address & va_mask);
+   print_pde_fields_ai(asic, pde_fields);
+}
+
+static void print_pte_ai(struct umr_asic *asic,
+   const char * indentation, int pde_cnt, uint64_t prev_addr,
+   uint64_t pte_idx, uint64_t pte_entry, uint64_t address,
+   uint64_t va_mask, pte_fields_ai_t pte_fields)
+{
+   if (asic == NULL) {
+   asic->mem_funcs.vm_message("\\-> PTE");
+   } else {
+   asic->mem_funcs.vm_message("%s ",
+   [18-pde_cnt*3]);
+   if (pte_fields.pde)
+   asic->mem_funcs.vm_message("PDE0-as-PTE");
+   else
+   asic->mem_funcs.vm_message("PTE");
+   asic->mem_funcs.vm_message("@{0x%" PRIx64 "/%" PRIx64"}",
+   prev_addr,
+   pte_idx);
+   }
+   asic->mem_funcs.vm_message("=0x%016" PRIx64 ", VA=0x%012" PRIx64
+   ", PBA==0x%012" PRIx64 ", V=%" PRIu64
+   ", S=%" PRIu64 ", C=%" PRIu64 ", Z=%" PRIu64
+   ", X=%" PRIu64 ", R=%" PRIu64 ", W=%" PRIu64
+   ", FS=%" PRIu64 ", T=%" PRIu64 ", MTYPE=",
+   pte_entry,
+   address & va_mask,
+   pte_fields.page_base_addr,
+   pte_fields.valid,
+   pte_fields.system,
+   pte_fields.coherent,
+   pte_fields.tmz,
+   pte_fields.execute,
+   pte_fields.read,
+   pte_fields.write,
+

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 02:23:03PM +0200, Christian König wrote:
> Am 22.06.21 um 14:01 schrieb Jason Gunthorpe:
> > On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> > > On Tue, Jun 22, 2021 at 9:37 AM Christian König
> > >  wrote:
> > > > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > > > > 
> > > > > > Another thing I want to emphasize is that we are doing p2p only
> > > > > > through the export/import of the FD. We do *not* allow the user to
> > > > > > mmap the dma-buf as we do not support direct IO. So there is no 
> > > > > > access
> > > > > > to these pages through the userspace.
> > > > > Arguably mmaping the memory is a better choice, and is the direction
> > > > > that Logan's series goes in. Here the use of DMABUF was specifically
> > > > > designed to allow hitless revokation of the memory, which this isn't
> > > > > even using.
> > > > The major problem with this approach is that DMA-buf is also used for
> > > > memory which isn't CPU accessible.
> > That isn't an issue here because the memory is only intended to be
> > used with P2P transfers so it must be CPU accessible.
> 
> No, especially P2P is often done on memory resources which are not even
> remotely CPU accessible.

That is a special AMD thing, P2P here is PCI P2P and all PCI memory is
CPU accessible.

> > > > > So you are taking the hit of very limited hardware support and reduced
> > > > > performance just to squeeze into DMABUF..
> > You still have the issue that this patch is doing all of this P2P
> > stuff wrong - following the already NAK'd AMD approach.
> 
> Well that stuff was NAKed because we still use sg_tables, not because we
> don't want to allocate struct pages.

sg lists in general.
 
> The plan is to push this forward since DEVICE_PRIVATE clearly can't handle
> all of our use cases and is not really a good fit to be honest.
> 
> IOMMU is now working as well, so as far as I can see we are all good here.

How? Is that more AMD special stuff?

This patch series never calls to the iommu driver, AFAICT.

> > > I'll go and read Logan's patch-set to see if that will work for us in
> > > the future. Please remember, as Daniel said, we don't have struct page
> > > backing our device memory, so if that is a requirement to connect to
> > > Logan's work, then I don't think we will want to do it at this point.
> > It is trivial to get the struct page for a PCI BAR.
> 
> Yeah, but it doesn't make much sense. Why should we create a struct page for
> something that isn't even memory in a lot of cases?

Because the iommu and other places need this handle to setup their
stuff. Nobody has yet been brave enough to try to change those flows
to be able to use a physical CPU address.

This is why we have a special struct page type just for PCI BAR
memory.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 05:48:10PM +0200, Christian König wrote:
> Am 22.06.21 um 17:40 schrieb Jason Gunthorpe:
> > On Tue, Jun 22, 2021 at 05:29:01PM +0200, Christian König wrote:
> > > [SNIP]
> > > No absolutely not. NVidia GPUs work exactly the same way.
> > > 
> > > And you have tons of similar cases in embedded and SoC systems where
> > > intermediate memory between devices isn't directly addressable with the 
> > > CPU.
> > None of that is PCI P2P.
> > 
> > It is all some specialty direct transfer.
> > 
> > You can't reasonably call dma_map_resource() on non CPU mapped memory
> > for instance, what address would you pass?
> > 
> > Do not confuse "I am doing transfers between two HW blocks" with PCI
> > Peer to Peer DMA transfers - the latter is a very narrow subcase.
> > 
> > > No, just using the dma_map_resource() interface.
> > Ik, but yes that does "work". Logan's series is better.
>
> No it isn't. It makes devices depend on allocating struct pages for their
> BARs which is not necessary nor desired.

Which dramatically reduces the cost of establishing DMA mappings, a
loop of dma_map_resource() is very expensive.
 
> How do you prevent direct I/O on those pages for example?

GUP fails.

> Allocating a struct pages has their use case, for example for exposing VRAM
> as memory for HMM. But that is something very specific and should not limit
> PCIe P2P DMA in general.

Sure, but that is an ideal we are far from obtaining, and nobody wants
to work on it prefering to do hacky hacky like this.

If you believe in this then remove the scatter list from dmabuf, add a
new set of dma_map* APIs to work on physical addresses and all the
other stuff needed.

Otherwise, we have what we have and drivers don't get to opt out. This
is why the stuff in AMDGPU was NAK'd.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 05:29:01PM +0200, Christian König wrote:
> Am 22.06.21 um 17:23 schrieb Jason Gunthorpe:
> > On Tue, Jun 22, 2021 at 02:23:03PM +0200, Christian König wrote:
> > > Am 22.06.21 um 14:01 schrieb Jason Gunthorpe:
> > > > On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> > > > > On Tue, Jun 22, 2021 at 9:37 AM Christian König
> > > > >  wrote:
> > > > > > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > > > > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > > > > > > 
> > > > > > > > Another thing I want to emphasize is that we are doing p2p only
> > > > > > > > through the export/import of the FD. We do *not* allow the user 
> > > > > > > > to
> > > > > > > > mmap the dma-buf as we do not support direct IO. So there is no 
> > > > > > > > access
> > > > > > > > to these pages through the userspace.
> > > > > > > Arguably mmaping the memory is a better choice, and is the 
> > > > > > > direction
> > > > > > > that Logan's series goes in. Here the use of DMABUF was 
> > > > > > > specifically
> > > > > > > designed to allow hitless revokation of the memory, which this 
> > > > > > > isn't
> > > > > > > even using.
> > > > > > The major problem with this approach is that DMA-buf is also used 
> > > > > > for
> > > > > > memory which isn't CPU accessible.
> > > > That isn't an issue here because the memory is only intended to be
> > > > used with P2P transfers so it must be CPU accessible.
> > > No, especially P2P is often done on memory resources which are not even
> > > remotely CPU accessible.
> > That is a special AMD thing, P2P here is PCI P2P and all PCI memory is
> > CPU accessible.
> 
> No absolutely not. NVidia GPUs work exactly the same way.
>
> And you have tons of similar cases in embedded and SoC systems where
> intermediate memory between devices isn't directly addressable with the CPU.

None of that is PCI P2P.

It is all some specialty direct transfer.

You can't reasonably call dma_map_resource() on non CPU mapped memory
for instance, what address would you pass?

Do not confuse "I am doing transfers between two HW blocks" with PCI
Peer to Peer DMA transfers - the latter is a very narrow subcase.

> No, just using the dma_map_resource() interface.

Ik, but yes that does "work". Logan's series is better.

> > > > > I'll go and read Logan's patch-set to see if that will work for us in
> > > > > the future. Please remember, as Daniel said, we don't have struct page
> > > > > backing our device memory, so if that is a requirement to connect to
> > > > > Logan's work, then I don't think we will want to do it at this point.
> > > > It is trivial to get the struct page for a PCI BAR.
> > > Yeah, but it doesn't make much sense. Why should we create a struct page 
> > > for
> > > something that isn't even memory in a lot of cases?
> > Because the iommu and other places need this handle to setup their
> > stuff. Nobody has yet been brave enough to try to change those flows
> > to be able to use a physical CPU address.
> 
> Well that is certainly not true. I'm just not sure if that works with all
> IOMMU drivers thought.

Huh? All the iommu interfaces except for the dma_map_resource() are
struct page based. dma_map_resource() is slow ad limited in what it
can do.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 06:24:28PM +0300, Oded Gabbay wrote:
> On Tue, Jun 22, 2021 at 6:11 PM Jason Gunthorpe  wrote:
> >
> > On Tue, Jun 22, 2021 at 04:12:26PM +0300, Oded Gabbay wrote:
> >
> > > > 1) Setting sg_page to NULL
> > > > 2) 'mapping' pages for P2P DMA without going through the iommu
> > > > 3) Allowing P2P DMA without using the p2p dma API to validate that it
> > > >can work at all in the first place.
> > > >
> > > > All of these result in functional bugs in certain system
> > > > configurations.
> > > >
> > > > Jason
> > >
> > > Hi Jason,
> > > Thanks for the feedback.
> > > Regarding point 1, why is that a problem if we disable the option to
> > > mmap the dma-buf from user-space ?
> >
> > Userspace has nothing to do with needing struct pages or not
> >
> > Point 1 and 2 mostly go together, you supporting the iommu is not nice
> > if you dont have struct pages.
> >
> > You should study Logan's patches I pointed you at as they are solving
> > exactly this problem.

> Yes, I do need to study them. I agree with you here. It appears I
> have a hole in my understanding.  I'm missing the connection between
> iommu support (which I must have of course) and struct pages.

Chistian explained what the AMD driver is doing by calling
dma_map_resource().

Which is a hacky and slow way of achieving what Logan's series is
doing.

> > No, the design of the dmabuf requires the exporter to do the dma maps
> > and so it is only the exporter that is wrong to omit all the iommu and
> > p2p logic.
> >
> > RDMA is OK today only because nobody has implemented dma buf support
> > in rxe/si - mainly because the only implementations of exporters don't
>
> Can you please educate me, what is rxe/si ?

Sorry, rxe/siw - these are the all-software implementations of RDMA
and they require the struct page to do a SW memory copy. They can't
implement dmabuf without it.

> ok...
> so how come that patch-set was merged into 5.12 if it's buggy ?

We only implemented true dma devices for RDMA DMABUF support, so it is
isn't buggy right now.

> Yes, that's what I expect to see. But I want to see it with my own
> eyes and then figure out how to solve this.

It might be tricky to test because you have to ensure the iommu is
turned on and has a non-idenity page table. Basically if it doesn't
trigger a IOMMU failure then the IOMMU isn't setup properly.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 04:12:26PM +0300, Oded Gabbay wrote:

> > 1) Setting sg_page to NULL
> > 2) 'mapping' pages for P2P DMA without going through the iommu
> > 3) Allowing P2P DMA without using the p2p dma API to validate that it
> >can work at all in the first place.
> >
> > All of these result in functional bugs in certain system
> > configurations.
> >
> > Jason
> 
> Hi Jason,
> Thanks for the feedback.
> Regarding point 1, why is that a problem if we disable the option to
> mmap the dma-buf from user-space ? 

Userspace has nothing to do with needing struct pages or not

Point 1 and 2 mostly go together, you supporting the iommu is not nice
if you dont have struct pages.

You should study Logan's patches I pointed you at as they are solving
exactly this problem.

> In addition, I didn't see any problem with sg_page being NULL in the
> RDMA p2p dma-buf code. Did I miss something here ?

No, the design of the dmabuf requires the exporter to do the dma maps
and so it is only the exporter that is wrong to omit all the iommu and
p2p logic.

RDMA is OK today only because nobody has implemented dma buf support
in rxe/si - mainly because the only implementations of exporters don't
set the struct page and are thus buggy.

> I will take two GAUDI devices and use one as an exporter and one as an
> importer. I want to see that the solution works end-to-end, with real
> device DMA from importer to exporter.

I can tell you it doesn't. Stuffing physical addresses directly into
the sg list doesn't involve any of the IOMMU code so any configuration
that requires IOMMU page table setup will not work.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 05:24:08PM +0200, Christian König wrote:

> > > I will take two GAUDI devices and use one as an exporter and one as an
> > > importer. I want to see that the solution works end-to-end, with real
> > > device DMA from importer to exporter.
> > I can tell you it doesn't. Stuffing physical addresses directly into
> > the sg list doesn't involve any of the IOMMU code so any configuration
> > that requires IOMMU page table setup will not work.
> 
> Sure it does. See amdgpu_vram_mgr_alloc_sgt:
> 
> amdgpu_res_first(res, offset, length, );
 ^^

I'm not talking about the AMD driver, I'm talking about this patch.

+   bar_address = hdev->dram_pci_bar_start +
+   (pages[cur_page] - prop->dram_base_address);
+   sg_dma_address(sg) = bar_address;

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: New uAPI for color management proposal and feedback request

2021-06-22 Thread Werner Sembach


Am 19.05.21 um 11:34 schrieb Pekka Paalanen:
> On Wed, 12 May 2021 16:04:16 +0300
> Ville Syrjälä  wrote:
>
>> On Wed, May 12, 2021 at 02:06:56PM +0200, Werner Sembach wrote:
>>> Hello,
>>>
>>> In addition to the existing "max bpc", and "Broadcast RGB/output_csc" drm 
>>> properties I propose 4 new properties:
>>> "preferred pixel encoding", "active color depth", "active color range", and 
>>> "active pixel encoding"
>>>
>>>
>>> Motivation:
>>>
>>> Current monitors have a variety pixel encodings available: RGB, YCbCr 
>>> 4:4:4, YCbCr 4:2:2, YCbCr 4:2:0.
>>>
>>> In addition they might be full or limited RGB range and the monitors accept 
>>> different bit depths.
>>>
>>> Currently the kernel driver for AMD and Intel GPUs automatically configure 
>>> the color settings automatically with little
>>> to no influence of the user. However there are several real world scenarios 
>>> where the user might disagree with the
>>> default chosen by the drivers and wants to set his or her own preference.
>>>
>>> Some examples:
>>>
>>> 1. While RGB and YCbCr 4:4:4 in theory carry the same amount of color 
>>> information, some screens might look better on one
>>> than the other because of bad internal conversion. The driver currently 
>>> however has a fixed default that is chosen if
>>> available (RGB for Intel and YCbCr 4:4:4 for AMD). The only way to change 
>>> this currently is by editing and overloading
>>> the edid reported by the monitor to the kernel.
>>>
>>> 2. RGB and YCbCr 4:4:4 need a higher port clock then YCbCr 4:2:0. Some 
>>> hardware might report that it supports the higher
>>> port clock, but because of bad shielding on the PC, the cable, or the 
>>> monitor the screen cuts out every few seconds when
>>> RGB or YCbCr 4:4:4 encoding is used, while YCbCr 4:2:0 might just work fine 
>>> without changing hardware. The drivers
>>> currently however always default to the "best available" option even if it 
>>> might be broken.
>>>
>>> 3. Some screens natively only supporting 8-bit color, simulate 10-Bit color 
>>> by rapidly switching between 2 adjacent
>>> colors. They advertise themselves to the kernel as 10-bit monitors but the 
>>> user might not like the "fake" 10-bit effect
>>> and prefer running at the native 8-bit per color.
>>>
>>> 4. Some screens are falsely classified as full RGB range wile they actually 
>>> use limited RGB range. This results in
>>> washed out colors in dark and bright scenes. A user override can be helpful 
>>> to manually fix this issue when it occurs.
>>>
>>> There already exist several requests, discussion, and patches regarding the 
>>> thematic:
>>>
>>> - https://gitlab.freedesktop.org/drm/amd/-/issues/476
>>>
>>> - https://gitlab.freedesktop.org/drm/amd/-/issues/1548
>>>
>>> - https://lkml.org/lkml/2021/5/7/695
>>>
>>> - https://lkml.org/lkml/2021/5/11/416
>>>
> ...
>
>>> Adoption:
>>>
>>> A KDE dev wants to implement the settings in the KDE settings GUI:
>>> https://gitlab.freedesktop.org/drm/amd/-/issues/476#note_912370
>>>
>>> Tuxedo Computers (my employer) wants to implement the settings desktop 
>>> environment agnostic in Tuxedo Control Center. I
>>> will start work on this in parallel to implementing the new kernel code.  
>> I suspect everyone would be happier to accept new uapi if we had
>> multiple compositors signed up to implement it.
> I think having Weston support for these would be good, but for now it
> won't be much of an UI: just weston.ini to set, and the log to see what
> happened.

Since a first version of the patch set is now feature complete, please let me 
know if a MR regarding this is started.

Thanks

>
> However, knowing what happened is going to be important for color
> calibration auditing:
> https://gitlab.freedesktop.org/wayland/weston/-/issues/467
>
> Yes, please, very much for read-only properties for the feedback part.
> Properties that both userspace and kernel will write are hard to deal
> with in general.
>
> Btw. "max bpc" I can kind of guess that conversion from framebuffer
> format to the wire bpc happens automatically and only as the final
> step, but "Broadcast RGB" is more complicated: is the output from the
> abstract pixel pipeline sent as-is and "Broadcast RGB" is just another
> inforframe bit to the monitor, or does "Broadcast RGB" setting actually
> change what happens in the pixel pipeline *and* set infoframe bits?
>
> My vague recollection is that framebuffer was always assumed to be in
> full range, and then if "Broadcast RGB" was set to limited range, the
> driver would mangle the pixel pipeline to convert from full to limited
> range. This means that it would be impossible to have limited range
> data in a framebuffer, or there might be a double-conversion by
> userspace programming a LUT for limited->full and then the driver
> adding full->limited. I'm also confused how full/limited works when
> framebuffer is in RGB/YCbCr and the monitor wire format is in RGB/YCbCr
> and there may be RGB->YCbCR or YCbCR->RGB

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Christian König


Am 22.06.21 um 14:01 schrieb Jason Gunthorpe:

On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:

On Tue, Jun 22, 2021 at 9:37 AM Christian König
 wrote:

Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:

On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:


Another thing I want to emphasize is that we are doing p2p only
through the export/import of the FD. We do *not* allow the user to
mmap the dma-buf as we do not support direct IO. So there is no access
to these pages through the userspace.

Arguably mmaping the memory is a better choice, and is the direction
that Logan's series goes in. Here the use of DMABUF was specifically
designed to allow hitless revokation of the memory, which this isn't
even using.

The major problem with this approach is that DMA-buf is also used for
memory which isn't CPU accessible.

That isn't an issue here because the memory is only intended to be
used with P2P transfers so it must be CPU accessible.


No, especially P2P is often done on memory resources which are not even 
remotely CPU accessible.


That's one of the major reasons why we use P2P in the first place. See 
the whole XGMI implementation for example.



Thanks Jason for the clarification, but I honestly prefer to use
DMA-BUF at the moment.
It gives us just what we need (even more than what we need as you
pointed out), it is *already* integrated and tested in the RDMA
subsystem, and I'm feeling comfortable using it as I'm somewhat
familiar with it from my AMD days.

That was one of the reasons we didn't even considered using the mapping
memory approach for GPUs.

Well, now we have DEVICE_PRIVATE memory that can meet this need
too.. Just nobody has wired it up to hmm_range_fault()


So you are taking the hit of very limited hardware support and reduced
performance just to squeeze into DMABUF..

You still have the issue that this patch is doing all of this P2P
stuff wrong - following the already NAK'd AMD approach.


Well that stuff was NAKed because we still use sg_tables, not because we 
don't want to allocate struct pages.


The plan is to push this forward since DEVICE_PRIVATE clearly can't 
handle all of our use cases and is not really a good fit to be honest.


IOMMU is now working as well, so as far as I can see we are all good here.


I'll go and read Logan's patch-set to see if that will work for us in
the future. Please remember, as Daniel said, we don't have struct page
backing our device memory, so if that is a requirement to connect to
Logan's work, then I don't think we will want to do it at this point.

It is trivial to get the struct page for a PCI BAR.


Yeah, but it doesn't make much sense. Why should we create a struct page 
for something that isn't even memory in a lot of cases?


Regards,
Christian.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 00/22] Deprecate struct drm_device.irq_enabled

2021-06-22 Thread Laurent Pinchart

Hi Thomas,

Thank you for the patches.

On Tue, Jun 22, 2021 at 04:09:40PM +0200, Thomas Zimmermann wrote:
> Remove references to struct drm_device.irq_enabled from modern
> DRM drivers and core.
> 
> KMS drivers enable IRQs for their devices internally. They don't
> have to keep track of the IRQ state via irq_enabled. For vblanking,
> it's cleaner to test for vblanking support directly than to test
> for enabled IRQs.
> 
> This used to be a single patch, [1] but it's now a full series.
> 
> The first 3 patches replace instances of irq_enabled that are not
> required.
> 
> Patch 4 fixes vblank ioctls to actually test for vblank support
> instead of IRQs.
> 
> THe rest of the patchset removes irq_enabled from all non-legacy
> drivers. The only exception is omapdrm, which has an internal
> dpendency on the field's value. For this drivers, the state gets
> duplicated internally.
> 
> With the patchset applied, drivers can later switch over to plain
> Linux IRQ interfaces and DRM's IRQ midlayer can be declared legacy.
> 
> v2:
>   * keep the original test for legacy drivers in
> drm_wait_vblank_ioctl() (Daniel)
> 
> [1] 
> https://lore.kernel.org/dri-devel/20210608090301.4752-1-tzimmerm...@suse.de/
> 
> Thomas Zimmermann (22):
>   drm/amdgpu: Track IRQ state in local device state
>   drm/hibmc: Call drm_irq_uninstall() unconditionally
>   drm/radeon: Track IRQ state in local device state
>   drm: Don't test for IRQ support in VBLANK ioctls
>   drm/komeda: Don't set struct drm_device.irq_enabled
>   drm/malidp: Don't set struct drm_device.irq_enabled
>   drm/exynos: Don't set struct drm_device.irq_enabled
>   drm/kirin: Don't set struct drm_device.irq_enabled
>   drm/imx: Don't set struct drm_device.irq_enabled
>   drm/mediatek: Don't set struct drm_device.irq_enabled
>   drm/nouveau: Don't set struct drm_device.irq_enabled
>   drm/omapdrm: Track IRQ state in local device state
>   drm/rockchip: Don't set struct drm_device.irq_enabled
>   drm/sti: Don't set struct drm_device.irq_enabled
>   drm/stm: Don't set struct drm_device.irq_enabled
>   drm/sun4i: Don't set struct drm_device.irq_enabled
>   drm/tegra: Don't set struct drm_device.irq_enabled
>   drm/tidss: Don't use struct drm_device.irq_enabled
>   drm/vc4: Don't set struct drm_device.irq_enabled
>   drm/vmwgfx: Don't set struct drm_device.irq_enabled
>   drm/xlnx: Don't set struct drm_device.irq_enabled
>   drm/zte: Don't set struct drm_device.irq_enabled

The list seems to be missing armada, rcar-du and vkms. It would also be
nice to address i915 if possible.

>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c |  6 +++---
>  drivers/gpu/drm/arm/display/komeda/komeda_kms.c |  4 
>  drivers/gpu/drm/arm/malidp_drv.c|  4 
>  drivers/gpu/drm/drm_irq.c   | 10 +++---
>  drivers/gpu/drm/drm_vblank.c| 13 +
>  drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --
>  drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c |  3 +--
>  drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c |  2 --
>  drivers/gpu/drm/imx/dcss/dcss-kms.c |  3 ---
>  drivers/gpu/drm/imx/imx-drm-core.c  | 11 ---
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  6 --
>  drivers/gpu/drm/nouveau/nouveau_drm.c   |  3 ---
>  drivers/gpu/drm/omapdrm/omap_drv.h  |  2 ++
>  drivers/gpu/drm/omapdrm/omap_irq.c  |  6 +++---
>  drivers/gpu/drm/radeon/radeon_fence.c   |  2 +-
>  drivers/gpu/drm/radeon/radeon_irq_kms.c | 16 
>  drivers/gpu/drm/rockchip/rockchip_drm_drv.c |  6 --
>  drivers/gpu/drm/sti/sti_compositor.c|  2 --
>  drivers/gpu/drm/stm/ltdc.c  |  3 ---
>  drivers/gpu/drm/sun4i/sun4i_drv.c   |  2 --
>  drivers/gpu/drm/tegra/drm.c |  7 ---
>  drivers/gpu/drm/tidss/tidss_irq.c   |  3 ---
>  drivers/gpu/drm/vc4/vc4_kms.c   |  1 -
>  drivers/gpu/drm/vmwgfx/vmwgfx_irq.c |  8 
>  drivers/gpu/drm/xlnx/zynqmp_dpsub.c |  2 --
>  drivers/gpu/drm/zte/zx_drm_drv.c|  6 --
>  26 files changed, 30 insertions(+), 111 deletions(-)
> 
> 
> base-commit: 8c1323b422f8473421682ba783b5949ddd89a3f4
> prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
> prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24

-- 
Regards,

Laurent Pinchart
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 00/22] Deprecate struct drm_device.irq_enabled

2021-06-22 Thread Laurent Pinchart

On Tue, Jun 22, 2021 at 07:11:33PM +0300, Laurent Pinchart wrote:
> Hi Thomas,
> 
> Thank you for the patches.
> 
> On Tue, Jun 22, 2021 at 04:09:40PM +0200, Thomas Zimmermann wrote:
> > Remove references to struct drm_device.irq_enabled from modern
> > DRM drivers and core.
> > 
> > KMS drivers enable IRQs for their devices internally. They don't
> > have to keep track of the IRQ state via irq_enabled. For vblanking,
> > it's cleaner to test for vblanking support directly than to test
> > for enabled IRQs.
> > 
> > This used to be a single patch, [1] but it's now a full series.
> > 
> > The first 3 patches replace instances of irq_enabled that are not
> > required.
> > 
> > Patch 4 fixes vblank ioctls to actually test for vblank support
> > instead of IRQs.
> > 
> > THe rest of the patchset removes irq_enabled from all non-legacy
> > drivers. The only exception is omapdrm, which has an internal
> > dpendency on the field's value. For this drivers, the state gets
> > duplicated internally.
> > 
> > With the patchset applied, drivers can later switch over to plain
> > Linux IRQ interfaces and DRM's IRQ midlayer can be declared legacy.
> > 
> > v2:
> > * keep the original test for legacy drivers in
> >   drm_wait_vblank_ioctl() (Daniel)
> > 
> > [1] 
> > https://lore.kernel.org/dri-devel/20210608090301.4752-1-tzimmerm...@suse.de/
> > 
> > Thomas Zimmermann (22):
> >   drm/amdgpu: Track IRQ state in local device state
> >   drm/hibmc: Call drm_irq_uninstall() unconditionally
> >   drm/radeon: Track IRQ state in local device state
> >   drm: Don't test for IRQ support in VBLANK ioctls
> >   drm/komeda: Don't set struct drm_device.irq_enabled
> >   drm/malidp: Don't set struct drm_device.irq_enabled
> >   drm/exynos: Don't set struct drm_device.irq_enabled
> >   drm/kirin: Don't set struct drm_device.irq_enabled
> >   drm/imx: Don't set struct drm_device.irq_enabled
> >   drm/mediatek: Don't set struct drm_device.irq_enabled
> >   drm/nouveau: Don't set struct drm_device.irq_enabled
> >   drm/omapdrm: Track IRQ state in local device state
> >   drm/rockchip: Don't set struct drm_device.irq_enabled
> >   drm/sti: Don't set struct drm_device.irq_enabled
> >   drm/stm: Don't set struct drm_device.irq_enabled
> >   drm/sun4i: Don't set struct drm_device.irq_enabled
> >   drm/tegra: Don't set struct drm_device.irq_enabled
> >   drm/tidss: Don't use struct drm_device.irq_enabled
> >   drm/vc4: Don't set struct drm_device.irq_enabled
> >   drm/vmwgfx: Don't set struct drm_device.irq_enabled
> >   drm/xlnx: Don't set struct drm_device.irq_enabled
> >   drm/zte: Don't set struct drm_device.irq_enabled
> 
> The list seems to be missing armada, rcar-du and vkms. It would also be
> nice to address i915 if possible.

In addition to this, for all the existing patches,

Reviewed-by: Laurent Pinchart 

> >  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c |  6 +++---
> >  drivers/gpu/drm/arm/display/komeda/komeda_kms.c |  4 
> >  drivers/gpu/drm/arm/malidp_drv.c|  4 
> >  drivers/gpu/drm/drm_irq.c   | 10 +++---
> >  drivers/gpu/drm/drm_vblank.c| 13 +
> >  drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --
> >  drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c |  3 +--
> >  drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c |  2 --
> >  drivers/gpu/drm/imx/dcss/dcss-kms.c |  3 ---
> >  drivers/gpu/drm/imx/imx-drm-core.c  | 11 ---
> >  drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  6 --
> >  drivers/gpu/drm/nouveau/nouveau_drm.c   |  3 ---
> >  drivers/gpu/drm/omapdrm/omap_drv.h  |  2 ++
> >  drivers/gpu/drm/omapdrm/omap_irq.c  |  6 +++---
> >  drivers/gpu/drm/radeon/radeon_fence.c   |  2 +-
> >  drivers/gpu/drm/radeon/radeon_irq_kms.c | 16 
> >  drivers/gpu/drm/rockchip/rockchip_drm_drv.c |  6 --
> >  drivers/gpu/drm/sti/sti_compositor.c|  2 --
> >  drivers/gpu/drm/stm/ltdc.c  |  3 ---
> >  drivers/gpu/drm/sun4i/sun4i_drv.c   |  2 --
> >  drivers/gpu/drm/tegra/drm.c |  7 ---
> >  drivers/gpu/drm/tidss/tidss_irq.c   |  3 ---
> >  drivers/gpu/drm/vc4/vc4_kms.c   |  1 -
> >  drivers/gpu/drm/vmwgfx/vmwgfx_irq.c |  8 
> >  drivers/gpu/drm/xlnx/zynqmp_dpsub.c |  2 --
> >  drivers/gpu/drm/zte/zx_drm_drv.c|  6 --
> >  26 files changed, 30 insertions(+), 111 deletions(-)
> > 
> > 
> > base-commit: 8c1323b422f8473421682ba783b5949ddd89a3f4
> > prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
> > prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24

-- 
Regards,

Laurent Pinchart
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Oded Gabbay

On Tue, Jun 22, 2021 at 6:31 PM Christian König
 wrote:
>
>
>
> Am 22.06.21 um 17:28 schrieb Jason Gunthorpe:
> > On Tue, Jun 22, 2021 at 05:24:08PM +0200, Christian König wrote:
> >
>  I will take two GAUDI devices and use one as an exporter and one as an
>  importer. I want to see that the solution works end-to-end, with real
>  device DMA from importer to exporter.
> >>> I can tell you it doesn't. Stuffing physical addresses directly into
> >>> the sg list doesn't involve any of the IOMMU code so any configuration
> >>> that requires IOMMU page table setup will not work.
> >> Sure it does. See amdgpu_vram_mgr_alloc_sgt:
> >>
> >>  amdgpu_res_first(res, offset, length, );
> >   ^^
> >
> > I'm not talking about the AMD driver, I'm talking about this patch.
> >
> > + bar_address = hdev->dram_pci_bar_start +
> > + (pages[cur_page] - prop->dram_base_address);
> > + sg_dma_address(sg) = bar_address;
>
> Yeah, that is indeed not working.
>
> Oded you need to use dma_map_resource() for this.
>
> Christian.
Yes, of course.
But will it be enough ?
Jason said that supporting IOMMU isn't nice when we don't have struct pages.
I fail to understand the connection, I need to dig into this.

Oded

>
>
>
> >
> > Jason
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Oded Gabbay

On Tue, Jun 22, 2021 at 6:28 PM Jason Gunthorpe  wrote:
>
> On Tue, Jun 22, 2021 at 05:24:08PM +0200, Christian König wrote:
>
> > > > I will take two GAUDI devices and use one as an exporter and one as an
> > > > importer. I want to see that the solution works end-to-end, with real
> > > > device DMA from importer to exporter.
> > > I can tell you it doesn't. Stuffing physical addresses directly into
> > > the sg list doesn't involve any of the IOMMU code so any configuration
> > > that requires IOMMU page table setup will not work.
> >
> > Sure it does. See amdgpu_vram_mgr_alloc_sgt:
> >
> > amdgpu_res_first(res, offset, length, );
>  ^^
>
> I'm not talking about the AMD driver, I'm talking about this patch.
>
> +   bar_address = hdev->dram_pci_bar_start +
> +   (pages[cur_page] - prop->dram_base_address);
> +   sg_dma_address(sg) = bar_address;
>
> Jason
Yes, you are correct of course, but what will happen Jason, If I will
add a call to dma_map_resource() like Christian said ?
Won't that solve that specific issue ?
That's why I want to try it...

Oded
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 06/22] drm/malidp: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Liviu Dudau

On Tue, Jun 22, 2021 at 04:09:46PM +0200, Thomas Zimmermann wrote:
> The field drm_device.irq_enabled is only used by legacy drivers
> with userspace modesetting. Don't set it in malidp.
> 
> Signed-off-by: Thomas Zimmermann 

Acked-by: Liviu Dudau 

Best regards,
Liviu

> ---
>  drivers/gpu/drm/arm/malidp_drv.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/arm/malidp_drv.c 
> b/drivers/gpu/drm/arm/malidp_drv.c
> index de59f3302516..78d15b04b105 100644
> --- a/drivers/gpu/drm/arm/malidp_drv.c
> +++ b/drivers/gpu/drm/arm/malidp_drv.c
> @@ -847,8 +847,6 @@ static int malidp_bind(struct device *dev)
>   if (ret < 0)
>   goto irq_init_fail;
>  
> - drm->irq_enabled = true;
> -
>   ret = drm_vblank_init(drm, drm->mode_config.num_crtc);
>   if (ret < 0) {
>   DRM_ERROR("failed to initialise vblank\n");
> @@ -874,7 +872,6 @@ static int malidp_bind(struct device *dev)
>  vblank_fail:
>   malidp_se_irq_fini(hwdev);
>   malidp_de_irq_fini(hwdev);
> - drm->irq_enabled = false;
>  irq_init_fail:
>   drm_atomic_helper_shutdown(drm);
>   component_unbind_all(dev, drm);
> @@ -909,7 +906,6 @@ static void malidp_unbind(struct device *dev)
>   drm_atomic_helper_shutdown(drm);
>   malidp_se_irq_fini(hwdev);
>   malidp_de_irq_fini(hwdev);
> - drm->irq_enabled = false;
>   component_unbind_all(dev, drm);
>   of_node_put(malidp->crtc.port);
>   malidp->crtc.port = NULL;
> -- 
> 2.32.0
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 05/22] drm/komeda: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Liviu Dudau

On Tue, Jun 22, 2021 at 04:09:45PM +0200, Thomas Zimmermann wrote:
> The field drm_device.irq_enabled is only used by legacy drivers
> with userspace modesetting. Don't set it in komeda.
> 
> Signed-off-by: Thomas Zimmermann 

Acked-by: Liviu Dudau 

Best regards,
Liviu

> ---
>  drivers/gpu/drm/arm/display/komeda/komeda_kms.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c 
> b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
> index ff45f23f3d56..52a6db5707a3 100644
> --- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
> +++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
> @@ -301,8 +301,6 @@ struct komeda_kms_dev *komeda_kms_attach(struct 
> komeda_dev *mdev)
>   if (err)
>   goto free_component_binding;
>  
> - drm->irq_enabled = true;
> -
>   drm_kms_helper_poll_init(drm);
>  
>   err = drm_dev_register(drm, 0);
> @@ -313,7 +311,6 @@ struct komeda_kms_dev *komeda_kms_attach(struct 
> komeda_dev *mdev)
>  
>  free_interrupts:
>   drm_kms_helper_poll_fini(drm);
> - drm->irq_enabled = false;
>  free_component_binding:
>   component_unbind_all(mdev->dev, drm);
>  cleanup_mode_config:
> @@ -331,7 +328,6 @@ void komeda_kms_detach(struct komeda_kms_dev *kms)
>   drm_dev_unregister(drm);
>   drm_kms_helper_poll_fini(drm);
>   drm_atomic_helper_shutdown(drm);
> - drm->irq_enabled = false;
>   component_unbind_all(mdev->dev, drm);
>   drm_mode_config_cleanup(drm);
>   komeda_kms_cleanup_private_objs(kms);
> -- 
> 2.32.0
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 04/22] drm: Don't test for IRQ support in VBLANK ioctls

2021-06-22 Thread Liviu Dudau

Hello,

On Tue, Jun 22, 2021 at 04:09:44PM +0200, Thomas Zimmermann wrote:
> For KMS drivers, replace the IRQ check in VBLANK ioctls with a check for
> vblank support. IRQs might be enabled wthout vblanking being supported.
> 
> This change also removes the DRM framework's only dependency on IRQ state
> for non-legacy drivers. For legacy drivers with userspace modesetting,
> the original test remains in drm_wait_vblank_ioctl().
> 
> v2:
>   * keep the old test for legacy drivers in
> drm_wait_vblank_ioctl() (Daniel)
> 
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/drm_irq.c| 10 +++---
>  drivers/gpu/drm/drm_vblank.c | 13 +
>  2 files changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index c3bd664ea733..1d7785721323 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -74,10 +74,8 @@
>   * only supports devices with a single interrupt on the main device stored in
>   * _device.dev and set as the device paramter in drm_dev_alloc().
>   *
> - * These IRQ helpers are strictly optional. Drivers which roll their own only
> - * need to set _device.irq_enabled to signal the DRM core that vblank
> - * interrupts are working. Since these helpers don't automatically clean up 
> the
> - * requested interrupt like e.g. devm_request_irq() they're not really
> + * These IRQ helpers are strictly optional. Since these helpers don't 
> automatically
> + * clean up the requested interrupt like e.g. devm_request_irq() they're not 
> really
>   * recommended.
>   */
>  
> @@ -91,9 +89,7 @@
>   * and after the installation.
>   *
>   * This is the simplified helper interface provided for drivers with no 
> special
> - * needs. Drivers which need to install interrupt handlers for multiple
> - * interrupts must instead set _device.irq_enabled to signal the DRM core
> - * that vblank interrupts are available.
> + * needs.
>   *
>   * @irq must match the interrupt number that would be passed to 
> request_irq(),
>   * if called directly instead of using this helper function.
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index 3417e1ac7918..a98a4aad5037 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -1748,8 +1748,13 @@ int drm_wait_vblank_ioctl(struct drm_device *dev, void 
> *data,
>   unsigned int pipe_index;
>   unsigned int flags, pipe, high_pipe;
>  
> - if (!dev->irq_enabled)
> - return -EOPNOTSUPP;
> + if  (drm_core_check_feature(dev, DRIVER_MODESET)) {
> + if (!drm_dev_has_vblank(dev))
> + return -EOPNOTSUPP;
> + } else {
> + if (!dev->irq_enabled)
> + return -EOPNOTSUPP;
> + }

For a system call that is used quite a lot by userspace we have increased the 
code size
in a noticeable way. Can we not cache it privately?

Best regards,
Liviu

>  
>   if (vblwait->request.type & _DRM_VBLANK_SIGNAL)
>   return -EINVAL;
> @@ -2023,7 +2028,7 @@ int drm_crtc_get_sequence_ioctl(struct drm_device *dev, 
> void *data,
>   if (!drm_core_check_feature(dev, DRIVER_MODESET))
>   return -EOPNOTSUPP;
>  
> - if (!dev->irq_enabled)
> + if (!drm_dev_has_vblank(dev))
>   return -EOPNOTSUPP;
>  
>   crtc = drm_crtc_find(dev, file_priv, get_seq->crtc_id);
> @@ -2082,7 +2087,7 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device 
> *dev, void *data,
>   if (!drm_core_check_feature(dev, DRIVER_MODESET))
>   return -EOPNOTSUPP;
>  
> - if (!dev->irq_enabled)
> + if (!drm_dev_has_vblank(dev))
>   return -EOPNOTSUPP;
>  
>   crtc = drm_crtc_find(dev, file_priv, queue_seq->crtc_id);
> -- 
> 2.32.0
> 

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Oded Gabbay

On Tue, Jun 22, 2021 at 6:11 PM Jason Gunthorpe  wrote:
>
> On Tue, Jun 22, 2021 at 04:12:26PM +0300, Oded Gabbay wrote:
>
> > > 1) Setting sg_page to NULL
> > > 2) 'mapping' pages for P2P DMA without going through the iommu
> > > 3) Allowing P2P DMA without using the p2p dma API to validate that it
> > >can work at all in the first place.
> > >
> > > All of these result in functional bugs in certain system
> > > configurations.
> > >
> > > Jason
> >
> > Hi Jason,
> > Thanks for the feedback.
> > Regarding point 1, why is that a problem if we disable the option to
> > mmap the dma-buf from user-space ?
>
> Userspace has nothing to do with needing struct pages or not
>
> Point 1 and 2 mostly go together, you supporting the iommu is not nice
> if you dont have struct pages.
>
> You should study Logan's patches I pointed you at as they are solving
> exactly this problem.
Yes, I do need to study them. I agree with you here. It appears I have
a hole in my understanding.
I'm missing the connection between iommu support (which I must have of
course) and struct pages.

>
> > In addition, I didn't see any problem with sg_page being NULL in the
> > RDMA p2p dma-buf code. Did I miss something here ?
>
> No, the design of the dmabuf requires the exporter to do the dma maps
> and so it is only the exporter that is wrong to omit all the iommu and
> p2p logic.
>
> RDMA is OK today only because nobody has implemented dma buf support
> in rxe/si - mainly because the only implementations of exporters don't
Can you please educate me, what is rxe/si ?

> set the struct page and are thus buggy.

ok...
so how come that patch-set was merged into 5.12 if it's buggy ?
Because the current exporters are buggy ?  I probably need a history
lesson here.
But I understand why you think it's a bad idea to add a new buggy exporter.

>
> > I will take two GAUDI devices and use one as an exporter and one as an
> > importer. I want to see that the solution works end-to-end, with real
> > device DMA from importer to exporter.
>
> I can tell you it doesn't. Stuffing physical addresses directly into
> the sg list doesn't involve any of the IOMMU code so any configuration
> that requires IOMMU page table setup will not work.
>
> Jason

Yes, that's what I expect to see. But I want to see it with my own
eyes and then figure out how to solve this.
Maybe the result will be going to Logan's path, maybe something else,
but I need to start by seeing the failure in a real system.

Thanks for the information, it is really helpful.

Oded
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 00/22] Deprecate struct drm_device.irq_enabled

2021-06-22 Thread Daniel Vetter

On Tue, Jun 22, 2021 at 04:09:40PM +0200, Thomas Zimmermann wrote:
> Remove references to struct drm_device.irq_enabled from modern
> DRM drivers and core.
> 
> KMS drivers enable IRQs for their devices internally. They don't
> have to keep track of the IRQ state via irq_enabled. For vblanking,
> it's cleaner to test for vblanking support directly than to test
> for enabled IRQs.
> 
> This used to be a single patch, [1] but it's now a full series.
> 
> The first 3 patches replace instances of irq_enabled that are not
> required.
> 
> Patch 4 fixes vblank ioctls to actually test for vblank support
> instead of IRQs.
> 
> THe rest of the patchset removes irq_enabled from all non-legacy
> drivers. The only exception is omapdrm, which has an internal
> dpendency on the field's value. For this drivers, the state gets
> duplicated internally.
> 
> With the patchset applied, drivers can later switch over to plain
> Linux IRQ interfaces and DRM's IRQ midlayer can be declared legacy.
> 
> v2:
>   * keep the original test for legacy drivers in
> drm_wait_vblank_ioctl() (Daniel)
> 
> [1] 
> https://lore.kernel.org/dri-devel/20210608090301.4752-1-tzimmerm...@suse.de/

On the series:

Acked-by: Daniel Vetter 

But I've only done a very light reading of this, so please wait for driver
folks to have some time to check their own before merging.

I think a devm_ version of drm_irq_install might be helpful in further
untangling here, but that's definitely for another series.
-Daniel

> 
> Thomas Zimmermann (22):
>   drm/amdgpu: Track IRQ state in local device state
>   drm/hibmc: Call drm_irq_uninstall() unconditionally
>   drm/radeon: Track IRQ state in local device state
>   drm: Don't test for IRQ support in VBLANK ioctls
>   drm/komeda: Don't set struct drm_device.irq_enabled
>   drm/malidp: Don't set struct drm_device.irq_enabled
>   drm/exynos: Don't set struct drm_device.irq_enabled
>   drm/kirin: Don't set struct drm_device.irq_enabled
>   drm/imx: Don't set struct drm_device.irq_enabled
>   drm/mediatek: Don't set struct drm_device.irq_enabled
>   drm/nouveau: Don't set struct drm_device.irq_enabled
>   drm/omapdrm: Track IRQ state in local device state
>   drm/rockchip: Don't set struct drm_device.irq_enabled
>   drm/sti: Don't set struct drm_device.irq_enabled
>   drm/stm: Don't set struct drm_device.irq_enabled
>   drm/sun4i: Don't set struct drm_device.irq_enabled
>   drm/tegra: Don't set struct drm_device.irq_enabled
>   drm/tidss: Don't use struct drm_device.irq_enabled
>   drm/vc4: Don't set struct drm_device.irq_enabled
>   drm/vmwgfx: Don't set struct drm_device.irq_enabled
>   drm/xlnx: Don't set struct drm_device.irq_enabled
>   drm/zte: Don't set struct drm_device.irq_enabled
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c |  6 +++---
>  drivers/gpu/drm/arm/display/komeda/komeda_kms.c |  4 
>  drivers/gpu/drm/arm/malidp_drv.c|  4 
>  drivers/gpu/drm/drm_irq.c   | 10 +++---
>  drivers/gpu/drm/drm_vblank.c| 13 +
>  drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --
>  drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c |  3 +--
>  drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c |  2 --
>  drivers/gpu/drm/imx/dcss/dcss-kms.c |  3 ---
>  drivers/gpu/drm/imx/imx-drm-core.c  | 11 ---
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  6 --
>  drivers/gpu/drm/nouveau/nouveau_drm.c   |  3 ---
>  drivers/gpu/drm/omapdrm/omap_drv.h  |  2 ++
>  drivers/gpu/drm/omapdrm/omap_irq.c  |  6 +++---
>  drivers/gpu/drm/radeon/radeon_fence.c   |  2 +-
>  drivers/gpu/drm/radeon/radeon_irq_kms.c | 16 
>  drivers/gpu/drm/rockchip/rockchip_drm_drv.c |  6 --
>  drivers/gpu/drm/sti/sti_compositor.c|  2 --
>  drivers/gpu/drm/stm/ltdc.c  |  3 ---
>  drivers/gpu/drm/sun4i/sun4i_drv.c   |  2 --
>  drivers/gpu/drm/tegra/drm.c |  7 ---
>  drivers/gpu/drm/tidss/tidss_irq.c   |  3 ---
>  drivers/gpu/drm/vc4/vc4_kms.c   |  1 -
>  drivers/gpu/drm/vmwgfx/vmwgfx_irq.c |  8 
>  drivers/gpu/drm/xlnx/zynqmp_dpsub.c |  2 --
>  drivers/gpu/drm/zte/zx_drm_drv.c|  6 --
>  26 files changed, 30 insertions(+), 111 deletions(-)
> 
> 
> base-commit: 8c1323b422f8473421682ba783b5949ddd89a3f4
> prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
> prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
> --
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/1] drm/amdgpu: add helper function for vm pasid

2021-06-22 Thread Das, Nirmoy



On 6/22/2021 12:36 PM, Christian König wrote:

Am 22.06.21 um 12:30 schrieb Das, Nirmoy:


On 6/22/2021 10:36 AM, Christian König wrote:


Am 22.06.21 um 09:39 schrieb Das, Nirmoy:


On 6/22/2021 9:03 AM, Christian König wrote:



Am 22.06.21 um 08:57 schrieb Nirmoy Das:

Cleanup code related to vm pasid by adding helper functions.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 105 
-

  1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 63975bda8e76..6e476b173cbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -87,6 +87,46 @@ struct amdgpu_prt_cb {
  struct dma_fence_cb cb;
  };

+static int amdgpu_vm_pasid_alloc(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ unsigned int pasid,
+ unsigned int *vm_pasid)
+{
+    unsigned long flags;
+    int r;
+
+    if (!pasid)
+    return 0;
+
+ spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid 
+ 1,

+  GFP_ATOMIC);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+    if (r < 0)
+    return r;
+    if (vm_pasid)
+    *vm_pasid = pasid;
+


Ok the more I read from this patch the less it makes sense.

We don't allocate the passid here, we just set it up in the idr.

What we could do is to replace the idr with an xarray, that would 
certainly make more sense than this here.



xarray looks great, with that we don't need pasid_lock either.


You still need the lock to protect against VM destruction while 
looking things up, but you could switch to RCU for this instead.



xarray has xa_{lock|unloack}_irqsave() and adev->vm_manager.pasid_xa 
will exist till devices's lifetime.


That's just a wrapper around the lock.


So I am thinking something like:

amdgpu_vm_pasid_insert()

{

...

xa_lock_irqsave(adev->vm_manager.pasids, flags)
r = xa_store(>vm_manager.pasids, pasid, vm, GFP_ATOMIC);
xa_unlock_irqsave(adev->vm_manager.pasids, flags)


It would be really nice if we could avoid the GFP_ATOMIC here, but not 
much of a problem since we had that before.



I think it is possible as I think  only amdgpu_vm_handle_fault() runs in 
interrupt context which tries to find a vm ptr not store it.






}

amdgpu_vm_pasid_remove()

{



xa_lock_irqsave(adev->vm_manager.pasids, flags)
xa_erase(>vm_manager.pasids, pasid);
xa_unlock_irqsave(adev->vm_manager.pasids, flags)

}


xa_{lock|unloack}_irqsave() can be use while looking up vm ptr for a 
pasid.



Shouldn't this be enough ?



Yeah I think so.



Great.


Nirmoy



Christian.



Regards,

Nirmoy



Christian.




Thanks

Nirmoy




Christian.


+    return 0;
+}
+
+static void amdgpu_vm_pasid_remove(struct amdgpu_device *adev,
+   unsigned int pasid,
+   unsigned int *vm_pasid)
+{
+    unsigned long flags;
+
+    if (!pasid)
+    return;
+
+ spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    idr_remove(>vm_manager.pasid_idr, pasid);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+
+    if (vm_pasid)
+    *vm_pasid = 0;
+}
+
  /*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no 
reclaim-FS
   * happens while holding this lock anywhere to prevent 
deadlocks when
@@ -2940,18 +2980,8 @@ int amdgpu_vm_init(struct amdgpu_device 
*adev, struct amdgpu_vm *vm, u32 pasid)


  amdgpu_bo_unreserve(vm->root.bo);

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, 
pasid + 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-    if (r < 0)
-    goto error_free_root;
-
-    vm->pasid = pasid;
-    }
+    if (amdgpu_vm_pasid_alloc(adev, vm, pasid, >pasid))
+    goto error_free_root;

  INIT_KFIFO(vm->faults);

@@ -3038,19 +3068,11 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  r = amdgpu_vm_check_clean_reserved(adev, vm);
  if (r)
  goto unreserve_bo;
+    r = amdgpu_vm_pasid_alloc(adev, vm, pasid, NULL);
+    if (r ==  -ENOSPC)
+    goto unreserve_bo;

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, 
pasid + 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
-    if (r == -ENOSPC)
-    goto unreserve_bo;
-    r = 0;
-    }
+    r = 0;

  /* Check if PD needs to be reinitialized and do it before
   * changing any other state, in case it fails.
@@ -3089,35 +3111,23 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  vm->is_compute_context = true;

  if (vm->pasid) {
-

Re: [PATCH V3 1/7] drm/amdgpu: correct tcp harvest setting

2021-06-22 Thread Lazar, Lijo




On 6/22/2021 2:49 PM, Michel Dänzer wrote:

On 2021-06-22 8:08 a.m., Lazar, Lijo wrote:

[Public]

AFAIK, that expression is legal (some code analyzer may warn on value of 
4*max_wgp_per_sh); similar kind is used in rotate shift operations.


The default type for constants in C is int, so 0x is a 32-bit signed 
integer.


Probably not as per section 6.4.4.

"The type of an integer constant is the first of the corresponding list 
in which its value can be represented."


It is a hexadecimal constant and the first to fit this value is unsigned 
int. Regardless, adding u suffix will avoid any ambiguity.


Thanks,
Lijo



The C99 specification lists this under J.2 Undefined behavior:

— An expression having signed promoted type is left-shifted and either the 
value of the
  expression is negative or the result of shifting would be not be 
representable in the
  promoted type (6.5.7).

So it would be safer to make it unsigned: 0xu (or just ~0u).



-Original Message-
From: Quan, Evan 
Sent: Tuesday, June 22, 2021 7:56 AM
To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: RE: [PATCH V3 1/7] drm/amdgpu: correct tcp harvest setting

[AMD Official Use Only]

Thanks Lijo.
However, I'm not quite sure whether " 0x << (4 * max_wgp_per_sh);" is a 
valid expression since it kind of triggers some overflow.
Can that work for non-x86 platform or even work reliably for x86 platform?





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 13/22] drm/rockchip: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in rockchip.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
index b730b8d5d949..c8e60fd9ff24 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@ -162,12 +162,6 @@ static int rockchip_drm_bind(struct device *dev)
 
drm_mode_config_reset(drm_dev);
 
-   /*
-* enable drm irq mode.
-* - with irq_enabled = true, we can use the vblank feature.
-*/
-   drm_dev->irq_enabled = true;
-
ret = rockchip_drm_fbdev_init(drm_dev);
if (ret)
goto err_unbind_all;
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 22/22] drm/zte: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in zte.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/zte/zx_drm_drv.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/zte/zx_drm_drv.c b/drivers/gpu/drm/zte/zx_drm_drv.c
index 5506336594e2..064056503ebb 100644
--- a/drivers/gpu/drm/zte/zx_drm_drv.c
+++ b/drivers/gpu/drm/zte/zx_drm_drv.c
@@ -75,12 +75,6 @@ static int zx_drm_bind(struct device *dev)
goto out_unbind;
}
 
-   /*
-* We will manage irq handler on our own.  In this case, irq_enabled
-* need to be true for using vblank core support.
-*/
-   drm->irq_enabled = true;
-
drm_mode_config_reset(drm);
drm_kms_helper_poll_init(drm);
 
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 20/22] drm/vmwgfx: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in vmxgfx. All usage of
the field within vmwgfx can safely be removed.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_irq.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c
index b9a9b7ddadbd..4b82f5995452 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c
@@ -292,15 +292,11 @@ void vmw_irq_uninstall(struct drm_device *dev)
if (!(dev_priv->capabilities & SVGA_CAP_IRQMASK))
return;
 
-   if (!dev->irq_enabled)
-   return;
-
vmw_write(dev_priv, SVGA_REG_IRQMASK, 0);
 
status = vmw_irq_status_read(dev_priv);
vmw_irq_status_write(dev_priv, status);
 
-   dev->irq_enabled = false;
free_irq(dev->irq, dev);
 }
 
@@ -315,9 +311,6 @@ int vmw_irq_install(struct drm_device *dev, int irq)
 {
int ret;
 
-   if (dev->irq_enabled)
-   return -EBUSY;
-
vmw_irq_preinstall(dev);
 
ret = request_threaded_irq(irq, vmw_irq_handler, vmw_thread_fn,
@@ -325,7 +318,6 @@ int vmw_irq_install(struct drm_device *dev, int irq)
if (ret < 0)
return ret;
 
-   dev->irq_enabled = true;
dev->irq = irq;
 
return ret;
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 19/22] drm/vc4: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in vc4.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/vc4/vc4_kms.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 6a1a9e1d72ce..f0b3e4cf5bce 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -880,7 +880,6 @@ int vc4_kms_load(struct drm_device *dev)
/* Set support for vblank irq fast disable, before drm_vblank_init() */
dev->vblank_disable_immediate = true;
 
-   dev->irq_enabled = true;
ret = drm_vblank_init(dev, dev->mode_config.num_crtc);
if (ret < 0) {
dev_err(dev->dev, "failed to initialize vblank\n");
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 21/22] drm/xlnx: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in xlnx.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/xlnx/zynqmp_dpsub.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/xlnx/zynqmp_dpsub.c 
b/drivers/gpu/drm/xlnx/zynqmp_dpsub.c
index 0c1c50271a88..ac37053412a1 100644
--- a/drivers/gpu/drm/xlnx/zynqmp_dpsub.c
+++ b/drivers/gpu/drm/xlnx/zynqmp_dpsub.c
@@ -111,8 +111,6 @@ static int zynqmp_dpsub_drm_init(struct zynqmp_dpsub *dpsub)
if (ret)
return ret;
 
-   drm->irq_enabled = 1;
-
drm_kms_helper_poll_init(drm);
 
/*
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 17/22] drm/tegra: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in tegra.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/tegra/drm.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index f96c237b2242..8d27c21ddf48 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -1188,13 +1188,6 @@ static int host1x_drm_probe(struct host1x_device *dev)
goto device;
}
 
-   /*
-* We don't use the drm_irq_install() helpers provided by the DRM
-* core, so we need to set this manually in order to allow the
-* DRM_IOCTL_WAIT_VBLANK to operate correctly.
-*/
-   drm->irq_enabled = true;
-
/* syncpoints are used for full 32-bit hardware VBLANK counters */
drm->max_vblank_count = 0x;
 
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 18/22] drm/tidss: Don't use struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't use it in tidss.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/tidss/tidss_irq.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_irq.c 
b/drivers/gpu/drm/tidss/tidss_irq.c
index a5ec7931ef6b..2ed3e3296776 100644
--- a/drivers/gpu/drm/tidss/tidss_irq.c
+++ b/drivers/gpu/drm/tidss/tidss_irq.c
@@ -57,9 +57,6 @@ irqreturn_t tidss_irq_handler(int irq, void *arg)
unsigned int id;
dispc_irq_t irqstatus;
 
-   if (WARN_ON(!ddev->irq_enabled))
-   return IRQ_NONE;
-
irqstatus = dispc_read_and_clear_irqstatus(tidss->dispc);
 
for (id = 0; id < tidss->num_crtcs; id++) {
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 15/22] drm/stm: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in stm.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/stm/ltdc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..e9c5a52f041a 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -1339,9 +1339,6 @@ int ltdc_load(struct drm_device *ddev)
goto err;
}
 
-   /* Allow usage of vblank without having to call drm_irq_install */
-   ddev->irq_enabled = 1;
-
clk_disable_unprepare(ldev->pixel_clk);
 
pinctrl_pm_select_sleep_state(ddev->dev);
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 16/22] drm/sun4i: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in sun4i.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/sun4i/sun4i_drv.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c 
b/drivers/gpu/drm/sun4i/sun4i_drv.c
index af335f58bdfc..570f3af25e86 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -97,8 +97,6 @@ static int sun4i_drv_bind(struct device *dev)
if (ret)
goto cleanup_mode_config;
 
-   drm->irq_enabled = true;
-
/* Remove early framebuffers (ie. simplefb) */
ret = drm_aperture_remove_framebuffers(false, "sun4i-drm-fb");
if (ret)
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 14/22] drm/sti: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in sti.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/sti/sti_compositor.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/sti/sti_compositor.c 
b/drivers/gpu/drm/sti/sti_compositor.c
index 319962a2c17b..9caaf3ccfabe 100644
--- a/drivers/gpu/drm/sti/sti_compositor.c
+++ b/drivers/gpu/drm/sti/sti_compositor.c
@@ -145,8 +145,6 @@ static int sti_compositor_bind(struct device *dev,
}
 
drm_vblank_init(drm_dev, crtc_id);
-   /* Allow usage of vblank without having to call drm_irq_install */
-   drm_dev->irq_enabled = 1;
 
return 0;
 }
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 11/22] drm/nouveau: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in nouveau.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index a616cf4573b8..1cb14e99a60c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -553,8 +553,6 @@ nouveau_drm_device_init(struct drm_device *dev)
if (ret)
goto fail_master;
 
-   dev->irq_enabled = true;
-
nvxx_client(>client.base)->debug =
nvkm_dbgopt(nouveau_debug, "DRM");
 
@@ -795,7 +793,6 @@ nouveau_drm_device_remove(struct drm_device *dev)
 
drm_dev_unregister(dev);
 
-   dev->irq_enabled = false;
client = nvxx_client(>client.base);
device = nvkm_device_find(client->device);
 
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 12/22] drm/omapdrm: Track IRQ state in local device state

2021-06-22 Thread Thomas Zimmermann

Replace usage of struct drm_device.irq_enabled with the driver's
own state field struct omap_drm_device.irq_enabled. The field in
the DRM device structure is considered legacy and should not be
used by KMS drivers.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/omapdrm/omap_drv.h | 2 ++
 drivers/gpu/drm/omapdrm/omap_irq.c | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_drv.h 
b/drivers/gpu/drm/omapdrm/omap_drv.h
index d6f136984da9..591d4c273f02 100644
--- a/drivers/gpu/drm/omapdrm/omap_drv.h
+++ b/drivers/gpu/drm/omapdrm/omap_drv.h
@@ -48,6 +48,8 @@ struct omap_drm_private {
struct dss_device *dss;
struct dispc_device *dispc;
 
+   bool irq_enabled;
+
unsigned int num_pipes;
struct omap_drm_pipeline pipes[8];
struct omap_drm_pipeline *channels[8];
diff --git a/drivers/gpu/drm/omapdrm/omap_irq.c 
b/drivers/gpu/drm/omapdrm/omap_irq.c
index 15148d4b35b5..bb6e3fc18204 100644
--- a/drivers/gpu/drm/omapdrm/omap_irq.c
+++ b/drivers/gpu/drm/omapdrm/omap_irq.c
@@ -291,7 +291,7 @@ int omap_drm_irq_install(struct drm_device *dev)
if (ret < 0)
return ret;
 
-   dev->irq_enabled = true;
+   priv->irq_enabled = true;
 
return 0;
 }
@@ -300,10 +300,10 @@ void omap_drm_irq_uninstall(struct drm_device *dev)
 {
struct omap_drm_private *priv = dev->dev_private;
 
-   if (!dev->irq_enabled)
+   if (!priv->irq_enabled)
return;
 
-   dev->irq_enabled = false;
+   priv->irq_enabled = false;
 
dispc_free_irq(priv->dispc, dev);
 }
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 09/22] drm/imx: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in imx.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/imx/dcss/dcss-kms.c |  3 ---
 drivers/gpu/drm/imx/imx-drm-core.c  | 11 ---
 2 files changed, 14 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-kms.c 
b/drivers/gpu/drm/imx/dcss/dcss-kms.c
index 37ae68a7fba5..917834b1c80e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-kms.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-kms.c
@@ -133,8 +133,6 @@ struct dcss_kms_dev *dcss_kms_attach(struct dcss_dev *dcss)
if (ret)
goto cleanup_mode_config;
 
-   drm->irq_enabled = true;
-
ret = dcss_kms_bridge_connector_init(kms);
if (ret)
goto cleanup_mode_config;
@@ -178,7 +176,6 @@ void dcss_kms_detach(struct dcss_kms_dev *kms)
drm_kms_helper_poll_fini(drm);
drm_atomic_helper_shutdown(drm);
drm_crtc_vblank_off(>crtc.base);
-   drm->irq_enabled = false;
drm_mode_config_cleanup(drm);
dcss_crtc_deinit(>crtc, drm);
drm->dev_private = NULL;
diff --git a/drivers/gpu/drm/imx/imx-drm-core.c 
b/drivers/gpu/drm/imx/imx-drm-core.c
index 76819a8ac37f..9558e9e1b431 100644
--- a/drivers/gpu/drm/imx/imx-drm-core.c
+++ b/drivers/gpu/drm/imx/imx-drm-core.c
@@ -207,17 +207,6 @@ static int imx_drm_bind(struct device *dev)
if (IS_ERR(drm))
return PTR_ERR(drm);
 
-   /*
-* enable drm irq mode.
-* - with irq_enabled = true, we can use the vblank feature.
-*
-* P.S. note that we wouldn't use drm irq handler but
-*  just specific driver own one instead because
-*  drm framework supports only one irq handler and
-*  drivers can well take care of their interrupts
-*/
-   drm->irq_enabled = true;
-
/*
 * set max width and height as default value(4096x4096).
 * this value would be used to check framebuffer size limitation
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 10/22] drm/mediatek: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in mediatek.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mediatek/mtk_drm_drv.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index b46bdb8985da..9b60bec33d3b 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -270,12 +270,6 @@ static int mtk_drm_kms_init(struct drm_device *drm)
goto err_component_unbind;
}
 
-   /*
-* We don't use the drm_irq_install() helpers provided by the DRM
-* core, so we need to set this manually in order to allow the
-* DRM_IOCTL_WAIT_VBLANK to operate correctly.
-*/
-   drm->irq_enabled = true;
ret = drm_vblank_init(drm, MAX_CRTC);
if (ret < 0)
goto err_component_unbind;
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 08/22] drm/kirin: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in kirin.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c 
b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
index e590e19db657..98ae9a48f3fe 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
@@ -185,8 +185,6 @@ static int kirin_drm_kms_init(struct drm_device *dev,
DRM_ERROR("failed to initialize vblank.\n");
goto err_unbind_all;
}
-   /* with irq_enabled = true, we can use the vblank feature. */
-   dev->irq_enabled = true;
 
/* reset all the states of crtc/plane/encoder/connector */
drm_mode_config_reset(dev);
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 06/22] drm/malidp: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in malidp.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/arm/malidp_drv.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/arm/malidp_drv.c b/drivers/gpu/drm/arm/malidp_drv.c
index de59f3302516..78d15b04b105 100644
--- a/drivers/gpu/drm/arm/malidp_drv.c
+++ b/drivers/gpu/drm/arm/malidp_drv.c
@@ -847,8 +847,6 @@ static int malidp_bind(struct device *dev)
if (ret < 0)
goto irq_init_fail;
 
-   drm->irq_enabled = true;
-
ret = drm_vblank_init(drm, drm->mode_config.num_crtc);
if (ret < 0) {
DRM_ERROR("failed to initialise vblank\n");
@@ -874,7 +872,6 @@ static int malidp_bind(struct device *dev)
 vblank_fail:
malidp_se_irq_fini(hwdev);
malidp_de_irq_fini(hwdev);
-   drm->irq_enabled = false;
 irq_init_fail:
drm_atomic_helper_shutdown(drm);
component_unbind_all(dev, drm);
@@ -909,7 +906,6 @@ static void malidp_unbind(struct device *dev)
drm_atomic_helper_shutdown(drm);
malidp_se_irq_fini(hwdev);
malidp_de_irq_fini(hwdev);
-   drm->irq_enabled = false;
component_unbind_all(dev, drm);
of_node_put(malidp->crtc.port);
malidp->crtc.port = NULL;
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 07/22] drm/exynos: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in exynos.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c 
b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index e60257f1f24b..d8f1cf4d6b69 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -300,16 +300,6 @@ static int exynos_drm_bind(struct device *dev)
 
drm_mode_config_reset(drm);
 
-   /*
-* enable drm irq mode.
-* - with irq_enabled = true, we can use the vblank feature.
-*
-* P.S. note that we wouldn't use drm irq handler but
-*  just specific driver own one instead because
-*  drm framework supports only one irq handler.
-*/
-   drm->irq_enabled = true;
-
/* init kms poll for handling hpd */
drm_kms_helper_poll_init(drm);
 
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 05/22] drm/komeda: Don't set struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in komeda.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/arm/display/komeda/komeda_kms.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c 
b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
index ff45f23f3d56..52a6db5707a3 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
@@ -301,8 +301,6 @@ struct komeda_kms_dev *komeda_kms_attach(struct komeda_dev 
*mdev)
if (err)
goto free_component_binding;
 
-   drm->irq_enabled = true;
-
drm_kms_helper_poll_init(drm);
 
err = drm_dev_register(drm, 0);
@@ -313,7 +311,6 @@ struct komeda_kms_dev *komeda_kms_attach(struct komeda_dev 
*mdev)
 
 free_interrupts:
drm_kms_helper_poll_fini(drm);
-   drm->irq_enabled = false;
 free_component_binding:
component_unbind_all(mdev->dev, drm);
 cleanup_mode_config:
@@ -331,7 +328,6 @@ void komeda_kms_detach(struct komeda_kms_dev *kms)
drm_dev_unregister(drm);
drm_kms_helper_poll_fini(drm);
drm_atomic_helper_shutdown(drm);
-   drm->irq_enabled = false;
component_unbind_all(mdev->dev, drm);
drm_mode_config_cleanup(drm);
komeda_kms_cleanup_private_objs(kms);
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 04/22] drm: Don't test for IRQ support in VBLANK ioctls

2021-06-22 Thread Thomas Zimmermann

For KMS drivers, replace the IRQ check in VBLANK ioctls with a check for
vblank support. IRQs might be enabled wthout vblanking being supported.

This change also removes the DRM framework's only dependency on IRQ state
for non-legacy drivers. For legacy drivers with userspace modesetting,
the original test remains in drm_wait_vblank_ioctl().

v2:
* keep the old test for legacy drivers in
  drm_wait_vblank_ioctl() (Daniel)

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_irq.c| 10 +++---
 drivers/gpu/drm/drm_vblank.c | 13 +
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index c3bd664ea733..1d7785721323 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -74,10 +74,8 @@
  * only supports devices with a single interrupt on the main device stored in
  * _device.dev and set as the device paramter in drm_dev_alloc().
  *
- * These IRQ helpers are strictly optional. Drivers which roll their own only
- * need to set _device.irq_enabled to signal the DRM core that vblank
- * interrupts are working. Since these helpers don't automatically clean up the
- * requested interrupt like e.g. devm_request_irq() they're not really
+ * These IRQ helpers are strictly optional. Since these helpers don't 
automatically
+ * clean up the requested interrupt like e.g. devm_request_irq() they're not 
really
  * recommended.
  */
 
@@ -91,9 +89,7 @@
  * and after the installation.
  *
  * This is the simplified helper interface provided for drivers with no special
- * needs. Drivers which need to install interrupt handlers for multiple
- * interrupts must instead set _device.irq_enabled to signal the DRM core
- * that vblank interrupts are available.
+ * needs.
  *
  * @irq must match the interrupt number that would be passed to request_irq(),
  * if called directly instead of using this helper function.
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 3417e1ac7918..a98a4aad5037 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -1748,8 +1748,13 @@ int drm_wait_vblank_ioctl(struct drm_device *dev, void 
*data,
unsigned int pipe_index;
unsigned int flags, pipe, high_pipe;
 
-   if (!dev->irq_enabled)
-   return -EOPNOTSUPP;
+   if  (drm_core_check_feature(dev, DRIVER_MODESET)) {
+   if (!drm_dev_has_vblank(dev))
+   return -EOPNOTSUPP;
+   } else {
+   if (!dev->irq_enabled)
+   return -EOPNOTSUPP;
+   }
 
if (vblwait->request.type & _DRM_VBLANK_SIGNAL)
return -EINVAL;
@@ -2023,7 +2028,7 @@ int drm_crtc_get_sequence_ioctl(struct drm_device *dev, 
void *data,
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
 
-   if (!dev->irq_enabled)
+   if (!drm_dev_has_vblank(dev))
return -EOPNOTSUPP;
 
crtc = drm_crtc_find(dev, file_priv, get_seq->crtc_id);
@@ -2082,7 +2087,7 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, 
void *data,
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
 
-   if (!dev->irq_enabled)
+   if (!drm_dev_has_vblank(dev))
return -EOPNOTSUPP;
 
crtc = drm_crtc_find(dev, file_priv, queue_seq->crtc_id);
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 01/22] drm/amdgpu: Track IRQ state in local device state

2021-06-22 Thread Thomas Zimmermann

Replace usage of struct drm_device.irq_enabled with the driver's
own state field struct amdgpu_device.irq.installed. The field in
the DRM device structure is considered legacy and should not be
used by KMS drivers.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 32ce0e679dc7..7dad44e73cf6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -599,7 +599,7 @@ void amdgpu_irq_gpu_reset_resume_helper(struct 
amdgpu_device *adev)
 int amdgpu_irq_get(struct amdgpu_device *adev, struct amdgpu_irq_src *src,
   unsigned type)
 {
-   if (!adev_to_drm(adev)->irq_enabled)
+   if (!adev->irq.installed)
return -ENOENT;
 
if (type >= src->num_types)
@@ -629,7 +629,7 @@ int amdgpu_irq_get(struct amdgpu_device *adev, struct 
amdgpu_irq_src *src,
 int amdgpu_irq_put(struct amdgpu_device *adev, struct amdgpu_irq_src *src,
   unsigned type)
 {
-   if (!adev_to_drm(adev)->irq_enabled)
+   if (!adev->irq.installed)
return -ENOENT;
 
if (type >= src->num_types)
@@ -660,7 +660,7 @@ int amdgpu_irq_put(struct amdgpu_device *adev, struct 
amdgpu_irq_src *src,
 bool amdgpu_irq_enabled(struct amdgpu_device *adev, struct amdgpu_irq_src *src,
unsigned type)
 {
-   if (!adev_to_drm(adev)->irq_enabled)
+   if (!adev->irq.installed)
return false;
 
if (type >= src->num_types)
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 03/22] drm/radeon: Track IRQ state in local device state

2021-06-22 Thread Thomas Zimmermann

Replace usage of struct drm_device.irq_enabled with the driver's
own state field struct radeon_device.irq.installed. The field in
the DRM device structure is considered legacy and should not be
used by KMS drivers.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/radeon/radeon_fence.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_irq_kms.c | 16 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 0d8ef2368adf..7ec581363e23 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -288,7 +288,7 @@ static void radeon_fence_check_lockup(struct work_struct 
*work)
return;
}
 
-   if (fence_drv->delayed_irq && rdev->ddev->irq_enabled) {
+   if (fence_drv->delayed_irq && rdev->irq.installed) {
unsigned long irqflags;
 
fence_drv->delayed_irq = false;
diff --git a/drivers/gpu/drm/radeon/radeon_irq_kms.c 
b/drivers/gpu/drm/radeon/radeon_irq_kms.c
index 84d0b1a3355f..a36ce826d0c0 100644
--- a/drivers/gpu/drm/radeon/radeon_irq_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_irq_kms.c
@@ -357,7 +357,7 @@ void radeon_irq_kms_sw_irq_get(struct radeon_device *rdev, 
int ring)
 {
unsigned long irqflags;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
if (atomic_inc_return(>irq.ring_int[ring]) == 1) {
@@ -396,7 +396,7 @@ void radeon_irq_kms_sw_irq_put(struct radeon_device *rdev, 
int ring)
 {
unsigned long irqflags;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
if (atomic_dec_and_test(>irq.ring_int[ring])) {
@@ -422,7 +422,7 @@ void radeon_irq_kms_pflip_irq_get(struct radeon_device 
*rdev, int crtc)
if (crtc < 0 || crtc >= rdev->num_crtc)
return;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
if (atomic_inc_return(>irq.pflip[crtc]) == 1) {
@@ -448,7 +448,7 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc)
if (crtc < 0 || crtc >= rdev->num_crtc)
return;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
if (atomic_dec_and_test(>irq.pflip[crtc])) {
@@ -470,7 +470,7 @@ void radeon_irq_kms_enable_afmt(struct radeon_device *rdev, 
int block)
 {
unsigned long irqflags;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
spin_lock_irqsave(>irq.lock, irqflags);
@@ -492,7 +492,7 @@ void radeon_irq_kms_disable_afmt(struct radeon_device 
*rdev, int block)
 {
unsigned long irqflags;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
spin_lock_irqsave(>irq.lock, irqflags);
@@ -514,7 +514,7 @@ void radeon_irq_kms_enable_hpd(struct radeon_device *rdev, 
unsigned hpd_mask)
unsigned long irqflags;
int i;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
spin_lock_irqsave(>irq.lock, irqflags);
@@ -537,7 +537,7 @@ void radeon_irq_kms_disable_hpd(struct radeon_device *rdev, 
unsigned hpd_mask)
unsigned long irqflags;
int i;
 
-   if (!rdev->ddev->irq_enabled)
+   if (!rdev->irq.installed)
return;
 
spin_lock_irqsave(>irq.lock, irqflags);
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 02/22] drm/hibmc: Call drm_irq_uninstall() unconditionally

2021-06-22 Thread Thomas Zimmermann

Remove the check around drm_irq_uninstall(). The same test is
done by the function internally. The tested state in irq_enabled
is considered obsolete and should not be used by KMS drivers.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c 
b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
index f4bc5386574a..f8ef711bbe5d 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
@@ -253,8 +253,7 @@ static int hibmc_unload(struct drm_device *dev)
 {
drm_atomic_helper_shutdown(dev);
 
-   if (dev->irq_enabled)
-   drm_irq_uninstall(dev);
+   drm_irq_uninstall(dev);
 
pci_disable_msi(to_pci_dev(dev->dev));
 
-- 
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 00/22] Deprecate struct drm_device.irq_enabled

2021-06-22 Thread Thomas Zimmermann

Remove references to struct drm_device.irq_enabled from modern
DRM drivers and core.

KMS drivers enable IRQs for their devices internally. They don't
have to keep track of the IRQ state via irq_enabled. For vblanking,
it's cleaner to test for vblanking support directly than to test
for enabled IRQs.

This used to be a single patch, [1] but it's now a full series.

The first 3 patches replace instances of irq_enabled that are not
required.

Patch 4 fixes vblank ioctls to actually test for vblank support
instead of IRQs.

THe rest of the patchset removes irq_enabled from all non-legacy
drivers. The only exception is omapdrm, which has an internal
dpendency on the field's value. For this drivers, the state gets
duplicated internally.

With the patchset applied, drivers can later switch over to plain
Linux IRQ interfaces and DRM's IRQ midlayer can be declared legacy.

v2:
* keep the original test for legacy drivers in
  drm_wait_vblank_ioctl() (Daniel)

[1] https://lore.kernel.org/dri-devel/20210608090301.4752-1-tzimmerm...@suse.de/

Thomas Zimmermann (22):
  drm/amdgpu: Track IRQ state in local device state
  drm/hibmc: Call drm_irq_uninstall() unconditionally
  drm/radeon: Track IRQ state in local device state
  drm: Don't test for IRQ support in VBLANK ioctls
  drm/komeda: Don't set struct drm_device.irq_enabled
  drm/malidp: Don't set struct drm_device.irq_enabled
  drm/exynos: Don't set struct drm_device.irq_enabled
  drm/kirin: Don't set struct drm_device.irq_enabled
  drm/imx: Don't set struct drm_device.irq_enabled
  drm/mediatek: Don't set struct drm_device.irq_enabled
  drm/nouveau: Don't set struct drm_device.irq_enabled
  drm/omapdrm: Track IRQ state in local device state
  drm/rockchip: Don't set struct drm_device.irq_enabled
  drm/sti: Don't set struct drm_device.irq_enabled
  drm/stm: Don't set struct drm_device.irq_enabled
  drm/sun4i: Don't set struct drm_device.irq_enabled
  drm/tegra: Don't set struct drm_device.irq_enabled
  drm/tidss: Don't use struct drm_device.irq_enabled
  drm/vc4: Don't set struct drm_device.irq_enabled
  drm/vmwgfx: Don't set struct drm_device.irq_enabled
  drm/xlnx: Don't set struct drm_device.irq_enabled
  drm/zte: Don't set struct drm_device.irq_enabled

 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c |  6 +++---
 drivers/gpu/drm/arm/display/komeda/komeda_kms.c |  4 
 drivers/gpu/drm/arm/malidp_drv.c|  4 
 drivers/gpu/drm/drm_irq.c   | 10 +++---
 drivers/gpu/drm/drm_vblank.c| 13 +
 drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c |  3 +--
 drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c |  2 --
 drivers/gpu/drm/imx/dcss/dcss-kms.c |  3 ---
 drivers/gpu/drm/imx/imx-drm-core.c  | 11 ---
 drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  6 --
 drivers/gpu/drm/nouveau/nouveau_drm.c   |  3 ---
 drivers/gpu/drm/omapdrm/omap_drv.h  |  2 ++
 drivers/gpu/drm/omapdrm/omap_irq.c  |  6 +++---
 drivers/gpu/drm/radeon/radeon_fence.c   |  2 +-
 drivers/gpu/drm/radeon/radeon_irq_kms.c | 16 
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c |  6 --
 drivers/gpu/drm/sti/sti_compositor.c|  2 --
 drivers/gpu/drm/stm/ltdc.c  |  3 ---
 drivers/gpu/drm/sun4i/sun4i_drv.c   |  2 --
 drivers/gpu/drm/tegra/drm.c |  7 ---
 drivers/gpu/drm/tidss/tidss_irq.c   |  3 ---
 drivers/gpu/drm/vc4/vc4_kms.c   |  1 -
 drivers/gpu/drm/vmwgfx/vmwgfx_irq.c |  8 
 drivers/gpu/drm/xlnx/zynqmp_dpsub.c |  2 --
 drivers/gpu/drm/zte/zx_drm_drv.c|  6 --
 26 files changed, 30 insertions(+), 111 deletions(-)


base-commit: 8c1323b422f8473421682ba783b5949ddd89a3f4
prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
--
2.32.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: move apu flags initialization to the start of device init

2021-06-22 Thread Huang Rui

In some asics, we need to adjust the behavior according to the apu flags
at very early stage.

Signed-off-by: Huang Rui 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 36 ++
 drivers/gpu/drm/amd/amdgpu/nv.c|  1 -
 drivers/gpu/drm/amd/amdgpu/soc15.c | 10 +-
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3f51b142fc83..e6702d136a6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1359,6 +1359,38 @@ static void 
amdgpu_device_check_smu_prv_buffer_size(struct amdgpu_device *adev)
adev->pm.smu_prv_buffer_size = 0;
 }
 
+static int amdgpu_device_init_apu_flags(struct amdgpu_device *adev)
+{
+   if (!(adev->flags & AMD_IS_APU) ||
+   adev->asic_type < CHIP_RAVEN)
+   return 0;
+
+   switch (adev->asic_type) {
+   case CHIP_RAVEN:
+   if (adev->pdev->device == 0x15dd)
+   adev->apu_flags |= AMD_APU_IS_RAVEN;
+   if (adev->pdev->device == 0x15d8)
+   adev->apu_flags |= AMD_APU_IS_PICASSO;
+   break;
+   case CHIP_RENOIR:
+   if ((adev->pdev->device == 0x1636) ||
+   (adev->pdev->device == 0x164c))
+   adev->apu_flags |= AMD_APU_IS_RENOIR;
+   else
+   adev->apu_flags |= AMD_APU_IS_GREEN_SARDINE;
+   break;
+   case CHIP_VANGOGH:
+   adev->apu_flags |= AMD_APU_IS_VANGOGH;
+   break;
+   case CHIP_YELLOW_CARP:
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 /**
  * amdgpu_device_check_arguments - validate module params
  *
@@ -3358,6 +3390,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
mutex_init(>psp.mutex);
mutex_init(>notifier_lock);
 
+   r = amdgpu_device_init_apu_flags(adev);
+   if (r)
+   return r;
+
r = amdgpu_device_check_arguments(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 455d0425787c..1470488a18e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1275,7 +1275,6 @@ static int nv_common_early_init(void *handle)
break;
 
case CHIP_VANGOGH:
-   adev->apu_flags |= AMD_APU_IS_VANGOGH;
adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG |
AMD_CG_SUPPORT_GFX_MGLS |
AMD_CG_SUPPORT_GFX_CP_LS |
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index de85577c9cfd..b02436401d46 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1360,10 +1360,7 @@ static int soc15_common_early_init(void *handle)
break;
case CHIP_RAVEN:
adev->asic_funcs = _asic_funcs;
-   if (adev->pdev->device == 0x15dd)
-   adev->apu_flags |= AMD_APU_IS_RAVEN;
-   if (adev->pdev->device == 0x15d8)
-   adev->apu_flags |= AMD_APU_IS_PICASSO;
+
if (adev->rev_id >= 0x8)
adev->apu_flags |= AMD_APU_IS_RAVEN2;
 
@@ -1455,11 +1452,6 @@ static int soc15_common_early_init(void *handle)
break;
case CHIP_RENOIR:
adev->asic_funcs = _asic_funcs;
-   if ((adev->pdev->device == 0x1636) ||
-   (adev->pdev->device == 0x164c))
-   adev->apu_flags |= AMD_APU_IS_RENOIR;
-   else
-   adev->apu_flags |= AMD_APU_IS_GREEN_SARDINE;
 
if (adev->apu_flags & AMD_APU_IS_RENOIR)
adev->external_rev_id = adev->rev_id + 0x91;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Oded Gabbay

On Tue, Jun 22, 2021 at 3:15 PM Jason Gunthorpe  wrote:
>
> On Tue, Jun 22, 2021 at 03:04:30PM +0300, Oded Gabbay wrote:
> > On Tue, Jun 22, 2021 at 3:01 PM Jason Gunthorpe  wrote:
> > >
> > > On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> > > > On Tue, Jun 22, 2021 at 9:37 AM Christian König
> > > >  wrote:
> > > > >
> > > > > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > > > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > > > > >
> > > > > >> Another thing I want to emphasize is that we are doing p2p only
> > > > > >> through the export/import of the FD. We do *not* allow the user to
> > > > > >> mmap the dma-buf as we do not support direct IO. So there is no 
> > > > > >> access
> > > > > >> to these pages through the userspace.
> > > > > > Arguably mmaping the memory is a better choice, and is the direction
> > > > > > that Logan's series goes in. Here the use of DMABUF was specifically
> > > > > > designed to allow hitless revokation of the memory, which this isn't
> > > > > > even using.
> > > > >
> > > > > The major problem with this approach is that DMA-buf is also used for
> > > > > memory which isn't CPU accessible.
> > >
> > > That isn't an issue here because the memory is only intended to be
> > > used with P2P transfers so it must be CPU accessible.
> > >
> > > > > That was one of the reasons we didn't even considered using the 
> > > > > mapping
> > > > > memory approach for GPUs.
> > >
> > > Well, now we have DEVICE_PRIVATE memory that can meet this need
> > > too.. Just nobody has wired it up to hmm_range_fault()
> > >
> > > > > > So you are taking the hit of very limited hardware support and 
> > > > > > reduced
> > > > > > performance just to squeeze into DMABUF..
> > > >
> > > > Thanks Jason for the clarification, but I honestly prefer to use
> > > > DMA-BUF at the moment.
> > > > It gives us just what we need (even more than what we need as you
> > > > pointed out), it is *already* integrated and tested in the RDMA
> > > > subsystem, and I'm feeling comfortable using it as I'm somewhat
> > > > familiar with it from my AMD days.
> > >
> > > You still have the issue that this patch is doing all of this P2P
> > > stuff wrong - following the already NAK'd AMD approach.
> >
> > Could you please point me exactly to the lines of code that are wrong
> > in your opinion ?
>
> 1) Setting sg_page to NULL
> 2) 'mapping' pages for P2P DMA without going through the iommu
> 3) Allowing P2P DMA without using the p2p dma API to validate that it
>can work at all in the first place.
>
> All of these result in functional bugs in certain system
> configurations.
>
> Jason

Hi Jason,
Thanks for the feedback.
Regarding point 1, why is that a problem if we disable the option to
mmap the dma-buf from user-space ? We don't want to support CPU
fallback/Direct IO.
In addition, I didn't see any problem with sg_page being NULL in the
RDMA p2p dma-buf code. Did I miss something here ?

Regarding points 2 & 3, I want to examine them more closely in a KVM
virtual machine environment with IOMMU enabled.
I will take two GAUDI devices and use one as an exporter and one as an
importer. I want to see that the solution works end-to-end, with real
device DMA from importer to exporter.
I fear that the dummy importer I wrote is bypassing these two issues
you brought up.

So thanks again and I'll get back and update once I've finished testing it.

Oded
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 03:04:30PM +0300, Oded Gabbay wrote:
> On Tue, Jun 22, 2021 at 3:01 PM Jason Gunthorpe  wrote:
> >
> > On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> > > On Tue, Jun 22, 2021 at 9:37 AM Christian König
> > >  wrote:
> > > >
> > > > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > > > >
> > > > >> Another thing I want to emphasize is that we are doing p2p only
> > > > >> through the export/import of the FD. We do *not* allow the user to
> > > > >> mmap the dma-buf as we do not support direct IO. So there is no 
> > > > >> access
> > > > >> to these pages through the userspace.
> > > > > Arguably mmaping the memory is a better choice, and is the direction
> > > > > that Logan's series goes in. Here the use of DMABUF was specifically
> > > > > designed to allow hitless revokation of the memory, which this isn't
> > > > > even using.
> > > >
> > > > The major problem with this approach is that DMA-buf is also used for
> > > > memory which isn't CPU accessible.
> >
> > That isn't an issue here because the memory is only intended to be
> > used with P2P transfers so it must be CPU accessible.
> >
> > > > That was one of the reasons we didn't even considered using the mapping
> > > > memory approach for GPUs.
> >
> > Well, now we have DEVICE_PRIVATE memory that can meet this need
> > too.. Just nobody has wired it up to hmm_range_fault()
> >
> > > > > So you are taking the hit of very limited hardware support and reduced
> > > > > performance just to squeeze into DMABUF..
> > >
> > > Thanks Jason for the clarification, but I honestly prefer to use
> > > DMA-BUF at the moment.
> > > It gives us just what we need (even more than what we need as you
> > > pointed out), it is *already* integrated and tested in the RDMA
> > > subsystem, and I'm feeling comfortable using it as I'm somewhat
> > > familiar with it from my AMD days.
> >
> > You still have the issue that this patch is doing all of this P2P
> > stuff wrong - following the already NAK'd AMD approach.
> 
> Could you please point me exactly to the lines of code that are wrong
> in your opinion ?

1) Setting sg_page to NULL
2) 'mapping' pages for P2P DMA without going through the iommu
3) Allowing P2P DMA without using the p2p dma API to validate that it
   can work at all in the first place.

All of these result in functional bugs in certain system
configurations.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Jason Gunthorpe

On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> On Tue, Jun 22, 2021 at 9:37 AM Christian König
>  wrote:
> >
> > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > >
> > >> Another thing I want to emphasize is that we are doing p2p only
> > >> through the export/import of the FD. We do *not* allow the user to
> > >> mmap the dma-buf as we do not support direct IO. So there is no access
> > >> to these pages through the userspace.
> > > Arguably mmaping the memory is a better choice, and is the direction
> > > that Logan's series goes in. Here the use of DMABUF was specifically
> > > designed to allow hitless revokation of the memory, which this isn't
> > > even using.
> >
> > The major problem with this approach is that DMA-buf is also used for
> > memory which isn't CPU accessible.

That isn't an issue here because the memory is only intended to be
used with P2P transfers so it must be CPU accessible.

> > That was one of the reasons we didn't even considered using the mapping
> > memory approach for GPUs.

Well, now we have DEVICE_PRIVATE memory that can meet this need
too.. Just nobody has wired it up to hmm_range_fault()

> > > So you are taking the hit of very limited hardware support and reduced
> > > performance just to squeeze into DMABUF..
> 
> Thanks Jason for the clarification, but I honestly prefer to use
> DMA-BUF at the moment.
> It gives us just what we need (even more than what we need as you
> pointed out), it is *already* integrated and tested in the RDMA
> subsystem, and I'm feeling comfortable using it as I'm somewhat
> familiar with it from my AMD days.

You still have the issue that this patch is doing all of this P2P
stuff wrong - following the already NAK'd AMD approach.

> I'll go and read Logan's patch-set to see if that will work for us in
> the future. Please remember, as Daniel said, we don't have struct page
> backing our device memory, so if that is a requirement to connect to
> Logan's work, then I don't think we will want to do it at this point.

It is trivial to get the struct page for a PCI BAR.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: radeon on drm-tip: null-ptr deref in radeon_ttm_bo_destroy()

2021-06-22 Thread Christian König


Hi Thomas,

yeah that's a known issue. A patch to fix that is already under review.

Christian.

Am 22.06.21 um 14:03 schrieb Thomas Zimmermann:

Hi,

on drm-tip, I see a null-ptr deref in radeon_ttm_bo_destroy(). Happens 
when I try to start weston or X. Full error is below. Let me know if 
you need more info.


Best regards
Thomas

[ 1849.999218] 
==


[ 1850.006544] BUG: KASAN: null-ptr-deref in 
radeon_ttm_bo_destroy+0x39/0x1d0 [radeon]


[ 1850.014312] Read of size 4 at addr 0010 by task 
weston/1434


[ 1850.020938] 


[ 1850.022434] CPU: 7 PID: 1434 Comm: weston Tainted: G    
E 5.13.0-rc7-1-default+ #972


[ 1850.031233] Hardware name: Dell Inc. OptiPlex 9020/0N4YC8, BIOS 
A24 10/24/2018



[ 1850.038466] Call Trace:



[ 1850.040920]  dump_stack+0xa5/0xdc



[ 1850.044249]  ? radeon_ttm_bo_destroy+0x39/0x1d0 [radeon]



[ 1850.049639] kasan_report.cold+0x5f/0xd8



[ 1850.053575]  ? radeon_ttm_bo_destroy+0x39/0x1d0 [radeon]



[ 1850.058967] radeon_ttm_bo_destroy+0x39/0x1d0 [radeon]



[ 1850.064189]  radeon_bo_unref+0x1f/0x30 [radeon]



[ 1850.068798] radeon_gem_object_free+0x5f/0x80 [radeon]



[ 1850.074016]  ? radeon_gem_object_mmap+0x70/0x70 [radeon]



[ 1850.079404]  ? drm_gem_object_handle_put_unlocked+0xd0/0x160 [drm]



[ 1850.085673]  ? drm_gem_object_free+0x25/0x40 [drm]



[ 1850.090524] drm_gem_object_release_handle+0x8e/0xa0 [drm]



[ 1850.096070] drm_gem_handle_delete+0x5b/0xa0 [drm]



[ 1850.100922]  ? drm_gem_handle_create+0x50/0x50 [drm]



[ 1850.105947] drm_ioctl_kernel+0x131/0x180 [drm]



[ 1850.110538]  ? drm_setversion+0x340/0x340 [drm]



[ 1850.115135]  ? drm_gem_handle_create+0x50/0x50 [drm]



[ 1850.120157]  drm_ioctl+0x309/0x540 [drm]



[ 1850.124143]  ? drm_version+0x150/0x150 [drm]



[ 1850.128470]  ? __lock_release+0x12f/0x4e0



[ 1850.132496]  ? lock_downgrade+0xa0/0xa0



[ 1850.136342]  ? rpm_callback+0xe0/0xe0



[ 1850.140015]  ? mark_held_locks+0x23/0x90



[ 1850.143951]  ? lockdep_hardirqs_on_prepare.part.0+0x128/0x1d0



[ 1850.149708]  ? _raw_spin_unlock_irqrestore+0x37/0x40



[ 1850.154684]  ? lockdep_hardirqs_on+0x77/0xf0



[ 1850.158967]  ? _raw_spin_unlock_irqrestore+0x37/0x40



[ 1850.163947]  radeon_drm_ioctl+0x75/0xd0 [radeon]



[ 1850.168644]  __x64_sys_ioctl+0xb9/0xf0



[ 1850.172406]  do_syscall_64+0x40/0xb0



[ 1850.175992] entry_SYSCALL_64_after_hwframe+0x44/0xae



[ 1850.181053] RIP: 0033:0x7f7d5fd0c0bb


[ 1850.184636] Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 
4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 bd 0c 00 f7 d8 64 
89 01 48


[ 1850.203436] RSP: 002b:7ffc3fb35778 EFLAGS: 0246 ORIG_RAX: 
0010


[ 1850.211020] RAX: ffda RBX: 7ffc3fb357c8 RCX: 
7f7d5fd0c0bb


[ 1850.218171] RDX: 7ffc3fb357c8 RSI: 40086409 RDI: 
0010


[ 1850.225330] RBP: 40086409 R08:  R09: 



[ 1850.232489] R10: 7ffc3fbf4080 R11: 0246 R12: 
5561d758e130


[ 1850.239647] R13: 0010 R14: 5561d7bda6f0 R15: 
5561d7bcb250


[ 1850.246863] 
==



[ 1850.254107] Disabling lock debugging due to kernel taint


[ 1850.259487] BUG: kernel NULL pointer dereference, address: 
0010



[ 1850.266458] #PF: supervisor read access in kernel mode



[ 1850.271602] #PF: error_code(0x) - not-present page


[ 1850.276746] PGD 0 P4D 0 



[ 1850.279283] Oops:  [#1] SMP KASAN PTI


[ 1850.283296] CPU: 7 PID: 1434 Comm: weston Tainted: G    B   
E 5.13.0-rc7-1-default+ #972


[ 1850.292092] Hardware name: Dell Inc. OptiPlex 9020/0N4YC8, BIOS 
A24 10/24/2018



[ 1850.299324] RIP: 0010:radeon_ttm_bo_destroy+0x40/0x1d0 [radeon]


[ 1850.305323] Code: 81 c7 68 02 00 00 53 4c 8d ad 08 03 00 00 e8 47 
0f d6 ce 48 8b 9d 68 02 00 00 48 8d 7b 10 e8 37 0e d6 ce 48 8d bd 18 
01 00 00 <44> 8b 7b 10 e8 27 0f d6 ce 4c 8b b5 18 01 00 00 4c 89 ef 
e8 18 0f



[ 1850.324124] RSP: 0018:c9000367fbf8 EFLAGS: 00010282


[ 1850.329356] RAX: 0001 RBX:  RCX: 
dc00


[ 1850.336499] RDX: 0007 RSI: 0004 RDI: 
88818b2fd190


[ 1850.343643] RBP: 88818b2fd078 R08:  R09: 
9154f743


[ 1850.350787] R10: fbfff22a9ee8 R11: 0001 R12: 
88818b2fd000


[ 1850.357933] R13: 88818b2fd380 R14: 8881ecf87098 R15: 
8881ecf87038


[ 1850.365076] FS:  7f7d5f6618c0() GS:8887b7e0() 
knlGS:



[ 1850.373176] CS:  0010 DS:  ES:  CR0: 80050033


[ 1850.378927] CR2: 0010 CR3: 00024b49a002 CR4: 
001706e0



[ 1850.386070] Call Trace:



[ 1850.388519]  radeon_bo_unref+0x1f/0x30 [radeon]



[ 1850.393125]

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Oded Gabbay

On Tue, Jun 22, 2021 at 3:01 PM Jason Gunthorpe  wrote:
>
> On Tue, Jun 22, 2021 at 11:42:27AM +0300, Oded Gabbay wrote:
> > On Tue, Jun 22, 2021 at 9:37 AM Christian König
> >  wrote:
> > >
> > > Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > > > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> > > >
> > > >> Another thing I want to emphasize is that we are doing p2p only
> > > >> through the export/import of the FD. We do *not* allow the user to
> > > >> mmap the dma-buf as we do not support direct IO. So there is no access
> > > >> to these pages through the userspace.
> > > > Arguably mmaping the memory is a better choice, and is the direction
> > > > that Logan's series goes in. Here the use of DMABUF was specifically
> > > > designed to allow hitless revokation of the memory, which this isn't
> > > > even using.
> > >
> > > The major problem with this approach is that DMA-buf is also used for
> > > memory which isn't CPU accessible.
>
> That isn't an issue here because the memory is only intended to be
> used with P2P transfers so it must be CPU accessible.
>
> > > That was one of the reasons we didn't even considered using the mapping
> > > memory approach for GPUs.
>
> Well, now we have DEVICE_PRIVATE memory that can meet this need
> too.. Just nobody has wired it up to hmm_range_fault()
>
> > > > So you are taking the hit of very limited hardware support and reduced
> > > > performance just to squeeze into DMABUF..
> >
> > Thanks Jason for the clarification, but I honestly prefer to use
> > DMA-BUF at the moment.
> > It gives us just what we need (even more than what we need as you
> > pointed out), it is *already* integrated and tested in the RDMA
> > subsystem, and I'm feeling comfortable using it as I'm somewhat
> > familiar with it from my AMD days.
>
> You still have the issue that this patch is doing all of this P2P
> stuff wrong - following the already NAK'd AMD approach.

Could you please point me exactly to the lines of code that are wrong
in your opinion ?
I find it hard to understand from your statement what exactly you
think that we are doing wrong.
The implementation is found in the second patch in this patch-set.

Thanks,
Oded
>
> > I'll go and read Logan's patch-set to see if that will work for us in
> > the future. Please remember, as Daniel said, we don't have struct page
> > backing our device memory, so if that is a requirement to connect to
> > Logan's work, then I don't think we will want to do it at this point.
>
> It is trivial to get the struct page for a PCI BAR.
>
> Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

radeon on drm-tip: null-ptr deref in radeon_ttm_bo_destroy()

2021-06-22 Thread Thomas Zimmermann


Hi,

on drm-tip, I see a null-ptr deref in radeon_ttm_bo_destroy(). Happens 
when I try to start weston or X. Full error is below. Let me know if you 
need more info.


Best regards
Thomas


[ 1849.999218] 
==



[ 1850.006544] BUG: KASAN: null-ptr-deref in radeon_ttm_bo_destroy+0x39/0x1d0 
[radeon]



[ 1850.014312] Read of size 4 at addr 0010 by task weston/1434


[ 1850.020938] 



[ 1850.022434] CPU: 7 PID: 1434 Comm: weston Tainted: GE 
5.13.0-rc7-1-default+ #972



[ 1850.031233] Hardware name: Dell Inc. OptiPlex 9020/0N4YC8, BIOS A24 
10/24/2018



[ 1850.038466] Call Trace:



[ 1850.040920]  dump_stack+0xa5/0xdc



[ 1850.044249]  ? radeon_ttm_bo_destroy+0x39/0x1d0 [radeon]



[ 1850.049639]  kasan_report.cold+0x5f/0xd8



[ 1850.053575]  ? radeon_ttm_bo_destroy+0x39/0x1d0 [radeon]



[ 1850.058967]  radeon_ttm_bo_destroy+0x39/0x1d0 [radeon]



[ 1850.064189]  radeon_bo_unref+0x1f/0x30 [radeon]



[ 1850.068798]  radeon_gem_object_free+0x5f/0x80 [radeon]



[ 1850.074016]  ? radeon_gem_object_mmap+0x70/0x70 [radeon]



[ 1850.079404]  ? drm_gem_object_handle_put_unlocked+0xd0/0x160 [drm]



[ 1850.085673]  ? drm_gem_object_free+0x25/0x40 [drm]



[ 1850.090524]  drm_gem_object_release_handle+0x8e/0xa0 [drm]



[ 1850.096070]  drm_gem_handle_delete+0x5b/0xa0 [drm]



[ 1850.100922]  ? drm_gem_handle_create+0x50/0x50 [drm]



[ 1850.105947]  drm_ioctl_kernel+0x131/0x180 [drm]



[ 1850.110538]  ? drm_setversion+0x340/0x340 [drm]



[ 1850.115135]  ? drm_gem_handle_create+0x50/0x50 [drm]



[ 1850.120157]  drm_ioctl+0x309/0x540 [drm]



[ 1850.124143]  ? drm_version+0x150/0x150 [drm]



[ 1850.128470]  ? __lock_release+0x12f/0x4e0



[ 1850.132496]  ? lock_downgrade+0xa0/0xa0



[ 1850.136342]  ? rpm_callback+0xe0/0xe0



[ 1850.140015]  ? mark_held_locks+0x23/0x90



[ 1850.143951]  ? lockdep_hardirqs_on_prepare.part.0+0x128/0x1d0



[ 1850.149708]  ? _raw_spin_unlock_irqrestore+0x37/0x40



[ 1850.154684]  ? lockdep_hardirqs_on+0x77/0xf0



[ 1850.158967]  ? _raw_spin_unlock_irqrestore+0x37/0x40



[ 1850.163947]  radeon_drm_ioctl+0x75/0xd0 [radeon]



[ 1850.168644]  __x64_sys_ioctl+0xb9/0xf0



[ 1850.172406]  do_syscall_64+0x40/0xb0



[ 1850.175992]  entry_SYSCALL_64_after_hwframe+0x44/0xae



[ 1850.181053] RIP: 0033:0x7f7d5fd0c0bb



[ 1850.184636] Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c 
c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff 
ff 73 01 c3 48 8b 0d 85 bd 0c 00 f7 d8 64 89 01 48



[ 1850.203436] RSP: 002b:7ffc3fb35778 EFLAGS: 0246 ORIG_RAX: 
0010



[ 1850.211020] RAX: ffda RBX: 7ffc3fb357c8 RCX: 7f7d5fd0c0bb



[ 1850.218171] RDX: 7ffc3fb357c8 RSI: 40086409 RDI: 0010



[ 1850.225330] RBP: 40086409 R08:  R09: 



[ 1850.232489] R10: 7ffc3fbf4080 R11: 0246 R12: 5561d758e130



[ 1850.239647] R13: 0010 R14: 5561d7bda6f0 R15: 5561d7bcb250



[ 1850.246863] 
==



[ 1850.254107] Disabling lock debugging due to kernel taint



[ 1850.259487] BUG: kernel NULL pointer dereference, address: 0010



[ 1850.266458] #PF: supervisor read access in kernel mode



[ 1850.271602] #PF: error_code(0x) - not-present page


[ 1850.276746] PGD 0 P4D 0 



[ 1850.279283] Oops:  [#1] SMP KASAN PTI



[ 1850.283296] CPU: 7 PID: 1434 Comm: weston Tainted: GB   E 
5.13.0-rc7-1-default+ #972



[ 1850.292092] Hardware name: Dell Inc. OptiPlex 9020/0N4YC8, BIOS A24 
10/24/2018



[ 1850.299324] RIP: 0010:radeon_ttm_bo_destroy+0x40/0x1d0 [radeon]



[ 1850.305323] Code: 81 c7 68 02 00 00 53 4c 8d ad 08 03 00 00 e8 47 0f d6 ce 48 8b 
9d 68 02 00 00 48 8d 7b 10 e8 37 0e d6 ce 48 8d bd 18 01 00 00 <44> 8b 7b 10 e8 
27 0f d6 ce 4c 8b b5 18 01 00 00 4c 89 ef e8 18 0f



[ 1850.324124] RSP: 0018:c9000367fbf8 EFLAGS: 00010282



[ 1850.329356] RAX: 0001 RBX:  RCX: dc00



[ 1850.336499] RDX: 0007 RSI: 0004 RDI: 88818b2fd190



[ 1850.343643] RBP: 88818b2fd078 R08:  R09: 9154f743



[ 1850.350787] R10: fbfff22a9ee8 R11: 0001 R12: 88818b2fd000



[ 1850.357933] R13: 88818b2fd380 R14: 8881ecf87098 R15: 8881ecf87038



[ 1850.365076] FS:  7f7d5f6618c0() GS:8887b7e0() 
knlGS:



[ 1850.373176] CS:  0010 DS:  ES:  CR0: 80050033



[ 1850.378927] CR2: 0010 CR3: 00024b49a002 CR4: 001706e0



[ 1850.386070] Call Trace:



[ 1850.388519]  radeon_bo_unref+0x1f/0x30 [radeon]



[ 1850.393125]  radeon_gem_object_free+0x5f/0x80 [radeon]



[ 1850.398338]  ? radeon_gem_object_mmap+0x70/0x70 [radeon]



[ 1850.403724]  ?

Re: [PATCH v4 09/17] drm/uAPI: Add "active color range" drm property as feedback for userspace

2021-06-22 Thread Simon Ser

On Tuesday, June 22nd, 2021 at 11:50, Werner Sembach  
wrote:

> Unknown is when no monitor is connected or is when the
> connector/monitor is disabled.

I think the other connector props (link-status, non-desktop, etc) don't
have a special "unset" value, and instead the value is set to a random
enum entry. User-space should ignore the prop on these disconnected
connectors anyways.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] This patch replaces all the instances of dev_info with drm_info macro

2021-06-22 Thread kernel test robot

Hi Aman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.13-rc7 next-20210621]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Aman-Jain/This-patch-replaces-all-the-instances-of-dev_info-with-drm_info-macro/20210622-140850
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
a96bfed64c8986d6404e553f18203cae1f5ac7e6
config: x86_64-randconfig-a002-20210622 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
b3634d3e88b7f26534a5057bff182b7dced584fc)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/aa0d692308d703f641f19def814f7c8d59468671
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Aman-Jain/This-patch-replaces-all-the-instances-of-dev_info-with-drm_info-macro/20210622-140850
git checkout aa0d692308d703f641f19def814f7c8d59468671
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/radeon/radeon_drv.c:311:4: error: no member named 'dev' in 
>> 'struct device'
   drm_info(>dev,
   ^~~~
   include/drm/drm_print.h:416:2: note: expanded from macro 'drm_info'
   __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
   ^~
   include/drm/drm_print.h:412:27: note: expanded from macro '__drm_printk'
   dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
   ~^
   include/linux/dev_printk.h:118:12: note: expanded from macro 'dev_info'
   _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
 ^~~
   drivers/gpu/drm/radeon/radeon_drv.c:323:4: error: no member named 'dev' in 
'struct device'
   drm_info(>dev,
   ^~~~
   include/drm/drm_print.h:416:2: note: expanded from macro 'drm_info'
   __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
   ^~
   include/drm/drm_print.h:412:27: note: expanded from macro '__drm_printk'
   dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
   ~^
   include/linux/dev_printk.h:118:12: note: expanded from macro 'dev_info'
   _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
 ^~~
   2 errors generated.


vim +311 drivers/gpu/drm/radeon/radeon_drv.c

   291  
   292  static int radeon_pci_probe(struct pci_dev *pdev,
   293  const struct pci_device_id *ent)
   294  {
   295  unsigned long flags = 0;
   296  struct drm_device *dev;
   297  int ret;
   298  
   299  if (!ent)
   300  return -ENODEV; /* Avoid NULL-ptr deref in 
drm_get_pci_dev */
   301  
   302  flags = ent->driver_data;
   303  
   304  if (!radeon_si_support) {
   305  switch (flags & RADEON_FAMILY_MASK) {
   306  case CHIP_TAHITI:
   307  case CHIP_PITCAIRN:
   308  case CHIP_VERDE:
   309  case CHIP_OLAND:
   310  case CHIP_HAINAN:
 > 311  drm_info(>dev,
   312   "SI support disabled by module 
param\n");
   313  return -ENODEV;
   314  }
   315  }
   316  if (!radeon_cik_support) {
   317  switch (flags & RADEON_FAMILY_MASK) {
   318  case CHIP_KAVERI:
   319  case CHIP_BONAIRE:
   320  case CHIP_HAWAII:
   321  case CHIP_KABINI:
   322  case CHIP_MULLINS:
   323  drm_info(>dev,
   324   "CIK support disabled by module 
param\n");
   325  return -ENODEV;
   326  }
   327  }
   328  
   329  if (vga_switcheroo_client_probe_defer(pdev))
   330  return -EP

Re: [PATCH] This patch replaces all the instances of dev_info with drm_info macro

2021-06-22 Thread kernel test robot

Hi Aman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.13-rc7 next-20210621]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Aman-Jain/This-patch-replaces-all-the-instances-of-dev_info-with-drm_info-macro/20210622-140850
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
a96bfed64c8986d6404e553f18203cae1f5ac7e6
config: ia64-randconfig-r005-20210622 (attached as .config)
compiler: ia64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/aa0d692308d703f641f19def814f7c8d59468671
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Aman-Jain/This-patch-replaces-all-the-instances-of-dev_info-with-drm_info-macro/20210622-140850
git checkout aa0d692308d703f641f19def814f7c8d59468671
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=ia64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from include/linux/device.h:15,
from include/linux/pm_runtime.h:11,
from drivers/gpu/drm/radeon/radeon_drv.c:36:
   drivers/gpu/drm/radeon/radeon_drv.c: In function 'radeon_pci_probe':
>> include/drm/drm_print.h:412:27: error: 'struct device' has no member named 
>> 'dev'; did you mean 'devt'?
 412 |  dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
 |   ^~~
   include/linux/dev_printk.h:118:12: note: in definition of macro 'dev_info'
 118 |  _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
 |^~~
   include/drm/drm_print.h:416:2: note: in expansion of macro '__drm_printk'
 416 |  __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
 |  ^~~~
   drivers/gpu/drm/radeon/radeon_drv.c:311:4: note: in expansion of macro 
'drm_info'
 311 |drm_info(>dev,
 |^~~~
>> include/drm/drm_print.h:412:27: error: 'struct device' has no member named 
>> 'dev'; did you mean 'devt'?
 412 |  dev_##level##type((drm)->dev, "[drm] " fmt, ##__VA_ARGS__)
 |   ^~~
   include/linux/dev_printk.h:118:12: note: in definition of macro 'dev_info'
 118 |  _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
 |^~~
   include/drm/drm_print.h:416:2: note: in expansion of macro '__drm_printk'
 416 |  __drm_printk((drm), info,, fmt, ##__VA_ARGS__)
 |  ^~~~
   drivers/gpu/drm/radeon/radeon_drv.c:323:4: note: in expansion of macro 
'drm_info'
 323 |drm_info(>dev,
 |^~~~


vim +412 include/drm/drm_print.h

02c9656b2f0d69 Haneen Mohammed   2017-10-17  378  
02c9656b2f0d69 Haneen Mohammed   2017-10-17  379  /**
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  380   * DRM_DEV_DEBUG() - 
Debug output for generic drm code
02c9656b2f0d69 Haneen Mohammed   2017-10-17  381   *
091756bbb1a961 Haneen Mohammed   2017-10-17  382   * @dev: device pointer
091756bbb1a961 Haneen Mohammed   2017-10-17  383   * @fmt: printf() like 
format string.
02c9656b2f0d69 Haneen Mohammed   2017-10-17  384   */
db87086492581c Joe Perches   2018-03-16  385  #define 
DRM_DEV_DEBUG(dev, fmt, ...)  \
db87086492581c Joe Perches   2018-03-16  386drm_dev_dbg(dev, 
DRM_UT_CORE, fmt, ##__VA_ARGS__)
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  387  /**
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  388   * DRM_DEV_DEBUG_DRIVER() 
- Debug output for vendor specific part of the driver
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  389   *
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  390   * @dev: device pointer
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  391   * @fmt: printf() like 
format string.
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  392   */
db87086492581c Joe Perches   2018-03-16  393  #define 
DRM_DEV_DEBUG_DRIVER(dev, fmt, ...)   \
db87086492581c Joe Perches   2018-03-16  394drm_dev_dbg(dev, 
DRM_UT_DRIVER, fmt, ##__VA_ARGS__)
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  395  /**
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  396   * DRM_DEV_DEBUG_KMS() - 
Debug output for modesetting code
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  397   *
b52817e9de06a3 Mauro Carvalho Chehab 2020-10-27  398   * @dev: device pointer
b52817e

Re: [PATCH 1/1] drm/amdgpu: add helper function for vm pasid

2021-06-22 Thread Christian König





Am 22.06.21 um 08:57 schrieb Nirmoy Das:

Cleanup code related to vm pasid by adding helper functions.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 105 -
  1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 63975bda8e76..6e476b173cbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -87,6 +87,46 @@ struct amdgpu_prt_cb {
struct dma_fence_cb cb;
  };

+static int amdgpu_vm_pasid_alloc(struct amdgpu_device *adev,
+struct amdgpu_vm *vm,
+unsigned int pasid,
+unsigned int *vm_pasid)
+{
+   unsigned long flags;
+   int r;
+
+   if (!pasid)
+   return 0;
+
+   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+   r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
+ GFP_ATOMIC);
+   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+   if (r < 0)
+   return r;
+   if (vm_pasid)
+   *vm_pasid = pasid;
+


Ok the more I read from this patch the less it makes sense.

We don't allocate the passid here, we just set it up in the idr.

What we could do is to replace the idr with an xarray, that would 
certainly make more sense than this here.


Christian.


+   return 0;
+}
+
+static void amdgpu_vm_pasid_remove(struct amdgpu_device *adev,
+  unsigned int pasid,
+  unsigned int *vm_pasid)
+{
+   unsigned long flags;
+
+   if (!pasid)
+   return;
+
+   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+   idr_remove(>vm_manager.pasid_idr, pasid);
+   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+
+   if (vm_pasid)
+   *vm_pasid = 0;
+}
+
  /*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no reclaim-FS
   * happens while holding this lock anywhere to prevent deadlocks when
@@ -2940,18 +2980,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm, u32 pasid)

amdgpu_bo_unreserve(vm->root.bo);

-   if (pasid) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
- GFP_ATOMIC);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-   if (r < 0)
-   goto error_free_root;
-
-   vm->pasid = pasid;
-   }
+   if (amdgpu_vm_pasid_alloc(adev, vm, pasid, >pasid))
+   goto error_free_root;

INIT_KFIFO(vm->faults);

@@ -3038,19 +3068,11 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
r = amdgpu_vm_check_clean_reserved(adev, vm);
if (r)
goto unreserve_bo;
+   r = amdgpu_vm_pasid_alloc(adev, vm, pasid, NULL);
+   if (r ==  -ENOSPC)
+   goto unreserve_bo;

-   if (pasid) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
- GFP_ATOMIC);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
-   if (r == -ENOSPC)
-   goto unreserve_bo;
-   r = 0;
-   }
+   r = 0;

/* Check if PD needs to be reinitialized and do it before
 * changing any other state, in case it fails.
@@ -3089,35 +3111,23 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
vm->is_compute_context = true;

if (vm->pasid) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   idr_remove(>vm_manager.pasid_idr, vm->pasid);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
/* Free the original amdgpu allocated pasid
 * Will be replaced with kfd allocated pasid
 */
amdgpu_pasid_free(vm->pasid);
-   vm->pasid = 0;
+   amdgpu_vm_pasid_remove(adev, vm->pasid, >pasid);
}

/* Free the shadow bo for compute VM */
amdgpu_bo_unref(_amdgpu_bo_vm(vm->root.bo)->shadow);
-
if (pasid)
vm->pasid = pasid;

goto unreserve_bo;

  free_idr:
-   if (pasid) {
-   unsigned long flags;
+   amdgpu_vm_pasid_remove(adev, pasid, NULL);

-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   idr_remove(>vm_manager.pasid_idr, pasid);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-   }
  unreserve_bo:

Re: [PATCH] This patch replaces all the instances of dev_info with drm_info macro

2021-06-22 Thread Christian König


Am 22.06.21 um 08:07 schrieb Aman Jain:

When a driver has multiple instances it is necessary to differentiate
between them in the logs. This was done with dev_info/warn/err since
DRM_INFO/WARN/ERROR don't do this. We now have drm_info/warn/err for
printing the relevant debug messages. Hence, this patch uses
drm_* macros to achieve drm-formatted logging


Well first of all patches for radeon should have a drm/radeon prefix in 
its subject line.


Then I don't think this patch makes sense since this is about the 
hardware support of the module and not even remotely drm related.


So we most likely don't want the drm-formating here in the first place.

Regards,
Christian.



Signed-off-by: Aman Jain 
---
  drivers/gpu/drm/radeon/radeon_drv.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index efeb115ae70e..75e84914c29b 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -308,7 +308,7 @@ static int radeon_pci_probe(struct pci_dev *pdev,
case CHIP_VERDE:
case CHIP_OLAND:
case CHIP_HAINAN:
-   dev_info(>dev,
+   drm_info(>dev,
 "SI support disabled by module param\n");
return -ENODEV;
}
@@ -320,7 +320,7 @@ static int radeon_pci_probe(struct pci_dev *pdev,
case CHIP_HAWAII:
case CHIP_KABINI:
case CHIP_MULLINS:
-   dev_info(>dev,
+   drm_info(>dev,
 "CIK support disabled by module param\n");
return -ENODEV;
}


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/1] drm/amdgpu: add helper function for vm pasid

2021-06-22 Thread Christian König


Am 22.06.21 um 12:30 schrieb Das, Nirmoy:


On 6/22/2021 10:36 AM, Christian König wrote:


Am 22.06.21 um 09:39 schrieb Das, Nirmoy:


On 6/22/2021 9:03 AM, Christian König wrote:



Am 22.06.21 um 08:57 schrieb Nirmoy Das:

Cleanup code related to vm pasid by adding helper functions.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 105 
-

  1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 63975bda8e76..6e476b173cbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -87,6 +87,46 @@ struct amdgpu_prt_cb {
  struct dma_fence_cb cb;
  };

+static int amdgpu_vm_pasid_alloc(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ unsigned int pasid,
+ unsigned int *vm_pasid)
+{
+    unsigned long flags;
+    int r;
+
+    if (!pasid)
+    return 0;
+
+ spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
+  GFP_ATOMIC);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+    if (r < 0)
+    return r;
+    if (vm_pasid)
+    *vm_pasid = pasid;
+


Ok the more I read from this patch the less it makes sense.

We don't allocate the passid here, we just set it up in the idr.

What we could do is to replace the idr with an xarray, that would 
certainly make more sense than this here.



xarray looks great, with that we don't need pasid_lock either.


You still need the lock to protect against VM destruction while 
looking things up, but you could switch to RCU for this instead.



xarray has xa_{lock|unloack}_irqsave() and adev->vm_manager.pasid_xa 
will exist till devices's lifetime.


That's just a wrapper around the lock.


So I am thinking something like:

amdgpu_vm_pasid_insert()

{

...

xa_lock_irqsave(adev->vm_manager.pasids, flags)
r = xa_store(>vm_manager.pasids, pasid, vm, GFP_ATOMIC);
xa_unlock_irqsave(adev->vm_manager.pasids, flags)


It would be really nice if we could avoid the GFP_ATOMIC here, but not 
much of a problem since we had that before.



}

amdgpu_vm_pasid_remove()

{



xa_lock_irqsave(adev->vm_manager.pasids, flags)
xa_erase(>vm_manager.pasids, pasid);
xa_unlock_irqsave(adev->vm_manager.pasids, flags)

}


xa_{lock|unloack}_irqsave() can be use while looking up vm ptr for a 
pasid.



Shouldn't this be enough ?



Yeah I think so.

Christian.



Regards,

Nirmoy



Christian.




Thanks

Nirmoy




Christian.


+    return 0;
+}
+
+static void amdgpu_vm_pasid_remove(struct amdgpu_device *adev,
+   unsigned int pasid,
+   unsigned int *vm_pasid)
+{
+    unsigned long flags;
+
+    if (!pasid)
+    return;
+
+ spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    idr_remove(>vm_manager.pasid_idr, pasid);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+
+    if (vm_pasid)
+    *vm_pasid = 0;
+}
+
  /*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no 
reclaim-FS
   * happens while holding this lock anywhere to prevent deadlocks 
when
@@ -2940,18 +2980,8 @@ int amdgpu_vm_init(struct amdgpu_device 
*adev, struct amdgpu_vm *vm, u32 pasid)


  amdgpu_bo_unreserve(vm->root.bo);

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, 
pasid + 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-    if (r < 0)
-    goto error_free_root;
-
-    vm->pasid = pasid;
-    }
+    if (amdgpu_vm_pasid_alloc(adev, vm, pasid, >pasid))
+    goto error_free_root;

  INIT_KFIFO(vm->faults);

@@ -3038,19 +3068,11 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  r = amdgpu_vm_check_clean_reserved(adev, vm);
  if (r)
  goto unreserve_bo;
+    r = amdgpu_vm_pasid_alloc(adev, vm, pasid, NULL);
+    if (r ==  -ENOSPC)
+    goto unreserve_bo;

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, 
pasid + 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
-    if (r == -ENOSPC)
-    goto unreserve_bo;
-    r = 0;
-    }
+    r = 0;

  /* Check if PD needs to be reinitialized and do it before
   * changing any other state, in case it fails.
@@ -3089,35 +3111,23 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  vm->is_compute_context = true;

  if (vm->pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    idr_remove(>vm_manager.pasid_idr, vm->pasid);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
  /* Free the

Re: [PATCH 1/1] drm/amdgpu: add helper function for vm pasid

2021-06-22 Thread Das, Nirmoy



On 6/22/2021 10:36 AM, Christian König wrote:

Am 22.06.21 um 09:39 schrieb Das, Nirmoy:


On 6/22/2021 9:03 AM, Christian König wrote:



Am 22.06.21 um 08:57 schrieb Nirmoy Das:

Cleanup code related to vm pasid by adding helper functions.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 105 
-

  1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 63975bda8e76..6e476b173cbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -87,6 +87,46 @@ struct amdgpu_prt_cb {
  struct dma_fence_cb cb;
  };

+static int amdgpu_vm_pasid_alloc(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ unsigned int pasid,
+ unsigned int *vm_pasid)
+{
+    unsigned long flags;
+    int r;
+
+    if (!pasid)
+    return 0;
+
+    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
+  GFP_ATOMIC);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+    if (r < 0)
+    return r;
+    if (vm_pasid)
+    *vm_pasid = pasid;
+


Ok the more I read from this patch the less it makes sense.

We don't allocate the passid here, we just set it up in the idr.

What we could do is to replace the idr with an xarray, that would 
certainly make more sense than this here.



xarray looks great, with that we don't need pasid_lock either.


You still need the lock to protect against VM destruction while 
looking things up, but you could switch to RCU for this instead.



xarray has xa_{lock|unloack}_irqsave() and adev->vm_manager.pasid_xa 
will exist till devices's lifetime. So I am thinking something like:


amdgpu_vm_pasid_insert()

{

...

xa_lock_irqsave(adev->vm_manager.pasids, flags)
r = xa_store(>vm_manager.pasids, pasid, vm, GFP_ATOMIC);
xa_unlock_irqsave(adev->vm_manager.pasids, flags)

}

amdgpu_vm_pasid_remove()

{



xa_lock_irqsave(adev->vm_manager.pasids, flags)
xa_erase(>vm_manager.pasids, pasid);
xa_unlock_irqsave(adev->vm_manager.pasids, flags)

}


xa_{lock|unloack}_irqsave() can be use while looking up vm ptr for a pasid.


Shouldn't this be enough ?


Regards,

Nirmoy



Christian.




Thanks

Nirmoy




Christian.


+    return 0;
+}
+
+static void amdgpu_vm_pasid_remove(struct amdgpu_device *adev,
+   unsigned int pasid,
+   unsigned int *vm_pasid)
+{
+    unsigned long flags;
+
+    if (!pasid)
+    return;
+
+    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    idr_remove(>vm_manager.pasid_idr, pasid);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+
+    if (vm_pasid)
+    *vm_pasid = 0;
+}
+
  /*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no 
reclaim-FS
   * happens while holding this lock anywhere to prevent deadlocks 
when
@@ -2940,18 +2980,8 @@ int amdgpu_vm_init(struct amdgpu_device 
*adev, struct amdgpu_vm *vm, u32 pasid)


  amdgpu_bo_unreserve(vm->root.bo);

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, 
pasid + 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-    if (r < 0)
-    goto error_free_root;
-
-    vm->pasid = pasid;
-    }
+    if (amdgpu_vm_pasid_alloc(adev, vm, pasid, >pasid))
+    goto error_free_root;

  INIT_KFIFO(vm->faults);

@@ -3038,19 +3068,11 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  r = amdgpu_vm_check_clean_reserved(adev, vm);
  if (r)
  goto unreserve_bo;
+    r = amdgpu_vm_pasid_alloc(adev, vm, pasid, NULL);
+    if (r ==  -ENOSPC)
+    goto unreserve_bo;

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, 
pasid + 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
-    if (r == -ENOSPC)
-    goto unreserve_bo;
-    r = 0;
-    }
+    r = 0;

  /* Check if PD needs to be reinitialized and do it before
   * changing any other state, in case it fails.
@@ -3089,35 +3111,23 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  vm->is_compute_context = true;

  if (vm->pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    idr_remove(>vm_manager.pasid_idr, vm->pasid);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
  /* Free the original amdgpu allocated pasid
   * Will be replaced with kfd allocated pasid
   */
  amdgpu_pasid_free(vm->pasid);
-    vm->pasid = 0;
+    amdgpu_vm_pasid_remove(adev, vm->pasid, >pasid);
  }

RE: [PATCH] drm/amdgpu: move apu flags initialization to the start of device init

2021-06-22 Thread Liu, Aaron

Reviewed-by: Aaron Liu 

--
Best Regards
Aaron Liu

> -Original Message-
> From: Huang, Ray 
> Sent: Tuesday, June 22, 2021 5:41 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Zhang, Hawking
> ; Zhou1, Tao ; Yu, Lang
> ; Gong, Curry ; Liu, Aaron
> ; Huang, Ray 
> Subject: [PATCH] drm/amdgpu: move apu flags initialization to the start of
> device init
> 
> In some asics, we need to adjust the behavior according to the apu flags at
> very early stage.
> 
> Signed-off-by: Huang Rui 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 36
> ++
>  drivers/gpu/drm/amd/amdgpu/nv.c|  1 -
>  drivers/gpu/drm/amd/amdgpu/soc15.c | 10 +-
>  3 files changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3f51b142fc83..e6702d136a6d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1359,6 +1359,38 @@ static void
> amdgpu_device_check_smu_prv_buffer_size(struct amdgpu_device *adev)
>   adev->pm.smu_prv_buffer_size = 0;
>  }
> 
> +static int amdgpu_device_init_apu_flags(struct amdgpu_device *adev) {
> + if (!(adev->flags & AMD_IS_APU) ||
> + adev->asic_type < CHIP_RAVEN)
> + return 0;
> +
> + switch (adev->asic_type) {
> + case CHIP_RAVEN:
> + if (adev->pdev->device == 0x15dd)
> + adev->apu_flags |= AMD_APU_IS_RAVEN;
> + if (adev->pdev->device == 0x15d8)
> + adev->apu_flags |= AMD_APU_IS_PICASSO;
> + break;
> + case CHIP_RENOIR:
> + if ((adev->pdev->device == 0x1636) ||
> + (adev->pdev->device == 0x164c))
> + adev->apu_flags |= AMD_APU_IS_RENOIR;
> + else
> + adev->apu_flags |= AMD_APU_IS_GREEN_SARDINE;
> + break;
> + case CHIP_VANGOGH:
> + adev->apu_flags |= AMD_APU_IS_VANGOGH;
> + break;
> + case CHIP_YELLOW_CARP:
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
>  /**
>   * amdgpu_device_check_arguments - validate module params
>   *
> @@ -3358,6 +3390,10 @@ int amdgpu_device_init(struct amdgpu_device
> *adev,
>   mutex_init(>psp.mutex);
>   mutex_init(>notifier_lock);
> 
> + r = amdgpu_device_init_apu_flags(adev);
> + if (r)
> + return r;
> +
>   r = amdgpu_device_check_arguments(adev);
>   if (r)
>   return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c
> b/drivers/gpu/drm/amd/amdgpu/nv.c index 455d0425787c..1470488a18e3
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> @@ -1275,7 +1275,6 @@ static int nv_common_early_init(void *handle)
>   break;
> 
>   case CHIP_VANGOGH:
> - adev->apu_flags |= AMD_APU_IS_VANGOGH;
>   adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG |
>   AMD_CG_SUPPORT_GFX_MGLS |
>   AMD_CG_SUPPORT_GFX_CP_LS |
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
> b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index de85577c9cfd..b02436401d46 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -1360,10 +1360,7 @@ static int soc15_common_early_init(void *handle)
>   break;
>   case CHIP_RAVEN:
>   adev->asic_funcs = _asic_funcs;
> - if (adev->pdev->device == 0x15dd)
> - adev->apu_flags |= AMD_APU_IS_RAVEN;
> - if (adev->pdev->device == 0x15d8)
> - adev->apu_flags |= AMD_APU_IS_PICASSO;
> +
>   if (adev->rev_id >= 0x8)
>   adev->apu_flags |= AMD_APU_IS_RAVEN2;
> 
> @@ -1455,11 +1452,6 @@ static int soc15_common_early_init(void *handle)
>   break;
>   case CHIP_RENOIR:
>   adev->asic_funcs = _asic_funcs;
> - if ((adev->pdev->device == 0x1636) ||
> - (adev->pdev->device == 0x164c))
> - adev->apu_flags |= AMD_APU_IS_RENOIR;
> - else
> - adev->apu_flags |= AMD_APU_IS_GREEN_SARDINE;
> 
>   if (adev->apu_flags & AMD_APU_IS_RENOIR)
>   adev->external_rev_id = adev->rev_id + 0x91;
> --
> 2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v4 15/17] drm/uAPI: Move "Broadcast RGB" property from driver specific to general context

2021-06-22 Thread Werner Sembach

Am 22.06.21 um 09:25 schrieb Pekka Paalanen:
> On Fri, 18 Jun 2021 11:11:14 +0200
> Werner Sembach  wrote:
>
>> Add "Broadcast RGB" to general drm context so that more drivers besides
>> i915 and gma500 can implement it without duplicating code.
>>
>> Userspace can use this property to tell the graphic driver to use full or
>> limited color range for a given connector, overwriting the default
>> behaviour/automatic detection.
>>
>> Possible options are:
>> - Automatic (default/current behaviour)
>> - Full
>> - Limited 16:235
>>
>> In theory the driver should be able to automatically detect the monitors
>> capabilities, but because of flawed standard implementations in Monitors,
>> this might fail. In this case a manual overwrite is required to not have
>> washed out colors or lose details in very dark or bright scenes.
>>
>> Signed-off-by: Werner Sembach 
>> ---
>>  drivers/gpu/drm/drm_atomic_helper.c |  4 +++
>>  drivers/gpu/drm/drm_atomic_uapi.c   |  4 +++
>>  drivers/gpu/drm/drm_connector.c | 43 +
>>  include/drm/drm_connector.h | 16 +++
>>  4 files changed, 67 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
>> b/drivers/gpu/drm/drm_atomic_helper.c
>> index 90d62f305257..0c89d32efbd0 100644
>> --- a/drivers/gpu/drm/drm_atomic_helper.c
>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
>> @@ -691,6 +691,10 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
>>  if (old_connector_state->preferred_color_format !=
>>  new_connector_state->preferred_color_format)
>>  new_crtc_state->connectors_changed = true;
>> +
>> +if (old_connector_state->preferred_color_range !=
>> +new_connector_state->preferred_color_range)
>> +new_crtc_state->connectors_changed = true;
>>  }
>>  
>>  if (funcs->atomic_check)
>> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
>> b/drivers/gpu/drm/drm_atomic_uapi.c
>> index c536f5e22016..c589bb1a8163 100644
>> --- a/drivers/gpu/drm/drm_atomic_uapi.c
>> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
>> @@ -798,6 +798,8 @@ static int drm_atomic_connector_set_property(struct 
>> drm_connector *connector,
>>  state->max_requested_bpc = val;
>>  } else if (property == connector->preferred_color_format_property) {
>>  state->preferred_color_format = val;
>> +} else if (property == connector->preferred_color_range_property) {
>> +state->preferred_color_range = val;
>>  } else if (connector->funcs->atomic_set_property) {
>>  return connector->funcs->atomic_set_property(connector,
>>  state, property, val);
>> @@ -877,6 +879,8 @@ drm_atomic_connector_get_property(struct drm_connector 
>> *connector,
>>  *val = state->max_requested_bpc;
>>  } else if (property == connector->preferred_color_format_property) {
>>  *val = state->preferred_color_format;
>> +} else if (property == connector->preferred_color_range_property) {
>> +*val = state->preferred_color_range;
>>  } else if (connector->funcs->atomic_get_property) {
>>  return connector->funcs->atomic_get_property(connector,
>>  state, property, val);
>> diff --git a/drivers/gpu/drm/drm_connector.c 
>> b/drivers/gpu/drm/drm_connector.c
>> index aea03dd02e33..9bc596638613 100644
>> --- a/drivers/gpu/drm/drm_connector.c
>> +++ b/drivers/gpu/drm/drm_connector.c
>> @@ -905,6 +905,12 @@ static const struct drm_prop_enum_list 
>> drm_active_color_format_enum_list[] = {
>>  { DRM_COLOR_FORMAT_YCRCB420, "ycbcr420" },
>>  };
>>  
>> +static const struct drm_prop_enum_list 
>> drm_preferred_color_range_enum_list[] = {
>> +{ DRM_MODE_COLOR_RANGE_UNSET, "Automatic" },
>> +{ DRM_MODE_COLOR_RANGE_FULL, "Full" },
>> +{ DRM_MODE_COLOR_RANGE_LIMITED_16_235, "Limited 16:235" },
> Hi,
>
> the same question here about these numbers as I asked on the "active
> color range" property.
>
>> +};
>> +
>>  static const struct drm_prop_enum_list drm_active_color_range_enum_list[] = 
>> {
>>  { DRM_MODE_COLOR_RANGE_UNSET, "Unknown" },
>>  { DRM_MODE_COLOR_RANGE_FULL, "Full" },
>> @@ -1243,6 +1249,13 @@ static const struct drm_prop_enum_list 
>> dp_colorspaces[] = {
>>   *  drm_connector_attach_active_color_format_property() to install this
>>   *  property.
>>   *
>> + * Broadcast RGB:
>> + *  This property is used by userspace to change the used color range. When
>> + *  used the driver will use the selected range if valid for the current
>> + *  color format. Drivers to use the function
>> + *  drm_connector_attach_preferred_color_format_property() to create and
>> + *  attach the property to the connector during initialization.
> An important detail to document here is: does userspace need to
> take care that pixel data at the

Re: [PATCH v4 09/17] drm/uAPI: Add "active color range" drm property as feedback for userspace

2021-06-22 Thread Werner Sembach



Am 22.06.21 um 09:00 schrieb Pekka Paalanen:
> On Fri, 18 Jun 2021 11:11:08 +0200
> Werner Sembach  wrote:
>
>> Add a new general drm property "active color range" which can be used by
>> graphic drivers to report the used color range back to userspace.
>>
>> There was no way to check which color range got actually used on a given
>> monitor. To surely predict this, one must know the exact capabilities of
>> the monitor and what the default behaviour of the used driver is. This
>> property helps eliminating the guessing at this point.
>>
>> In the future, automatic color calibration for screens might also depend on
>> this information being available.
>>
>> Signed-off-by: Werner Sembach 
>> ---
>>  drivers/gpu/drm/drm_connector.c | 59 +
>>  include/drm/drm_connector.h | 27 +++
>>  2 files changed, 86 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_connector.c 
>> b/drivers/gpu/drm/drm_connector.c
>> index 684d7abdf0eb..818de58d972f 100644
>> --- a/drivers/gpu/drm/drm_connector.c
>> +++ b/drivers/gpu/drm/drm_connector.c
>> @@ -897,6 +897,12 @@ static const struct drm_prop_enum_list 
>> drm_active_color_format_enum_list[] = {
>>  { DRM_COLOR_FORMAT_YCRCB420, "ycbcr420" },
>>  };
>>  
>> +static const struct drm_prop_enum_list drm_active_color_range_enum_list[] = 
>> {
>> +{ DRM_MODE_COLOR_RANGE_UNSET, "Unknown" },
>> +{ DRM_MODE_COLOR_RANGE_FULL, "Full" },
>> +{ DRM_MODE_COLOR_RANGE_LIMITED_16_235, "Limited 16:235" },
> Doesn't "limited" mean different numbers on RGB vs. Y vs. CbCr? I have
> a vague recollection that at least one of them was different from the
> others.

Yes, seems like it does:
https://www.kernel.org/doc/html/v5.12/userspace-api/media/v4l/colorspaces-defs.html#c.V4L.v4l2_quantization

I carried the option names over from "Broadcast RGB", see my other e-mail for 
more details.

>
> Documenting DRM_MODE_COLOR_RANGE_UNSET as "unspecified/default" while
> the string for it is "Unknown" seems inconsistent to me. I would
> recommend to avoid the word "default" because "reset to defaults" might
> become a thing one day, and that probably is not the same default as
> here.
>
> Is there actually a case for "unknown"? How can it be not known? Or
> does it mean "not applicable"?

Unknown is when no monitor is connected or is when the connector/monitor is 
disabled.

It also is the initial value when the driver fails to correctly set the 
property. This shouldn't happen, but I'm
wondering if I should still introduce an _ERROR state instead for this case?

I will rename it, maybe "unset" to match the enum? "not applicable" also fits 
if either the error state is defined or
not necessary.

>
> Otherwise looks good to me.
>
>
> Thanks,
> pq
>
>
>> +};
>> +
>>  DRM_ENUM_NAME_FN(drm_get_dp_subconnector_name,
>>   drm_dp_subconnector_enum_list)
>>  
>> @@ -1221,6 +1227,14 @@ static const struct drm_prop_enum_list 
>> dp_colorspaces[] = {
>>   *  drm_connector_attach_active_color_format_property() to install this
>>   *  property.
>>   *
>> + * active color range:
>> + *  This read-only property tells userspace the color range actually used by
>> + *  the hardware display engine on "the cable" on a connector. The chosen
>> + *  value depends on hardware capabilities of the monitor and the used color
>> + *  format. Drivers shall use
>> + *  drm_connector_attach_active_color_range_property() to install this
>> + *  property.
>> + *
>>   * Connectors also have one standardized atomic property:
>>   *
>>   * CRTC_ID:
>> @@ -2264,6 +2278,51 @@ void 
>> drm_connector_set_active_color_format_property(struct drm_connector *connec
>>  }
>>  EXPORT_SYMBOL(drm_connector_set_active_color_format_property);
>>  
>> +/**
>> + * drm_connector_attach_active_color_range_property - attach "active color 
>> range" property
>> + * @connector: connector to attach active color range property on.
>> + *
>> + * This is used to check the applied color range on a connector.
>> + *
>> + * Returns:
>> + * Zero on success, negative errno on failure.
>> + */
>> +int drm_connector_attach_active_color_range_property(struct drm_connector 
>> *connector)
>> +{
>> +struct drm_device *dev = connector->dev;
>> +struct drm_property *prop;
>> +
>> +if (!connector->active_color_range_property) {
>> +prop = drm_property_create_enum(dev, DRM_MODE_PROP_IMMUTABLE, 
>> "active color range",
>> +
>> drm_active_color_range_enum_list,
>> +
>> ARRAY_SIZE(drm_active_color_range_enum_list));
>> +if (!prop)
>> +return -ENOMEM;
>> +
>> +connector->active_color_range_property = prop;
>> +drm_object_attach_property(>base, prop, 
>> DRM_MODE_COLOR_RANGE_UNSET);
>> +}
>> +
>> +return 0;
>> +}
>> +EXPORT_SYMBOL(drm_connector_attach_active_color_range_property);
>> +
>> +/**
>> + *

Re: [PATCH 1/3] drm/nouveau: wait for moving fence after pinning

2021-06-22 Thread Christian König


Am 22.06.21 um 11:20 schrieb Daniel Vetter:

On Mon, Jun 21, 2021 at 5:53 PM Daniel Vetter  wrote:

On Mon, Jun 21, 2021 at 5:49 PM Christian König
 wrote:

Am 21.06.21 um 16:54 schrieb Daniel Vetter:

On Mon, Jun 21, 2021 at 03:03:26PM +0200, Christian König wrote:

We actually need to wait for the moving fence after pinning
the BO to make sure that the pin is completed.

Signed-off-by: Christian König 
CC: sta...@kernel.org
---
   drivers/gpu/drm/nouveau/nouveau_prime.c | 8 +++-
   1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c 
b/drivers/gpu/drm/nouveau/nouveau_prime.c
index 347488685f74..591738545eba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_prime.c
+++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
@@ -93,7 +93,13 @@ int nouveau_gem_prime_pin(struct drm_gem_object *obj)
  if (ret)
  return -EINVAL;

-return 0;
+if (nvbo->bo.moving) {

Don't we need to hold the dma_resv to read this? We can grab a reference
and then unlock, but I think just unlocked wait can go boom pretty easily
(since we don't hold a reference or lock so someone else can jump in and
free the moving fence).

The moving fence is only modified while the BO is moved and since we
have just successfully pinned it

Yeah  ... so probably correct, but really tricky. Just wrapping a
ttm_bo_reserve/unreserve around the code you add should be enough and
get the job done?

I think you distracted me a bit with the "it can't move", so yes
there's a guarantee that no other fence can show up in ttm_bo->moving
and confuse us. But it could get set to NULL because someone realized
it signalled. We're not doing that systematically, but relying on
fences never getting garbage-collected for correctness isn't great.


Yeah, that's what I essentially meant with it would be better in general 
to take the lock.




Sot the ttm_bo_reserve/unreserve is definitely needed here around this
bit of code. You don't need to merge it with the reserve/unreserve in
the pin function though, it's just to protect against the
use-after-free.


Ah, yes good point. That means I don't need to change the pin/unpin 
functions in nouveau at all.



BTW: What do you think of making dma_fence_is_signaled() and 
dma_fence_wait_timeout() save to passing in NULL as fence?


I think we have a lot of cases where we check "!fence || 
dma_fence_is_signaled(fence)" or similar.


Christian.


-Daniel


But in general I agree that it would be better to avoid this. I just
didn't wanted to open a bigger can of worms by changing nouveau so much.

Yeah, but I'm kinda thinking of some helpers to wait for the move
fence (so that later on we can switch from having the exclusive fence
to the move fence do that, maybe). And then locking checks in there
would be nice.

Also avoids the case of explaining why lockless here is fine, but
lockless wait for the exclusive fence in e.g. a dynami dma-buf
importer is very much not fine at all. Just all around less trouble.
-Daniel


Christian.


-Daniel


+ret = dma_fence_wait(nvbo->bo.moving, true);
+if (ret)
+nouveau_bo_unpin(nvbo);
+}
+
+return ret;
   }

   void nouveau_gem_prime_unpin(struct drm_gem_object *obj)
--
2.25.1



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/1] amdgpu/pm: remove code duplication in show_power_cap calls

2021-06-22 Thread Wang, Kevin(Yang)

[AMD Official Use Only]

Please optimize the following code together in new generic function.

if (pp_funcs && pp_funcs->get_power_limit){}
please check above codes before calling pm_runtime_xxx interfaces to avoid 
empty operation in pm_runtime cycle.

Reviewed-by: Kevin Wang 

Best Regards,
Kevin

From: amd-gfx  on behalf of Darren 
Powell 
Sent: Tuesday, June 22, 2021 12:17 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Powell, Darren 
Subject: [PATCH 1/1] amdgpu/pm: remove code duplication in show_power_cap calls

created generic function and call with enum from
 * amdgpu_hwmon_show_power_cap_max
 * amdgpu_hwmon_show_power_cap
 * amdgpu_hwmon_show_power_cap_default

=== Test ===
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | cut -d " " -f 
10`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

cp pp_show_power_cap.txt{,.old}
lspci -nn | grep "VGA\|Display" > pp_show_power_cap.test.log
FILES="
power1_cap
power1_cap_max
power1_cap_default "

for f in $FILES
do
  echo  $f = `cat $HWMON_DIR/$f` >> pp_show_power_cap.test.log
done

Signed-off-by: Darren Powell 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 86 +-
 1 file changed, 14 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index b2335a1d3f98..99c21d1a2c4e 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -2901,14 +2901,14 @@ static ssize_t amdgpu_hwmon_show_power_cap_min(struct 
device *dev,
 return sprintf(buf, "%i\n", 0);
 }

-static ssize_t amdgpu_hwmon_show_power_cap_max(struct device *dev,
-struct device_attribute *attr,
-char *buf)
+static ssize_t amdgpu_hwmon_show_power_cap_generic(struct device *dev,
+  struct device_attribute *attr,
+  char *buf,
+  enum pp_power_limit_level pp_limit_level)
 {
 struct amdgpu_device *adev = dev_get_drvdata(dev);
 const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
 enum pp_power_type power_type = to_sensor_dev_attr(attr)->index;
-   enum pp_power_limit_level pp_limit_level = PP_PWR_LIMIT_MAX;
 uint32_t limit;
 ssize_t size;
 int r;
@@ -2941,85 +2941,27 @@ static ssize_t amdgpu_hwmon_show_power_cap_max(struct 
device *dev,
 return size;
 }

-static ssize_t amdgpu_hwmon_show_power_cap(struct device *dev,
+static ssize_t amdgpu_hwmon_show_power_cap_max(struct device *dev,
  struct device_attribute *attr,
  char *buf)
 {
-   struct amdgpu_device *adev = dev_get_drvdata(dev);
-   const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
-   enum pp_power_type power_type = to_sensor_dev_attr(attr)->index;
-   enum pp_power_limit_level pp_limit_level = PP_PWR_LIMIT_CURRENT;
-   uint32_t limit;
-   ssize_t size;
-   int r;
-
-   if (amdgpu_in_reset(adev))
-   return -EPERM;
-   if (adev->in_suspend && !adev->in_runpm)
-   return -EPERM;
-
-   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
-   if (r < 0) {
-   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
-   return r;
-   }
-
-   if (pp_funcs && pp_funcs->get_power_limit)
-   r = pp_funcs->get_power_limit(adev->powerplay.pp_handle, ,
- pp_limit_level, power_type);
-   else
-   r = -ENODATA;
-
-   if (!r)
-   size = snprintf(buf, PAGE_SIZE, "%u\n", limit * 100);
-   else
-   size = snprintf(buf, PAGE_SIZE, "\n");
-
-   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
-   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   return amdgpu_hwmon_show_power_cap_generic(dev, attr, buf, 
PP_PWR_LIMIT_MAX);
+}

-   return size;
+static ssize_t amdgpu_hwmon_show_power_cap(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   return amdgpu_hwmon_show_power_cap_generic(dev, attr, buf, 
PP_PWR_LIMIT_CURRENT);
 }

 static ssize_t amdgpu_hwmon_show_power_cap_default(struct device *dev,
  struct device_attribute *attr,
  char *buf)
 {
-   struct amdgpu_device *adev = dev_get_drvdata(dev);
-   const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
-   enum pp_power_type power_type = to_sensor_dev_attr(attr)->index;
-   enum pp_power_limit_level pp_limit_level = PP_PWR_LIMIT_DEFAULT;
-   uint32_t limit;
-   ssize_t size;
-   int r;
-
-   if

Re: [PATCH v4 17/17] drm/amd/display: Add handling for new "Broadcast RGB" property

2021-06-22 Thread Werner Sembach

Am 22.06.21 um 09:29 schrieb Pekka Paalanen:
> On Fri, 18 Jun 2021 11:11:16 +0200
> Werner Sembach  wrote:
>
>> This commit implements the "Broadcast RGB" drm property for the AMD GPU
>> driver.
>>
>> Signed-off-by: Werner Sembach 
>> ---
>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 22 ++-
>>  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  4 
>>  2 files changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index 9ffd2f9d3d75..c5dbf948a47a 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -5252,7 +5252,8 @@ get_aspect_ratio(const struct drm_display_mode 
>> *mode_in)
>>  }
>>  
>>  static enum dc_color_space
>> -get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing)
>> +get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing,
>> +   enum drm_mode_color_range preferred_color_range)
>>  {
>>  enum dc_color_space color_space = COLOR_SPACE_SRGB;
>>  
>> @@ -5267,13 +5268,17 @@ get_output_color_space(const struct dc_crtc_timing 
>> *dc_crtc_timing)
>>   * respectively
>>   */
>>  if (dc_crtc_timing->pix_clk_100hz > 270300) {
>> -if (dc_crtc_timing->flags.Y_ONLY)
>> +if (dc_crtc_timing->flags.Y_ONLY
>> +|| preferred_color_range ==
>> +
>> DRM_MODE_COLOR_RANGE_LIMITED_16_235)
>>  color_space =
>>  COLOR_SPACE_YCBCR709_LIMITED;
>>  else
>>  color_space = COLOR_SPACE_YCBCR709;
> Hi,
>
> does this mean that amdgpu would be using a property named "Broadcast
> RGB" to control the range of YCbCr too?

Yes, because I avoided creating a new property, but I'm not really happy with 
it either.

Possibility 1: Use "Broadcast RGB" for Y'CbCr too and clarify in documentation
    - still confusing name
    - limited does not mean something a little bit different for Y'CbCr and not 
strictly 16-235:
https://www.kernel.org/doc/html/v5.12/userspace-api/media/v4l/colorspaces-defs.html#c.V4L.v4l2_quantization
 , but name
of option is given by preexisting property

Possibility 2: Deprecate "Broadcast RGB" and a a more neutral sounding 
"preferred color range", with the more neutral
sounding "limited" option instead of "Limited 16:235"
    - What's the relation between the 2? pq mentioned on the amdgpu gitlab that 
there is a posibility for userspace to
have only the new or the old one shown
    - Alternatively ignore "Broadcast RGB" when "preferred color range" is set 
and have them coexist?

>
> That is surprising. If this is truly wanted, then the documentation of
> "Broadcast RGB" must say that it applies to YCbCr too.
>
> Does amdgpu do the same as intel wrt. to the question about whose
> responsibility it is to make the pixels at the connector to match the
> set range?

I guess the kernel driver does the conversion, but i have to check for both.

For Intel I did not change the behavior of Boradcast RGB, but i think it's not 
clearly specified in the docs where the
conversion happens.

>
>
> Thanks,
> pq
>
>>  } else {
>> -if (dc_crtc_timing->flags.Y_ONLY)
>> +if (dc_crtc_timing->flags.Y_ONLY
>> +|| preferred_color_range ==
>> +
>> DRM_MODE_COLOR_RANGE_LIMITED_16_235)
>>  color_space =
>>  COLOR_SPACE_YCBCR601_LIMITED;
>>  else
>> @@ -5283,7 +5288,10 @@ get_output_color_space(const struct dc_crtc_timing 
>> *dc_crtc_timing)
>>  }
>>  break;
>>  case PIXEL_ENCODING_RGB:
>> -color_space = COLOR_SPACE_SRGB;
>> +if (preferred_color_range == 
>> DRM_MODE_COLOR_RANGE_LIMITED_16_235)
>> +color_space = COLOR_SPACE_SRGB_LIMITED;
>> +else
>> +color_space = COLOR_SPACE_SRGB;
>>  break;
>>  
>>  default:
>> @@ -5429,7 +5437,10 @@ static void 
>> fill_stream_properties_from_drm_display_mode(
>>  
>>  timing_out->aspect_ratio = get_aspect_ratio(mode_in);
>>  
>> -stream->output_color_space = get_output_color_space(timing_out);
>> +stream->output_color_space = get_output_color_space(timing_out,
>> +connector_state ?
>> +
>> connector_state->preferred_color_range :
>> +
>> DRM_MODE_COLOR_RANGE_UNSET);
>>  
>>  stream->out_transfer_func->type = TF_TYPE_PREDEFINED;
>>  stream->out_transfer_func->tf =

Re: [PATCH 2/2] drm/amdgpu: rework dma_resv handling v3

2021-06-22 Thread Christian König


Am 17.06.21 um 23:09 schrieb Alex Deucher:

On Mon, Jun 14, 2021 at 1:45 PM Christian König
 wrote:

Drop the workaround and instead implement a better solution.

Basically we are now chaining all submissions using a dma_fence_chain
container and adding them as exclusive fence to the dma_resv object.

This way other drivers can still sync to the single exclusive fence
while amdgpu only sync to fences from different processes.

v3: add the shared fence first before the exclusive one

Signed-off-by: Christian König 

Series is:
Reviewed-by: Alex Deucher 


FYI I've pushed this to drm-misc-next to avoid re-base problems.

Will probably not go upstream before 5.15, so we have plenty of time to 
test this.


Christian.




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 62 
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 65 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  1 -
  6 files changed, 55 insertions(+), 79 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
index a130e766cbdb..c905a4cfc173 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
@@ -34,6 +34,7 @@ struct amdgpu_fpriv;
  struct amdgpu_bo_list_entry {
 struct ttm_validate_buffer  tv;
 struct amdgpu_bo_va *bo_va;
+   struct dma_fence_chain  *chain;
 uint32_tpriority;
 struct page **user_pages;
 booluser_invalidated;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 9ce649a1a8d3..25655414e9c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -572,6 +572,20 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 goto out;
 }

+   amdgpu_bo_list_for_each_entry(e, p->bo_list) {
+   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
+
+   e->bo_va = amdgpu_vm_bo_find(vm, bo);
+
+   if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
+   e->chain = dma_fence_chain_alloc();
+   if (!e->chain) {
+   r = -ENOMEM;
+   goto error_validate;
+   }
+   }
+   }
+
 amdgpu_cs_get_threshold_for_moves(p->adev, >bytes_moved_threshold,
   >bytes_moved_vis_threshold);
 p->bytes_moved = 0;
@@ -599,15 +613,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 gws = p->bo_list->gws_obj;
 oa = p->bo_list->oa_obj;

-   amdgpu_bo_list_for_each_entry(e, p->bo_list) {
-   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
-
-   /* Make sure we use the exclusive slot for shared BOs */
-   if (bo->prime_shared_count)
-   e->tv.num_shared = 0;
-   e->bo_va = amdgpu_vm_bo_find(vm, bo);
-   }
-
 if (gds) {
 p->job->gds_base = amdgpu_bo_gpu_offset(gds) >> PAGE_SHIFT;
 p->job->gds_size = amdgpu_bo_size(gds) >> PAGE_SHIFT;
@@ -629,8 +634,13 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 }

  error_validate:
-   if (r)
+   if (r) {
+   amdgpu_bo_list_for_each_entry(e, p->bo_list) {
+   dma_fence_chain_free(e->chain);
+   e->chain = NULL;
+   }
 ttm_eu_backoff_reservation(>ticket, >validated);
+   }
  out:
 return r;
  }
@@ -670,9 +680,17 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser 
*parser, int error,
  {
 unsigned i;

-   if (error && backoff)
+   if (error && backoff) {
+   struct amdgpu_bo_list_entry *e;
+
+   amdgpu_bo_list_for_each_entry(e, parser->bo_list) {
+   dma_fence_chain_free(e->chain);
+   e->chain = NULL;
+   }
+
 ttm_eu_backoff_reservation(>ticket,
>validated);
+   }

 for (i = 0; i < parser->num_post_deps; i++) {
 drm_syncobj_put(parser->post_deps[i].syncobj);
@@ -1245,6 +1263,28 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,

 amdgpu_vm_move_to_lru_tail(p->adev, >vm);

+   amdgpu_bo_list_for_each_entry(e, p->bo_list) {
+   struct dma_resv *resv = e->tv.bo->base.resv;
+   struct dma_fence_chain *chain = e->chain;
+
+   if (!chain)
+   continue;
+
+   /*
+* Work around dma_resv

Re: [PATCH 1/3] drm/nouveau: wait for moving fence after pinning

2021-06-22 Thread Daniel Vetter

On Mon, Jun 21, 2021 at 5:53 PM Daniel Vetter  wrote:
>
> On Mon, Jun 21, 2021 at 5:49 PM Christian König
>  wrote:
> >
> > Am 21.06.21 um 16:54 schrieb Daniel Vetter:
> > > On Mon, Jun 21, 2021 at 03:03:26PM +0200, Christian König wrote:
> > >> We actually need to wait for the moving fence after pinning
> > >> the BO to make sure that the pin is completed.
> > >>
> > >> Signed-off-by: Christian König 
> > >> CC: sta...@kernel.org
> > >> ---
> > >>   drivers/gpu/drm/nouveau/nouveau_prime.c | 8 +++-
> > >>   1 file changed, 7 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c 
> > >> b/drivers/gpu/drm/nouveau/nouveau_prime.c
> > >> index 347488685f74..591738545eba 100644
> > >> --- a/drivers/gpu/drm/nouveau/nouveau_prime.c
> > >> +++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
> > >> @@ -93,7 +93,13 @@ int nouveau_gem_prime_pin(struct drm_gem_object *obj)
> > >>  if (ret)
> > >>  return -EINVAL;
> > >>
> > >> -return 0;
> > >> +if (nvbo->bo.moving) {
> > > Don't we need to hold the dma_resv to read this? We can grab a reference
> > > and then unlock, but I think just unlocked wait can go boom pretty easily
> > > (since we don't hold a reference or lock so someone else can jump in and
> > > free the moving fence).
> >
> > The moving fence is only modified while the BO is moved and since we
> > have just successfully pinned it
>
> Yeah  ... so probably correct, but really tricky. Just wrapping a
> ttm_bo_reserve/unreserve around the code you add should be enough and
> get the job done?

I think you distracted me a bit with the "it can't move", so yes
there's a guarantee that no other fence can show up in ttm_bo->moving
and confuse us. But it could get set to NULL because someone realized
it signalled. We're not doing that systematically, but relying on
fences never getting garbage-collected for correctness isn't great.

Sot the ttm_bo_reserve/unreserve is definitely needed here around this
bit of code. You don't need to merge it with the reserve/unreserve in
the pin function though, it's just to protect against the
use-after-free.
-Daniel

>
> > But in general I agree that it would be better to avoid this. I just
> > didn't wanted to open a bigger can of worms by changing nouveau so much.
>
> Yeah, but I'm kinda thinking of some helpers to wait for the move
> fence (so that later on we can switch from having the exclusive fence
> to the move fence do that, maybe). And then locking checks in there
> would be nice.
>
> Also avoids the case of explaining why lockless here is fine, but
> lockless wait for the exclusive fence in e.g. a dynami dma-buf
> importer is very much not fine at all. Just all around less trouble.
> -Daniel
>
> >
> > Christian.
> >
> > > -Daniel
> > >
> > >> +ret = dma_fence_wait(nvbo->bo.moving, true);
> > >> +if (ret)
> > >> +nouveau_bo_unpin(nvbo);
> > >> +}
> > >> +
> > >> +return ret;
> > >>   }
> > >>
> > >>   void nouveau_gem_prime_unpin(struct drm_gem_object *obj)
> > >> --
> > >> 2.25.1
> > >>
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH V3 1/7] drm/amdgpu: correct tcp harvest setting

2021-06-22 Thread Michel Dänzer

On 2021-06-22 8:08 a.m., Lazar, Lijo wrote:
> [Public]
> 
> AFAIK, that expression is legal (some code analyzer may warn on value of 
> 4*max_wgp_per_sh); similar kind is used in rotate shift operations.

The default type for constants in C is int, so 0x is a 32-bit signed 
integer.

The C99 specification lists this under J.2 Undefined behavior:

— An expression having signed promoted type is left-shifted and either the 
value of the
 expression is negative or the result of shifting would be not be representable 
in the
 promoted type (6.5.7).

So it would be safer to make it unsigned: 0xu (or just ~0u).

> -Original Message-
> From: Quan, Evan  
> Sent: Tuesday, June 22, 2021 7:56 AM
> To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: RE: [PATCH V3 1/7] drm/amdgpu: correct tcp harvest setting
> 
> [AMD Official Use Only]
> 
> Thanks Lijo.
> However, I'm not quite sure whether " 0x << (4 * max_wgp_per_sh);" is 
> a valid expression since it kind of triggers some overflow.
> Can that work for non-x86 platform or even work reliably for x86 platform?

-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Oded Gabbay

On Tue, Jun 22, 2021 at 9:37 AM Christian König
 wrote:
>
> Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:
> > On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:
> >
> >> Another thing I want to emphasize is that we are doing p2p only
> >> through the export/import of the FD. We do *not* allow the user to
> >> mmap the dma-buf as we do not support direct IO. So there is no access
> >> to these pages through the userspace.
> > Arguably mmaping the memory is a better choice, and is the direction
> > that Logan's series goes in. Here the use of DMABUF was specifically
> > designed to allow hitless revokation of the memory, which this isn't
> > even using.
>
> The major problem with this approach is that DMA-buf is also used for
> memory which isn't CPU accessible.
>
> That was one of the reasons we didn't even considered using the mapping
> memory approach for GPUs.
>
> Regards,
> Christian.
>
> >
> > So you are taking the hit of very limited hardware support and reduced
> > performance just to squeeze into DMABUF..

Thanks Jason for the clarification, but I honestly prefer to use
DMA-BUF at the moment.
It gives us just what we need (even more than what we need as you
pointed out), it is *already* integrated and tested in the RDMA
subsystem, and I'm feeling comfortable using it as I'm somewhat
familiar with it from my AMD days.

I'll go and read Logan's patch-set to see if that will work for us in
the future. Please remember, as Daniel said, we don't have struct page
backing our device memory, so if that is a requirement to connect to
Logan's work, then I don't think we will want to do it at this point.

Thanks,
Oded

> >
> > Jason
> > ___
> > Linaro-mm-sig mailing list
> > linaro-mm-...@lists.linaro.org
> > https://lists.linaro.org/mailman/listinfo/linaro-mm-sig
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/1] drm/amdgpu: add helper function for vm pasid

2021-06-22 Thread Christian König


Am 22.06.21 um 09:39 schrieb Das, Nirmoy:


On 6/22/2021 9:03 AM, Christian König wrote:



Am 22.06.21 um 08:57 schrieb Nirmoy Das:

Cleanup code related to vm pasid by adding helper functions.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 105 
-

  1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 63975bda8e76..6e476b173cbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -87,6 +87,46 @@ struct amdgpu_prt_cb {
  struct dma_fence_cb cb;
  };

+static int amdgpu_vm_pasid_alloc(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ unsigned int pasid,
+ unsigned int *vm_pasid)
+{
+    unsigned long flags;
+    int r;
+
+    if (!pasid)
+    return 0;
+
+    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
+  GFP_ATOMIC);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+    if (r < 0)
+    return r;
+    if (vm_pasid)
+    *vm_pasid = pasid;
+


Ok the more I read from this patch the less it makes sense.

We don't allocate the passid here, we just set it up in the idr.

What we could do is to replace the idr with an xarray, that would 
certainly make more sense than this here.



xarray looks great, with that we don't need pasid_lock either.


You still need the lock to protect against VM destruction while looking 
things up, but you could switch to RCU for this instead.


Christian.




Thanks

Nirmoy




Christian.


+    return 0;
+}
+
+static void amdgpu_vm_pasid_remove(struct amdgpu_device *adev,
+   unsigned int pasid,
+   unsigned int *vm_pasid)
+{
+    unsigned long flags;
+
+    if (!pasid)
+    return;
+
+    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    idr_remove(>vm_manager.pasid_idr, pasid);
+ spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+
+    if (vm_pasid)
+    *vm_pasid = 0;
+}
+
  /*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no 
reclaim-FS

   * happens while holding this lock anywhere to prevent deadlocks when
@@ -2940,18 +2980,8 @@ int amdgpu_vm_init(struct amdgpu_device 
*adev, struct amdgpu_vm *vm, u32 pasid)


  amdgpu_bo_unreserve(vm->root.bo);

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid 
+ 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-    if (r < 0)
-    goto error_free_root;
-
-    vm->pasid = pasid;
-    }
+    if (amdgpu_vm_pasid_alloc(adev, vm, pasid, >pasid))
+    goto error_free_root;

  INIT_KFIFO(vm->faults);

@@ -3038,19 +3068,11 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  r = amdgpu_vm_check_clean_reserved(adev, vm);
  if (r)
  goto unreserve_bo;
+    r = amdgpu_vm_pasid_alloc(adev, vm, pasid, NULL);
+    if (r ==  -ENOSPC)
+    goto unreserve_bo;

-    if (pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid 
+ 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
-    if (r == -ENOSPC)
-    goto unreserve_bo;
-    r = 0;
-    }
+    r = 0;

  /* Check if PD needs to be reinitialized and do it before
   * changing any other state, in case it fails.
@@ -3089,35 +3111,23 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  vm->is_compute_context = true;

  if (vm->pasid) {
-    unsigned long flags;
-
- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    idr_remove(>vm_manager.pasid_idr, vm->pasid);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
  /* Free the original amdgpu allocated pasid
   * Will be replaced with kfd allocated pasid
   */
  amdgpu_pasid_free(vm->pasid);
-    vm->pasid = 0;
+    amdgpu_vm_pasid_remove(adev, vm->pasid, >pasid);
  }

  /* Free the shadow bo for compute VM */
amdgpu_bo_unref(_amdgpu_bo_vm(vm->root.bo)->shadow);
-
  if (pasid)
  vm->pasid = pasid;

  goto unreserve_bo;

  free_idr:
-    if (pasid) {
-    unsigned long flags;
+    amdgpu_vm_pasid_remove(adev, pasid, NULL);

- spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    idr_remove(>vm_manager.pasid_idr, pasid);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-    }
  unreserve_bo:
  amdgpu_bo_unreserve(vm->root.bo);
  return r;
@@ -3133,14 +3143,7 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

   */
  void amdgpu_vm_release_compute(struct amdgpu_device

[PATCH 1/1] amdgpu/pm: remove code duplication in show_power_cap calls

2021-06-22 Thread Darren Powell

 created generic function and call with enum from
 * amdgpu_hwmon_show_power_cap_max
 * amdgpu_hwmon_show_power_cap
 * amdgpu_hwmon_show_power_cap_default

=== Test ===
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | cut -d " " -f 
10`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

cp pp_show_power_cap.txt{,.old}
lspci -nn | grep "VGA\|Display" > pp_show_power_cap.test.log
FILES="
power1_cap
power1_cap_max
power1_cap_default "

for f in $FILES
do
  echo  $f = `cat $HWMON_DIR/$f` >> pp_show_power_cap.test.log
done

Signed-off-by: Darren Powell 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 86 +-
 1 file changed, 14 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index b2335a1d3f98..99c21d1a2c4e 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -2901,14 +2901,14 @@ static ssize_t amdgpu_hwmon_show_power_cap_min(struct 
device *dev,
return sprintf(buf, "%i\n", 0);
 }
 
-static ssize_t amdgpu_hwmon_show_power_cap_max(struct device *dev,
-struct device_attribute *attr,
-char *buf)
+static ssize_t amdgpu_hwmon_show_power_cap_generic(struct device *dev,
+  struct device_attribute *attr,
+  char *buf,
+  enum pp_power_limit_level pp_limit_level)
 {
struct amdgpu_device *adev = dev_get_drvdata(dev);
const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
enum pp_power_type power_type = to_sensor_dev_attr(attr)->index;
-   enum pp_power_limit_level pp_limit_level = PP_PWR_LIMIT_MAX;
uint32_t limit;
ssize_t size;
int r;
@@ -2941,85 +2941,27 @@ static ssize_t amdgpu_hwmon_show_power_cap_max(struct 
device *dev,
return size;
 }
 
-static ssize_t amdgpu_hwmon_show_power_cap(struct device *dev,
+static ssize_t amdgpu_hwmon_show_power_cap_max(struct device *dev,
 struct device_attribute *attr,
 char *buf)
 {
-   struct amdgpu_device *adev = dev_get_drvdata(dev);
-   const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
-   enum pp_power_type power_type = to_sensor_dev_attr(attr)->index;
-   enum pp_power_limit_level pp_limit_level = PP_PWR_LIMIT_CURRENT;
-   uint32_t limit;
-   ssize_t size;
-   int r;
-
-   if (amdgpu_in_reset(adev))
-   return -EPERM;
-   if (adev->in_suspend && !adev->in_runpm)
-   return -EPERM;
-
-   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
-   if (r < 0) {
-   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
-   return r;
-   }
-
-   if (pp_funcs && pp_funcs->get_power_limit)
-   r = pp_funcs->get_power_limit(adev->powerplay.pp_handle, ,
- pp_limit_level, power_type);
-   else
-   r = -ENODATA;
-
-   if (!r)
-   size = snprintf(buf, PAGE_SIZE, "%u\n", limit * 100);
-   else
-   size = snprintf(buf, PAGE_SIZE, "\n");
-
-   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
-   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   return amdgpu_hwmon_show_power_cap_generic(dev, attr, buf, 
PP_PWR_LIMIT_MAX);
+}
 
-   return size;
+static ssize_t amdgpu_hwmon_show_power_cap(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   return amdgpu_hwmon_show_power_cap_generic(dev, attr, buf, 
PP_PWR_LIMIT_CURRENT);
 }
 
 static ssize_t amdgpu_hwmon_show_power_cap_default(struct device *dev,
 struct device_attribute *attr,
 char *buf)
 {
-   struct amdgpu_device *adev = dev_get_drvdata(dev);
-   const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
-   enum pp_power_type power_type = to_sensor_dev_attr(attr)->index;
-   enum pp_power_limit_level pp_limit_level = PP_PWR_LIMIT_DEFAULT;
-   uint32_t limit;
-   ssize_t size;
-   int r;
-
-   if (amdgpu_in_reset(adev))
-   return -EPERM;
-   if (adev->in_suspend && !adev->in_runpm)
-   return -EPERM;
-
-   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
-   if (r < 0) {
-   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
-   return r;
-   }
-
-   if (pp_funcs && pp_funcs->get_power_limit)
-   r = pp_funcs->get_power_limit(adev->powerplay.pp_handle, ,
- pp_limit_level, power_type);
-   else
-   r = -ENODATA;
-
-   if

[PATCH] This patch replaces all the instances of dev_info with drm_info macro

2021-06-22 Thread Aman Jain

When a driver has multiple instances it is necessary to differentiate
between them in the logs. This was done with dev_info/warn/err since
DRM_INFO/WARN/ERROR don't do this. We now have drm_info/warn/err for
printing the relevant debug messages. Hence, this patch uses
drm_* macros to achieve drm-formatted logging

Signed-off-by: Aman Jain 
---
 drivers/gpu/drm/radeon/radeon_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index efeb115ae70e..75e84914c29b 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -308,7 +308,7 @@ static int radeon_pci_probe(struct pci_dev *pdev,
case CHIP_VERDE:
case CHIP_OLAND:
case CHIP_HAINAN:
-   dev_info(>dev,
+   drm_info(>dev,
 "SI support disabled by module param\n");
return -ENODEV;
}
@@ -320,7 +320,7 @@ static int radeon_pci_probe(struct pci_dev *pdev,
case CHIP_HAWAII:
case CHIP_KABINI:
case CHIP_MULLINS:
-   dev_info(>dev,
+   drm_info(>dev,
 "CIK support disabled by module param\n");
return -ENODEV;
}
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] This patch replaces all the instances of dev_info with drm_info

2021-06-22 Thread Aman Jain

When a driver has multiple instances it is necessary to differentiate
between them in the logs. This was done with dev_info/warn/err since
DRM_INFO/WARN/ERROR don't do this. We now have drm_info/warn/err for
printing the relevant debug messages. Hence, this patch uses
drm_* macros to achieve drm-formatted logging

Signed-off-by: Aman Jain 
---
 drivers/gpu/drm/radeon/radeon_drv.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index efeb115ae70e..639c447d9a1f 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -49,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "radeon_drv.h"
 #include "radeon.h"
@@ -308,7 +309,7 @@ static int radeon_pci_probe(struct pci_dev *pdev,
case CHIP_VERDE:
case CHIP_OLAND:
case CHIP_HAINAN:
-   dev_info(>dev,
+   drm_info(>dev,
 "SI support disabled by module param\n");
return -ENODEV;
}
@@ -320,7 +321,7 @@ static int radeon_pci_probe(struct pci_dev *pdev,
case CHIP_HAWAII:
case CHIP_KABINI:
case CHIP_MULLINS:
-   dev_info(>dev,
+   drm_info(>dev,
 "CIK support disabled by module param\n");
return -ENODEV;
}
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/1] drm/amdgpu: add helper function for vm pasid

2021-06-22 Thread Das, Nirmoy



On 6/22/2021 9:03 AM, Christian König wrote:



Am 22.06.21 um 08:57 schrieb Nirmoy Das:

Cleanup code related to vm pasid by adding helper functions.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 105 -
  1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 63975bda8e76..6e476b173cbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -87,6 +87,46 @@ struct amdgpu_prt_cb {
  struct dma_fence_cb cb;
  };

+static int amdgpu_vm_pasid_alloc(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ unsigned int pasid,
+ unsigned int *vm_pasid)
+{
+    unsigned long flags;
+    int r;
+
+    if (!pasid)
+    return 0;
+
+    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
+  GFP_ATOMIC);
+    spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+    if (r < 0)
+    return r;
+    if (vm_pasid)
+    *vm_pasid = pasid;
+


Ok the more I read from this patch the less it makes sense.

We don't allocate the passid here, we just set it up in the idr.

What we could do is to replace the idr with an xarray, that would 
certainly make more sense than this here.



xarray looks great, with that we don't need pasid_lock either.


Thanks

Nirmoy




Christian.


+    return 0;
+}
+
+static void amdgpu_vm_pasid_remove(struct amdgpu_device *adev,
+   unsigned int pasid,
+   unsigned int *vm_pasid)
+{
+    unsigned long flags;
+
+    if (!pasid)
+    return;
+
+    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+    idr_remove(>vm_manager.pasid_idr, pasid);
+    spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+
+    if (vm_pasid)
+    *vm_pasid = 0;
+}
+
  /*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no 
reclaim-FS

   * happens while holding this lock anywhere to prevent deadlocks when
@@ -2940,18 +2980,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
struct amdgpu_vm *vm, u32 pasid)


  amdgpu_bo_unreserve(vm->root.bo);

-    if (pasid) {
-    unsigned long flags;
-
-    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid 
+ 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-    if (r < 0)
-    goto error_free_root;
-
-    vm->pasid = pasid;
-    }
+    if (amdgpu_vm_pasid_alloc(adev, vm, pasid, >pasid))
+    goto error_free_root;

  INIT_KFIFO(vm->faults);

@@ -3038,19 +3068,11 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  r = amdgpu_vm_check_clean_reserved(adev, vm);
  if (r)
  goto unreserve_bo;
+    r = amdgpu_vm_pasid_alloc(adev, vm, pasid, NULL);
+    if (r ==  -ENOSPC)
+    goto unreserve_bo;

-    if (pasid) {
-    unsigned long flags;
-
-    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid 
+ 1,

-  GFP_ATOMIC);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
-    if (r == -ENOSPC)
-    goto unreserve_bo;
-    r = 0;
-    }
+    r = 0;

  /* Check if PD needs to be reinitialized and do it before
   * changing any other state, in case it fails.
@@ -3089,35 +3111,23 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

  vm->is_compute_context = true;

  if (vm->pasid) {
-    unsigned long flags;
-
-    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    idr_remove(>vm_manager.pasid_idr, vm->pasid);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
  /* Free the original amdgpu allocated pasid
   * Will be replaced with kfd allocated pasid
   */
  amdgpu_pasid_free(vm->pasid);
-    vm->pasid = 0;
+    amdgpu_vm_pasid_remove(adev, vm->pasid, >pasid);
  }

  /* Free the shadow bo for compute VM */
amdgpu_bo_unref(_amdgpu_bo_vm(vm->root.bo)->shadow);
-
  if (pasid)
  vm->pasid = pasid;

  goto unreserve_bo;

  free_idr:
-    if (pasid) {
-    unsigned long flags;
+    amdgpu_vm_pasid_remove(adev, pasid, NULL);

-    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-    idr_remove(>vm_manager.pasid_idr, pasid);
- spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-    }
  unreserve_bo:
  amdgpu_bo_unreserve(vm->root.bo);
  return r;
@@ -3133,14 +3143,7 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm,

   */
  void amdgpu_vm_release_compute(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)

  {
-    if (vm->pasid) {
-    unsigned long flags;
-
-    spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-

Re: [PATCH v4 17/17] drm/amd/display: Add handling for new "Broadcast RGB" property

2021-06-22 Thread Pekka Paalanen

On Fri, 18 Jun 2021 11:11:16 +0200
Werner Sembach  wrote:

> This commit implements the "Broadcast RGB" drm property for the AMD GPU
> driver.
> 
> Signed-off-by: Werner Sembach 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 22 ++-
>  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  4 
>  2 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 9ffd2f9d3d75..c5dbf948a47a 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -5252,7 +5252,8 @@ get_aspect_ratio(const struct drm_display_mode *mode_in)
>  }
>  
>  static enum dc_color_space
> -get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing)
> +get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing,
> +enum drm_mode_color_range preferred_color_range)
>  {
>   enum dc_color_space color_space = COLOR_SPACE_SRGB;
>  
> @@ -5267,13 +5268,17 @@ get_output_color_space(const struct dc_crtc_timing 
> *dc_crtc_timing)
>* respectively
>*/
>   if (dc_crtc_timing->pix_clk_100hz > 270300) {
> - if (dc_crtc_timing->flags.Y_ONLY)
> + if (dc_crtc_timing->flags.Y_ONLY
> + || preferred_color_range ==
> + 
> DRM_MODE_COLOR_RANGE_LIMITED_16_235)
>   color_space =
>   COLOR_SPACE_YCBCR709_LIMITED;
>   else
>   color_space = COLOR_SPACE_YCBCR709;

Hi,

does this mean that amdgpu would be using a property named "Broadcast
RGB" to control the range of YCbCr too?

That is surprising. If this is truly wanted, then the documentation of
"Broadcast RGB" must say that it applies to YCbCr too.

Does amdgpu do the same as intel wrt. to the question about whose
responsibility it is to make the pixels at the connector to match the
set range?


Thanks,
pq

>   } else {
> - if (dc_crtc_timing->flags.Y_ONLY)
> + if (dc_crtc_timing->flags.Y_ONLY
> + || preferred_color_range ==
> + 
> DRM_MODE_COLOR_RANGE_LIMITED_16_235)
>   color_space =
>   COLOR_SPACE_YCBCR601_LIMITED;
>   else
> @@ -5283,7 +5288,10 @@ get_output_color_space(const struct dc_crtc_timing 
> *dc_crtc_timing)
>   }
>   break;
>   case PIXEL_ENCODING_RGB:
> - color_space = COLOR_SPACE_SRGB;
> + if (preferred_color_range == 
> DRM_MODE_COLOR_RANGE_LIMITED_16_235)
> + color_space = COLOR_SPACE_SRGB_LIMITED;
> + else
> + color_space = COLOR_SPACE_SRGB;
>   break;
>  
>   default:
> @@ -5429,7 +5437,10 @@ static void 
> fill_stream_properties_from_drm_display_mode(
>  
>   timing_out->aspect_ratio = get_aspect_ratio(mode_in);
>  
> - stream->output_color_space = get_output_color_space(timing_out);
> + stream->output_color_space = get_output_color_space(timing_out,
> + connector_state ?
> + 
> connector_state->preferred_color_range :
> + 
> DRM_MODE_COLOR_RANGE_UNSET);
>  
>   stream->out_transfer_func->type = TF_TYPE_PREDEFINED;
>   stream->out_transfer_func->tf = TRANSFER_FUNCTION_SRGB;
> @@ -7780,6 +7791,7 @@ void amdgpu_dm_connector_init_helper(struct 
> amdgpu_display_manager *dm,
>   drm_connector_attach_active_bpc_property(>base, 8, 
> 16);
>   
> drm_connector_attach_preferred_color_format_property(>base);
>   
> drm_connector_attach_active_color_format_property(>base);
> + 
> drm_connector_attach_preferred_color_range_property(>base);
>   
> drm_connector_attach_active_color_range_property(>base);
>   }
>  
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 2563788ba95a..80e1389fd0ec 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -421,6 +421,10 @@ dm_dp_add_mst_connector(struct drm_dp_mst_topology_mgr 
> *mgr,
>   if (connector->active_color_format_property)
>   
> drm_connector_attach_active_color_format_property(>base);
>  
> + connector->preferred_color_range_property = 
> master->base.preferred_color_range_property;
> + if (connector->preferred_color_range_property)
> + 
>

Re: [PATCH v4 15/17] drm/uAPI: Move "Broadcast RGB" property from driver specific to general context

2021-06-22 Thread Pekka Paalanen

On Fri, 18 Jun 2021 11:11:14 +0200
Werner Sembach  wrote:

> Add "Broadcast RGB" to general drm context so that more drivers besides
> i915 and gma500 can implement it without duplicating code.
> 
> Userspace can use this property to tell the graphic driver to use full or
> limited color range for a given connector, overwriting the default
> behaviour/automatic detection.
> 
> Possible options are:
> - Automatic (default/current behaviour)
> - Full
> - Limited 16:235
> 
> In theory the driver should be able to automatically detect the monitors
> capabilities, but because of flawed standard implementations in Monitors,
> this might fail. In this case a manual overwrite is required to not have
> washed out colors or lose details in very dark or bright scenes.
> 
> Signed-off-by: Werner Sembach 
> ---
>  drivers/gpu/drm/drm_atomic_helper.c |  4 +++
>  drivers/gpu/drm/drm_atomic_uapi.c   |  4 +++
>  drivers/gpu/drm/drm_connector.c | 43 +
>  include/drm/drm_connector.h | 16 +++
>  4 files changed, 67 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> b/drivers/gpu/drm/drm_atomic_helper.c
> index 90d62f305257..0c89d32efbd0 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -691,6 +691,10 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
>   if (old_connector_state->preferred_color_format !=
>   new_connector_state->preferred_color_format)
>   new_crtc_state->connectors_changed = true;
> +
> + if (old_connector_state->preferred_color_range !=
> + new_connector_state->preferred_color_range)
> + new_crtc_state->connectors_changed = true;
>   }
>  
>   if (funcs->atomic_check)
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> b/drivers/gpu/drm/drm_atomic_uapi.c
> index c536f5e22016..c589bb1a8163 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -798,6 +798,8 @@ static int drm_atomic_connector_set_property(struct 
> drm_connector *connector,
>   state->max_requested_bpc = val;
>   } else if (property == connector->preferred_color_format_property) {
>   state->preferred_color_format = val;
> + } else if (property == connector->preferred_color_range_property) {
> + state->preferred_color_range = val;
>   } else if (connector->funcs->atomic_set_property) {
>   return connector->funcs->atomic_set_property(connector,
>   state, property, val);
> @@ -877,6 +879,8 @@ drm_atomic_connector_get_property(struct drm_connector 
> *connector,
>   *val = state->max_requested_bpc;
>   } else if (property == connector->preferred_color_format_property) {
>   *val = state->preferred_color_format;
> + } else if (property == connector->preferred_color_range_property) {
> + *val = state->preferred_color_range;
>   } else if (connector->funcs->atomic_get_property) {
>   return connector->funcs->atomic_get_property(connector,
>   state, property, val);
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index aea03dd02e33..9bc596638613 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -905,6 +905,12 @@ static const struct drm_prop_enum_list 
> drm_active_color_format_enum_list[] = {
>   { DRM_COLOR_FORMAT_YCRCB420, "ycbcr420" },
>  };
>  
> +static const struct drm_prop_enum_list drm_preferred_color_range_enum_list[] 
> = {
> + { DRM_MODE_COLOR_RANGE_UNSET, "Automatic" },
> + { DRM_MODE_COLOR_RANGE_FULL, "Full" },
> + { DRM_MODE_COLOR_RANGE_LIMITED_16_235, "Limited 16:235" },

Hi,

the same question here about these numbers as I asked on the "active
color range" property.

> +};
> +
>  static const struct drm_prop_enum_list drm_active_color_range_enum_list[] = {
>   { DRM_MODE_COLOR_RANGE_UNSET, "Unknown" },
>   { DRM_MODE_COLOR_RANGE_FULL, "Full" },
> @@ -1243,6 +1249,13 @@ static const struct drm_prop_enum_list 
> dp_colorspaces[] = {
>   *   drm_connector_attach_active_color_format_property() to install this
>   *   property.
>   *
> + * Broadcast RGB:
> + *   This property is used by userspace to change the used color range. When
> + *   used the driver will use the selected range if valid for the current
> + *   color format. Drivers to use the function
> + *   drm_connector_attach_preferred_color_format_property() to create and
> + *   attach the property to the connector during initialization.

An important detail to document here is: does userspace need to
take care that pixel data at the connector will already match the set
range, or will the driver program the hardware to produce the set range?

If the former, then

Re: [PATCH v4 12/17] drm/uAPI: Add "preferred color format" drm property as setting for userspace

2021-06-22 Thread Pekka Paalanen

On Fri, 18 Jun 2021 11:11:11 +0200
Werner Sembach  wrote:

> Add a new general drm property "preferred color format" which can be used
> by userspace to tell the graphic drivers to which color format to use.
> 
> Possible options are:
> - auto (default/current behaviour)
> - rgb
> - ycbcr444
> - ycbcr422 (not supported by both amdgpu and i915)
> - ycbcr420
> 
> In theory the auto option should choose the best available option for the
> current setup, but because of bad internal conversion some monitors look
> better with rgb and some with ycbcr444.
> 
> Also, because of bad shielded connectors and/or cables, it might be
> preferable to use the less bandwidth heavy ycbcr422 and ycbcr420 formats
> for a signal that is less deceptible to interference.
> 
> In the future, automatic color calibration for screens might also depend on
> this option being available.
> 
> Signed-off-by: Werner Sembach 
> ---
>  drivers/gpu/drm/drm_atomic_helper.c |  4 +++
>  drivers/gpu/drm/drm_atomic_uapi.c   |  4 +++
>  drivers/gpu/drm/drm_connector.c | 48 -
>  include/drm/drm_connector.h | 17 ++
>  4 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> b/drivers/gpu/drm/drm_atomic_helper.c
> index bc3487964fb5..90d62f305257 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -687,6 +687,10 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
>   if (old_connector_state->max_requested_bpc !=
>   new_connector_state->max_requested_bpc)
>   new_crtc_state->connectors_changed = true;
> +
> + if (old_connector_state->preferred_color_format !=
> + new_connector_state->preferred_color_format)
> + new_crtc_state->connectors_changed = true;
>   }
>  
>   if (funcs->atomic_check)
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> b/drivers/gpu/drm/drm_atomic_uapi.c
> index 438e9585b225..c536f5e22016 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -796,6 +796,8 @@ static int drm_atomic_connector_set_property(struct 
> drm_connector *connector,
>  fence_ptr);
>   } else if (property == connector->max_bpc_property) {
>   state->max_requested_bpc = val;
> + } else if (property == connector->preferred_color_format_property) {
> + state->preferred_color_format = val;
>   } else if (connector->funcs->atomic_set_property) {
>   return connector->funcs->atomic_set_property(connector,
>   state, property, val);
> @@ -873,6 +875,8 @@ drm_atomic_connector_get_property(struct drm_connector 
> *connector,
>   *val = 0;
>   } else if (property == connector->max_bpc_property) {
>   *val = state->max_requested_bpc;
> + } else if (property == connector->preferred_color_format_property) {
> + *val = state->preferred_color_format;
>   } else if (connector->funcs->atomic_get_property) {
>   return connector->funcs->atomic_get_property(connector,
>   state, property, val);
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index 818de58d972f..aea03dd02e33 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -889,6 +889,14 @@ static const struct drm_prop_enum_list 
> drm_dp_subconnector_enum_list[] = {
>   { DRM_MODE_SUBCONNECTOR_Native,  "Native"}, /* DP */
>  };
>  
> +static const struct drm_prop_enum_list 
> drm_preferred_color_format_enum_list[] = {
> + { 0, "auto" },
> + { DRM_COLOR_FORMAT_RGB444, "rgb" },
> + { DRM_COLOR_FORMAT_YCRCB444, "ycbcr444" },
> + { DRM_COLOR_FORMAT_YCRCB422, "ycbcr422" },
> + { DRM_COLOR_FORMAT_YCRCB420, "ycbcr420" },
> +};
> +
>  static const struct drm_prop_enum_list drm_active_color_format_enum_list[] = 
> {
>   { 0, "unknown" },
>   { DRM_COLOR_FORMAT_RGB444, "rgb" },
> @@ -1219,11 +1227,19 @@ static const struct drm_prop_enum_list 
> dp_colorspaces[] = {
>   *   Drivers shall use drm_connector_attach_active_bpc_property() to install
>   *   this property.
>   *
> + * preferred color format:
> + *   This property is used by userspace to change the used color format. When
> + *   used the driver will use the selected format if valid for the hardware,
> + *   sink, and current resolution and refresh rate combination. Drivers to
> + *   use the function drm_connector_attach_preferred_color_format_property()
> + *   to create and attach the property to the connector during
> + *   initialization.
> + *
>   * active color format:
>   *   This read-only property tells userspace the color format actually used
>   *   by the

Re: [PATCH v4 09/17] drm/uAPI: Add "active color range" drm property as feedback for userspace

2021-06-22 Thread Pekka Paalanen

On Fri, 18 Jun 2021 11:11:08 +0200
Werner Sembach  wrote:

> Add a new general drm property "active color range" which can be used by
> graphic drivers to report the used color range back to userspace.
> 
> There was no way to check which color range got actually used on a given
> monitor. To surely predict this, one must know the exact capabilities of
> the monitor and what the default behaviour of the used driver is. This
> property helps eliminating the guessing at this point.
> 
> In the future, automatic color calibration for screens might also depend on
> this information being available.
> 
> Signed-off-by: Werner Sembach 
> ---
>  drivers/gpu/drm/drm_connector.c | 59 +
>  include/drm/drm_connector.h | 27 +++
>  2 files changed, 86 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index 684d7abdf0eb..818de58d972f 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -897,6 +897,12 @@ static const struct drm_prop_enum_list 
> drm_active_color_format_enum_list[] = {
>   { DRM_COLOR_FORMAT_YCRCB420, "ycbcr420" },
>  };
>  
> +static const struct drm_prop_enum_list drm_active_color_range_enum_list[] = {
> + { DRM_MODE_COLOR_RANGE_UNSET, "Unknown" },
> + { DRM_MODE_COLOR_RANGE_FULL, "Full" },
> + { DRM_MODE_COLOR_RANGE_LIMITED_16_235, "Limited 16:235" },

Doesn't "limited" mean different numbers on RGB vs. Y vs. CbCr? I have
a vague recollection that at least one of them was different from the
others.

Documenting DRM_MODE_COLOR_RANGE_UNSET as "unspecified/default" while
the string for it is "Unknown" seems inconsistent to me. I would
recommend to avoid the word "default" because "reset to defaults" might
become a thing one day, and that probably is not the same default as
here.

Is there actually a case for "unknown"? How can it be not known? Or
does it mean "not applicable"?

Otherwise looks good to me.


Thanks,
pq


> +};
> +
>  DRM_ENUM_NAME_FN(drm_get_dp_subconnector_name,
>drm_dp_subconnector_enum_list)
>  
> @@ -1221,6 +1227,14 @@ static const struct drm_prop_enum_list 
> dp_colorspaces[] = {
>   *   drm_connector_attach_active_color_format_property() to install this
>   *   property.
>   *
> + * active color range:
> + *   This read-only property tells userspace the color range actually used by
> + *   the hardware display engine on "the cable" on a connector. The chosen
> + *   value depends on hardware capabilities of the monitor and the used color
> + *   format. Drivers shall use
> + *   drm_connector_attach_active_color_range_property() to install this
> + *   property.
> + *
>   * Connectors also have one standardized atomic property:
>   *
>   * CRTC_ID:
> @@ -2264,6 +2278,51 @@ void 
> drm_connector_set_active_color_format_property(struct drm_connector *connec
>  }
>  EXPORT_SYMBOL(drm_connector_set_active_color_format_property);
>  
> +/**
> + * drm_connector_attach_active_color_range_property - attach "active color 
> range" property
> + * @connector: connector to attach active color range property on.
> + *
> + * This is used to check the applied color range on a connector.
> + *
> + * Returns:
> + * Zero on success, negative errno on failure.
> + */
> +int drm_connector_attach_active_color_range_property(struct drm_connector 
> *connector)
> +{
> + struct drm_device *dev = connector->dev;
> + struct drm_property *prop;
> +
> + if (!connector->active_color_range_property) {
> + prop = drm_property_create_enum(dev, DRM_MODE_PROP_IMMUTABLE, 
> "active color range",
> + 
> drm_active_color_range_enum_list,
> + 
> ARRAY_SIZE(drm_active_color_range_enum_list));
> + if (!prop)
> + return -ENOMEM;
> +
> + connector->active_color_range_property = prop;
> + drm_object_attach_property(>base, prop, 
> DRM_MODE_COLOR_RANGE_UNSET);
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_connector_attach_active_color_range_property);
> +
> +/**
> + * drm_connector_set_active_color_range_property - sets the active color 
> range property for a
> + * connector
> + * @connector: drm connector
> + * @active_color_range: color range for the connector currently active on 
> "the cable"
> + *
> + * Should be used by atomic drivers to update the active color range over a 
> connector.
> + */
> +void drm_connector_set_active_color_range_property(struct drm_connector 
> *connector,
> +enum drm_mode_color_range 
> active_color_range)
> +{
> + drm_object_property_set_value(>base, 
> connector->active_color_range_property,
> +   active_color_range);
> +}
> +EXPORT_SYMBOL(drm_connector_set_active_color_range_property);
> +
>  /**
>   * drm_connector_attach_hdr_output_metadata_property - attach 
>

[PATCH 1/1] drm/amdgpu: add helper function for vm pasid

2021-06-22 Thread Nirmoy Das

Cleanup code related to vm pasid by adding helper functions.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 105 -
 1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 63975bda8e76..6e476b173cbb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -87,6 +87,46 @@ struct amdgpu_prt_cb {
struct dma_fence_cb cb;
 };

+static int amdgpu_vm_pasid_alloc(struct amdgpu_device *adev,
+struct amdgpu_vm *vm,
+unsigned int pasid,
+unsigned int *vm_pasid)
+{
+   unsigned long flags;
+   int r;
+
+   if (!pasid)
+   return 0;
+
+   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+   r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
+ GFP_ATOMIC);
+   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+   if (r < 0)
+   return r;
+   if (vm_pasid)
+   *vm_pasid = pasid;
+
+   return 0;
+}
+
+static void amdgpu_vm_pasid_remove(struct amdgpu_device *adev,
+  unsigned int pasid,
+  unsigned int *vm_pasid)
+{
+   unsigned long flags;
+
+   if (!pasid)
+   return;
+
+   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
+   idr_remove(>vm_manager.pasid_idr, pasid);
+   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
+
+   if (vm_pasid)
+   *vm_pasid = 0;
+}
+
 /*
  * vm eviction_lock can be taken in MMU notifiers. Make sure no reclaim-FS
  * happens while holding this lock anywhere to prevent deadlocks when
@@ -2940,18 +2980,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm, u32 pasid)

amdgpu_bo_unreserve(vm->root.bo);

-   if (pasid) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
- GFP_ATOMIC);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-   if (r < 0)
-   goto error_free_root;
-
-   vm->pasid = pasid;
-   }
+   if (amdgpu_vm_pasid_alloc(adev, vm, pasid, >pasid))
+   goto error_free_root;

INIT_KFIFO(vm->faults);

@@ -3038,19 +3068,11 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
r = amdgpu_vm_check_clean_reserved(adev, vm);
if (r)
goto unreserve_bo;
+   r = amdgpu_vm_pasid_alloc(adev, vm, pasid, NULL);
+   if (r ==  -ENOSPC)
+   goto unreserve_bo;

-   if (pasid) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   r = idr_alloc(>vm_manager.pasid_idr, vm, pasid, pasid + 1,
- GFP_ATOMIC);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
-   if (r == -ENOSPC)
-   goto unreserve_bo;
-   r = 0;
-   }
+   r = 0;

/* Check if PD needs to be reinitialized and do it before
 * changing any other state, in case it fails.
@@ -3089,35 +3111,23 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
vm->is_compute_context = true;

if (vm->pasid) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   idr_remove(>vm_manager.pasid_idr, vm->pasid);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-
/* Free the original amdgpu allocated pasid
 * Will be replaced with kfd allocated pasid
 */
amdgpu_pasid_free(vm->pasid);
-   vm->pasid = 0;
+   amdgpu_vm_pasid_remove(adev, vm->pasid, >pasid);
}

/* Free the shadow bo for compute VM */
amdgpu_bo_unref(_amdgpu_bo_vm(vm->root.bo)->shadow);
-
if (pasid)
vm->pasid = pasid;

goto unreserve_bo;

 free_idr:
-   if (pasid) {
-   unsigned long flags;
+   amdgpu_vm_pasid_remove(adev, pasid, NULL);

-   spin_lock_irqsave(>vm_manager.pasid_lock, flags);
-   idr_remove(>vm_manager.pasid_idr, pasid);
-   spin_unlock_irqrestore(>vm_manager.pasid_lock, flags);
-   }
 unreserve_bo:
amdgpu_bo_unreserve(vm->root.bo);
return r;
@@ -3133,14 +3143,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
  */
 void amdgpu_vm_release_compute(struct amdgpu_device *adev, struct amdgpu_vm 
*vm)
 {
-   if (vm->pasid) {
-   unsigned long flags;
-
-

Re: [PATCH v4 06/17] drm/uAPI: Add "active color format" drm property as feedback for userspace

2021-06-22 Thread Pekka Paalanen

On Fri, 18 Jun 2021 11:11:05 +0200
Werner Sembach  wrote:

> Add a new general drm property "active color format" which can be used by
> graphic drivers to report the used color format back to userspace.
> 
> There was no way to check which color format got actually used on a given
> monitor. To surely predict this, one must know the exact capabilities of
> the monitor, the GPU, and the connection used and what the default
> behaviour of the used driver is (e.g. amdgpu prefers YCbCr 4:4:4 while i915
> prefers RGB). This property helps eliminating the guessing on this point.
> 
> In the future, automatic color calibration for screens might also depend on
> this information being available.
> 
> Signed-off-by: Werner Sembach 
> ---
>  drivers/gpu/drm/drm_connector.c | 61 +
>  include/drm/drm_connector.h |  9 +
>  2 files changed, 70 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index 943f6b61053b..684d7abdf0eb 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -889,6 +889,14 @@ static const struct drm_prop_enum_list 
> drm_dp_subconnector_enum_list[] = {
>   { DRM_MODE_SUBCONNECTOR_Native,  "Native"}, /* DP */
>  };
>  
> +static const struct drm_prop_enum_list drm_active_color_format_enum_list[] = 
> {
> + { 0, "unknown" },
> + { DRM_COLOR_FORMAT_RGB444, "rgb" },
> + { DRM_COLOR_FORMAT_YCRCB444, "ycbcr444" },
> + { DRM_COLOR_FORMAT_YCRCB422, "ycbcr422" },
> + { DRM_COLOR_FORMAT_YCRCB420, "ycbcr420" },
> +};
> +
>  DRM_ENUM_NAME_FN(drm_get_dp_subconnector_name,
>drm_dp_subconnector_enum_list)
>  
> @@ -1205,6 +1213,14 @@ static const struct drm_prop_enum_list 
> dp_colorspaces[] = {
>   *   Drivers shall use drm_connector_attach_active_bpc_property() to install
>   *   this property.
>   *
> + * active color format:
> + *   This read-only property tells userspace the color format actually used
> + *   by the hardware display engine on "the cable" on a connector. The chosen
> + *   value depends on hardware capabilities, both display engine and
> + *   connected monitor. Drivers shall use
> + *   drm_connector_attach_active_color_format_property() to install this
> + *   property.
> + *
>   * Connectors also have one standardized atomic property:
>   *
>   * CRTC_ID:
> @@ -2203,6 +2219,51 @@ void drm_connector_set_active_bpc_property(struct 
> drm_connector *connector, int
>  }
>  EXPORT_SYMBOL(drm_connector_set_active_bpc_property);
>  
> +/**
> + * drm_connector_attach_active_color_format_property - attach "active color 
> format" property
> + * @connector: connector to attach active color format property on.
> + *
> + * This is used to check the applied color format on a connector.
> + *
> + * Returns:
> + * Zero on success, negative errno on failure.
> + */
> +int drm_connector_attach_active_color_format_property(struct drm_connector 
> *connector)
> +{
> + struct drm_device *dev = connector->dev;
> + struct drm_property *prop;
> +
> + if (!connector->active_color_format_property) {
> + prop = drm_property_create_enum(dev, DRM_MODE_PROP_IMMUTABLE, 
> "active color format",
> + 
> drm_active_color_format_enum_list,
> + 
> ARRAY_SIZE(drm_active_color_format_enum_list));
> + if (!prop)
> + return -ENOMEM;
> +
> + connector->active_color_format_property = prop;
> + drm_object_attach_property(>base, prop, 0);
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_connector_attach_active_color_format_property);
> +
> +/**
> + * drm_connector_set_active_color_format_property - sets the active color 
> format property for a
> + * connector
> + * @connector: drm connector
> + * @active_color_format: color format for the connector currently active on 
> "the cable"
> + *
> + * Should be used by atomic drivers to update the active color format over a 
> connector.
> + */
> +void drm_connector_set_active_color_format_property(struct drm_connector 
> *connector,
> + u32 active_color_format)
> +{
> + drm_object_property_set_value(>base, 
> connector->active_color_format_property,
> +   active_color_format);
> +}
> +EXPORT_SYMBOL(drm_connector_set_active_color_format_property);
> +
>  /**
>   * drm_connector_attach_hdr_output_metadata_property - attach 
> "HDR_OUTPUT_METADA" property
>   * @connector: connector to attach the property on.
> diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> index eee86de62a5f..8a5197f14e87 100644
> --- a/include/drm/drm_connector.h
> +++ b/include/drm/drm_connector.h
> @@ -1386,6 +1386,12 @@ struct drm_connector {
>*/
>   struct drm_property *active_bpc_property;
>  
> + /**
> +  * @active_color_format_property:

Re: [PATCH v4 03/17] drm/uAPI: Add "active bpc" as feedback channel for "max bpc" drm property

2021-06-22 Thread Pekka Paalanen

On Fri, 18 Jun 2021 11:11:02 +0200
Werner Sembach  wrote:

> Add a new general drm property "active bpc" which can be used by graphic
> drivers to report the applied bit depth per pixel back to userspace.
> 
> While "max bpc" can be used to change the color depth, there was no way to
> check which one actually got used. While in theory the driver chooses the
> best/highest color depth within the max bpc setting a user might not be
> fully aware what his hardware is or isn't capable off. This is meant as a
> quick way to double check the setup.
> 
> In the future, automatic color calibration for screens might also depend on
> this information being available.
> 
> Signed-off-by: Werner Sembach 
> ---
>  drivers/gpu/drm/drm_connector.c | 51 +
>  include/drm/drm_connector.h |  8 ++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index da39e7ff6965..943f6b61053b 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -1197,6 +1197,14 @@ static const struct drm_prop_enum_list 
> dp_colorspaces[] = {
>   *   drm_connector_attach_max_bpc_property() to create and attach the
>   *   property to the connector during initialization.
>   *
> + * active bpc:
> + *   This read-only range property tells userspace the pixel color bit depth
> + *   actually used by the hardware display engine on "the cable" on a
> + *   connector. The chosen value depends on hardware capabilities, both
> + *   display engine and connected monitor, and the "max bpc" property.
> + *   Drivers shall use drm_connector_attach_active_bpc_property() to install
> + *   this property.
> + *
>   * Connectors also have one standardized atomic property:
>   *
>   * CRTC_ID:
> @@ -2152,6 +2160,49 @@ int drm_connector_attach_max_bpc_property(struct 
> drm_connector *connector,
>  }
>  EXPORT_SYMBOL(drm_connector_attach_max_bpc_property);
>  
> +/**
> + * drm_connector_attach_active_bpc_property - attach "active bpc" property
> + * @connector: connector to attach active bpc property on.
> + * @min: The minimum bit depth supported by the connector.
> + * @max: The maximum bit depth supported by the connector.
> + *
> + * This is used to check the applied bit depth on a connector.
> + *
> + * Returns:
> + * Zero on success, negative errno on failure.
> + */
> +int drm_connector_attach_active_bpc_property(struct drm_connector 
> *connector, int min, int max)
> +{
> + struct drm_device *dev = connector->dev;
> + struct drm_property *prop;
> +
> + if (!connector->active_bpc_property) {
> + prop = drm_property_create_range(dev, DRM_MODE_PROP_IMMUTABLE, 
> "active bpc",
> +  min, max);
> + if (!prop)
> + return -ENOMEM;
> +
> + connector->active_bpc_property = prop;
> + drm_object_attach_property(>base, prop, 0);
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_connector_attach_active_bpc_property);
> +
> +/**
> + * drm_connector_set_active_bpc_property - sets the active bits per color 
> property for a connector
> + * @connector: drm connector
> + * @active_bpc: bits per color for the connector currently active on "the 
> cable"
> + *
> + * Should be used by atomic drivers to update the active bits per color over 
> a connector.
> + */
> +void drm_connector_set_active_bpc_property(struct drm_connector *connector, 
> int active_bpc)
> +{
> + drm_object_property_set_value(>base, 
> connector->active_bpc_property, active_bpc);
> +}
> +EXPORT_SYMBOL(drm_connector_set_active_bpc_property);
> +
>  /**
>   * drm_connector_attach_hdr_output_metadata_property - attach 
> "HDR_OUTPUT_METADA" property
>   * @connector: connector to attach the property on.
> diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> index 714d1a01c065..eee86de62a5f 100644
> --- a/include/drm/drm_connector.h
> +++ b/include/drm/drm_connector.h
> @@ -1380,6 +1380,12 @@ struct drm_connector {
>*/
>   struct drm_property *max_bpc_property;
>  
> + /**
> +  * @active_bpc_property: Default connector property for the active bpc
> +  * to be driven out of the connector.
> +  */
> + struct drm_property *active_bpc_property;
> +
>  #define DRM_CONNECTOR_POLL_HPD (1 << 0)
>  #define DRM_CONNECTOR_POLL_CONNECT (1 << 1)
>  #define DRM_CONNECTOR_POLL_DISCONNECT (1 << 2)
> @@ -1702,6 +1708,8 @@ int drm_connector_set_panel_orientation_with_quirk(
>   int width, int height);
>  int drm_connector_attach_max_bpc_property(struct drm_connector *connector,
> int min, int max);
> +int drm_connector_attach_active_bpc_property(struct drm_connector 
> *connector, int min, int max);
> +void drm_connector_set_active_bpc_property(struct drm_connector *connector, 
> int active_bpc);
>  
>  /**
>   * struct drm_tile_group - Tile group metadata

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-22 Thread Christian König


Am 22.06.21 um 01:29 schrieb Jason Gunthorpe:

On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:


Another thing I want to emphasize is that we are doing p2p only
through the export/import of the FD. We do *not* allow the user to
mmap the dma-buf as we do not support direct IO. So there is no access
to these pages through the userspace.

Arguably mmaping the memory is a better choice, and is the direction
that Logan's series goes in. Here the use of DMABUF was specifically
designed to allow hitless revokation of the memory, which this isn't
even using.


The major problem with this approach is that DMA-buf is also used for 
memory which isn't CPU accessible.


That was one of the reasons we didn't even considered using the mapping 
memory approach for GPUs.


Regards,
Christian.



So you are taking the hit of very limited hardware support and reduced
performance just to squeeze into DMABUF..

Jason
___
Linaro-mm-sig mailing list
linaro-mm-...@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-mm-sig


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

1 2 >

1 - 100 of 102 matches

Mail list logo