date:20231207

RE: [PATCH v6 3/5] mm/gup: Introduce memfd_pin_user_pages() for pinning memfd pages (v6)

2023-12-07 Thread Kasireddy, Vivek

Hi David,

> >
> >> On 05.12.23 06:35, Vivek Kasireddy wrote:
> >>> For drivers that would like to longterm-pin the pages associated
> >>> with a memfd, the pin_user_pages_fd() API provides an option to
> >>> not only pin the pages via FOLL_PIN but also to check and migrate
> >>> them if they reside in movable zone or CMA block. This API
> >>> currently works with memfds but it should work with any files
> >>> that belong to either shmemfs or hugetlbfs. Files belonging to
> >>> other filesystems are rejected for now.
> >>>
> >>> The pages need to be located first before pinning them via FOLL_PIN.
> >>> If they are found in the page cache, they can be immediately pinned.
> >>> Otherwise, they need to be allocated using the filesystem specific
> >>> APIs and then pinned.
> >>>
> >>> v2:
> >>> - Drop gup_flags and improve comments and commit message (David)
> >>> - Allocate a page if we cannot find in page cache for the hugetlbfs
> >>> case as well (David)
> >>> - Don't unpin pages if there is a migration related failure (David)
> >>> - Drop the unnecessary nr_pages <= 0 check (Jason)
> >>> - Have the caller of the API pass in file * instead of fd (Jason)
> >>>
> >>> v3: (David)
> >>> - Enclose the huge page allocation code with #ifdef
> >> CONFIG_HUGETLB_PAGE
> >>> (Build error reported by kernel test robot )
> >>> - Don't forget memalloc_pin_restore() on non-migration related errors
> >>> - Improve the readability of the cleanup code associated with
> >>> non-migration related errors
> >>> - Augment the comments by describing FOLL_LONGTERM like behavior
> >>> - Include the R-b tag from Jason
> >>>
> >>> v4:
> >>> - Remove the local variable "page" and instead use 3 return statements
> >>> in alloc_file_page() (David)
> >>> - Add the R-b tag from David
> >>>
> >>> v5: (David)
> >>> - For hugetlb case, ensure that we only obtain head pages from the
> >>> mapping by using __filemap_get_folio() instead of
> find_get_page_flags()
> >>> - Handle -EEXIST when two or more potential users try to simultaneously
> >>> add a huge page to the mapping by forcing them to retry on failure
> >>>
> >>> v6: (Christoph)
> >>> - Rename this API to memfd_pin_user_pages() to make it clear that it
> >>> is intended for memfds
> >>> - Move the memfd page allocation helper from gup.c to memfd.c
> >>> - Fix indentation errors in memfd_pin_user_pages()
> >>> - For contiguous ranges of folios, use a helper such as
> >>> filemap_get_folios_contig() to lookup the page cache in batches
> >>>
> >>> Cc: David Hildenbrand 
> >>> Cc: Christoph Hellwig 
> >>> Cc: Daniel Vetter 
> >>> Cc: Mike Kravetz 
> >>> Cc: Hugh Dickins 
> >>> Cc: Peter Xu 
> >>> Cc: Gerd Hoffmann 
> >>> Cc: Dongwon Kim 
> >>> Cc: Junxiao Chang 
> >>> Suggested-by: Jason Gunthorpe 
> >>> Reviewed-by: Jason Gunthorpe  (v2)
> >>> Reviewed-by: David Hildenbrand  (v3)
> >>> Signed-off-by: Vivek Kasireddy 
> >>> ---
> >>>include/linux/memfd.h |   5 +++
> >>>include/linux/mm.h|   2 +
> >>>mm/gup.c  | 102
> ++
> >>>mm/memfd.c|  34 ++
> >>>4 files changed, 143 insertions(+)
> >>>
> >>> diff --git a/include/linux/memfd.h b/include/linux/memfd.h
> >>> index e7abf6fa4c52..6fc0d1282151 100644
> >>> --- a/include/linux/memfd.h
> >>> +++ b/include/linux/memfd.h
> >>> @@ -6,11 +6,16 @@
> >>>
> >>>#ifdef CONFIG_MEMFD_CREATE
> >>>extern long memfd_fcntl(struct file *file, unsigned int cmd, unsigned 
> >>> int
> >> arg);
> >>> +extern struct page *memfd_alloc_page(struct file *memfd, pgoff_t idx);
> >>>#else
> >>>static inline long memfd_fcntl(struct file *f, unsigned int c, 
> >>> unsigned int
> a)
> >>>{
> >>>   return -EINVAL;
> >>>}
> >>> +static inline struct page *memfd_alloc_page(struct file *memfd, pgoff_t
> >> idx)
> >>> +{
> >>> + return ERR_PTR(-EINVAL);
> >>> +}
> >>>#endif
> >>>
> >>>#endif /* __LINUX_MEMFD_H */
> >>> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >>> index 418d26608ece..ac69db45509f 100644
> >>> --- a/include/linux/mm.h
> >>> +++ b/include/linux/mm.h
> >>> @@ -2472,6 +2472,8 @@ long get_user_pages_unlocked(unsigned long
> >> start, unsigned long nr_pages,
> >>>   struct page **pages, unsigned int gup_flags);
> >>>long pin_user_pages_unlocked(unsigned long start, unsigned long
> >> nr_pages,
> >>>   struct page **pages, unsigned int gup_flags);
> >>> +long memfd_pin_user_pages(struct file *file, pgoff_t start,
> >>> +   unsigned long nr_pages, struct page **pages);
> >>>
> >>>int get_user_pages_fast(unsigned long start, int nr_pages,
> >>>   unsigned int gup_flags, struct page **pages);
> >>> diff --git a/mm/gup.c b/mm/gup.c
> >>> index 231711efa390..eb93d1ec9dc6 100644
> >>> --- a/mm/gup.c
> >>> +++ b/mm/gup.c
> >>> @@ -5,6 +5,7 @@
> >>>#include 
> >>>
> >>>#include 
> >>> +#include 
>

Re: [PATCH v1 1/2] dt-bindings: display: bridge: cdns: Add properties to support StarFive JH7110 SoC

2023-12-07 Thread Shengyang Chen

Hi, Rob

Thanks for review and comment.

On 2023/11/28 23:52, Rob Herring wrote:
> On Mon, Nov 27, 2023 at 07:34:35PM +0800, Shengyang Chen wrote:
>> From: Keith Zhao 
>> 
>> Add properties in CDNS DSI yaml file to match with
>> CDNS DSI module in StarFive JH7110 SoC.
>> 
>> Signed-off-by: Keith Zhao 
>> ---
>>  .../bindings/display/bridge/cdns,dsi.yaml | 38 ++-
>>  1 file changed, 36 insertions(+), 2 deletions(-)
>> 
>> diff --git a/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml 
>> b/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> index 23060324d16e..3f02ee383aad 100644
>> --- a/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> +++ b/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> @@ -17,6 +17,7 @@ properties:
>>  enum:
>>- cdns,dsi
>>- ti,j721e-dsi
>> +  - starfive,cdns-dsi
>>  
>>reg:
>>  minItems: 1
>> @@ -27,14 +28,20 @@ properties:
>>Register block for wrapper settings registers in case of TI J7 
>> SoCs.
>>  
>>clocks:
>> +minItems: 2
>>  items:
>>- description: PSM clock, used by the IP
>>- description: sys clock, used by the IP
>> +  - description: apb clock, used by the IP
>> +  - description: txesc clock, used by the IP
>>  
>>clock-names:
>> +minItems: 2
>>  items:
>>- const: dsi_p_clk
>>- const: dsi_sys_clk
>> +  - const: apb
>> +  - const: txesc
>>  
>>phys:
>>  maxItems: 1
>> @@ -46,10 +53,21 @@ properties:
>>  maxItems: 1
>>  
>>resets:
>> -maxItems: 1
>> +minItems: 1
>> +items:
>> +  - description: dsi sys reset line
>> +  - description: dsi dpi reset line
>> +  - description: dsi apb reset line
>> +  - description: dsi txesc reset line
>> +  - description: dsi txbytehs reset line
>>  
>>reset-names:
>> -const: dsi_p_rst
>> +items:
>> +  - const: dsi_p_rst
>> +  - const: dsi_dpi
>> +  - const: dsi_apb
>> +  - const: dsi_txesc
>> +  - const: dsi_txbytehs
> 
> Let's not continue the redundant 'dsi_' prefix. We're stuck with it for 
> the first one, but not the new ones.
> 

ok, "dsi_" will be dropped in next commit.

> Rob

thanks.

Best Regards,
Shengyang

Re: [PATCH v1 1/2] dt-bindings: display: bridge: cdns: Add properties to support StarFive JH7110 SoC

2023-12-07 Thread Shengyang Chen




On 2023/11/27 20:23, Krzysztof Kozlowski wrote:
> On 27/11/2023 12:34, Shengyang Chen wrote:
>> From: Keith Zhao 
>> 
>> Add properties in CDNS DSI yaml file to match with
>> CDNS DSI module in StarFive JH7110 SoC.
>> 
>> Signed-off-by: Keith Zhao 
>> ---
>>  .../bindings/display/bridge/cdns,dsi.yaml | 38 ++-
>>  1 file changed, 36 insertions(+), 2 deletions(-)
>> 
>> diff --git a/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml 
>> b/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> index 23060324d16e..3f02ee383aad 100644
>> --- a/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> +++ b/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> @@ -17,6 +17,7 @@ properties:
>>  enum:
>>- cdns,dsi
>>- ti,j721e-dsi
>> +  - starfive,cdns-dsi
> 
> BTW, one more thing, I really doubt that starfive created "cdns" block.
> "cdns" is vendor prefix. Use SoCs-specific compatibles.
> 

The StarFive SoC contains cdns dsi ip inside. It did not create cdns block. 
Sorry about that.
It will be fixed by using SoCs-specific compatibles.
thanks

> Best regards,
> Krzysztof
>

Re: [PATCH] drm/crtc: Fix uninit-value bug in drm_mode_setcrtc

2023-12-07 Thread Harshit Mogalapalli


Hello,

On 21/07/23 9:44 pm, Ziqi Zhao wrote:

The connector_set contains uninitialized values when allocated with
kmalloc_array. However, in the "out" branch, the logic assumes that any
element in connector_set would be equal to NULL if failed to
initialize, which causes the bug reported by Syzbot. The fix is to use
an extra variable to keep track of how many connectors are initialized
indeed, and use that variable to decrease any refcounts in the "out"
branch.



This bug is reproducible on 6.7-rc3 on KASAN enabled kernel as wild 
memory access.


[  424.699429] general protection fault, probably for non-canonical 
address 0xfbf7c8b63d84d2a6:  [#1] PREEMPT SMP KASAN PTI
[  424.727952] KASAN: maybe wild-memory-access in range 
[0xdfbe65b1ec269530-0xdfbe65b1ec269537]

[  424.743794] CPU: 3 PID: 9040 Comm: r Not tainted 6.7.0-rc3+ #1
[  424.758855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.11.0-2.el7 04/01/2014

[  424.774845] RIP: 0010:drm_mode_object_put+0x27/0x50
[  424.782854] Code: 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 89 fd e8 
ae 92 0b fd 48 8d 7d 18 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 
03 <80> 3c 02 00 75 1a 48 83 7d 18 00 74 0d e8 87 92 0b fd 48 89 ef e8

[  424.816805] RSP: 0018:8881199b7ad0 EFLAGS: 00010a06
[  424.830847] RAX: dc00 RBX: ed1023336fc3 RCX: 

[  424.844180] RDX: 1bf7ccb63d84d2a6 RSI:  RDI: 
dfbe65b1ec269530
[  424.854860] RBP: dfbe65b1ec269518 R08:  R09: 

[  424.870833] R10:  R11:  R12: 
dfbe65b1ec2694d8
[  424.886846] R13: dc00 R14: 8881060731c0 R15: 
0001
[  424.901889] FS:  7fecfc1ec740() GS:8881f3f8() 
knlGS:

[  424.910833] CS:  0010 DS:  ES:  CR0: 80050033
[  424.918929] CR2:  CR3: 000117e7c000 CR4: 
06f0

[  424.936058] Call Trace:
[  424.936058]  
[  424.936058]  ? show_regs+0x9b/0xb0
[  424.950853]  ? die_addr+0x55/0xe0
[  424.950853]  ? exc_general_protection+0x1a4/0x320
[  424.965905]  ? asm_exc_general_protection+0x26/0x30
[  424.974878]  ? drm_mode_object_put+0x27/0x50
[  424.982866]  drm_mode_setcrtc+0x7ec/0x1630
[  424.990875]  ? __pfx_drm_mode_setcrtc+0x10/0x10
[  424.998877]  ? ww_mutex_lock+0x9a/0x1c0
[  425.006852]  ? __pfx_ww_mutex_lock+0x10/0x10
[  425.014875]  ? __drm_dev_dbg+0xbd/0x1a0
[  425.014875]  ? __pfx___drm_dev_dbg+0x10/0x10
[  425.030321]  ? drm_lease_owner+0x44/0x60
[  425.030981]  drm_ioctl_kernel+0x2a0/0x500
[  425.040058]  ? __pfx_drm_mode_setcrtc+0x10/0x10
[  425.048128]  ? __pfx_drm_ioctl_kernel+0x10/0x10
[  425.055809]  drm_ioctl+0x58a/0xb60
[  425.062876]  ? __pfx_drm_mode_setcrtc+0x10/0x10
[  425.070875]  ? __pfx_drm_ioctl+0x10/0x10
[  425.078875]  ? __pfx_do_sys_openat2+0x10/0x10
[  425.086875]  ? selinux_file_ioctl+0x184/0x270
[  425.099093]  ? selinux_file_ioctl+0xba/0x270
[  425.102865]  ? __pfx_drm_ioctl+0x10/0x10
[  425.111092]  __x64_sys_ioctl+0x1b1/0x220
[  425.119055]  do_syscall_64+0x45/0x100
[  425.127106]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  425.135102] RIP: 0033:0x7fecfb6f8289


After applying this patch, the bug is not reproducible.


Thanks,
Harshit





Reported-by: syzbot+4fad2e57beb6397ab...@syzkaller.appspotmail.com
Signed-off-by: Ziqi Zhao 
---
  drivers/gpu/drm/drm_crtc.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index df9bf3c9206e..d718c17ab1e9 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -715,8 +715,7 @@ int drm_mode_setcrtc(struct drm_device *dev, void *data,
struct drm_mode_set set;
uint32_t __user *set_connectors_ptr;
struct drm_modeset_acquire_ctx ctx;
-   int ret;
-   int i;
+   int ret, i, num_connectors;
  
  	if (!drm_core_check_feature(dev, DRIVER_MODESET))

return -EOPNOTSUPP;
@@ -851,6 +850,7 @@ int drm_mode_setcrtc(struct drm_device *dev, void *data,
goto out;
}
  
+		num_connectors = 0;

for (i = 0; i < crtc_req->count_connectors; i++) {
connector_set[i] = NULL;
set_connectors_ptr = (uint32_t __user *)(unsigned 
long)crtc_req->set_connectors_ptr;
@@ -871,6 +871,7 @@ int drm_mode_setcrtc(struct drm_device *dev, void *data,
connector->name);
  
  			connector_set[i] = connector;

+   num_connectors++;
}
}
  
@@ -879,7 +880,7 @@ int drm_mode_setcrtc(struct drm_device *dev, void *data,

set.y = crtc_req->y;
set.mode = mode;
set.connectors = connector_set;
-   set.num_connectors = crtc_req->count_connectors;
+   set.num_connectors = num_connectors;
set.fb = fb;
  
  	if (drm_drv_uses_atomic_modeset(dev))

@@ -892,7 +893,7 @@ int

Re: [PATCH v1 1/2] dt-bindings: display: bridge: cdns: Add properties to support StarFive JH7110 SoC

2023-12-07 Thread Shengyang Chen

Hi,Krzysztof

Thanks for review and comment.

On 2023/11/27 20:22, Krzysztof Kozlowski wrote:
> On 27/11/2023 12:34, Shengyang Chen wrote:
>> From: Keith Zhao 
>> 
>> Add properties in CDNS DSI yaml file to match with
>> CDNS DSI module in StarFive JH7110 SoC.
>> 
>> Signed-off-by: Keith Zhao 
>> ---
>>  .../bindings/display/bridge/cdns,dsi.yaml | 38 ++-
>>  1 file changed, 36 insertions(+), 2 deletions(-)
>> 
>> diff --git a/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml 
>> b/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> index 23060324d16e..3f02ee383aad 100644
>> --- a/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> +++ b/Documentation/devicetree/bindings/display/bridge/cdns,dsi.yaml
>> @@ -17,6 +17,7 @@ properties:
>>  enum:
>>- cdns,dsi
>>- ti,j721e-dsi
>> +  - starfive,cdns-dsi
> 
> Keep alphabetical order.
> 

ok, will keep it order

>>  
>>reg:
>>  minItems: 1
>> @@ -27,14 +28,20 @@ properties:
>>Register block for wrapper settings registers in case of TI J7 
>> SoCs.
>>  
>>clocks:
>> +minItems: 2
>>  items:
>>- description: PSM clock, used by the IP
>>- description: sys clock, used by the IP
>> +  - description: apb clock, used by the IP
>> +  - description: txesc clock, used by the IP
>>  
>>clock-names:
>> +minItems: 2
>>  items:
>>- const: dsi_p_clk
>>- const: dsi_sys_clk
>> +  - const: apb
>> +  - const: txesc
>>  
>>phys:
>>  maxItems: 1
>> @@ -46,10 +53,21 @@ properties:
>>  maxItems: 1
>>  
>>resets:
>> -maxItems: 1
>> +minItems: 1
>> +items:
>> +  - description: dsi sys reset line
>> +  - description: dsi dpi reset line
>> +  - description: dsi apb reset line
>> +  - description: dsi txesc reset line
>> +  - description: dsi txbytehs reset line
>>  
>>reset-names:
>> -const: dsi_p_rst
>> +items:
>> +  - const: dsi_p_rst
>> +  - const: dsi_dpi
>> +  - const: dsi_apb
>> +  - const: dsi_txesc
>> +  - const: dsi_txbytehs
>>  
>>ports:
>>  $ref: /schemas/graph.yaml#/properties/ports
>> @@ -90,6 +108,22 @@ allOf:
>>  reg:
>>maxItems: 1
>>  
> 
> You need to restrict other variants, because you just relaxed several
> properties for everyone...
> 
> 

ok, will fix it

> Best regards,
> Krzysztof
> 

thanks.

Best Regards,
Shengyang

[PATCH v2 16/16] drm/msm/dpu: add cdm blocks to dpu snapshot

2023-12-07 Thread Abhinav Kumar

Now that CDM block support has been added to DPU lets also add its
entry to the DPU snapshot to help debugging.

Signed-off-by: Abhinav Kumar 
Reviewed-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index dc24fe4bb3b0..59647ad19906 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -947,6 +947,10 @@ static void dpu_kms_mdp_snapshot(struct msm_disp_state 
*disp_state, struct msm_k
}
}
 
+   if (cat->cdm)
+   msm_disp_snapshot_add_block(disp_state, cat->cdm->len,
+   dpu_kms->mmio + cat->cdm->base, 
cat->cdm->name);
+
pm_runtime_put_sync(_kms->pdev->dev);
 }
 
-- 
2.40.1

[PATCH v2 15/16] drm/msm/dpu: introduce separate wb2_format arrays for rgb and yuv

2023-12-07 Thread Abhinav Kumar

Lets rename the existing wb2_formats array wb2_formats_rgb to indicate
that it has only RGB formats and can be used on any chipset having a WB
block.

Introduce a new wb2_formats_rgb_yuv array to the catalog to
indicate support for YUV formats to writeback in addition to RGB.

Chipsets which have support for CDM block will use the newly added
wb2_formats_rgb_yuv array.

Signed-off-by: Abhinav Kumar 
---
 .../msm/disp/dpu1/catalog/dpu_10_0_sm8650.h   |  4 +-
 .../msm/disp/dpu1/catalog/dpu_6_0_sm8250.h|  4 +-
 .../msm/disp/dpu1/catalog/dpu_6_2_sc7180.h|  4 +-
 .../msm/disp/dpu1/catalog/dpu_7_2_sc7280.h|  4 +-
 .../msm/disp/dpu1/catalog/dpu_9_0_sm8550.h|  4 +-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 37 ++-
 6 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h
index 04d2a73dd942..eb5dfff2ec4f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h
@@ -341,8 +341,8 @@ static const struct dpu_wb_cfg sm8650_wb[] = {
.name = "wb_2", .id = WB_2,
.base = 0x65000, .len = 0x2c8,
.features = WB_SM8250_MASK,
-   .format_list = wb2_formats,
-   .num_formats = ARRAY_SIZE(wb2_formats),
+   .format_list = wb2_formats_rgb,
+   .num_formats = ARRAY_SIZE(wb2_formats_rgb),
.xin_id = 6,
.vbif_idx = VBIF_RT,
.maxlinewidth = 4096,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
index 58b0f50518c8..a57d50b1f028 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
@@ -336,8 +336,8 @@ static const struct dpu_wb_cfg sm8250_wb[] = {
.name = "wb_2", .id = WB_2,
.base = 0x65000, .len = 0x2c8,
.features = WB_SM8250_MASK,
-   .format_list = wb2_formats,
-   .num_formats = ARRAY_SIZE(wb2_formats),
+   .format_list = wb2_formats_rgb_yuv,
+   .num_formats = ARRAY_SIZE(wb2_formats_rgb_yuv),
.clk_ctrl = DPU_CLK_CTRL_WB2,
.xin_id = 6,
.vbif_idx = VBIF_RT,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
index bcfedfc8251a..7382ebb6e5b2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
@@ -157,8 +157,8 @@ static const struct dpu_wb_cfg sc7180_wb[] = {
.name = "wb_2", .id = WB_2,
.base = 0x65000, .len = 0x2c8,
.features = WB_SM8250_MASK,
-   .format_list = wb2_formats,
-   .num_formats = ARRAY_SIZE(wb2_formats),
+   .format_list = wb2_formats_rgb,
+   .num_formats = ARRAY_SIZE(wb2_formats_rgb),
.clk_ctrl = DPU_CLK_CTRL_WB2,
.xin_id = 6,
.vbif_idx = VBIF_RT,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
index 19c2b7454796..2f153e0b5c6a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
@@ -169,8 +169,8 @@ static const struct dpu_wb_cfg sc7280_wb[] = {
.name = "wb_2", .id = WB_2,
.base = 0x65000, .len = 0x2c8,
.features = WB_SM8250_MASK,
-   .format_list = wb2_formats,
-   .num_formats = ARRAY_SIZE(wb2_formats),
+   .format_list = wb2_formats_rgb_yuv,
+   .num_formats = ARRAY_SIZE(wb2_formats_rgb_yuv),
.clk_ctrl = DPU_CLK_CTRL_WB2,
.xin_id = 6,
.vbif_idx = VBIF_RT,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
index bf56265967c0..ad48defa154f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
@@ -315,8 +315,8 @@ static const struct dpu_wb_cfg sm8550_wb[] = {
.name = "wb_2", .id = WB_2,
.base = 0x65000, .len = 0x2c8,
.features = WB_SM8250_MASK,
-   .format_list = wb2_formats,
-   .num_formats = ARRAY_SIZE(wb2_formats),
+   .format_list = wb2_formats_rgb,
+   .num_formats = ARRAY_SIZE(wb2_formats_rgb),
.xin_id = 6,
.vbif_idx = VBIF_RT,
.maxlinewidth = 4096,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index

[PATCH v2 14/16] drm/msm/dpu: reserve cdm blocks for writeback in case of YUV output

2023-12-07 Thread Abhinav Kumar

Reserve CDM blocks for writeback if the format of the output fb
is YUV. At the moment, the reservation is done only for writeback
but can easily be extended by relaxing the checks once other
interfaces are ready to output YUV.

changes in v2:
- use needs_cdm from topology struct
- drop fb related checks from atomic_mode_set()

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 27 +
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 862912727925..a576e3e62429 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "msm_drv.h"
 #include "dpu_kms.h"
@@ -583,6 +584,7 @@ static int dpu_encoder_virt_atomic_check(
struct drm_display_mode *adj_mode;
struct msm_display_topology topology;
struct dpu_global_state *global_state;
+   struct drm_framebuffer *fb;
struct drm_dsc_config *dsc;
int i = 0;
int ret = 0;
@@ -623,6 +625,22 @@ static int dpu_encoder_virt_atomic_check(
 
topology = dpu_encoder_get_topology(dpu_enc, dpu_kms, adj_mode, 
crtc_state, dsc);
 
+   /*
+* Use CDM only for writeback at the moment as other interfaces cannot 
handle it.
+* if writeback itself cannot handle cdm for some reason it will fail 
in its atomic_check()
+* earlier.
+*/
+   if (dpu_enc->disp_info.intf_type == INTF_WB && 
conn_state->writeback_job) {
+   fb = conn_state->writeback_job->fb;
+
+   if (fb && 
DPU_FORMAT_IS_YUV(to_dpu_format(msm_framebuffer_format(fb
+   topology.needs_cdm = true;
+   if (topology.needs_cdm && !dpu_enc->cur_master->hw_cdm)
+   crtc_state->mode_changed = true;
+   else if (!topology.needs_cdm && dpu_enc->cur_master->hw_cdm)
+   crtc_state->mode_changed = true;
+   }
+
/*
 * Release and Allocate resources on every modeset
 * Dont allocate when active is false.
@@ -1063,6 +1081,15 @@ static void dpu_encoder_virt_atomic_mode_set(struct 
drm_encoder *drm_enc,
 
dpu_enc->dsc_mask = dsc_mask;
 
+   if (dpu_enc->disp_info.intf_type == INTF_WB && 
conn_state->writeback_job) {
+   struct dpu_hw_blk *hw_cdm = NULL;
+
+   dpu_rm_get_assigned_resources(_kms->rm, global_state,
+ drm_enc->base.id, DPU_HW_BLK_CDM,
+ _cdm, 1);
+   dpu_enc->cur_master->hw_cdm = hw_cdm ? to_dpu_hw_cdm(hw_cdm) : 
NULL;
+   }
+
cstate = to_dpu_crtc_state(crtc_state);
 
for (i = 0; i < num_lm; i++) {
-- 
2.40.1

[PATCH v2 13/16] drm/msm/dpu: plug-in the cdm related bits to writeback setup

2023-12-07 Thread Abhinav Kumar

To setup and enable CDM block for the writeback pipeline, lets
add the pieces together to set the active bits and the flush
bits for the CDM block.

changes in v2:
- passed the cdm idx to update_pending_flush_cdm()
  (have retained the R-b as its a minor change)

Signed-off-by: Abhinav Kumar 
Reviewed-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
index 85429c62d727..0cc2c3ee491f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -214,6 +214,7 @@ static void dpu_encoder_phys_wb_setup_ctl(struct 
dpu_encoder_phys *phys_enc)
 {
struct dpu_hw_wb *hw_wb;
struct dpu_hw_ctl *ctl;
+   struct dpu_hw_cdm *hw_cdm;
 
if (!phys_enc) {
DPU_ERROR("invalid encoder\n");
@@ -222,6 +223,7 @@ static void dpu_encoder_phys_wb_setup_ctl(struct 
dpu_encoder_phys *phys_enc)
 
hw_wb = phys_enc->hw_wb;
ctl = phys_enc->hw_ctl;
+   hw_cdm = phys_enc->hw_cdm;
 
if (test_bit(DPU_CTL_ACTIVE_CFG, >caps->features) &&
(phys_enc->hw_ctl &&
@@ -238,6 +240,9 @@ static void dpu_encoder_phys_wb_setup_ctl(struct 
dpu_encoder_phys *phys_enc)
if (mode_3d && hw_pp && hw_pp->merge_3d)
intf_cfg.merge_3d = hw_pp->merge_3d->idx;
 
+   if (hw_cdm)
+   intf_cfg.cdm = hw_cdm->idx;
+
if (phys_enc->hw_pp->merge_3d && 
phys_enc->hw_pp->merge_3d->ops.setup_3d_mode)

phys_enc->hw_pp->merge_3d->ops.setup_3d_mode(phys_enc->hw_pp->merge_3d,
mode_3d);
@@ -421,6 +426,7 @@ static void _dpu_encoder_phys_wb_update_flush(struct 
dpu_encoder_phys *phys_enc)
struct dpu_hw_wb *hw_wb;
struct dpu_hw_ctl *hw_ctl;
struct dpu_hw_pingpong *hw_pp;
+   struct dpu_hw_cdm *hw_cdm;
u32 pending_flush = 0;
 
if (!phys_enc)
@@ -429,6 +435,7 @@ static void _dpu_encoder_phys_wb_update_flush(struct 
dpu_encoder_phys *phys_enc)
hw_wb = phys_enc->hw_wb;
hw_pp = phys_enc->hw_pp;
hw_ctl = phys_enc->hw_ctl;
+   hw_cdm = phys_enc->hw_cdm;
 
DPU_DEBUG("[wb:%d]\n", hw_wb->idx - WB_0);
 
@@ -444,6 +451,9 @@ static void _dpu_encoder_phys_wb_update_flush(struct 
dpu_encoder_phys *phys_enc)
hw_ctl->ops.update_pending_flush_merge_3d(hw_ctl,
hw_pp->merge_3d->idx);
 
+   if (hw_cdm && hw_ctl->ops.update_pending_flush_cdm)
+   hw_ctl->ops.update_pending_flush_cdm(hw_ctl, hw_cdm->idx);
+
if (hw_ctl->ops.get_pending_flush)
pending_flush = hw_ctl->ops.get_pending_flush(hw_ctl);
 
-- 
2.40.1

[PATCH v2 12/16] drm/msm/dpu: add an API to setup the CDM block for writeback

2023-12-07 Thread Abhinav Kumar

Add an API dpu_encoder_helper_phys_setup_cdm() which can be used by
the writeback encoder to setup the CDM block.

Currently, this is defined and used within the writeback's physical
encoder layer however, the function can be modified to be used to setup
the CDM block even for non-writeback interfaces.

Until those modifications are planned and made, keep it local to
writeback.

changes in v2:
- add the RGB2YUV CSC matrix to dpu util as needed by CDM
- use dpu_hw_get_csc_cfg() to get and program CSC
- drop usage of setup_csc_data() and setup_cdwn() cdm ops
  as they both have been merged into enable()
- drop reduntant hw_cdm and hw_pp checks

Signed-off-by: Abhinav Kumar 
---
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  3 +
 .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 96 ++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.c   | 17 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h   |  1 +
 4 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index 410f6225789c..1d6d1eb642b9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -16,6 +16,7 @@
 #include "dpu_hw_pingpong.h"
 #include "dpu_hw_ctl.h"
 #include "dpu_hw_top.h"
+#include "dpu_hw_cdm.h"
 #include "dpu_encoder.h"
 #include "dpu_crtc.h"
 
@@ -210,6 +211,7 @@ static inline int dpu_encoder_phys_inc_pending(struct 
dpu_encoder_phys *phys)
  * @wbirq_refcount: Reference count of writeback interrupt
  * @wb_done_timeout_cnt: number of wb done irq timeout errors
  * @wb_cfg:  writeback block config to store fb related details
+ * @cdm_cfg: cdm block config needed to store writeback block's CDM 
configuration
  * @wb_conn: backpointer to writeback connector
  * @wb_job: backpointer to current writeback job
  * @dest:   dpu buffer layout for current writeback output buffer
@@ -219,6 +221,7 @@ struct dpu_encoder_phys_wb {
atomic_t wbirq_refcount;
int wb_done_timeout_cnt;
struct dpu_hw_wb_cfg wb_cfg;
+   struct dpu_hw_cdm_cfg cdm_cfg;
struct drm_writeback_connector *wb_conn;
struct drm_writeback_job *wb_job;
struct dpu_hw_fmt_layout dest;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
index 4665367cf14f..85429c62d727 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -259,6 +259,99 @@ static void dpu_encoder_phys_wb_setup_ctl(struct 
dpu_encoder_phys *phys_enc)
}
 }
 
+/**
+ * dpu_encoder_phys_wb_setup_cdp - setup chroma down sampling block
+ * @phys_enc:Pointer to physical encoder
+ */
+static void dpu_encoder_helper_phys_setup_cdm(struct dpu_encoder_phys 
*phys_enc)
+{
+   struct dpu_hw_cdm *hw_cdm;
+   struct dpu_hw_cdm_cfg *cdm_cfg;
+   struct dpu_hw_pingpong *hw_pp;
+   struct dpu_encoder_phys_wb *wb_enc;
+   const struct msm_format *format;
+   const struct dpu_format *dpu_fmt;
+   struct drm_writeback_job *wb_job;
+   int ret;
+
+   if (!phys_enc)
+   return;
+
+   wb_enc = to_dpu_encoder_phys_wb(phys_enc);
+   cdm_cfg = _enc->cdm_cfg;
+   hw_pp = phys_enc->hw_pp;
+   hw_cdm = phys_enc->hw_cdm;
+   wb_job = wb_enc->wb_job;
+
+   format = msm_framebuffer_format(wb_enc->wb_job->fb);
+   dpu_fmt = dpu_get_dpu_format_ext(format->pixel_format, 
wb_job->fb->modifier);
+
+   if (!hw_cdm)
+   return;
+
+   if (!DPU_FORMAT_IS_YUV(dpu_fmt)) {
+   DPU_DEBUG("[enc:%d] cdm_disable fmt:%x\n", 
DRMID(phys_enc->parent),
+ dpu_fmt->base.pixel_format);
+   if (hw_cdm->ops.disable)
+   hw_cdm->ops.disable(hw_cdm);
+
+   return;
+   }
+
+   memset(cdm_cfg, 0, sizeof(struct dpu_hw_cdm_cfg));
+
+   cdm_cfg->output_width = wb_job->fb->width;
+   cdm_cfg->output_height = wb_job->fb->height;
+   cdm_cfg->output_fmt = dpu_fmt;
+   cdm_cfg->output_type = CDM_CDWN_OUTPUT_WB;
+   cdm_cfg->output_bit_depth = DPU_FORMAT_IS_DX(dpu_fmt) ?
+   CDM_CDWN_OUTPUT_10BIT : CDM_CDWN_OUTPUT_8BIT;
+   cdm_cfg->csc_cfg = dpu_hw_get_csc_cfg(DPU_HW_RGB2YUV_601L_10BIT);
+   if (!cdm_cfg->csc_cfg) {
+   DPU_ERROR("valid csc not found\n");
+   return;
+   }
+
+   /* enable 10 bit logic */
+   switch (cdm_cfg->output_fmt->chroma_sample) {
+   case DPU_CHROMA_RGB:
+   cdm_cfg->h_cdwn_type = CDM_CDWN_DISABLE;
+   cdm_cfg->v_cdwn_type = CDM_CDWN_DISABLE;
+   break;
+   case DPU_CHROMA_H2V1:
+   cdm_cfg->h_cdwn_type = CDM_CDWN_COSITE;
+   cdm_cfg->v_cdwn_type = CDM_CDWN_DISABLE;
+   break;
+   case DPU_CHROMA_420:
+

[PATCH v2 11/16] drm/msm/dpu: add support to disable CDM block during encoder cleanup

2023-12-07 Thread Abhinav Kumar

In preparation of setting up CDM block, add the logic to disable it
properly during encoder cleanup.

changes in v2:
- call update_pending_flush_cdm even when bind_pingpong_blk
  is not present

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c  | 10 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index aa1a1646b322..862912727925 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -26,6 +26,7 @@
 #include "dpu_hw_dspp.h"
 #include "dpu_hw_dsc.h"
 #include "dpu_hw_merge3d.h"
+#include "dpu_hw_cdm.h"
 #include "dpu_formats.h"
 #include "dpu_encoder_phys.h"
 #include "dpu_crtc.h"
@@ -2050,6 +2051,15 @@ void dpu_encoder_helper_phys_cleanup(struct 
dpu_encoder_phys *phys_enc)
phys_enc->hw_pp->merge_3d->idx);
}
 
+   if (phys_enc->hw_cdm) {
+   if (phys_enc->hw_cdm->ops.bind_pingpong_blk && phys_enc->hw_pp)
+   
phys_enc->hw_cdm->ops.bind_pingpong_blk(phys_enc->hw_cdm,
+   false, 
phys_enc->hw_pp->idx);
+   if (phys_enc->hw_ctl->ops.update_pending_flush_cdm)
+   
phys_enc->hw_ctl->ops.update_pending_flush_cdm(phys_enc->hw_ctl,
+  
phys_enc->hw_cdm->idx);
+   }
+
if (dpu_enc->dsc) {
dpu_encoder_unprep_dsc(dpu_enc);
dpu_enc->dsc = NULL;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index b6b48e2c63ef..410f6225789c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -151,6 +151,7 @@ enum dpu_intr_idx {
  * @hw_pp: Hardware interface to the ping pong registers
  * @hw_intf:   Hardware interface to the intf registers
  * @hw_wb: Hardware interface to the wb registers
+ * @hw_cdm:Hardware interface to the CDM registers
  * @dpu_kms:   Pointer to the dpu_kms top level
  * @cached_mode:   DRM mode cached at mode_set time, acted on in enable
  * @enabled:   Whether the encoder has enabled and running a mode
@@ -179,6 +180,7 @@ struct dpu_encoder_phys {
struct dpu_hw_pingpong *hw_pp;
struct dpu_hw_intf *hw_intf;
struct dpu_hw_wb *hw_wb;
+   struct dpu_hw_cdm *hw_cdm;
struct dpu_kms *dpu_kms;
struct drm_display_mode cached_mode;
enum dpu_enc_split_role split_role;
-- 
2.40.1

[PATCH v2 07/16] drm/msm/dpu: add dpu_hw_cdm abstraction for CDM block

2023-12-07 Thread Abhinav Kumar

CDM block comes with its own set of registers and operations
which can be done. In-line with other hardware sub-blocks, this
change adds the dpu_hw_cdm abstraction for the CDM block.

changes in v2:
- replace bit magic with relevant defines
- use drmm_kzalloc instead of kzalloc/free
- some formatting fixes
- inline _setup_cdm_ops()
- protect bind_pingpong_blk with core_rev check
- drop setup_csc_data() and setup_cdwn() ops as they
  are merged into enable()

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/Makefile|   1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.c  | 276 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h  | 114 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |   1 +
 4 files changed, 392 insertions(+)
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.c
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 49671364fdcf..b1173128b5b9 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -63,6 +63,7 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
disp/dpu1/dpu_encoder_phys_wb.o \
disp/dpu1/dpu_formats.o \
disp/dpu1/dpu_hw_catalog.o \
+   disp/dpu1/dpu_hw_cdm.o \
disp/dpu1/dpu_hw_ctl.o \
disp/dpu1/dpu_hw_dsc.o \
disp/dpu1/dpu_hw_dsc_1_2.o \
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.c
new file mode 100644
index ..0dbe2df56cc8
--- /dev/null
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.c
@@ -0,0 +1,276 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023, The Linux Foundation. All rights reserved.
+ */
+
+#include 
+
+#include "dpu_hw_mdss.h"
+#include "dpu_hw_util.h"
+#include "dpu_hw_catalog.h"
+#include "dpu_hw_cdm.h"
+#include "dpu_kms.h"
+
+#define CDM_CSC_10_OPMODE  0x000
+#define CDM_CSC_10_BASE0x004
+
+#define CDM_CDWN2_OP_MODE  0x100
+#define CDM_CDWN2_CLAMP_OUT0x104
+#define CDM_CDWN2_PARAMS_3D_0  0x108
+#define CDM_CDWN2_PARAMS_3D_1  0x10C
+#define CDM_CDWN2_COEFF_COSITE_H_0 0x110
+#define CDM_CDWN2_COEFF_COSITE_H_1 0x114
+#define CDM_CDWN2_COEFF_COSITE_H_2 0x118
+#define CDM_CDWN2_COEFF_OFFSITE_H_00x11C
+#define CDM_CDWN2_COEFF_OFFSITE_H_10x120
+#define CDM_CDWN2_COEFF_OFFSITE_H_20x124
+#define CDM_CDWN2_COEFF_COSITE_V   0x128
+#define CDM_CDWN2_COEFF_OFFSITE_V  0x12C
+#define CDM_CDWN2_OUT_SIZE 0x130
+
+#define CDM_HDMI_PACK_OP_MODE  0x200
+#define CDM_CSC_10_MATRIX_COEFF_0  0x004
+
+#define CDM_MUX0x224
+
+/* CDM CDWN2 sub-block bit definitions */
+#define CDM_CDWN2_OP_MODE_EN  BIT(0)
+#define CDM_CDWN2_OP_MODE_ENABLE_HBIT(1)
+#define CDM_CDWN2_OP_MODE_ENABLE_VBIT(2)
+#define CDM_CDWN2_OP_MODE_METHOD_H_AVGBIT(3)
+#define CDM_CDWN2_OP_MODE_METHOD_H_COSITE BIT(4)
+#define CDM_CDWN2_OP_MODE_METHOD_V_AVGBIT(5)
+#define CDM_CDWN2_OP_MODE_METHOD_V_COSITE BIT(6)
+#define CDM_CDWN2_OP_MODE_BITS_OUT_8BIT   BIT(7)
+#define CDM_CDWN2_OP_MODE_METHOD_H_OFFSITEGENMASK(4, 3)
+#define CDM_CDWN2_OP_MODE_METHOD_V_OFFSITEGENMASK(6, 5)
+#define CDM_CDWN2_V_PIXEL_DROP_MASK   GENMASK(6, 5)
+#define CDM_CDWN2_H_PIXEL_DROP_MASK   GENMASK(4, 3)
+
+/* CDM CSC10 sub-block bit definitions */
+#define CDM_CSC10_OP_MODE_EN   BIT(0)
+#define CDM_CSC10_OP_MODE_SRC_FMT_YUV  BIT(1)
+#define CDM_CSC10_OP_MODE_DST_FMT_YUV  BIT(2)
+
+/* CDM HDMI pack sub-block bit definitions */
+#define CDM_HDMI_PACK_OP_MODE_EN   BIT(0)
+
+/**
+ * Horizontal coefficients for cosite chroma downscale
+ * s13 representation of coefficients
+ */
+static u32 cosite_h_coeff[] = {0x0016, 0x01cc, 0x019e};
+
+/**
+ * Horizontal coefficients for offsite chroma downscale
+ */
+static u32 offsite_h_coeff[] = {0x000b0005, 0x01db01eb, 0x00e40046};
+
+/**
+ * Vertical coefficients for cosite chroma downscale
+ */
+static u32 cosite_v_coeff[] = {0x00080004};
+/**
+ * Vertical coefficients for offsite chroma downscale
+ */
+static u32 offsite_v_coeff[] = {0x00060002};
+
+static int dpu_hw_cdm_setup_cdwn(struct dpu_hw_cdm *ctx, struct dpu_hw_cdm_cfg 
*cfg)
+{
+   struct dpu_hw_blk_reg_map *c = >hw;
+   u32 opmode = 0;
+   u32 out_size = 0;
+
+   if (cfg->output_bit_depth == CDM_CDWN_OUTPUT_10BIT)
+   opmode &= ~CDM_CDWN2_OP_MODE_BITS_OUT_8BIT;
+   else
+   opmode |= CDM_CDWN2_OP_MODE_BITS_OUT_8BIT;
+
+   /* ENABLE DWNS_H bit */
+   opmode |= CDM_CDWN2_OP_MODE_ENABLE_H;
+
+   switch (cfg->h_cdwn_type) {
+   case CDM_CDWN_DISABLE:
+   /* CLEAR METHOD_H field */
+   opmode &=

[PATCH v2 09/16] drm/msm/dpu: add support to allocate CDM from RM

2023-12-07 Thread Abhinav Kumar

Even though there is usually only one CDM block, it can be
used by either HDMI, DisplayPort OR Writeback interfaces.

Hence its allocation needs to be tracked properly by the
resource manager to ensure appropriate availability of the
block.

changes in v2:
- move needs_cdm to topology struct

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c  | 38 +++--
 drivers/gpu/drm/msm/msm_drv.h   |  2 ++
 4 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
index 9db4cf61bd29..5df545904057 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
@@ -98,6 +98,7 @@ enum dpu_hw_blk_type {
DPU_HW_BLK_DSPP,
DPU_HW_BLK_MERGE_3D,
DPU_HW_BLK_DSC,
+   DPU_HW_BLK_CDM,
DPU_HW_BLK_MAX,
 };
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
index df6271017b80..a0cd36e45a01 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
@@ -135,6 +135,7 @@ struct dpu_global_state {
uint32_t ctl_to_enc_id[CTL_MAX - CTL_0];
uint32_t dspp_to_enc_id[DSPP_MAX - DSPP_0];
uint32_t dsc_to_enc_id[DSC_MAX - DSC_0];
+   uint32_t cdm_to_enc_id;
 };
 
 struct dpu_global_state
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index 7ed476b96304..b58a9c2ae326 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -435,6 +435,26 @@ static int _dpu_rm_reserve_dsc(struct dpu_rm *rm,
return 0;
 }
 
+static int _dpu_rm_reserve_cdm(struct dpu_rm *rm,
+  struct dpu_global_state *global_state,
+  struct drm_encoder *enc)
+{
+   /* try allocating only one CDM block */
+   if (!rm->cdm_blk) {
+   DPU_ERROR("CDM block does not exist\n");
+   return -EIO;
+   }
+
+   if (global_state->cdm_to_enc_id) {
+   DPU_ERROR("CDM_0 is already allocated\n");
+   return -EIO;
+   }
+
+   global_state->cdm_to_enc_id = enc->base.id;
+
+   return 0;
+}
+
 static int _dpu_rm_make_reservation(
struct dpu_rm *rm,
struct dpu_global_state *global_state,
@@ -460,6 +480,14 @@ static int _dpu_rm_make_reservation(
if (ret)
return ret;
 
+   if (reqs->topology.needs_cdm) {
+   ret = _dpu_rm_reserve_cdm(rm, global_state, enc);
+   if (ret) {
+   DPU_ERROR("unable to find CDM blk\n");
+   return ret;
+   }
+   }
+
return ret;
 }
 
@@ -470,9 +498,9 @@ static int _dpu_rm_populate_requirements(
 {
reqs->topology = req_topology;
 
-   DRM_DEBUG_KMS("num_lm: %d num_dsc: %d num_intf: %d\n",
+   DRM_DEBUG_KMS("num_lm: %d num_dsc: %d num_intf: %d cdm: %d\n",
  reqs->topology.num_lm, reqs->topology.num_dsc,
- reqs->topology.num_intf);
+ reqs->topology.num_intf, reqs->topology.needs_cdm);
 
return 0;
 }
@@ -501,6 +529,7 @@ void dpu_rm_release(struct dpu_global_state *global_state,
ARRAY_SIZE(global_state->dsc_to_enc_id), enc->base.id);
_dpu_rm_clear_mapping(global_state->dspp_to_enc_id,
ARRAY_SIZE(global_state->dspp_to_enc_id), enc->base.id);
+   _dpu_rm_clear_mapping(_state->cdm_to_enc_id, 1, enc->base.id);
 }
 
 int dpu_rm_reserve(
@@ -574,6 +603,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
hw_to_enc_id = global_state->dsc_to_enc_id;
max_blks = ARRAY_SIZE(rm->dsc_blks);
break;
+   case DPU_HW_BLK_CDM:
+   hw_blks = >cdm_blk;
+   hw_to_enc_id = _state->cdm_to_enc_id;
+   max_blks = 1;
+   break;
default:
DPU_ERROR("blk type %d not managed by rm\n", type);
return 0;
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index a205127ccc93..1ebad634781c 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -92,12 +92,14 @@ enum msm_event_wait {
  * @num_intf: number of interfaces the panel is mounted on
  * @num_dspp: number of dspp blocks used
  * @num_dsc:  number of Display Stream Compression (DSC) blocks used
+ * @needs_cdm:indicates whether cdm block is needed for this display 
topology
  */
 struct msm_display_topology {
u32 num_lm;
u32 num_intf;
u32 num_dspp;
u32 num_dsc;
+   bool needs_cdm;
 };
 
 /* Commit/Event thread specific structure */
-- 
2.40.1

[PATCH v2 08/16] drm/msm/dpu: add cdm blocks to RM

2023-12-07 Thread Abhinav Kumar

Add the RM APIs necessary to initialize and allocate CDM
blocks to be used by the rest of the DPU pipeline.

changes in v2:
- treat cdm_init() failure as fatal
- fixed the commit text

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 13 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index 0bb28cf4a6cb..7ed476b96304 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -8,6 +8,7 @@
 #include "dpu_kms.h"
 #include "dpu_hw_lm.h"
 #include "dpu_hw_ctl.h"
+#include "dpu_hw_cdm.h"
 #include "dpu_hw_pingpong.h"
 #include "dpu_hw_sspp.h"
 #include "dpu_hw_intf.h"
@@ -176,6 +177,18 @@ int dpu_rm_init(struct drm_device *dev,
rm->hw_sspp[sspp->id - SSPP_NONE] = hw;
}
 
+   if (cat->cdm) {
+   struct dpu_hw_cdm *hw;
+
+   hw = dpu_hw_cdm_init(dev, cat->cdm, mmio, cat->mdss_ver);
+   if (IS_ERR(hw)) {
+   rc = PTR_ERR(hw);
+   DPU_ERROR("failed cdm object creation: err %d\n", rc);
+   goto fail;
+   }
+   rm->cdm_blk = >base;
+   }
+
return 0;
 
 fail:
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
index 36752d837be4..e3f83ebc656b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
@@ -22,6 +22,7 @@ struct dpu_global_state;
  * @hw_wb: array of wb hardware resources
  * @dspp_blks: array of dspp hardware resources
  * @hw_sspp: array of sspp hardware resources
+ * @cdm_blk: cdm hardware resource
  */
 struct dpu_rm {
struct dpu_hw_blk *pingpong_blks[PINGPONG_MAX - PINGPONG_0];
@@ -33,6 +34,7 @@ struct dpu_rm {
struct dpu_hw_blk *merge_3d_blks[MERGE_3D_MAX - MERGE_3D_0];
struct dpu_hw_blk *dsc_blks[DSC_MAX - DSC_0];
struct dpu_hw_sspp *hw_sspp[SSPP_MAX - SSPP_NONE];
+   struct dpu_hw_blk *cdm_blk;
 };
 
 /**
-- 
2.40.1

[PATCH v2 06/16] drm/msm/dpu: add cdm blocks to sm8250 dpu_hw_catalog

2023-12-07 Thread Abhinav Kumar

Add CDM blocks to the sm8250 dpu_hw_catalog to support
YUV format output from writeback block.

changes in v2:
- re-use the cdm definition from sc7280

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
index 2359c16e9206..58b0f50518c8 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
@@ -384,6 +384,7 @@ const struct dpu_mdss_cfg dpu_sm8250_cfg = {
.mdss_ver = _mdss_ver,
.caps = _dpu_caps,
.mdp = _mdp,
+   .cdm = _cdm,
.ctl_count = ARRAY_SIZE(sm8250_ctl),
.ctl = sm8250_ctl,
.sspp_count = ARRAY_SIZE(sm8250_sspp),
-- 
2.40.1

[PATCH v2 10/16] drm/msm/dpu: add CDM related logic to dpu_hw_ctl layer

2023-12-07 Thread Abhinav Kumar

CDM block will need its own logic to program the flush and active
bits in the dpu_hw_ctl layer.

Make necessary changes in dpu_hw_ctl to support CDM programming.

changes in v2:
- remove unused empty line
- pass in cdm_num to update_pending_flush_cdm()

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 35 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 12 
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index e7b680a151d6..75b8a32389c3 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -32,11 +32,13 @@
 #define   CTL_DSC_ACTIVE0x0E8
 #define   CTL_WB_ACTIVE 0x0EC
 #define   CTL_INTF_ACTIVE   0x0F4
+#define   CTL_CDM_ACTIVE0x0F8
 #define   CTL_FETCH_PIPE_ACTIVE 0x0FC
 #define   CTL_MERGE_3D_FLUSH0x100
 #define   CTL_DSC_FLUSH0x104
 #define   CTL_WB_FLUSH  0x108
 #define   CTL_INTF_FLUSH0x110
+#define   CTL_CDM_FLUSH0x114
 #define   CTL_INTF_MASTER   0x134
 #define   CTL_DSPP_n_FLUSH(n)   ((0x13C) + ((n) * 4))
 
@@ -46,6 +48,7 @@
 #define DPU_REG_RESET_TIMEOUT_US2000
 #define  MERGE_3D_IDX   23
 #define  DSC_IDX22
+#define CDM_IDX 26
 #define  INTF_IDX   31
 #define WB_IDX  16
 #define  DSPP_IDX   29  /* From DPU hw rev 7.x.x */
@@ -107,6 +110,7 @@ static inline void dpu_hw_ctl_clear_pending_flush(struct 
dpu_hw_ctl *ctx)
ctx->pending_wb_flush_mask = 0;
ctx->pending_merge_3d_flush_mask = 0;
ctx->pending_dsc_flush_mask = 0;
+   ctx->pending_cdm_flush_mask = 0;
 
memset(ctx->pending_dspp_flush_mask, 0,
sizeof(ctx->pending_dspp_flush_mask));
@@ -151,6 +155,10 @@ static inline void dpu_hw_ctl_trigger_flush_v1(struct 
dpu_hw_ctl *ctx)
DPU_REG_WRITE(>hw, CTL_DSC_FLUSH,
  ctx->pending_dsc_flush_mask);
 
+   if (ctx->pending_flush_mask & BIT(CDM_IDX))
+   DPU_REG_WRITE(>hw, CTL_CDM_FLUSH,
+ ctx->pending_cdm_flush_mask);
+
DPU_REG_WRITE(>hw, CTL_FLUSH, ctx->pending_flush_mask);
 }
 
@@ -282,6 +290,13 @@ static void dpu_hw_ctl_update_pending_flush_wb(struct 
dpu_hw_ctl *ctx,
}
 }
 
+static void dpu_hw_ctl_update_pending_flush_cdm(struct dpu_hw_ctl *ctx, enum 
dpu_cdm cdm_num)
+{
+   /* update pending flush only if CDM_0 is flushed */
+   if (cdm_num == CDM_0)
+   ctx->pending_flush_mask |= BIT(CDM_IDX);
+}
+
 static void dpu_hw_ctl_update_pending_flush_wb_v1(struct dpu_hw_ctl *ctx,
enum dpu_wb wb)
 {
@@ -310,6 +325,12 @@ static void dpu_hw_ctl_update_pending_flush_dsc_v1(struct 
dpu_hw_ctl *ctx,
ctx->pending_flush_mask |= BIT(DSC_IDX);
 }
 
+static void dpu_hw_ctl_update_pending_flush_cdm_v1(struct dpu_hw_ctl *ctx, 
enum dpu_cdm cdm_num)
+{
+   ctx->pending_cdm_flush_mask |= BIT(cdm_num - CDM_0);
+   ctx->pending_flush_mask |= BIT(CDM_IDX);
+}
+
 static void dpu_hw_ctl_update_pending_flush_dspp(struct dpu_hw_ctl *ctx,
enum dpu_dspp dspp, u32 dspp_sub_blk)
 {
@@ -513,6 +534,7 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
u32 intf_active = 0;
u32 wb_active = 0;
u32 mode_sel = 0;
+   u32 cdm_active = 0;
 
/* CTL_TOP[31:28] carries group_id to collate CTL paths
 * per VM. Explicitly disable it until VM support is
@@ -526,6 +548,7 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
 
intf_active = DPU_REG_READ(c, CTL_INTF_ACTIVE);
wb_active = DPU_REG_READ(c, CTL_WB_ACTIVE);
+   cdm_active = DPU_REG_READ(c, CTL_CDM_ACTIVE);
 
if (cfg->intf)
intf_active |= BIT(cfg->intf - INTF_0);
@@ -543,6 +566,9 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
 
if (cfg->dsc)
DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
+
+   if (cfg->cdm)
+   DPU_REG_WRITE(c, CTL_CDM_ACTIVE, cfg->cdm);
 }
 
 static void dpu_hw_ctl_intf_cfg(struct dpu_hw_ctl *ctx,
@@ -586,6 +612,7 @@ static void dpu_hw_ctl_reset_intf_cfg_v1(struct dpu_hw_ctl 
*ctx,
u32 wb_active = 0;
u32 merge3d_active = 0;
u32 dsc_active;
+   u32 cdm_active;
 
/*
 * This API resets each portion of the CTL path namely,
@@ -621,6 +648,12 @@ static void dpu_hw_ctl_reset_intf_cfg_v1(struct dpu_hw_ctl 
*ctx,
dsc_active &= ~cfg->dsc;
DPU_REG_WRITE(c, CTL_DSC_ACTIVE, dsc_active);
}
+
+   if (cfg->cdm) {
+   cdm_active = DPU_REG_READ(c, CTL_CDM_ACTIVE);
+   cdm_active &= ~cfg->cdm;
+   DPU_REG_WRITE(c, CTL_CDM_ACTIVE, cdm_active);
+   }
 }
 
 static void

[PATCH v2 05/16] drm/msm/dpu: add cdm blocks to sc7280 dpu_hw_catalog

2023-12-07 Thread Abhinav Kumar

Add CDM blocks to the sc7280 dpu_hw_catalog to support
YUV format output from writeback block.

changes in v2:
- remove explicit zero assignment for features
- move sc7280_cdm to dpu_hw_catalog from the sc7280
  catalog file as its definition can be re-used

Signed-off-by: Abhinav Kumar 
---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h  |  1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c  | 10 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h  | 13 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  5 +
 4 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
index 209675de6742..19c2b7454796 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
@@ -248,6 +248,7 @@ const struct dpu_mdss_cfg dpu_sc7280_cfg = {
.mdss_ver = _mdss_ver,
.caps = _dpu_caps,
.mdp = _mdp,
+   .cdm = _cdm,
.ctl_count = ARRAY_SIZE(sc7280_ctl),
.ctl = sc7280_ctl,
.sspp_count = ARRAY_SIZE(sc7280_sspp),
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index d52aae54bbd5..1be3156cde05 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -426,6 +426,16 @@ static const struct dpu_dsc_sub_blks dsc_sblk_1 = {
.ctl = {.name = "ctl", .base = 0xF80, .len = 0x10},
 };
 
+/*
+ * CDM sub block config
+ */
+static const struct dpu_cdm_cfg sc7280_cdm = {
+   .name = "cdm_0",
+   .id = CDM_0,
+   .len = 0x228,
+   .base = 0x79200,
+};
+
 /*
  * VBIF sub blocks config
  */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index e3c0d007481b..ba82ef4560a6 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -682,6 +682,17 @@ struct dpu_vbif_cfg {
u32 memtype[MAX_XIN_COUNT];
 };
 
+/**
+ * struct dpu_cdm_cfg - information of chroma down blocks
+ * @name   string name for debug purposes
+ * @id enum identifying this block
+ * @base   register offset of this block
+ * @features   bit mask identifying sub-blocks/features
+ */
+struct dpu_cdm_cfg {
+   DPU_HW_BLK_INFO;
+};
+
 /**
  * Define CDP use cases
  * @DPU_PERF_CDP_UDAGE_RT: real-time use cases
@@ -805,6 +816,8 @@ struct dpu_mdss_cfg {
u32 wb_count;
const struct dpu_wb_cfg *wb;
 
+   const struct dpu_cdm_cfg *cdm;
+
u32 ad_count;
 
u32 dspp_count;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
index a6702b2bfc68..f319c8232ea5 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
@@ -185,6 +185,11 @@ enum dpu_dsc {
DSC_MAX
 };
 
+enum dpu_cdm {
+   CDM_0 = 1,
+   CDM_MAX
+};
+
 enum dpu_pingpong {
PINGPONG_NONE,
PINGPONG_0,
-- 
2.40.1

[PATCH v2 04/16] drm/msm/dpu: move csc matrices to dpu_hw_util

2023-12-07 Thread Abhinav Kumar

Since the type and usage of CSC matrices is spanning across DPU
lets introduce a helper to the dpu_hw_util to return the CSC
corresponding to the request type. This will help to add more
supported CSC types such as the RGB to YUV one which is used in
the case of CDM.

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.c | 54 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h |  7 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c   | 39 ++-
 3 files changed, 64 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.c
index 0b05061e3e62..59a153331194 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.c
@@ -87,6 +87,60 @@ static u32 dpu_hw_util_log_mask = DPU_DBG_MASK_NONE;
 #define QOS_QOS_CTRL_VBLANK_ENBIT(16)
 #define QOS_QOS_CTRL_CREQ_VBLANK_MASK GENMASK(21, 20)
 
+static const struct dpu_csc_cfg dpu_csc_YUV2RGB_601L = {
+   {
+   /* S15.16 format */
+   0x00012A00, 0x, 0x00019880,
+   0x00012A00, 0x9B80, 0x3000,
+   0x00012A00, 0x00020480, 0x,
+   },
+   /* signed bias */
+   { 0xfff0, 0xff80, 0xff80,},
+   { 0x0, 0x0, 0x0,},
+   /* unsigned clamp */
+   { 0x10, 0xeb, 0x10, 0xf0, 0x10, 0xf0,},
+   { 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,},
+};
+
+static const struct dpu_csc_cfg dpu_csc10_YUV2RGB_601L = {
+   {
+   /* S15.16 format */
+   0x00012A00, 0x, 0x00019880,
+   0x00012A00, 0x9B80, 0x3000,
+   0x00012A00, 0x00020480, 0x,
+   },
+   /* signed bias */
+   { 0xffc0, 0xfe00, 0xfe00,},
+   { 0x0, 0x0, 0x0,},
+   /* unsigned clamp */
+   { 0x40, 0x3ac, 0x40, 0x3c0, 0x40, 0x3c0,},
+   { 0x00, 0x3ff, 0x00, 0x3ff, 0x00, 0x3ff,},
+};
+
+/**
+ * dpu_hw_get_csc_cfg - get the CSC matrix based on the request type
+ * @type:  type of the requested CSC matrix from caller
+ * Return: CSC matrix corresponding to the request type in DPU format
+ */
+const struct dpu_csc_cfg *dpu_hw_get_csc_cfg(enum dpu_hw_csc_cfg_type type)
+{
+   const struct dpu_csc_cfg *csc_cfg = NULL;
+
+   switch (type) {
+   case DPU_HW_YUV2RGB_601L:
+   csc_cfg = _csc_YUV2RGB_601L;
+   break;
+   case DPU_HW_YUV2RGB_601L_10BIT:
+   csc_cfg = _csc10_YUV2RGB_601L;
+   break;
+   default:
+   DPU_ERROR("unknown csc_cfg type\n");
+   break;
+   }
+
+   return csc_cfg;
+}
+
 void dpu_reg_write(struct dpu_hw_blk_reg_map *c,
u32 reg_off,
u32 val,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h
index fe083b2e5696..49f2bcf6de15 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h
@@ -19,6 +19,11 @@
 #define MISR_CTRL_STATUS_CLEAR  BIT(10)
 #define MISR_CTRL_FREE_RUN_MASK BIT(31)
 
+enum dpu_hw_csc_cfg_type {
+   DPU_HW_YUV2RGB_601L,
+   DPU_HW_YUV2RGB_601L_10BIT,
+};
+
 /*
  * This is the common struct maintained by each sub block
  * for mapping the register offsets in this block to the
@@ -368,4 +373,6 @@ bool dpu_hw_clk_force_ctrl(struct dpu_hw_blk_reg_map *c,
   const struct dpu_clk_ctrl_reg *clk_ctrl_reg,
   bool enable);
 
+const struct dpu_csc_cfg *dpu_hw_get_csc_cfg(enum dpu_hw_csc_cfg_type type);
+
 #endif /* _DPU_HW_UTIL_H */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 3235ab132540..31641889b9f0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -21,6 +21,7 @@
 #include "dpu_kms.h"
 #include "dpu_formats.h"
 #include "dpu_hw_sspp.h"
+#include "dpu_hw_util.h"
 #include "dpu_trace.h"
 #include "dpu_crtc.h"
 #include "dpu_vbif.h"
@@ -508,50 +509,16 @@ static void _dpu_plane_setup_pixel_ext(struct 
dpu_hw_scaler3_cfg *scale_cfg,
}
 }
 
-static const struct dpu_csc_cfg dpu_csc_YUV2RGB_601L = {
-   {
-   /* S15.16 format */
-   0x00012A00, 0x, 0x00019880,
-   0x00012A00, 0x9B80, 0x3000,
-   0x00012A00, 0x00020480, 0x,
-   },
-   /* signed bias */
-   { 0xfff0, 0xff80, 0xff80,},
-   { 0x0, 0x0, 0x0,},
-   /* unsigned clamp */
-   { 0x10, 0xeb, 0x10, 0xf0, 0x10, 0xf0,},
-   { 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,},
-};
-
-static const struct dpu_csc_cfg dpu_csc10_YUV2RGB_601L = {
-   {
-   /* S15.16 format */
-   0x00012A00, 0x, 0x00019880,
-   0x00012A00, 0x9B80, 0x3000,
-   0x00012A00, 0x00020480, 0x,
-   },
-   /*

[PATCH v2 02/16] drm/msm/dpu: rename dpu_encoder_phys_wb_setup_cdp to match its functionality

2023-12-07 Thread Abhinav Kumar

dpu_encoder_phys_wb_setup_cdp() is not programming the chroma down
prefetch block. Its setting up the display ctl path for writeback.

Hence rename it to dpu_encoder_phys_wb_setup_ctl() to match what its
actually doing.

Fixes: d7d0e73f7de3 ("drm/msm/dpu: introduce the dpu_encoder_phys_* for 
writeback")
Signed-off-by: Abhinav Kumar 
Reviewed-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
index 91b1967cf566..4665367cf14f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -207,10 +207,10 @@ static void dpu_encoder_phys_wb_setup_fb(struct 
dpu_encoder_phys *phys_enc,
 }
 
 /**
- * dpu_encoder_phys_wb_setup_cdp - setup chroma down prefetch block
+ * dpu_encoder_phys_wb_setup_ctl - setup wb pipeline for ctl path
  * @phys_enc:Pointer to physical encoder
  */
-static void dpu_encoder_phys_wb_setup_cdp(struct dpu_encoder_phys *phys_enc)
+static void dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
 {
struct dpu_hw_wb *hw_wb;
struct dpu_hw_ctl *ctl;
@@ -382,7 +382,7 @@ static void dpu_encoder_phys_wb_setup(
 
dpu_encoder_phys_wb_setup_fb(phys_enc, fb);
 
-   dpu_encoder_phys_wb_setup_cdp(phys_enc);
+   dpu_encoder_phys_wb_setup_ctl(phys_enc);
 
 }
 
-- 
2.40.1

[PATCH v2 03/16] drm/msm/dpu: fix writeback programming for YUV cases

2023-12-07 Thread Abhinav Kumar

For YUV cases, setting the required format bits was missed
out in the register programming. Lets fix it now in preparation
of adding YUV formats support for writeback.

changes in v2:
- dropped the fixes tag as its not a fix but adding
  new functionality

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c
index ed0e80616129..e75995f7fcea 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c
@@ -89,6 +89,9 @@ static void dpu_hw_wb_setup_format(struct dpu_hw_wb *ctx,
dst_format |= BIT(14); /* DST_ALPHA_X */
}
 
+   if (DPU_FORMAT_IS_YUV(fmt))
+   dst_format |= BIT(15);
+
pattern = (fmt->element[3] << 24) |
(fmt->element[2] << 16) |
(fmt->element[1] << 8)  |
-- 
2.40.1

[PATCH v2 01/16] drm/msm/dpu: add formats check for writeback encoder

2023-12-07 Thread Abhinav Kumar

In preparation for adding more formats to dpu writeback add
format validation to it to fail any unsupported formats.

changes in v2:
- correct some grammar in the commit text

Fixes: d7d0e73f7de3 ("drm/msm/dpu: introduce the dpu_encoder_phys_* for 
writeback")
Signed-off-by: Abhinav Kumar 
Reviewed-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
index bb94909caa25..91b1967cf566 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -272,6 +272,7 @@ static int dpu_encoder_phys_wb_atomic_check(
 {
struct drm_framebuffer *fb;
const struct drm_display_mode *mode = _state->mode;
+   int ret;
 
DPU_DEBUG("[atomic_check:%d, \"%s\",%d,%d]\n",
phys_enc->hw_wb->idx, mode->name, mode->hdisplay, 
mode->vdisplay);
@@ -308,6 +309,12 @@ static int dpu_encoder_phys_wb_atomic_check(
return -EINVAL;
}
 
+   ret = drm_atomic_helper_check_wb_encoder_state(phys_enc->parent, 
conn_state);
+   if (ret < 0) {
+   DPU_ERROR("invalid pixel format %p4cc\n", >format->format);
+   return ret;
+   }
+
return 0;
 }
 
-- 
2.40.1

[PATCH v2 00/16] Add CDM support for MSM writeback

2023-12-07 Thread Abhinav Kumar

Chroma Down Sampling (CDM) block is a hardware block in the DPU pipeline
which among other things has a CSC block that can convert RGB input
from the DPU to YUV data.

This block can be used with either HDMI, DP or writeback interface.

In this series, lets first add the support for CDM block to be used
with writeback and then follow-up with support for other interfaces such
as DP.

This was validated by adding support to pass custom output format to the
IGT's kms_writeback test-case, specifically only for the output dump
test-case [1].

The usage for this is:

./kms_writeback -d -f 

So for NV12, this can be verified with the below command:

./kms_writeback -d -f NV12

[1] : https://patchwork.freedesktop.org/series/122125/

changes in v2:
- rebased on top of current msm-next-lumag
- fix commit text of some of the patches
- move csc matrices to dpu_hw_util as they span across DPU
- move cdm blk define to dpu_hw_catalog as its common across chipsets
- remove bit magic in dpu_hw_cdm with relevant defines
- use drmm_kzalloc instead of kzalloc/free
- protect bind_pingpong_blk with core_rev check
- drop setup_csc_data() and setup_cdwn() ops as they
  are merged into enable()
- protect bind_pingpong_blk with core_rev check
- drop setup_csc_data() and setup_cdwn() ops as they
  are merged into enable()
- move needs_cdm to topology struct
- call update_pending_flush_cdm even when bind_pingpong_blk
  is not present
- drop usage of setup_csc_data() and setup_cdwn() cdm ops
  as they both have been merged into enable()
- drop reduntant hw_cdm and hw_pp checks
- drop fb related checks from dpu_encoder::atomic_mode_set()
- introduce separate wb2_format arrays for rgb and yuv

Abhinav Kumar (16):
  drm/msm/dpu: add formats check for writeback encoder
  drm/msm/dpu: rename dpu_encoder_phys_wb_setup_cdp to match its
functionality
  drm/msm/dpu: fix writeback programming for YUV cases
  drm/msm/dpu: move csc matrices to dpu_hw_util
  drm/msm/dpu: add cdm blocks to sc7280 dpu_hw_catalog
  drm/msm/dpu: add cdm blocks to sm8250 dpu_hw_catalog
  drm/msm/dpu: add dpu_hw_cdm abstraction for CDM block
  drm/msm/dpu: add cdm blocks to RM
  drm/msm/dpu: add support to allocate CDM from RM
  drm/msm/dpu: add CDM related logic to dpu_hw_ctl layer
  drm/msm/dpu: add support to disable CDM block during encoder cleanup
  drm/msm/dpu: add an API to setup the CDM block for writeback
  drm/msm/dpu: plug-in the cdm related bits to writeback setup
  drm/msm/dpu: reserve cdm blocks for writeback in case of YUV output
  drm/msm/dpu: introduce separate wb2_format arrays for rgb and yuv
  drm/msm/dpu: add cdm blocks to dpu snapshot

 drivers/gpu/drm/msm/Makefile  |   1 +
 .../msm/disp/dpu1/catalog/dpu_10_0_sm8650.h   |   4 +-
 .../msm/disp/dpu1/catalog/dpu_6_0_sm8250.h|   5 +-
 .../msm/disp/dpu1/catalog/dpu_6_2_sc7180.h|   4 +-
 .../msm/disp/dpu1/catalog/dpu_7_2_sc7280.h|   5 +-
 .../msm/disp/dpu1/catalog/dpu_9_0_sm8550.h|   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   |  37 +++
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |   5 +
 .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 117 +++-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c|  47 ++-
 .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  13 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.c| 276 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h| 114 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c|  35 +++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h|  12 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h   |   7 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.c   |  71 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h   |   8 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c |   3 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |   4 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   |   1 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c |  39 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c|  51 +++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h|   2 +
 drivers/gpu/drm/msm/msm_drv.h |   2 +
 25 files changed, 815 insertions(+), 52 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.c
 create mode 100644 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cdm.h

-- 
2.40.1

[git pull] drm fixes for 6.7-rc5

2023-12-07 Thread Dave Airlie

Hi Linus,

Regular weekly fixes, mostly amdgpu and i915 as usual. A couple of
nouveau, panfrost, one core and one bridge Kconfig.

Seems about normal for rc5,

Regards,
Dave.

drm-fixes-2023-12-08:
drm fixes for v6.7-rc5

atomic-helpers:
- invoke end_fb_access while owning plane state

i915:
- fix a missing dep for a previous fix
- Relax BXT/GLK DSI transcoder hblank limits
- Fix DP MST .mode_valid_ctx() return values
- Reject DP MST modes that require bigjoiner (as it's not yet
supported on DP MST)
- Fix _intel_dsb_commit() variable type to allow negative values

nouveau:
- document some bits of gsp rm
- flush vmm more on tu102 to avoid hangs

panfrost:
- fix imported dma-buf objects residency
- fix device freq update

bridge:
- tc358768 - fix Kconfig

amdgpu:
- Disable MCBP on gfx9
- DC vbios fix
- eDP fix
- dml2 UBSAN fix
- SMU 14 fix
- RAS fixes
- dml KASAN/KCSAN fix
- PSP 13 fix
- Clockgating fixes
- Suspend fix

exynos:
- fix pointer dereference
- fix wrong error check
The following changes since commit 33cc938e65a98f1d29d0a18403dbbee050dcad9a:

  Linux 6.7-rc4 (2023-12-03 18:52:56 +0900)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2023-12-08

for you to fetch changes up to b7b5a56acec819bb8dcd03c687e97a091b29d28f:

  Merge tag 'exynos-drm-next-for-v6.7-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into
drm-fixes (2023-12-08 13:55:32 +1000)


drm fixes for v6.7-rc5

atomic-helpers:
- invoke end_fb_access while owning plane state

i915:
- fix a missing dep for a previous fix
- Relax BXT/GLK DSI transcoder hblank limits
- Fix DP MST .mode_valid_ctx() return values
- Reject DP MST modes that require bigjoiner (as it's not yet
supported on DP MST)
- Fix _intel_dsb_commit() variable type to allow negative values

nouveau:
- document some bits of gsp rm
- flush vmm more on tu102 to avoid hangs

panfrost:
- fix imported dma-buf objects residency
- fix device freq update

bridge:
- tc358768 - fix Kconfig

amdgpu:
- Disable MCBP on gfx9
- DC vbios fix
- eDP fix
- dml2 UBSAN fix
- SMU 14 fix
- RAS fixes
- dml KASAN/KCSAN fix
- PSP 13 fix
- Clockgating fixes
- Suspend fix

exynos:
- fix pointer dereference
- fix wrong error check


Adrián Larumbe (2):
  drm/panfrost: Consider dma-buf imported objects as resident
  drm/panfrost: Fix incorrect updating of current device frequency

Alex Deucher (2):
  drm/amd/display: Increase frame warning limit with KASAN or KCSAN in dml
  drm/amdgpu: fix buffer funcs setting order on suspend

Alvin Lee (1):
  drm/amd/display: Use channel_width = 2 for vram table 3.0

Arnd Bergmann (1):
  drm/bridge: tc358768: select CONFIG_VIDEOMODE_HELPERS

Dave Airlie (6):
  nouveau/tu102: flush all pdbs on vmm flush
  Merge tag 'drm-intel-fixes-2023-12-01-1' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'drm-intel-fixes-2023-12-07' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'drm-misc-fixes-2023-12-07' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
  Merge tag 'amd-drm-fixes-6.7-2023-12-06' of
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
  Merge tag 'exynos-drm-next-for-v6.7-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into
drm-fixes

Hawking Zhang (1):
  drm/amdgpu: Update fw version for boot time error query

Inki Dae (1):
  drm/exynos: fix a wrong error checking

Ivan Lipski (1):
  drm/amd/display: Add monitor patch for specific eDP

Jiadong Zhu (1):
  drm/amdgpu: disable MCBP by default

Li Ma (1):
  drm/amd/swsmu: update smu v14_0_0 driver if version and metrics table

Lijo Lazar (4):
  drm/amdgpu: Restrict extended wait to PSP v13.0.6
  drm/amdgpu: Add NULL checks for function pointers
  drm/amdgpu: Update HDP 4.4.2 clock gating flags
  drm/amdgpu: Avoid querying DRM MGCG status

Roman Li (1):
  drm/amd/display: Fix array-index-out-of-bounds in dml2

Thomas Zimmermann (1):
  drm/atomic-helpers: Invoke end_fb_access while owning plane state

Timur Tabi (1):
  nouveau/gsp: document some aspects of GSP-RM

Ville Syrjälä (4):
  drm/i915: Check pipe active state in {planes,vrr}_{enabling,disabling}()
  drm/i915: Skip some timing checks on BXT/GLK DSI transcoders
  drm/i915/mst: Fix .mode_valid_ctx() return values
  drm/i915/mst: Reject modes that require the bigjoiner

Xiang Yang (1):
  drm/exynos: fix a potential error pointer dereference

Yang Wang (2):
  drm/amd/pm: support new mca smu error code decoding
  drm/amdgpu: optimize the printing order of error data

heminhong (1):
  drm/i915: correct the input parameter on _intel_dsb_commit()

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h

Re: [PATCH 3/3] arm64: dts: qcom: sm8650: Add DisplayPort device nodes

2023-12-07 Thread Bjorn Andersson

On Thu, Dec 07, 2023 at 05:37:19PM +0100, Neil Armstrong wrote:
> diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi 
> b/arch/arm64/boot/dts/qcom/sm8650.dtsi
[..]
> +
> + mdss_dp0: displayport-controller@af54000 {
> + compatible = "qcom,sm8650-dp";
> + reg = <0 0xaf54000 0 0x200>,
> +   <0 0xaf54200 0 0x200>,
> +   <0 0xaf55000 0 0xc00>,
> +   <0 0xaf56000 0 0x400>,
> +   <0 0xaf57000 0 0x400>;
> +
> + interrupts-extended = < 12>;
> +
> + clocks = < DISP_CC_MDSS_AHB_CLK>,
> +  < DISP_CC_MDSS_DPTX0_AUX_CLK>,
> +  < DISP_CC_MDSS_DPTX0_LINK_CLK>,
> +  < 
> DISP_CC_MDSS_DPTX0_LINK_INTF_CLK>,
> +  < 
> DISP_CC_MDSS_DPTX0_PIXEL0_CLK>;
> + clock-names = "core_iface",
> +   "core_aux",
> +   "ctrl_link",
> +   "ctrl_link_iface",
> +   "stream_pixel";
> +
> + assigned-clocks = < 
> DISP_CC_MDSS_DPTX0_LINK_CLK_SRC>,
> +   < 
> DISP_CC_MDSS_DPTX0_PIXEL0_CLK_SRC>;
> + assigned-clock-parents = <_dp_qmpphy 
> QMP_USB43DP_DP_LINK_CLK>,
> +  <_dp_qmpphy 
> QMP_USB43DP_DP_VCO_DIV_CLK>;
> +
> + operating-points-v2 = <_opp_table>;
> +
> + power-domains = < RPMHPD_MX>;

Are you sure the DP TX block sits in MX? I'd expect this to be
RPMHPD_MMCX, and then the PHY partially in MX...

> +
> + phys = <_dp_qmpphy QMP_USB43DP_DP_PHY>;
> + phy-names = "dp";
> +
> + #sound-dai-cells = <0>;
> +
> + status = "disabled";
> +
> + ports {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + port@0 {
> + reg = <0>;
> +
> + mdss_dp0_in: endpoint {
> + remote-endpoint = 
> <_intf0_out>;
> + };
> + };
> +
> + port@1 {
> + reg = <1>;
> +
> + mdss_dp0_out: endpoint {
> + };
> + };
> + };
> +
> + dp_opp_table: opp-table {

Is there any reason why we keep sorting 'o' after 'p' in these nodes?

Regards,
Bjorn

Re:Re: [v3 5/6] drm/vs: Add hdmi driver

2023-12-07 Thread Andy Yan



Hi Keith：

在 2023-12-08 11:00:31，"Keith Zhao"  写道：
>
>
>On 2023/12/8 8:37, Andy Yan wrote:
>> Hi Keth：
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2023-12-07 18:48:13，"Keith Zhao"  写道：
>>>
>>>
>>>On 2023/12/7 17:02, Andy Yan wrote:
 
 
 
 
 Hi Keith：
 
 
 
 
 
 
 
 
 
 
 
 At 2023-12-06 22:11:33, "Keith Zhao"  wrote:
>
>
>On 2023/12/6 20:56, Maxime Ripard wrote:
>> On Wed, Dec 06, 2023 at 08:02:55PM +0800, Keith Zhao wrote:
>>> >> +static const struct of_device_id starfive_hdmi_dt_ids[] = {
>>> >> +{ .compatible = "starfive,jh7110-inno-hdmi",},
>>> > 
>>> > So it's inno hdmi, just like Rockchip then?
>>> > 
>>> > This should be a common driver.
>>>
>>> Rockchip has a inno hdmi IP. and Starfive has a inno hdmi IP.
>>> but the harewawre difference of them is big , it is not easy to use the 
>>> common driver
>>> maybe i need the inno hdmi version here to make a distinction
>> 
>> I just had a look at the rockchip header file: all the registers but the
>> STARFIVE_* ones are identical.
>> 
>> There's no need to have two identical drivers then, please use the
>> rockchip driver instead.
>> 
>> Maxime
>
>ok, have a simple test , edid can get . i will continue 
 
 Maybe you can take drivers/gpu/drm/bridge/synopsys/dw-hdmi as a reference， 
 this
 is also a hdmi ip used by rockchip/meson/sunxi/jz/imx。
 We finally make it share one driver。
>
>>>hi Andy:
>>>
>>>dw_hdmi seems a good choice , it can handle inno hdmi hardware by define its 
>>>dw_hdmi_plat_data.
>>>does it means i can write own driver files such as(dw_hdmi-starfive.c) based 
>>>on dw_hdmi instead of add plat_data in inno_hdmi.c
>>>
>> 
>> I think the process maybe like this：
>> 
>> 1. split the inno_hdmi.c under rockchip to  inno_hdmi.c(the common part), 
>> inno_hdmi-rockchip.c(the soc specific part)
>> 2. move the common part inno_hdmi.c to drivers/gpu/drm/bridge/innosilicon/
>> 3. add startfive specific part, inno_hdmi-startfive.c
>> 
>> bellow git log from kernel three show how we convert  dw_hdmi to a common 
>> driver: 
>> 
>> 
>> 
>> 12b9f204e804 drm: bridge/dw_hdmi: add rockchip rk3288 support
>> 74af9e4d03b8 dt-bindings: Add documentation for rockchip dw hdmi
>> d346c14eeea9 drm: bridge/dw_hdmi: add function dw_hdmi_phy_enable_spare
>> a4d3b8b050d5 drm: bridge/dw_hdmi: clear i2cmphy_stat0 reg in 
>> hdmi_phy_wait_i2c_done
>> 632d035bace2 drm: bridge/dw_hdmi: add mode_valid support
>> 0cd9d1428322 drm: bridge/dw_hdmi: add support for multi-byte register width 
>> access
>> cd152393967e dt-bindings: add document for dw_hdmi
>> b21f4b658df8 drm: imx: imx-hdmi: move imx-hdmi to bridge/dw_hdmi
>> aaa757a092c2 drm: imx: imx-hdmi: split phy configuration to platform driver
>> 3d1b35a3d9f3 drm: imx: imx-hdmi: convert imx-hdmi to drm_bridge mode
>> c2c3848851a7 drm: imx: imx-hdmi: return defer if can't get ddc i2c adapter
>> b587833933de drm: imx: imx-hdmi: make checkpatch happy
>> 
>hi Andy:
>I got you means, 
>as I don't have a rockchip board on hand , to split the inno_hdmi.c can not be 
>tested.
>
>how adout this idea:
>1、split the starfive_hdmi.c under verisilicion to  inno_hdmi.c(the common 
>part), inno_hdmi-starfive.c(the soc specific part)
>2. move the common part inno_hdmi.c to drivers/gpu/drm/bridge/innosilicon/
>3. In the future, inno hdmi.c under rockchip will reuse the public driver.

I am not sure if drm maintainers are happy with this。

To be honest， I also don't have a  i.mx board when I start convert dw_hdmi to a 
common driver,
some respectable people from the community help test and give me many valuable 
advice， this
is the power of open source。

I found a rk3036 based kylin board this week，but it can't  boot yet，I will go 
on try if
I can boot it this weekend。 I can do the test on rockchip side， if i can make 
this board work。

>
>> 
>>>Thanks for pointing this out!!!
>>>
>
>___
>linux-riscv mailing list
>linux-ri...@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-riscv
>>>
>>>___
>>>linux-riscv mailing list
>>>linux-ri...@lists.infradead.org
>>>http://lists.infradead.org/mailman/listinfo/linux-riscv

Re: [v3 5/6] drm/vs: Add hdmi driver

2023-12-07 Thread Keith Zhao




On 2023/12/8 8:37, Andy Yan wrote:
> Hi Keth：
> 
> 
> 
> 
> 
> 
> 在 2023-12-07 18:48:13，"Keith Zhao"  写道：
>>
>>
>>On 2023/12/7 17:02, Andy Yan wrote:
>>> 
>>> 
>>> 
>>> 
>>> Hi Keith：
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> At 2023-12-06 22:11:33, "Keith Zhao"  wrote:


On 2023/12/6 20:56, Maxime Ripard wrote:
> On Wed, Dec 06, 2023 at 08:02:55PM +0800, Keith Zhao wrote:
>> >> +static const struct of_device_id starfive_hdmi_dt_ids[] = {
>> >> + { .compatible = "starfive,jh7110-inno-hdmi",},
>> > 
>> > So it's inno hdmi, just like Rockchip then?
>> > 
>> > This should be a common driver.
>>
>> Rockchip has a inno hdmi IP. and Starfive has a inno hdmi IP.
>> but the harewawre difference of them is big , it is not easy to use the 
>> common driver
>> maybe i need the inno hdmi version here to make a distinction
> 
> I just had a look at the rockchip header file: all the registers but the
> STARFIVE_* ones are identical.
> 
> There's no need to have two identical drivers then, please use the
> rockchip driver instead.
> 
> Maxime

ok, have a simple test , edid can get . i will continue 
>>> 
>>> Maybe you can take drivers/gpu/drm/bridge/synopsys/dw-hdmi as a reference， 
>>> this
>>> is also a hdmi ip used by rockchip/meson/sunxi/jz/imx。
>>> We finally make it share one driver。

>>hi Andy:
>>
>>dw_hdmi seems a good choice , it can handle inno hdmi hardware by define its 
>>dw_hdmi_plat_data.
>>does it means i can write own driver files such as(dw_hdmi-starfive.c) based 
>>on dw_hdmi instead of add plat_data in inno_hdmi.c
>>
> 
> I think the process maybe like this：
> 
> 1. split the inno_hdmi.c under rockchip to  inno_hdmi.c(the common part), 
> inno_hdmi-rockchip.c(the soc specific part)
> 2. move the common part inno_hdmi.c to drivers/gpu/drm/bridge/innosilicon/
> 3. add startfive specific part, inno_hdmi-startfive.c
> 
> bellow git log from kernel three show how we convert  dw_hdmi to a common 
> driver: 
> 
> 
> 
> 12b9f204e804 drm: bridge/dw_hdmi: add rockchip rk3288 support
> 74af9e4d03b8 dt-bindings: Add documentation for rockchip dw hdmi
> d346c14eeea9 drm: bridge/dw_hdmi: add function dw_hdmi_phy_enable_spare
> a4d3b8b050d5 drm: bridge/dw_hdmi: clear i2cmphy_stat0 reg in 
> hdmi_phy_wait_i2c_done
> 632d035bace2 drm: bridge/dw_hdmi: add mode_valid support
> 0cd9d1428322 drm: bridge/dw_hdmi: add support for multi-byte register width 
> access
> cd152393967e dt-bindings: add document for dw_hdmi
> b21f4b658df8 drm: imx: imx-hdmi: move imx-hdmi to bridge/dw_hdmi
> aaa757a092c2 drm: imx: imx-hdmi: split phy configuration to platform driver
> 3d1b35a3d9f3 drm: imx: imx-hdmi: convert imx-hdmi to drm_bridge mode
> c2c3848851a7 drm: imx: imx-hdmi: return defer if can't get ddc i2c adapter
> b587833933de drm: imx: imx-hdmi: make checkpatch happy
> 
hi Andy:
I got you means, 
as I don't have a rockchip board on hand , to split the inno_hdmi.c can not be 
tested.

how adout this idea:
1、split the starfive_hdmi.c under verisilicion to  inno_hdmi.c(the common 
part), inno_hdmi-starfive.c(the soc specific part)
2. move the common part inno_hdmi.c to drivers/gpu/drm/bridge/innosilicon/
3. In the future, inno hdmi.c under rockchip will reuse the public driver.

> 
>>Thanks for pointing this out!!!
>>

___
linux-riscv mailing list
linux-ri...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
>>
>>___
>>linux-riscv mailing list
>>linux-ri...@lists.infradead.org
>>http://lists.infradead.org/mailman/listinfo/linux-riscv

Re: [PATCH v2 0/4] Adreno 643 + fixes

2023-12-07 Thread Bjorn Andersson



On Mon, 20 Nov 2023 13:12:51 +0100, Konrad Dybcio wrote:
> as it says on the can
> 
> drm/msm patches for Rob
> arm64 patches for linux-arm-msm
> 
> for use with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25408
> 
> [...]

Applied, thanks!

[1/4] arm64: dts: qcom: sc7280: Add ZAP shader support
  commit: 0ab1bef0b7c359e672cc2b8d51f0179cefa369fc
[2/4] arm64: dts: qcom: sc7280: Fix up GPU SIDs
  commit: 94085049fdad7a36fe14dd55e72e712fe55d6bca
[3/4] arm64: dts: qcom: sc7280: Mark Adreno SMMU as DMA coherent
  commit: 31edad478534186a2718be9206ce7b19f2735f6e
[4/4] arm64: dts: qcom: sc7280: Add 0xac Adreno speed bin
  commit: 6a7f8c635dab30233df93b5566d4169ed956b71b

Best regards,
-- 
Bjorn Andersson

[PATCH] x86/vmware: Add TDX hypercall support

2023-12-07 Thread Alexey Makhalov

From: Alexey Makhalov 

VMware hypercalls use I/O port, VMCALL or VMMCALL instructions.
Add __tdx_hypercall path to support TDX guests.

No change in high bandwidth hypercalls, as only low bandwidth
ones are supported for TDX guests.

Co-developed-by: Tim Merrifield 
Signed-off-by: Tim Merrifield 
Signed-off-by: Alexey Makhalov 
Reviewed-by: Nadav Amit 
---
 arch/x86/include/asm/vmware.h | 83 +++
 arch/x86/kernel/cpu/vmware.c  | 22 ++
 2 files changed, 105 insertions(+)

diff --git a/arch/x86/include/asm/vmware.h b/arch/x86/include/asm/vmware.h
index 719e41260ece..04c698b905ab 100644
--- a/arch/x86/include/asm/vmware.h
+++ b/arch/x86/include/asm/vmware.h
@@ -34,12 +34,65 @@
 #define VMWARE_CMD_GETHZ   45
 #define VMWARE_CMD_GETVCPU_INFO68
 #define VMWARE_CMD_STEALCLOCK  91
+/*
+ * Hypercall command mask:
+ *   bits[6:0] command, range [0, 127]
+ *   bits[19:16] sub-command, range [0, 15]
+ */
+#define VMWARE_CMD_MASK0xf007fULL
 
 #define CPUID_VMWARE_FEATURES_ECX_VMMCALL  BIT(0)
 #define CPUID_VMWARE_FEATURES_ECX_VMCALL   BIT(1)
 
 extern u8 vmware_hypercall_mode;
 
+#define VMWARE_TDX_VENDOR_LEAF 0x1af7e4909ULL
+#define VMWARE_TDX_HCALL_FUNC  1
+
+extern unsigned long vmware_tdx_hypercall(struct tdx_module_args *args);
+
+/*
+ * TDCALL[TDG.VP.VMCALL] uses rax (arg0) and rcx (arg2), while the use of
+ * rbp (arg6) is discouraged by the TDX specification. Therefore, we
+ * remap those registers to r12, r13 and r14, respectively.
+ */
+static inline
+unsigned long vmware_tdx_hypercall_args(unsigned long cmd, unsigned long in1,
+   unsigned long in3, unsigned long in4,
+   unsigned long in5, unsigned long in6,
+   uint32_t *out1, uint32_t *out2,
+   uint32_t *out3, uint32_t *out4,
+   uint32_t *out5, uint32_t *out6)
+{
+   unsigned long ret;
+
+   struct tdx_module_args args = {
+   .r13 = cmd,
+   .rbx = in1,
+   .rdx = in3,
+   .rsi = in4,
+   .rdi = in5,
+   .r14 = in6,
+   };
+
+   ret = vmware_tdx_hypercall();
+
+   if (out1)
+   *out1 = args.rbx;
+   if (out2)
+   *out2 = args.r13;
+   if (out3)
+   *out3 = args.rdx;
+   if (out4)
+   *out4 = args.rsi;
+   if (out5)
+   *out5 = args.rdi;
+   if (out6)
+   *out6 = args.r14;
+
+   return ret;
+}
+
 /*
  * The low bandwidth call. The low word of edx is presumed to have OUT bit
  * set. The high word of edx may contain input data from the caller.
@@ -67,6 +120,11 @@ unsigned long vmware_hypercall1(unsigned long cmd, unsigned 
long in1)
 {
unsigned long out0;
 
+   if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
+   return vmware_tdx_hypercall_args(cmd, in1, 0, 0, 0, 0,
+NULL, NULL, NULL,
+NULL, NULL, NULL);
+
asm_inline volatile (VMWARE_HYPERCALL
: "=a" (out0)
: [port] "i" (VMWARE_HYPERVISOR_PORT),
@@ -85,6 +143,11 @@ unsigned long vmware_hypercall3(unsigned long cmd, unsigned 
long in1,
 {
unsigned long out0;
 
+   if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
+   return vmware_tdx_hypercall_args(cmd, in1, 0, 0, 0, 0,
+out1, out2, NULL,
+NULL, NULL, NULL);
+
asm_inline volatile (VMWARE_HYPERCALL
: "=a" (out0), "=b" (*out1), "=c" (*out2)
: [port] "i" (VMWARE_HYPERVISOR_PORT),
@@ -104,6 +167,11 @@ unsigned long vmware_hypercall4(unsigned long cmd, 
unsigned long in1,
 {
unsigned long out0;
 
+   if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
+   return vmware_tdx_hypercall_args(cmd, in1, 0, 0, 0, 0,
+out1, out2, out3,
+NULL, NULL, NULL);
+
asm_inline volatile (VMWARE_HYPERCALL
: "=a" (out0), "=b" (*out1), "=c" (*out2), "=d" (*out3)
: [port] "i" (VMWARE_HYPERVISOR_PORT),
@@ -123,6 +191,11 @@ unsigned long vmware_hypercall5(unsigned long cmd, 
unsigned long in1,
 {
unsigned long out0;
 
+   if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
+   return vmware_tdx_hypercall_args(cmd, in1, in3, in4, in5, 0,
+NULL, out2, NULL,
+NULL, NULL, NULL);
+
asm_inline volatile (VMWARE_HYPERCALL
: "=a" (out0), "=c" (*out2)
: [port] "i" (VMWARE_HYPERVISOR_PORT),
@@ -145,6 +218,11

Re: [PATCH] x86/vmware: Add TDX hypercall support

2023-12-07 Thread Alexey Makhalov





On 12/7/23 9:12 AM, Dave Hansen wrote:

On 12/5/23 23:15, Alexey Makhalov wrote:

+#ifdef CONFIG_INTEL_TDX_GUEST
+/* Export tdx hypercall and allow it only for VMware guests. */
+void vmware_tdx_hypercall_args(struct tdx_module_args *args)
+{
+   if (hypervisor_is_type(X86_HYPER_VMWARE))
+   __tdx_hypercall(args);
+}
+EXPORT_SYMBOL_GPL(vmware_tdx_hypercall_args);
+#endif


I think this is still too generic.  This still allows anything setting
X86_HYPER_VMWARE to make any TDX hypercall.

I'd *much* rather you export something like vmware_tdx_hypercall() or
even the high-level calls like hypervisor_ppn_reset_all().  The higher
level and more specialized the interface, the less likely it is to be
abused.


Dave, I understood your point. Please take a look on the next version of 
the patch.


I export vmware_tdx_hypercall(), while vmware_tdx_hypercall_args() is a
static inline wrapper on top.
Most of the vmware hypercall logic plus sanity checks are now in 
exported function. While only input and output argument handling remains 
in the wrapper to allow compiler optimization for hypercalls with few 
argument. Exporting vmware_tdx_hypercall1, vmware_tdx_hypercall3, and so 
on is not an option either.


Regards,
--Alexey

Re: [net-next v1 00/16] Device Memory TCP

2023-12-07 Thread Mina Almasry

On Thu, Dec 7, 2023 at 4:52 PM Mina Almasry  wrote:
>
> Major changes in v1:
> --
>
> 1. Implemented MVP queue API ndos to remove the userspace-visible
>driver reset.
>
> 2. Fixed issues in the napi_pp_put_page() devmem frag unref path.
>
> 3. Removed RFC tag.
>
> Many smaller addressed comments across all the patches (patches have
> individual change log).
>
> Full tree including the rest of the GVE driver changes:
> https://github.com/mina/linux/commits/tcpdevmem-v1
>
> Cc: Yunsheng Lin 
> Cc: Shailend Chand 
> Cc: Harshitha Ramamurthy 
>

Welp, I messed up the subject line. It should say [PATCH net-next...]
across all the patches. This may trip up bots and email filters. If
this is annoying, I'll resend with the fixed subject line after the
24hr cooldown period. Sorry about that.

-- 
Thanks,
Mina

Re: (subset) [PATCH 0/3] arm64: qcom: sm8650: add support for DisplayPort Controller

2023-12-07 Thread Dmitry Baryshkov



On Thu, 07 Dec 2023 17:37:16 +0100, Neil Armstrong wrote:
> This adds support for the DisplayPort Controller found in the SM8650
> SoC, but it requires a specific compatible because the registers offsets
> has changed since SM8550.
> 
> This also updates the SM8650 MDSS bindings to allow a displayport subnode,
> and adds the necessary changes in the SM8650 DTSI to declare the DisplayPort
> Controller.
> 
> [...]

Applied, thanks!

[1/3] dt-bindings: display: msm: dp-controller: document SM8650 compatible
  https://gitlab.freedesktop.org/lumag/msm/-/commit/157fd368561e
[2/3] drm/msm/dp: Add DisplayPort controller for SM8650
  https://gitlab.freedesktop.org/lumag/msm/-/commit/1b2d98bdd7b7

Best regards,
-- 
Dmitry Baryshkov

Re: [PATCH] drm/msm/dpu: drop MSM_ENC_VBLANK support

2023-12-07 Thread Dmitry Baryshkov



On Wed, 04 Oct 2023 06:19:03 +0300, Dmitry Baryshkov wrote:
> There are no in-kernel users of MSM_ENC_VBLANK wait type. Drop it
> together with the corresponding wait_for_vblank callback.
> 
> 

Applied, thanks!

[1/1] drm/msm/dpu: drop MSM_ENC_VBLANK support
  https://gitlab.freedesktop.org/lumag/msm/-/commit/d4c74a150cce

Best regards,
-- 
Dmitry Baryshkov

Re: [PATCH] drm/msm/dp: Fix platform_get_irq() check

2023-12-07 Thread Dmitry Baryshkov



On Wed, 06 Dec 2023 15:02:05 +0300, Dan Carpenter wrote:
> The platform_get_irq() function returns negative error codes.  It never
> returns zero.  Fix the check accordingly.
> 
> 

Applied, thanks!

[1/1] drm/msm/dp: Fix platform_get_irq() check
  https://gitlab.freedesktop.org/lumag/msm/-/commit/c4ac0c6c96f0

Best regards,
-- 
Dmitry Baryshkov

Re: [PATCH] drm/msm/dp: Fix platform_get_irq() check

2023-12-07 Thread Dmitry Baryshkov


On 06/12/2023 14:02, Dan Carpenter wrote:

The platform_get_irq() function returns negative error codes.  It never
returns zero.  Fix the check accordingly.

Fixes: 82c2a5751227 ("drm/msm/dp: tie dp_display_irq_handler() with dp driver")
Signed-off-by: Dan Carpenter 
---
  drivers/gpu/drm/msm/dp/dp_display.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

[PATCH v4 2/2] drm/vkms: move wb's atomic_check from encoder to connector

2023-12-07 Thread Dmitry Baryshkov

As the renamed drm_atomic_helper_check_wb_connector_state() now accepts
drm_writeback_connector as the first argument (instead of drm_encoder),
move the VKMS writeback atomic_check from drm_encoder_helper_funcs to
drm_connector_helper_funcs. Also drop the vkms_wb_encoder_helper_funcs,
which have become empty now.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/vkms/vkms_writeback.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_writeback.c 
b/drivers/gpu/drm/vkms/vkms_writeback.c
index fef7f3daf2c9..bc724cbd5e3a 100644
--- a/drivers/gpu/drm/vkms/vkms_writeback.c
+++ b/drivers/gpu/drm/vkms/vkms_writeback.c
@@ -30,18 +30,25 @@ static const struct drm_connector_funcs 
vkms_wb_connector_funcs = {
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 };
 
-static int vkms_wb_encoder_atomic_check(struct drm_encoder *encoder,
-   struct drm_crtc_state *crtc_state,
-   struct drm_connector_state *conn_state)
+static int vkms_wb_atomic_check(struct drm_connector *connector,
+   struct drm_atomic_state *state)
 {
-   struct drm_connector *connector = conn_state->connector;
+   struct drm_connector_state *conn_state =
+   drm_atomic_get_new_connector_state(state, connector);
+   struct drm_crtc_state *crtc_state;
struct drm_framebuffer *fb;
-   const struct drm_display_mode *mode = _state->mode;
+   const struct drm_display_mode *mode;
int ret;
 
if (!conn_state->writeback_job || !conn_state->writeback_job->fb)
return 0;
 
+   if (!conn_state->crtc)
+   return 0;
+
+   crtc_state = drm_atomic_get_new_crtc_state(state, conn_state->crtc);
+   mode = _state->mode;
+
fb = conn_state->writeback_job->fb;
if (fb->width != mode->hdisplay || fb->height != mode->vdisplay) {
DRM_DEBUG_KMS("Invalid framebuffer size %ux%u\n",
@@ -49,17 +56,13 @@ static int vkms_wb_encoder_atomic_check(struct drm_encoder 
*encoder,
return -EINVAL;
}
 
-   ret = drm_atomic_helper_check_wb_connector_state(connector, 
conn_state->state);
+   ret = drm_atomic_helper_check_wb_connector_state(connector, state);
if (ret < 0)
return ret;
 
return 0;
 }
 
-static const struct drm_encoder_helper_funcs vkms_wb_encoder_helper_funcs = {
-   .atomic_check = vkms_wb_encoder_atomic_check,
-};
-
 static int vkms_wb_connector_get_modes(struct drm_connector *connector)
 {
struct drm_device *dev = connector->dev;
@@ -162,6 +165,7 @@ static const struct drm_connector_helper_funcs 
vkms_wb_conn_helper_funcs = {
.prepare_writeback_job = vkms_wb_prepare_job,
.cleanup_writeback_job = vkms_wb_cleanup_job,
.atomic_commit = vkms_wb_atomic_commit,
+   .atomic_check = vkms_wb_atomic_check,
 };
 
 int vkms_enable_writeback_connector(struct vkms_device *vkmsdev)
@@ -172,7 +176,7 @@ int vkms_enable_writeback_connector(struct vkms_device 
*vkmsdev)
 
return drm_writeback_connector_init(>drm, wb,
_wb_connector_funcs,
-   _wb_encoder_helper_funcs,
+   NULL,
vkms_wb_formats,
ARRAY_SIZE(vkms_wb_formats),
1);
-- 
2.39.2

[PATCH v4 1/2] drm/atomic-helper: rename drm_atomic_helper_check_wb_encoder_state

2023-12-07 Thread Dmitry Baryshkov

The drm_atomic_helper_check_wb_encoder_state() function doesn't use
encoder for anything other than getting the drm_device instance. The
function's description talks about checking the writeback connector
state, not the encoder state. Moreover, there is no such thing as an
encoder state, encoders generally do not have a state on their own.

Rename the function to drm_atomic_helper_check_wb_connector_state()
and change arguments to drm_writeback_connector and drm_atomic_state.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/drm_atomic_helper.c   | 16 +---
 drivers/gpu/drm/vkms/vkms_writeback.c |  3 ++-
 include/drm/drm_atomic_helper.h   |  5 ++---
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index a920fbae714c..39ef0a6addeb 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -795,9 +795,9 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
 EXPORT_SYMBOL(drm_atomic_helper_check_modeset);
 
 /**
- * drm_atomic_helper_check_wb_encoder_state() - Check writeback encoder state
- * @encoder: encoder state to check
- * @conn_state: connector state to check
+ * drm_atomic_helper_check_wb_connector_state() - Check writeback connector 
state
+ * @connector: corresponding connector
+ * @state: the driver state object
  *
  * Checks if the writeback connector state is valid, and returns an error if it
  * isn't.
@@ -806,9 +806,11 @@ EXPORT_SYMBOL(drm_atomic_helper_check_modeset);
  * Zero for success or -errno
  */
 int
-drm_atomic_helper_check_wb_encoder_state(struct drm_encoder *encoder,
-struct drm_connector_state *conn_state)
+drm_atomic_helper_check_wb_connector_state(struct drm_connector *connector,
+  struct drm_atomic_state *state)
 {
+   struct drm_connector_state *conn_state =
+   drm_atomic_get_new_connector_state(state, connector);
struct drm_writeback_job *wb_job = conn_state->writeback_job;
struct drm_property_blob *pixel_format_blob;
struct drm_framebuffer *fb;
@@ -827,11 +829,11 @@ drm_atomic_helper_check_wb_encoder_state(struct 
drm_encoder *encoder,
if (fb->format->format == formats[i])
return 0;
 
-   drm_dbg_kms(encoder->dev, "Invalid pixel format %p4cc\n", 
>format->format);
+   drm_dbg_kms(connector->dev, "Invalid pixel format %p4cc\n", 
>format->format);
 
return -EINVAL;
 }
-EXPORT_SYMBOL(drm_atomic_helper_check_wb_encoder_state);
+EXPORT_SYMBOL(drm_atomic_helper_check_wb_connector_state);
 
 /**
  * drm_atomic_helper_check_plane_state() - Check plane state for validity
diff --git a/drivers/gpu/drm/vkms/vkms_writeback.c 
b/drivers/gpu/drm/vkms/vkms_writeback.c
index d7e63aa14663..fef7f3daf2c9 100644
--- a/drivers/gpu/drm/vkms/vkms_writeback.c
+++ b/drivers/gpu/drm/vkms/vkms_writeback.c
@@ -34,6 +34,7 @@ static int vkms_wb_encoder_atomic_check(struct drm_encoder 
*encoder,
struct drm_crtc_state *crtc_state,
struct drm_connector_state *conn_state)
 {
+   struct drm_connector *connector = conn_state->connector;
struct drm_framebuffer *fb;
const struct drm_display_mode *mode = _state->mode;
int ret;
@@ -48,7 +49,7 @@ static int vkms_wb_encoder_atomic_check(struct drm_encoder 
*encoder,
return -EINVAL;
}
 
-   ret = drm_atomic_helper_check_wb_encoder_state(encoder, conn_state);
+   ret = drm_atomic_helper_check_wb_connector_state(connector, 
conn_state->state);
if (ret < 0)
return ret;
 
diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h
index 006b5c977ad7..9aa0a05aa072 100644
--- a/include/drm/drm_atomic_helper.h
+++ b/include/drm/drm_atomic_helper.h
@@ -49,9 +49,8 @@ struct drm_private_state;
 
 int drm_atomic_helper_check_modeset(struct drm_device *dev,
struct drm_atomic_state *state);
-int
-drm_atomic_helper_check_wb_encoder_state(struct drm_encoder *encoder,
-struct drm_connector_state 
*conn_state);
+int drm_atomic_helper_check_wb_connector_state(struct drm_connector *connector,
+  struct drm_atomic_state *state);
 int drm_atomic_helper_check_plane_state(struct drm_plane_state *plane_state,
const struct drm_crtc_state *crtc_state,
int min_scale,
-- 
2.39.2

[PATCH v4 0/2] drm/atomic-helper: rename drm_atomic_helper_check_wb_encoder_state

2023-12-07 Thread Dmitry Baryshkov

The function drm_atomic_helper_check_wb_encoder_state() doesn't use
drm_encoder for anything sensible. Internally it checks
drm_writeback_connector's state. Thus it makes sense to let this
function accept drm_connector object and the drm_atomic_state
and rename it to drm_atomic_helper_check_wb_connector_state().

Changes since v3:
- Fix the function usage in vkms_wb_encoder_atomic_check() (Maxime)

Changes since v2:
- Make the function accept drm_connector instead of
  drm_writeback_connector (Maxime)

Changes since v1:
- Make the function accept drm_writeback_connector and drm_atomic_state
  (Maxime)
- Added a patch for VKMS to move atomic_check of WB path from encoder to
  connector helpers.

Dmitry Baryshkov (2):
  drm/atomic-helper: rename drm_atomic_helper_check_wb_encoder_state
  drm/vkms: move wb's atomic_check from encoder to connector

 drivers/gpu/drm/drm_atomic_helper.c   | 16 +---
 drivers/gpu/drm/vkms/vkms_writeback.c | 25 +++--
 include/drm/drm_atomic_helper.h   |  5 ++---
 3 files changed, 26 insertions(+), 20 deletions(-)

-- 
2.39.2

Re: [Intel-gfx] [PATCH] [v2] drm/i915/display: Check GGTT to determine phys_base

2023-12-07 Thread Almahallawy, Khaled

Thank You for the patch. We noticed a break in the customer board with
the latest GOP + this patch.


Thank You
Khaled  

On Wed, 2023-12-06 at 18:46 +, Paz Zcharya wrote:
> There was an assumption that for iGPU there should be a 1:1 mapping
> of GGTT to physical address pointing to the framebuffer.
> This assumption is not strictly true effective generation 8 or newer.
> Fix that by checking GGTT to determine the phys address on gen8+.
> 
> The following algorithm for phys_base should be valid for all
> platforms:
> 1. Find pte
> 2. if(IS_DGFX(i915) && pte & GEN12_GGTT_PTE_LM) mem =
> i915->mm.regions[INTEL_REGION_LMEM_0] else mem = i915-
> >mm.stolen_region
> 3. phys_base = (pte & I915_GTT_PAGE_MASK) - mem->region.start;
> 
> - On older platforms, stolen_region points to system memory, starting
> at 0
> - on DG2, it uses lmem region which starts at 0 as well
> - on MTL, stolen_region points to stolen-local which starts at
> 0x80
> 
> Changes from v1:
>   - Add an if statement for gen7-, where there is a 1:1 mapping
> 
> Signed-off-by: Paz Zcharya 
> ---
> 
>  .../drm/i915/display/intel_plane_initial.c| 64 +++
> 
>  1 file changed, 39 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_plane_initial.c
> b/drivers/gpu/drm/i915/display/intel_plane_initial.c
> index a55c09cbd0e4..7d9bb631b93b 100644
> --- a/drivers/gpu/drm/i915/display/intel_plane_initial.c
> +++ b/drivers/gpu/drm/i915/display/intel_plane_initial.c
> @@ -59,44 +59,58 @@ initial_plane_vma(struct drm_i915_private *i915,
>   return NULL;
>  
>   base = round_down(plane_config->base, I915_GTT_MIN_ALIGNMENT);
> - if (IS_DGFX(i915)) {
> +
> + if (GRAPHICS_VER(i915) < 8) {
> + /*
> +  * In gen7-, there is a 1:1 mapping
> +  * between GSM and physical address.
> +  */
> + phys_base = base;
> + mem = i915->mm.stolen_region;
> + } else {
> + /*
> +  * In gen8+, there is no 1:1 mapping between
> +  * GSM and physical address, so we need to
> +  * check GGTT to determine the physical address.
> +  */
>   gen8_pte_t __iomem *gte = to_gt(i915)->ggtt->gsm;
>   gen8_pte_t pte;
>  
>   gte += base / I915_GTT_PAGE_SIZE;
> -
>   pte = ioread64(gte);
> - if (!(pte & GEN12_GGTT_PTE_LM)) {
> - drm_err(>drm,
> - "Initial plane programming missing
> PTE_LM bit\n");
> - return NULL;
> - }
> -
> - phys_base = pte & I915_GTT_PAGE_MASK;
> - mem = i915->mm.regions[INTEL_REGION_LMEM_0];
>  
> - /*
> -  * We don't currently expect this to ever be placed in
> the
> -  * stolen portion.
> -  */
> - if (phys_base >= resource_size(>region)) {
> - drm_err(>drm,
> - "Initial plane programming using
> invalid range, phys_base=%pa\n",
> - _base);
> - return NULL;
> + if (IS_DGFX(i915)) {
> + if (!(pte & GEN12_GGTT_PTE_LM)) {
> + drm_err(>drm,
> + "Initial plane programming
> missing PTE_LM bit\n");
> + return NULL;
> + }
> + mem = i915->mm.regions[INTEL_REGION_LMEM_0];
> + } else {
> + mem = i915->mm.stolen_region;
>   }
>  
> - drm_dbg(>drm,
> - "Using phys_base=%pa, based on initial plane
> programming\n",
> - _base);
> - } else {
> - phys_base = base;
> - mem = i915->mm.stolen_region;
> + phys_base = (pte & I915_GTT_PAGE_MASK) - mem-
> >region.start;
>   }
>  
>   if (!mem)
>   return NULL;
>  
> + /*
> +  * We don't currently expect this to ever be placed in the
> +  * stolen portion.
> +  */
> + if (phys_base >= resource_size(>region)) {
> + drm_err(>drm,
> + "Initial plane programming using invalid range,
> phys_base=%pa\n",
> + _base);
> + return NULL;
> + }
> +
> + drm_dbg(>drm,
> + "Using phys_base=%pa, based on initial plane
> programming\n",
> + _base);
> +
>   size = round_up(plane_config->base + plane_config->size,
>   mem->min_page_size);
>   size -= base;

Re: [PATCH v3 1/2] drm/atomic-helper: rename drm_atomic_helper_check_wb_encoder_state

2023-12-07 Thread Dmitry Baryshkov

On Thu, 7 Dec 2023 at 12:10, Maxime Ripard  wrote:
>
> Hi,
>
> On Wed, Dec 06, 2023 at 01:14:54PM +0300, Dmitry Baryshkov wrote:
> > The drm_atomic_helper_check_wb_encoder_state() function doesn't use
> > encoder for anything other than getting the drm_device instance. The
> > function's description talks about checking the writeback connector
> > state, not the encoder state. Moreover, there is no such thing as an
> > encoder state, encoders generally do not have a state on their own.
> >
> > Rename the function to drm_atomic_helper_check_wb_connector_state()
> > and change arguments to drm_writeback_connector and drm_atomic_state.
> >
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >  drivers/gpu/drm/drm_atomic_helper.c   | 16 +---
> >  drivers/gpu/drm/vkms/vkms_writeback.c |  5 -
> >  include/drm/drm_atomic_helper.h   |  5 ++---
> >  3 files changed, 15 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> > b/drivers/gpu/drm/drm_atomic_helper.c
> > index c3f677130def..c98a766ca3bd 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -795,9 +795,9 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
> >  EXPORT_SYMBOL(drm_atomic_helper_check_modeset);
> >
> >  /**
> > - * drm_atomic_helper_check_wb_encoder_state() - Check writeback encoder 
> > state
> > - * @encoder: encoder state to check
> > - * @conn_state: connector state to check
> > + * drm_atomic_helper_check_wb_connector_state() - Check writeback 
> > connector state
> > + * @connector: corresponding connector
> > + * @state: the driver state object
> >   *
> >   * Checks if the writeback connector state is valid, and returns an error 
> > if it
> >   * isn't.
> > @@ -806,9 +806,11 @@ EXPORT_SYMBOL(drm_atomic_helper_check_modeset);
> >   * Zero for success or -errno
> >   */
> >  int
> > -drm_atomic_helper_check_wb_encoder_state(struct drm_encoder *encoder,
> > -  struct drm_connector_state 
> > *conn_state)
> > +drm_atomic_helper_check_wb_connector_state(struct drm_connector *connector,
> > +struct drm_atomic_state *state)
> >  {
> > + struct drm_connector_state *conn_state =
> > + drm_atomic_get_new_connector_state(state, connector);
> >   struct drm_writeback_job *wb_job = conn_state->writeback_job;
> >   struct drm_property_blob *pixel_format_blob;
> >   struct drm_framebuffer *fb;
> > @@ -827,11 +829,11 @@ drm_atomic_helper_check_wb_encoder_state(struct 
> > drm_encoder *encoder,
> >   if (fb->format->format == formats[i])
> >   return 0;
> >
> > - drm_dbg_kms(encoder->dev, "Invalid pixel format %p4cc\n", 
> > >format->format);
> > + drm_dbg_kms(connector->dev, "Invalid pixel format %p4cc\n", 
> > >format->format);
> >
> >   return -EINVAL;
> >  }
> > -EXPORT_SYMBOL(drm_atomic_helper_check_wb_encoder_state);
> > +EXPORT_SYMBOL(drm_atomic_helper_check_wb_connector_state);
>
> Thanks for updating the prototype ...
>
> >  /**
> >   * drm_atomic_helper_check_plane_state() - Check plane state for validity
> > diff --git a/drivers/gpu/drm/vkms/vkms_writeback.c 
> > b/drivers/gpu/drm/vkms/vkms_writeback.c
> > index d7e63aa14663..23c4f7b61cb6 100644
> > --- a/drivers/gpu/drm/vkms/vkms_writeback.c
> > +++ b/drivers/gpu/drm/vkms/vkms_writeback.c
> > @@ -34,6 +34,9 @@ static int vkms_wb_encoder_atomic_check(struct 
> > drm_encoder *encoder,
> >   struct drm_crtc_state *crtc_state,
> >   struct drm_connector_state 
> > *conn_state)
> >  {
> > + struct drm_connector *connector = conn_state->connector;
> > + struct drm_writeback_connector *wb_conn =
> > + drm_connector_to_writeback(connector);
> >   struct drm_framebuffer *fb;
> >   const struct drm_display_mode *mode = _state->mode;
> >   int ret;
> > @@ -48,7 +51,7 @@ static int vkms_wb_encoder_atomic_check(struct 
> > drm_encoder *encoder,
> >   return -EINVAL;
> >   }
> >
> > - ret = drm_atomic_helper_check_wb_encoder_state(encoder, conn_state);
> > + ret = drm_atomic_helper_check_wb_connector_state(wb_conn, 
> > conn_state->state);
>
> ... but it looks like you forgot to update it here

Indeed, I fixed it in the second patch, but forgot to update the first one.

>
> Maxime



-- 
With best wishes
Dmitry

[net-next v1 12/16] net: add support for skbs with unreadable frags

2023-12-07 Thread Mina Almasry

For device memory TCP, we expect the skb headers to be available in host
memory for access, and we expect the skb frags to be in device memory
and unaccessible to the host. We expect there to be no mixing and
matching of device memory frags (unaccessible) with host memory frags
(accessible) in the same skb.

Add a skb->devmem flag which indicates whether the frags in this skb
are device memory frags or not.

__skb_fill_page_desc() now checks frags added to skbs for page_pool_iovs,
and marks the skb as skb->devmem accordingly.

Add checks through the network stack to avoid accessing the frags of
devmem skbs and avoid coalescing devmem skbs with non devmem skbs.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 


---

Changes in v1:
- Rename devmem -> dmabuf (David).
- Flip skb_frags_not_readable (Jakub).

---
 include/linux/skbuff.h | 14 +++-
 include/net/tcp.h  |  5 +--
 net/core/datagram.c|  6 
 net/core/gro.c |  5 ++-
 net/core/skbuff.c  | 77 --
 net/ipv4/tcp.c |  3 ++
 net/ipv4/tcp_input.c   | 13 +--
 net/ipv4/tcp_output.c  |  5 ++-
 net/packet/af_packet.c |  4 +--
 9 files changed, 112 insertions(+), 20 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 851f448d2181..61de32ab04ea 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -817,6 +817,8 @@ typedef unsigned char *sk_buff_data_t;
  * @csum_level: indicates the number of consecutive checksums found in
  * the packet minus one that have been verified as
  * CHECKSUM_UNNECESSARY (max 3)
+ * @dmabuf: indicates that all the fragments in this skb are backed by
+ * dmabuf.
  * @dst_pending_confirm: need to confirm neighbour
  * @decrypted: Decrypted SKB
  * @slow_gro: state present at GRO time, slower prepare step required
@@ -1003,7 +1005,7 @@ struct sk_buff {
 #if IS_ENABLED(CONFIG_IP_SCTP)
__u8csum_not_inet:1;
 #endif
-
+   __u8dmabuf:1;
 #if defined(CONFIG_NET_SCHED) || defined(CONFIG_NET_XGRESS)
__u16   tc_index;   /* traffic control index */
 #endif
@@ -1778,6 +1780,12 @@ static inline void skb_zcopy_downgrade_managed(struct 
sk_buff *skb)
__skb_zcopy_downgrade_managed(skb);
 }
 
+/* Return true if frags in this skb are readable by the host. */
+static inline bool skb_frags_readable(const struct sk_buff *skb)
+{
+   return !skb->dmabuf;
+}
+
 static inline void skb_mark_not_on_list(struct sk_buff *skb)
 {
skb->next = NULL;
@@ -2480,6 +2488,10 @@ static inline void __skb_fill_page_desc(struct sk_buff 
*skb, int i,
struct page *page, int off, int size)
 {
__skb_fill_page_desc_noacc(skb_shinfo(skb), i, page, off, size);
+   if (page_is_page_pool_iov(page)) {
+   skb->dmabuf = true;
+   return;
+   }
 
/* Propagate page pfmemalloc to the skb if we can. The problem is
 * that not all callers have unique ownership of the page but rely
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 973555cb1d3f..0fbf198bdb55 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1017,7 +1017,7 @@ static inline int tcp_skb_mss(const struct sk_buff *skb)
 
 static inline bool tcp_skb_can_collapse_to(const struct sk_buff *skb)
 {
-   return likely(!TCP_SKB_CB(skb)->eor);
+   return likely(!TCP_SKB_CB(skb)->eor && skb_frags_readable(skb));
 }
 
 static inline bool tcp_skb_can_collapse(const struct sk_buff *to,
@@ -1025,7 +1025,8 @@ static inline bool tcp_skb_can_collapse(const struct 
sk_buff *to,
 {
return likely(tcp_skb_can_collapse_to(to) &&
  mptcp_skb_can_collapse(to, from) &&
- skb_pure_zcopy_same(to, from));
+ skb_pure_zcopy_same(to, from) &&
+ skb_frags_readable(to) == skb_frags_readable(from));
 }
 
 /* Events passed to congestion control interface */
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 103d46fa0eeb..f28472ddbaa4 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -426,6 +426,9 @@ static int __skb_datagram_iter(const struct sk_buff *skb, 
int offset,
return 0;
}
 
+   if (!skb_frags_readable(skb))
+   goto short_copy;
+
/* Copy paged appendix. Hmm... why does this look so complicated? */
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
@@ -638,6 +641,9 @@ int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock 
*sk,
if (msg && msg->msg_ubuf && msg->sg_from_iter)
return msg->sg_from_iter(sk, skb, from, length);
 
+   if (!skb_frags_readable(skb))
+   return -EFAULT;
+
frag = skb_shinfo(skb)->nr_frags;
 
while (length && iov_iter_count(from)) {
diff --git

[net-next v1 14/16] net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags

2023-12-07 Thread Mina Almasry

Add an interface for the user to notify the kernel that it is done
reading the devmem dmabuf frags returned as cmsg. The kernel will
drop the reference on the frags to make them available for re-use.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

Changes in v1:
- devmemtoken -> dmabuf_token (David).
- Use napi_pp_put_page() for refcounting (Yunsheng).

---
 include/uapi/asm-generic/socket.h |  1 +
 include/uapi/linux/uio.h  |  4 
 net/core/sock.c   | 38 +++
 3 files changed, 43 insertions(+)

diff --git a/include/uapi/asm-generic/socket.h 
b/include/uapi/asm-generic/socket.h
index 25a2f5255f52..1acb77780f10 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -135,6 +135,7 @@
 #define SO_PASSPIDFD   76
 #define SO_PEERPIDFD   77
 
+#define SO_DEVMEM_DONTNEED 97
 #define SO_DEVMEM_LINEAR   98
 #define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF   99
diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h
index ad92e37699da..65f33178a601 100644
--- a/include/uapi/linux/uio.h
+++ b/include/uapi/linux/uio.h
@@ -30,6 +30,10 @@ struct dmabuf_cmsg {
__u32  dmabuf_id;   /* dmabuf id this frag belongs to. */
 };
 
+struct dmabuf_token {
+   __u32 token_start;
+   __u32 token_count;
+};
 /*
  * UIO_MAXIOV shall be at least 16 1003.1g (5.4.1.1)
  */
diff --git a/net/core/sock.c b/net/core/sock.c
index fef349dd72fa..521bdc4ff260 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1051,6 +1051,41 @@ static int sock_reserve_memory(struct sock *sk, int 
bytes)
return 0;
 }
 
+static noinline_for_stack int
+sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+   struct dmabuf_token tokens[128];
+   unsigned int num_tokens, i, j;
+   int ret;
+
+   if (sk->sk_type != SOCK_STREAM || sk->sk_protocol != IPPROTO_TCP)
+   return -EBADF;
+
+   if (optlen % sizeof(struct dmabuf_token) || optlen > sizeof(tokens))
+   return -EINVAL;
+
+   num_tokens = optlen / sizeof(struct dmabuf_token);
+   if (copy_from_sockptr(tokens, optval, optlen))
+   return -EFAULT;
+
+   ret = 0;
+   for (i = 0; i < num_tokens; i++) {
+   for (j = 0; j < tokens[i].token_count; j++) {
+   struct page *page = xa_erase(>sk_user_pages,
+tokens[i].token_start + j);
+
+   if (page) {
+   if (WARN_ON_ONCE(!napi_pp_put_page(page,
+  false)))
+   page_pool_page_put_many(page, 1);
+   ret++;
+   }
+   }
+   }
+
+   return ret;
+}
+
 void sockopt_lock_sock(struct sock *sk)
 {
/* When current->bpf_ctx is set, the setsockopt is called from
@@ -1538,6 +1573,9 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
break;
}
 
+   case SO_DEVMEM_DONTNEED:
+   ret = sock_devmem_dontneed(sk, optval, optlen);
+   break;
default:
ret = -ENOPROTOOPT;
break;
-- 
2.43.0.472.g3155946c3a-goog

[net-next v1 16/16] selftests: add ncdevmem, netcat for devmem TCP

2023-12-07 Thread Mina Almasry

ncdevmem is a devmem TCP netcat. It works similarly to netcat, but it
sends and receives data using the devmem TCP APIs. It uses udmabuf as
the dmabuf provider. It is compatible with a regular netcat running on
a peer, or a ncdevmem running on a peer.

In addition to normal netcat support, ncdevmem has a validation mode,
where it sends a specific pattern and validates this pattern on the
receiver side to ensure data integrity.

Suggested-by: Stanislav Fomichev 
Signed-off-by: Mina Almasry 

---

Changes in v1:
- Many more general cleanups (Willem).
- Removed driver reset (Jakub).
- Removed hardcoded if index (Paolo).

RFC v2:
- General cleanups (Willem).

---
 tools/testing/selftests/net/.gitignore |   1 +
 tools/testing/selftests/net/Makefile   |   5 +
 tools/testing/selftests/net/ncdevmem.c | 489 +
 3 files changed, 495 insertions(+)
 create mode 100644 tools/testing/selftests/net/ncdevmem.c

diff --git a/tools/testing/selftests/net/.gitignore 
b/tools/testing/selftests/net/.gitignore
index 2f9d378edec3..b644dbae58b7 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -17,6 +17,7 @@ ipv6_flowlabel
 ipv6_flowlabel_mgr
 log.txt
 msg_zerocopy
+ncdevmem
 nettest
 psock_fanout
 psock_snd
diff --git a/tools/testing/selftests/net/Makefile 
b/tools/testing/selftests/net/Makefile
index 14bd68da7466..d7a66563ffe7 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -5,6 +5,10 @@ CFLAGS =  -Wall -Wl,--no-as-needed -O2 -g
 CFLAGS += -I../../../../usr/include/ $(KHDR_INCLUDES)
 # Additional include paths needed by kselftest.h
 CFLAGS += -I../
+CFLAGS += -I../../../net/ynl/generated/
+CFLAGS += -I../../../net/ynl/lib/
+
+LDLIBS += ../../../net/ynl/lib/ynl.a ../../../net/ynl/generated/protos.a
 
 TEST_PROGS := run_netsocktests run_afpackettests test_bpf.sh netdevice.sh \
  rtnetlink.sh xfrm_policy.sh test_blackhole_dev.sh
@@ -92,6 +96,7 @@ TEST_PROGS += test_vxlan_nolocalbypass.sh
 TEST_PROGS += test_bridge_backup_port.sh
 TEST_PROGS += fdb_flush.sh
 TEST_PROGS += fq_band_pktlimit.sh
+TEST_GEN_FILES += ncdevmem
 
 TEST_FILES := settings
 
diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
new file mode 100644
index ..7fbeee02b9a2
--- /dev/null
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -0,0 +1,489 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#define __EXPORTED_HEADERS__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#define __iovec_defined
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "netdev-user.h"
+#include 
+
+#define PAGE_SHIFT 12
+#define TEST_PREFIX "ncdevmem"
+#define NUM_PAGES 16000
+
+#ifndef MSG_SOCK_DEVMEM
+#define MSG_SOCK_DEVMEM 0x200
+#endif
+
+/*
+ * tcpdevmem netcat. Works similarly to netcat but does device memory TCP
+ * instead of regular TCP. Uses udmabuf to mock a dmabuf provider.
+ *
+ * Usage:
+ *
+ * On server:
+ * ncdevmem -s  -c  -f eth1 -d 3 -n :06:00.0 -l \
+ * -p 5201 -v 7
+ *
+ * On client:
+ * yes $(echo -e \\x01\\x02\\x03\\x04\\x05\\x06) | \
+ * tr \\n \\0 | \
+ * head -c 5G | \
+ * nc  5201 -p 5201
+ *
+ * Note this is compatible with regular netcat. i.e. the sender or receiver can
+ * be replaced with regular netcat to test the RX or TX path in isolation.
+ */
+
+static char *server_ip = "192.168.1.4";
+static char *client_ip = "192.168.1.2";
+static char *port = "5201";
+static size_t do_validation;
+static int queue_num = 15;
+static char *ifname = "eth1";
+static unsigned int ifindex = 3;
+static char *nic_pci_addr = ":06:00.0";
+static unsigned int iterations;
+static unsigned int dmabuf_id;
+
+void print_bytes(void *ptr, size_t size)
+{
+   unsigned char *p = ptr;
+   int i;
+
+   for (i = 0; i < size; i++)
+   printf("%02hhX ", p[i]);
+   printf("\n");
+}
+
+void print_nonzero_bytes(void *ptr, size_t size)
+{
+   unsigned char *p = ptr;
+   unsigned int i;
+
+   for (i = 0; i < size; i++)
+   putchar(p[i]);
+   printf("\n");
+}
+
+void validate_buffer(void *line, size_t size)
+{
+   static unsigned char seed = 1;
+   unsigned char *ptr = line;
+   int errors = 0;
+   size_t i;
+
+   for (i = 0; i < size; i++) {
+   if (ptr[i] != seed) {
+   fprintf(stderr,
+   "Failed validation: expected=%u, actual=%u, 
index=%lu\n",
+   seed, ptr[i], i);
+   errors++;
+   if (errors > 20)
+   error(1, 0, "validation failed.");
+   }
+   seed++;
+   if (seed

[net-next v1 13/16] tcp: RX path for devmem TCP

2023-12-07 Thread Mina Almasry

In tcp_recvmsg_locked(), detect if the skb being received by the user
is a devmem skb. In this case - if the user provided the MSG_SOCK_DEVMEM
flag - pass it to tcp_recvmsg_devmem() for custom handling.

tcp_recvmsg_devmem() copies any data in the skb header to the linear
buffer, and returns a cmsg to the user indicating the number of bytes
returned in the linear buffer.

tcp_recvmsg_devmem() then loops over the unaccessible devmem skb frags,
and returns to the user a cmsg_devmem indicating the location of the
data in the dmabuf device memory. cmsg_devmem contains this information:

1. the offset into the dmabuf where the payload starts. 'frag_offset'.
2. the size of the frag. 'frag_size'.
3. an opaque token 'frag_token' to return to the kernel when the buffer
is to be released.

The pages awaiting freeing are stored in the newly added
sk->sk_user_pages, and each page passed to userspace is get_page()'d.
This reference is dropped once the userspace indicates that it is
done reading this page.  All pages are released when the socket is
destroyed.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

Changes in v1:
- Added dmabuf_id to dmabuf_cmsg (David/Stan).
- Devmem -> dmabuf (David).
- Change tcp_recvmsg_dmabuf() check to skb->dmabuf (Paolo).
- Use __skb_frag_ref() & napi_pp_put_page() for refcounting (Yunsheng).

RFC v3:
- Fixed issue with put_cmsg() failing silently.

---
 include/linux/socket.h|   1 +
 include/net/page_pool/helpers.h   |   9 ++
 include/net/sock.h|   2 +
 include/uapi/asm-generic/socket.h |   5 +
 include/uapi/linux/uio.h  |  10 ++
 net/ipv4/tcp.c| 190 +-
 net/ipv4/tcp_ipv4.c   |   8 ++
 7 files changed, 220 insertions(+), 5 deletions(-)

diff --git a/include/linux/socket.h b/include/linux/socket.h
index cfcb7e2c3813..fe2b9e2081bb 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -326,6 +326,7 @@ struct ucred {
  * plain text and require encryption
  */
 
+#define MSG_SOCK_DEVMEM 0x200  /* Receive devmem skbs as cmsg */
 #define MSG_ZEROCOPY   0x400   /* Use user data in kernel path */
 #define MSG_SPLICE_PAGES 0x800 /* Splice the pages from the iterator 
in sendmsg() */
 #define MSG_FASTOPEN   0x2000  /* Send data in TCP SYN */
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 2d4e0a2c5620..e7e2e89d3663 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -108,6 +108,15 @@ page_pool_iov_dma_addr(const struct page_pool_iov *ppiov)
   ((dma_addr_t)page_pool_iov_idx(ppiov) << PAGE_SHIFT);
 }
 
+static inline unsigned long
+page_pool_iov_virtual_addr(const struct page_pool_iov *ppiov)
+{
+   struct dmabuf_genpool_chunk_owner *owner = page_pool_iov_owner(ppiov);
+
+   return owner->base_virtual +
+  ((unsigned long)page_pool_iov_idx(ppiov) << PAGE_SHIFT);
+}
+
 static inline struct netdev_dmabuf_binding *
 page_pool_iov_binding(const struct page_pool_iov *ppiov)
 {
diff --git a/include/net/sock.h b/include/net/sock.h
index 1d6931caf0c3..01029c855c1b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -353,6 +353,7 @@ struct sk_filter;
   *@sk_txtime_unused: unused txtime flags
   *@ns_tracker: tracker for netns reference
   *@sk_bind2_node: bind node in the bhash2 table
+  *@sk_user_pages: xarray of pages the user is holding a reference on.
   */
 struct sock {
/*
@@ -545,6 +546,7 @@ struct sock {
struct rcu_head sk_rcu;
netns_tracker   ns_tracker;
struct hlist_node   sk_bind2_node;
+   struct xarray   sk_user_pages;
 };
 
 enum sk_pacing {
diff --git a/include/uapi/asm-generic/socket.h 
b/include/uapi/asm-generic/socket.h
index 8ce8a39a1e5f..25a2f5255f52 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -135,6 +135,11 @@
 #define SO_PASSPIDFD   76
 #define SO_PEERPIDFD   77
 
+#define SO_DEVMEM_LINEAR   98
+#define SCM_DEVMEM_LINEAR  SO_DEVMEM_LINEAR
+#define SO_DEVMEM_DMABUF   99
+#define SCM_DEVMEM_DMABUF  SO_DEVMEM_DMABUF
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h
index 059b1a9147f4..ad92e37699da 100644
--- a/include/uapi/linux/uio.h
+++ b/include/uapi/linux/uio.h
@@ -20,6 +20,16 @@ struct iovec
__kernel_size_t iov_len; /* Must be size_t (1003.1g) */
 };
 
+struct dmabuf_cmsg {
+   __u64 frag_offset;  /* offset into the dmabuf where the frag starts.
+*/
+   __u32 frag_size;/* size of the frag. */
+   __u32 frag_token;   /* token representing this frag for
+

[net-next v1 15/16] net: add devmem TCP documentation

2023-12-07 Thread Mina Almasry

Signed-off-by: Mina Almasry 
---
 Documentation/networking/devmem.rst | 270 
 1 file changed, 270 insertions(+)
 create mode 100644 Documentation/networking/devmem.rst

diff --git a/Documentation/networking/devmem.rst 
b/Documentation/networking/devmem.rst
new file mode 100644
index ..ed0d9c88b708
--- /dev/null
+++ b/Documentation/networking/devmem.rst
@@ -0,0 +1,270 @@
+
+=
+Device Memory TCP
+=
+
+
+Intro
+=
+
+Device memory TCP (devmem TCP) enables receiving data directly into device
+memory (dmabuf). The feature is currently implemented for TCP sockets.
+
+
+Opportunity
+---
+
+A large amount of data transfers have device memory as the source and/or
+destination. Accelerators drastically increased the volume of such transfers.
+Some examples include:
+
+- Distributed training, where ML accelerators, such as GPUs on different hosts,
+  exchange data among them.
+
+- Distributed raw block storage applications transfer large amounts of data 
with
+  remote SSDs, much of this data does not require host processing.
+
+Today, the majority of the Device-to-Device data transfers the network are
+implemented as the following low level operations: Device-to-Host copy,
+Host-to-Host network transfer, and Host-to-Device copy.
+
+The implementation is suboptimal, especially for bulk data transfers, and can
+put significant strains on system resources such as host memory bandwidth and
+PCIe bandwidth.
+
+Devmem TCP optimizes this use case by implementing socket APIs that enable
+the user to receive incoming network packets directly into device memory.
+
+Packet payloads go directly from the NIC to device memory.
+
+Packet headers go to host memory and are processed by the TCP/IP stack
+normally. The NIC must support header split to achieve this.
+
+Advantages:
+
+- Alleviate host memory bandwidth pressure, compared to existing
+  network-transfer + device-copy semantics.
+
+- Alleviate PCIe bandwidth pressure, by limiting data transfer to the lowest
+  level of the PCIe tree, compared to traditional path which sends data through
+  the root complex.
+
+
+More Info
+-
+
+  slides, video
+https://netdevconf.org/0x17/sessions/talk/device-memory-tcp.html
+
+  patchset
+[RFC PATCH v3 00/12] Device Memory TCP
+
https://lore.kernel.org/lkml/20231106024413.2801438-1-almasrym...@google.com/T/
+
+
+Interface
+=
+
+Example
+---
+
+tools/testing/selftests/net/ncdevmem.c:do_server shows an example of setting up
+the RX path of this API.
+
+NIC Setup
+-
+
+Header split, flow steering, & RSS are required features for devmem TCP.
+
+Header split is used to split incoming packets into a header buffer in host
+memory, and a payload buffer in device memory.
+
+Flow steering & RSS are used to ensure that only flows targeting devmem land on
+RX queue bound to devmem.
+
+Enable header split & flow steering:
+
+::
+
+   # enable header split (assuming priv-flag)
+   ethtool --set-priv-flags eth1 enable-header-split on
+
+   # enable flow steering
+   ethtool -K eth1 ntuple on
+
+Configure RSS to steer all traffic away from the target RX queue (queue 15 in
+this example):
+
+::
+
+   ethtool --set-rxfh-indir eth1 equal 15
+
+
+The user must bind a dmabuf to any number of RX queues on a given NIC using
+netlink API:
+
+::
+
+   /* Bind dmabuf to NIC RX queue 15 */
+   struct netdev_queue *queues;
+   queues = malloc(sizeof(*queues) * 1);
+
+   queues[0]._present.type = 1;
+   queues[0]._present.idx = 1;
+   queues[0].type = NETDEV_RX_QUEUE_TYPE_RX;
+   queues[0].idx = 15;
+
+   *ys = ynl_sock_create(_netdev_family, );
+
+   req = netdev_bind_rx_req_alloc();
+   netdev_bind_rx_req_set_ifindex(req, 1 /* ifindex */);
+   netdev_bind_rx_req_set_dmabuf_fd(req, dmabuf_fd);
+   __netdev_bind_rx_req_set_queues(req, queues, n_queue_index);
+
+   rsp = netdev_bind_rx(*ys, req);
+
+   dmabuf_id = rsp->dmabuf_id;
+
+
+The netlink API returns a dmabuf_id: a unique ID that refers to this dmabuf
+that has been bound.
+
+Socket Setup
+
+
+The socket must be flow steering to the dmabuf bound RX queue:
+
+::
+
+   ethtool -N eth1 flow-type tcp4 ... queue 15,
+
+
+Receiving data
+--
+
+The user application must signal to the kernel that it is capable of receiving
+devmem data by passing the MSG_SOCK_DEVMEM flag to recvmsg:
+
+::
+
+   ret = recvmsg(fd, , MSG_SOCK_DEVMEM);
+
+Applications that do not specify the MSG_SOCK_DEVMEM flag will receive an 
EFAULT
+on devmem data.
+
+Devmem data is received directly into the dmabuf bound to the NIC in 'NIC
+Setup', and the kernel signals such to the user via the SCM_DEVMEM_* cmsgs:
+
+::
+
+   for (cm = CMSG_FIRSTHDR(); cm; cm = CMSG_NXTHDR(, cm)) {
+   if (cm->cmsg_level != SOL_SOCKET ||
+   (cm->cmsg_type != SCM_DEVMEM_DMABUF &&

[net-next v1 09/16] page_pool: device memory support

2023-12-07 Thread Mina Almasry

Overload the LSB of struct page* to indicate that it's a page_pool_iov.

Refactor mm calls on struct page* into helpers, and add page_pool_iov
handling on those helpers. Modify callers of these mm APIs with calls to
these helpers instead.

In areas where struct page* is dereferenced, add a check for special
handling of page_pool_iov.

Signed-off-by: Mina Almasry 

---

v1:
- Disable fragmentation support for iov properly.
- fix napi_pp_put_page() path (Yunsheng).

---
 include/net/page_pool/helpers.h | 78 -
 net/core/page_pool.c| 67 
 net/core/skbuff.c   | 28 +++-
 3 files changed, 141 insertions(+), 32 deletions(-)

diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 00197f14aa87..2d4e0a2c5620 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -154,6 +154,64 @@ static inline struct page_pool_iov 
*page_to_page_pool_iov(struct page *page)
return NULL;
 }
 
+static inline int page_pool_page_ref_count(struct page *page)
+{
+   if (page_is_page_pool_iov(page))
+   return page_pool_iov_refcount(page_to_page_pool_iov(page));
+
+   return page_ref_count(page);
+}
+
+static inline void page_pool_page_get_many(struct page *page,
+  unsigned int count)
+{
+   if (page_is_page_pool_iov(page))
+   return page_pool_iov_get_many(page_to_page_pool_iov(page),
+ count);
+
+   return page_ref_add(page, count);
+}
+
+static inline void page_pool_page_put_many(struct page *page,
+  unsigned int count)
+{
+   if (page_is_page_pool_iov(page))
+   return page_pool_iov_put_many(page_to_page_pool_iov(page),
+ count);
+
+   if (count > 1)
+   page_ref_sub(page, count - 1);
+
+   put_page(page);
+}
+
+static inline bool page_pool_page_is_pfmemalloc(struct page *page)
+{
+   if (page_is_page_pool_iov(page))
+   return false;
+
+   return page_is_pfmemalloc(page);
+}
+
+static inline bool page_pool_page_is_pref_nid(struct page *page, int pref_nid)
+{
+   /* Assume page_pool_iov are on the preferred node without actually
+* checking...
+*
+* This check is only used to check for recycling memory in the page
+* pool's fast paths. Currently the only implementation of page_pool_iov
+* is dmabuf device memory. It's a deliberate decision by the user to
+* bind a certain dmabuf to a certain netdev, and the netdev rx queue
+* would not be able to reallocate memory from another dmabuf that
+* exists on the preferred node, so, this check doesn't make much sense
+* in this case. Assume all page_pool_iovs can be recycled for now.
+*/
+   if (page_is_page_pool_iov(page))
+   return true;
+
+   return page_to_nid(page) == pref_nid;
+}
+
 /**
  * page_pool_dev_alloc_pages() - allocate a page.
  * @pool:  pool from which to allocate
@@ -304,6 +362,10 @@ static inline long page_pool_defrag_page(struct page 
*page, long nr)
 {
long ret;
 
+   /* fragmentation support hasn't been added to ppiov yet */
+   if (WARN_ON_ONCE(page_is_page_pool_iov(page)))
+   return 0;
+
/* If nr == pp_frag_count then we have cleared all remaining
 * references to the page:
 * 1. 'n == 1': no need to actually overwrite it.
@@ -347,7 +409,8 @@ static inline long page_pool_defrag_page(struct page *page, 
long nr)
 static inline bool page_pool_is_last_frag(struct page *page)
 {
/* If page_pool_defrag_page() returns 0, we were the last user */
-   return page_pool_defrag_page(page, 1) == 0;
+   return page_is_page_pool_iov(page) ||
+  page_pool_defrag_page(page, 1) == 0;
 }
 
 /**
@@ -434,7 +497,12 @@ static inline void page_pool_free_va(struct page_pool 
*pool, void *va,
  */
 static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
 {
-   dma_addr_t ret = page->dma_addr;
+   dma_addr_t ret;
+
+   if (page_is_page_pool_iov(page))
+   return page_pool_iov_dma_addr(page_to_page_pool_iov(page));
+
+   ret = page->dma_addr;
 
if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA)
ret <<= PAGE_SHIFT;
@@ -444,6 +512,12 @@ static inline dma_addr_t page_pool_get_dma_addr(struct 
page *page)
 
 static inline bool page_pool_set_dma_addr(struct page *page, dma_addr_t addr)
 {
+   /* page_pool_iovs are mapped and their dma-addr can't be modified. */
+   if (page_is_page_pool_iov(page)) {
+   DEBUG_NET_WARN_ON_ONCE(true);
+   return false;
+   }
+
if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA) {
page->dma_addr = addr >> PAGE_SHIFT;
 
diff --git a/net/core/page_pool.c

[net-next v1 11/16] net: support non paged skb frags

2023-12-07 Thread Mina Almasry

Make skb_frag_page() fail in the case where the frag is not backed
by a page, and fix its relevant callers to handle this case.

Correctly handle skb_frag refcounting in the page_pool_iovs case.

Signed-off-by: Mina Almasry 


---

Changes in v1:
- Fix illegal_highdma() (Yunsheng).
- Rework napi_pp_put_page() slightly to reduce code churn (Willem).

---
 include/linux/skbuff.h | 42 +++---
 net/core/dev.c |  3 ++-
 net/core/gro.c |  2 +-
 net/core/skbuff.c  |  3 +++
 net/ipv4/tcp.c |  3 +++
 5 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b370eb8d70f7..851f448d2181 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -37,6 +37,8 @@
 #endif
 #include 
 #include 
+#include 
+#include 
 
 /**
  * DOC: skb checksums
@@ -3414,15 +3416,38 @@ static inline void skb_frag_off_copy(skb_frag_t *fragto,
fragto->bv_offset = fragfrom->bv_offset;
 }
 
+/* Returns true if the skb_frag contains a page_pool_iov. */
+static inline bool skb_frag_is_page_pool_iov(const skb_frag_t *frag)
+{
+   return page_is_page_pool_iov(frag->bv_page);
+}
+
 /**
  * skb_frag_page - retrieve the page referred to by a paged fragment
  * @frag: the paged fragment
  *
- * Returns the  page associated with @frag.
+ * Returns the  page associated with @frag. Returns NULL if this frag
+ * has no associated page.
  */
 static inline struct page *skb_frag_page(const skb_frag_t *frag)
 {
-   return frag->bv_page;
+   if (!page_is_page_pool_iov(frag->bv_page))
+   return frag->bv_page;
+
+   return NULL;
+}
+
+/**
+ * skb_frag_page_pool_iov - retrieve the page_pool_iov referred to by fragment
+ * @frag: the fragment
+ *
+ * Returns the  page_pool_iov associated with @frag. Returns NULL if 
this
+ * frag has no associated page_pool_iov.
+ */
+static inline struct page_pool_iov *
+skb_frag_page_pool_iov(const skb_frag_t *frag)
+{
+   return page_to_page_pool_iov(frag->bv_page);
 }
 
 /**
@@ -3433,7 +3458,7 @@ static inline struct page *skb_frag_page(const skb_frag_t 
*frag)
  */
 static inline void __skb_frag_ref(skb_frag_t *frag)
 {
-   get_page(skb_frag_page(frag));
+   page_pool_page_get_many(frag->bv_page, 1);
 }
 
 /**
@@ -3453,13 +3478,13 @@ bool napi_pp_put_page(struct page *page, bool 
napi_safe);
 static inline void
 napi_frag_unref(skb_frag_t *frag, bool recycle, bool napi_safe)
 {
-   struct page *page = skb_frag_page(frag);
-
 #ifdef CONFIG_PAGE_POOL
-   if (recycle && napi_pp_put_page(page, napi_safe))
+   if (recycle && napi_pp_put_page(frag->bv_page, napi_safe))
return;
+   page_pool_page_put_many(frag->bv_page, 1);
+#else
+   put_page(skb_frag_page(frag));
 #endif
-   put_page(page);
 }
 
 /**
@@ -3499,6 +3524,9 @@ static inline void skb_frag_unref(struct sk_buff *skb, 
int f)
  */
 static inline void *skb_frag_address(const skb_frag_t *frag)
 {
+   if (!skb_frag_page(frag))
+   return NULL;
+
return page_address(skb_frag_page(frag)) + skb_frag_off(frag);
 }
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 30667e4c3b95..1ae9257df441 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3709,8 +3709,9 @@ static int illegal_highdma(struct net_device *dev, struct 
sk_buff *skb)
if (!(dev->features & NETIF_F_HIGHDMA)) {
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = _shinfo(skb)->frags[i];
+   struct page *page = skb_frag_page(frag);
 
-   if (PageHighMem(skb_frag_page(frag)))
+   if (page && PageHighMem(page))
return 1;
}
}
diff --git a/net/core/gro.c b/net/core/gro.c
index 0759277dc14e..42d7f6755f32 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -376,7 +376,7 @@ static inline void skb_gro_reset_offset(struct sk_buff 
*skb, u32 nhoff)
NAPI_GRO_CB(skb)->frag0 = NULL;
NAPI_GRO_CB(skb)->frag0_len = 0;
 
-   if (!skb_headlen(skb) && pinfo->nr_frags &&
+   if (!skb_headlen(skb) && pinfo->nr_frags && skb_frag_page(frag0) &&
!PageHighMem(skb_frag_page(frag0)) &&
(!NET_IP_ALIGN || !((skb_frag_off(frag0) + nhoff) & 3))) {
NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 07f802f1adf1..2ce64f57a0f6 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2999,6 +2999,9 @@ static bool __skb_splice_bits(struct sk_buff *skb, struct 
pipe_inode_info *pipe,
for (seg = 0; seg < skb_shinfo(skb)->nr_frags; seg++) {
const skb_frag_t *f = _shinfo(skb)->frags[seg];
 
+   if (WARN_ON_ONCE(!skb_frag_page(f)))
+   return false;
+
if (__splice_segment(skb_frag_page(f),
 skb_frag_off(f),

[net-next v1 10/16] page_pool: don't release iov on elevanted refcount

2023-12-07 Thread Mina Almasry

Currently the page_pool behavior is that a page is considered for
recycling only once, the first time __page_pool_put_page() is called on
it.

This works because in practice the net stack only holds 1 reference to
the skb frags. In that case, the page_pool recycling works as expected,
as the skb frags will have 1 reference on the pages from the net stack
when __page_pool_put_page() is called (if the driver is not holding
extra references for recycling), and so the page will be recycled.

However, this is not compatible with devmem TCP. For devmem TCP, the net
stack holds 2 references for each frag, 1 reference is part of the SKB,
and the second reference is for the user holding the frag until they
call SO_DEVMEM_DONTNEED. This causes a bug in the page_pool recycling
where, when the skb is freed, the reference count goes from 2->1, the
page_pool sees a pending reference, releases the page, and so no devmem
iovs get recycled.

To fix this, don't release iovs on elevated refcount.

Signed-off-by: Mina Almasry 
---
 net/core/page_pool.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index f0148d66371b..dc2a148f5b06 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -731,6 +731,29 @@ __page_pool_put_page(struct page_pool *pool, struct page 
*page,
/* Page found as candidate for recycling */
return page;
}
+
+   if (page_is_page_pool_iov(page)) {
+   /* With devmem TCP and ppiovs, we can't release pages if the
+* refcount is > 1. This is because the net stack holds
+* 2 references:
+*  - 1 for the skb, and
+*  - 1 for the user until they call SO_DEVMEM_DONTNEED.
+* Releasing pages for elevated refcounts completely disables
+* page_pool recycling. Instead, simply don't release pages and
+* the next call to napi_pp_put_page() via SO_DEVMEM_DONTNEED
+* will consider the page again for recycling. As a result,
+* devmem TCP incompatible with drivers doing refcnt based
+* recycling unless those drivers:
+*
+* - don't mark skb_mark_for_recycle()
+* - are sure to release the last reference with
+*   page_pool_put_full_page() to consider the page for
+*   page_pool recycling.
+*/
+   page_pool_page_put_many(page, 1);
+   return NULL;
+   }
+
/* Fallback/non-XDP mode: API user have elevated refcnt.
 *
 * Many drivers split up the page into fragments, and some
-- 
2.43.0.472.g3155946c3a-goog

[net-next v1 07/16] netdev: netdevice devmem allocator

2023-12-07 Thread Mina Almasry

Implement netdev devmem allocator. The allocator takes a given struct
netdev_dmabuf_binding as input and allocates page_pool_iov from that
binding.

The allocation simply delegates to the binding's genpool for the
allocation logic and wraps the returned memory region in a page_pool_iov
struct.

page_pool_iov are refcounted and are freed back to the binding when the
refcount drops to 0.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

v1:
- Rename devmem -> dmabuf (David).

---
 include/net/devmem.h| 13 
 include/net/page_pool/helpers.h | 28 +
 net/core/dev.c  | 37 -
 3 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/include/net/devmem.h b/include/net/devmem.h
index 29ff125f9815..29bc337c7743 100644
--- a/include/net/devmem.h
+++ b/include/net/devmem.h
@@ -48,6 +48,9 @@ struct netdev_dmabuf_binding {
 };
 
 #ifdef CONFIG_DMA_SHARED_BUFFER
+struct page_pool_iov *
+netdev_alloc_dmabuf(struct netdev_dmabuf_binding *binding);
+void netdev_free_dmabuf(struct page_pool_iov *ppiov);
 void __netdev_dmabuf_binding_free(struct netdev_dmabuf_binding *binding);
 int netdev_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
   struct netdev_dmabuf_binding **out);
@@ -55,6 +58,16 @@ void netdev_unbind_dmabuf(struct netdev_dmabuf_binding 
*binding);
 int netdev_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
struct netdev_dmabuf_binding *binding);
 #else
+static inline struct page_pool_iov *
+netdev_alloc_dmabuf(struct netdev_dmabuf_binding *binding)
+{
+   return NULL;
+}
+
+static inline void netdev_free_dmabuf(struct page_pool_iov *ppiov)
+{
+}
+
 static inline void
 __netdev_dmabuf_binding_free(struct netdev_dmabuf_binding *binding)
 {
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 7dc65774cde5..8bfc2d43efd4 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -79,6 +79,34 @@ static inline u64 *page_pool_ethtool_stats_get(u64 *data, 
void *stats)
 }
 #endif
 
+/* page_pool_iov support */
+
+static inline struct dmabuf_genpool_chunk_owner *
+page_pool_iov_owner(const struct page_pool_iov *ppiov)
+{
+   return ppiov->owner;
+}
+
+static inline unsigned int page_pool_iov_idx(const struct page_pool_iov *ppiov)
+{
+   return ppiov - page_pool_iov_owner(ppiov)->ppiovs;
+}
+
+static inline dma_addr_t
+page_pool_iov_dma_addr(const struct page_pool_iov *ppiov)
+{
+   struct dmabuf_genpool_chunk_owner *owner = page_pool_iov_owner(ppiov);
+
+   return owner->base_dma_addr +
+  ((dma_addr_t)page_pool_iov_idx(ppiov) << PAGE_SHIFT);
+}
+
+static inline struct netdev_dmabuf_binding *
+page_pool_iov_binding(const struct page_pool_iov *ppiov)
+{
+   return page_pool_iov_owner(ppiov)->binding;
+}
+
 /**
  * page_pool_dev_alloc_pages() - allocate a page.
  * @pool:  pool from which to allocate
diff --git a/net/core/dev.c b/net/core/dev.c
index b8c8be5a912e..30667e4c3b95 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -155,8 +155,8 @@
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
 
 #include "dev.h"
 #include "net-sysfs.h"
@@ -2120,6 +2120,41 @@ static int netdev_restart_rx_queue(struct net_device 
*dev, int rxq_idx)
return err;
 }
 
+struct page_pool_iov *netdev_alloc_dmabuf(struct netdev_dmabuf_binding 
*binding)
+{
+   struct dmabuf_genpool_chunk_owner *owner;
+   struct page_pool_iov *ppiov;
+   unsigned long dma_addr;
+   ssize_t offset;
+   ssize_t index;
+
+   dma_addr = gen_pool_alloc_owner(binding->chunk_pool, PAGE_SIZE,
+   (void **));
+   if (!dma_addr)
+   return NULL;
+
+   offset = dma_addr - owner->base_dma_addr;
+   index = offset / PAGE_SIZE;
+   ppiov = >ppiovs[index];
+
+   netdev_dmabuf_binding_get(binding);
+
+   return ppiov;
+}
+
+void netdev_free_dmabuf(struct page_pool_iov *ppiov)
+{
+   struct netdev_dmabuf_binding *binding = page_pool_iov_binding(ppiov);
+   unsigned long dma_addr = page_pool_iov_dma_addr(ppiov);
+
+   refcount_set(>refcount, 1);
+
+   if (gen_pool_has_addr(binding->chunk_pool, dma_addr, PAGE_SIZE))
+   gen_pool_free(binding->chunk_pool, dma_addr, PAGE_SIZE);
+
+   netdev_dmabuf_binding_put(binding);
+}
+
 /* Protected by rtnl_lock() */
 static DEFINE_XARRAY_FLAGS(netdev_dmabuf_bindings, XA_FLAGS_ALLOC1);
 
-- 
2.43.0.472.g3155946c3a-goog

[net-next v1 08/16] memory-provider: dmabuf devmem memory provider

2023-12-07 Thread Mina Almasry

Implement a memory provider that allocates dmabuf devmem page_pool_iovs.

The provider receives a reference to the struct netdev_dmabuf_binding
via the pool->mp_priv pointer. The driver needs to set this pointer for
the provider in the page_pool_params.

The provider obtains a reference on the netdev_dmabuf_binding which
guarantees the binding and the underlying mapping remains alive until
the provider is destroyed.

Usage of PP_FLAG_DMA_MAP is required for this memory provide such that
the page_pool can provide the driver with the dma-addrs of the devmem.

Support for PP_FLAG_DMA_SYNC_DEV is omitted for simplicity.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

v1:
- static_branch check in page_is_page_pool_iov() (Willem & Paolo).
- PP_DEVMEM -> PP_IOV (David).
- Require PP_FLAG_DMA_MAP (Jakub).

---
 include/net/page_pool/helpers.h | 47 +
 include/net/page_pool/types.h   |  9 
 net/core/page_pool.c| 89 -
 3 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 8bfc2d43efd4..00197f14aa87 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -53,6 +53,8 @@
 #define _NET_PAGE_POOL_HELPERS_H
 
 #include 
+#include 
+#include 
 
 #ifdef CONFIG_PAGE_POOL_STATS
 /* Deprecated driver-facing API, use netlink instead */
@@ -92,6 +94,11 @@ static inline unsigned int page_pool_iov_idx(const struct 
page_pool_iov *ppiov)
return ppiov - page_pool_iov_owner(ppiov)->ppiovs;
 }
 
+static inline u32 page_pool_iov_binding_id(const struct page_pool_iov *ppiov)
+{
+   return page_pool_iov_owner(ppiov)->binding->id;
+}
+
 static inline dma_addr_t
 page_pool_iov_dma_addr(const struct page_pool_iov *ppiov)
 {
@@ -107,6 +114,46 @@ page_pool_iov_binding(const struct page_pool_iov *ppiov)
return page_pool_iov_owner(ppiov)->binding;
 }
 
+static inline int page_pool_iov_refcount(const struct page_pool_iov *ppiov)
+{
+   return refcount_read(>refcount);
+}
+
+static inline void page_pool_iov_get_many(struct page_pool_iov *ppiov,
+ unsigned int count)
+{
+   refcount_add(count, >refcount);
+}
+
+void __page_pool_iov_free(struct page_pool_iov *ppiov);
+
+static inline void page_pool_iov_put_many(struct page_pool_iov *ppiov,
+ unsigned int count)
+{
+   if (!refcount_sub_and_test(count, >refcount))
+   return;
+
+   __page_pool_iov_free(ppiov);
+}
+
+/* page pool mm helpers */
+
+DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
+static inline bool page_is_page_pool_iov(const struct page *page)
+{
+   return static_branch_unlikely(_pool_mem_providers) &&
+  (unsigned long)page & PP_IOV;
+}
+
+static inline struct page_pool_iov *page_to_page_pool_iov(struct page *page)
+{
+   if (page_is_page_pool_iov(page))
+   return (struct page_pool_iov *)((unsigned long)page & ~PP_IOV);
+
+   DEBUG_NET_WARN_ON_ONCE(true);
+   return NULL;
+}
+
 /**
  * page_pool_dev_alloc_pages() - allocate a page.
  * @pool:  pool from which to allocate
diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index 44faee7a7b02..136930a238de 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -134,8 +134,15 @@ struct memory_provider_ops {
bool (*release_page)(struct page_pool *pool, struct page *page);
 };
 
+extern const struct memory_provider_ops dmabuf_devmem_ops;
+
 /* page_pool_iov support */
 
+/*  We overload the LSB of the struct page pointer to indicate whether it's
+ *  a page or page_pool_iov.
+ */
+#define PP_IOV 0x01UL
+
 /* Owner of the dma-buf chunks inserted into the gen pool. Each scatterlist
  * entry from the dmabuf is inserted into the genpool as a chunk, and needs
  * this owner struct to keep track of some metadata necessary to create
@@ -159,6 +166,8 @@ struct page_pool_iov {
struct dmabuf_genpool_chunk_owner *owner;
 
refcount_t refcount;
+
+   struct page_pool *pp;
 };
 
 struct page_pool {
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index f5c84d2a4510..423c88564a00 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -12,6 +12,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -20,12 +21,15 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
 #include "page_pool_priv.h"
 
-static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers);
+DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers);
+EXPORT_SYMBOL(page_pool_mem_providers);
 
 #define DEFER_TIME (msecs_to_jiffies(1000))
 #define DEFER_WARN_INTERVAL (60 * HZ)
@@ -175,6 +179,7 @@ static void page_pool_producer_unlock(struct page_pool 
*pool,
 static int page_pool_init(struct page_pool *pool,
  const struct page_pool_params *params)
 {
+

[net-next v1 06/16] netdev: support binding dma-buf to netdevice

2023-12-07 Thread Mina Almasry

Add a netdev_dmabuf_binding struct which represents the
dma-buf-to-netdevice binding. The netlink API will bind the dma-buf to
rx queues on the netdevice. On the binding, the dma_buf_attach
& dma_buf_map_attachment will occur. The entries in the sg_table from
mapping will be inserted into a genpool to make it ready
for allocation.

The chunks in the genpool are owned by a dmabuf_chunk_owner struct which
holds the dma-buf offset of the base of the chunk and the dma_addr of
the chunk. Both are needed to use allocations that come from this chunk.

We create a new type that represents an allocation from the genpool:
page_pool_iov. We setup the page_pool_iov allocation size in the
genpool to PAGE_SIZE for simplicity: to match the PAGE_SIZE normally
allocated by the page pool and given to the drivers.

The user can unbind the dmabuf from the netdevice by closing the netlink
socket that established the binding. We do this so that the binding is
automatically unbound even if the userspace process crashes.

The binding and unbinding leaves an indicator in struct netdev_rx_queue
that the given queue is bound, but the binding doesn't take effect until
the driver actually reconfigures its queues, and re-initializes its page
pool.

The netdev_dmabuf_binding struct is refcounted, and releases its
resources only when all the refs are released.

Signed-off-by: Willem de Bruijn 
Signed-off-by: Kaiyuan Zhang 
Signed-off-by: Mina Almasry 

---

v1:

- Introduce devmem.h instead of bloating netdevice.h (Jakub)
- ENOTSUPP -> EOPNOTSUPP (checkpatch.pl I think)
- Remove unneeded rcu protection for binding->list (rtnl protected)
- Removed extraneous err_binding_put: label.
- Removed dma_addr += len (Paolo).
- Don't override err on netdev_bind_dmabuf_to_queue failure.
- Rename devmem -> dmabuf (David).
- Add id to dmabuf binding (David/Stan).
- Fix missing xa_destroy bound_rq_list.
- Use queue api to reset bound RX queues (Jakub).
- Update netlink API for rx-queue type (tx/re) (Jakub).

RFC v3:
- Support multi rx-queue binding

---
 include/net/devmem.h  |  96 
 include/net/netdev_rx_queue.h |   1 +
 include/net/page_pool/types.h |  27 
 net/core/dev.c| 276 ++
 net/core/netdev-genl.c| 122 ++-
 5 files changed, 520 insertions(+), 2 deletions(-)
 create mode 100644 include/net/devmem.h

diff --git a/include/net/devmem.h b/include/net/devmem.h
new file mode 100644
index ..29ff125f9815
--- /dev/null
+++ b/include/net/devmem.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Device memory TCP support
+ *
+ * Authors:Mina Almasry 
+ * Willem de Bruijn 
+ * Kaiyuan Zhang 
+ *
+ */
+#ifndef _NET_DEVMEM_H
+#define _NET_DEVMEM_H
+
+struct netdev_dmabuf_binding {
+   struct dma_buf *dmabuf;
+   struct dma_buf_attachment *attachment;
+   struct sg_table *sgt;
+   struct net_device *dev;
+   struct gen_pool *chunk_pool;
+
+   /* The user holds a ref (via the netlink API) for as long as they want
+* the binding to remain alive. Each page pool using this binding holds
+* a ref to keep the binding alive. Each allocated page_pool_iov holds a
+* ref.
+*
+* The binding undos itself and unmaps the underlying dmabuf once all
+* those refs are dropped and the binding is no longer desired or in
+* use.
+*/
+   refcount_t ref;
+
+   /* The portid of the user that owns this binding. Used for netlink to
+* notify us of the user dropping the bind.
+*/
+   u32 owner_nlportid;
+
+   /* The list of bindings currently active. Used for netlink to notify us
+* of the user dropping the bind.
+*/
+   struct list_head list;
+
+   /* rxq's this binding is active on. */
+   struct xarray bound_rxq_list;
+
+   /* ID of this binding. Globally unique to all bindings currently
+* active.
+*/
+   u32 id;
+};
+
+#ifdef CONFIG_DMA_SHARED_BUFFER
+void __netdev_dmabuf_binding_free(struct netdev_dmabuf_binding *binding);
+int netdev_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
+  struct netdev_dmabuf_binding **out);
+void netdev_unbind_dmabuf(struct netdev_dmabuf_binding *binding);
+int netdev_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
+   struct netdev_dmabuf_binding *binding);
+#else
+static inline void
+__netdev_dmabuf_binding_free(struct netdev_dmabuf_binding *binding)
+{
+}
+
+static inline int netdev_bind_dmabuf(struct net_device *dev,
+unsigned int dmabuf_fd,
+struct netdev_dmabuf_binding **out)
+{
+   return -EOPNOTSUPP;
+}
+static inline void netdev_unbind_dmabuf(struct netdev_dmabuf_binding *binding)
+{
+}
+
+static inline int
+netdev_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
+

[net-next v1 05/16] net: netdev netlink api to bind dma-buf to a net device

2023-12-07 Thread Mina Almasry

API takes the dma-buf fd as input, and binds it to the netdevice. The
user can specify the rx queues to bind the dma-buf to.

Suggested-by: Stanislav Fomichev 
Signed-off-by: Mina Almasry 

---

Changes in v1:
- Add rx-queue-type to distingish rx from tx (Jakub)
- Return dma-buf ID from netlink API (David, Stan)

Changes in RFC-v3:
- Support binding multiple rx rx-queues

---
 Documentation/netlink/specs/netdev.yaml | 52 +
 include/uapi/linux/netdev.h | 19 +
 net/core/netdev-genl-gen.c  | 19 +
 net/core/netdev-genl-gen.h  |  2 +
 net/core/netdev-genl.c  |  6 +++
 tools/include/uapi/linux/netdev.h   | 19 +
 6 files changed, 117 insertions(+)

diff --git a/Documentation/netlink/specs/netdev.yaml 
b/Documentation/netlink/specs/netdev.yaml
index f2c76d103bd8..df6a11d47006 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -260,6 +260,45 @@ attribute-sets:
 name: napi-id
 doc: ID of the NAPI instance which services this queue.
 type: u32
+  -
+name: queue-dmabuf
+attributes:
+  -
+name: type
+doc: rx or tx queue
+type: u8
+enum: queue-type
+  -
+name: idx
+doc: queue index
+type: u32
+
+  -
+name: bind-dmabuf
+attributes:
+  -
+name: ifindex
+doc: netdev ifindex to bind the dma-buf to.
+type: u32
+checks:
+  min: 1
+  -
+name: queues
+doc: receive queues to bind the dma-buf to.
+type: nest
+nested-attributes: queue-dmabuf
+multi-attr: true
+  -
+name: dmabuf-fd
+doc: dmabuf file descriptor to bind.
+type: u32
+  -
+name: dmabuf-id
+doc: id of the dmabuf binding
+type: u32
+checks:
+  min: 1
+
 
 operations:
   list:
@@ -382,6 +421,19 @@ operations:
   attributes:
 - ifindex
 reply: *queue-get-op
+-
+  name: bind-rx
+  doc: Bind dmabuf to netdev
+  attribute-set: bind-dmabuf
+  do:
+request:
+  attributes:
+- ifindex
+- dmabuf-fd
+- queues
+reply:
+  attributes:
+- dmabuf-id
 -
   name: napi-get
   doc: Get information about NAPI instances configured on the system.
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 424c5e28f495..35d201dc4b05 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -129,6 +129,24 @@ enum {
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
 };
 
+enum {
+   NETDEV_A_QUEUE_DMABUF_TYPE = 1,
+   NETDEV_A_QUEUE_DMABUF_IDX,
+
+   __NETDEV_A_QUEUE_DMABUF_MAX,
+   NETDEV_A_QUEUE_DMABUF_MAX = (__NETDEV_A_QUEUE_DMABUF_MAX - 1)
+};
+
+enum {
+   NETDEV_A_BIND_DMABUF_IFINDEX = 1,
+   NETDEV_A_BIND_DMABUF_QUEUES,
+   NETDEV_A_BIND_DMABUF_DMABUF_FD,
+   NETDEV_A_BIND_DMABUF_DMABUF_ID,
+
+   __NETDEV_A_BIND_DMABUF_MAX,
+   NETDEV_A_BIND_DMABUF_MAX = (__NETDEV_A_BIND_DMABUF_MAX - 1)
+};
+
 enum {
NETDEV_CMD_DEV_GET = 1,
NETDEV_CMD_DEV_ADD_NTF,
@@ -140,6 +158,7 @@ enum {
NETDEV_CMD_PAGE_POOL_CHANGE_NTF,
NETDEV_CMD_PAGE_POOL_STATS_GET,
NETDEV_CMD_QUEUE_GET,
+   NETDEV_CMD_BIND_RX,
NETDEV_CMD_NAPI_GET,
 
__NETDEV_CMD_MAX,
diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c
index be7f2ebd61b2..3384b1ae3f40 100644
--- a/net/core/netdev-genl-gen.c
+++ b/net/core/netdev-genl-gen.c
@@ -27,6 +27,11 @@ const struct nla_policy 
netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFIND
[NETDEV_A_PAGE_POOL_IFINDEX] = NLA_POLICY_FULL_RANGE(NLA_U32, 
_a_page_pool_ifindex_range),
 };
 
+const struct nla_policy 
netdev_queue_dmabuf_nl_policy[NETDEV_A_QUEUE_DMABUF_IDX + 1] = {
+   [NETDEV_A_QUEUE_DMABUF_TYPE] = NLA_POLICY_MAX(NLA_U8, 1),
+   [NETDEV_A_QUEUE_DMABUF_IDX] = { .type = NLA_U32, },
+};
+
 /* NETDEV_CMD_DEV_GET - do */
 static const struct nla_policy netdev_dev_get_nl_policy[NETDEV_A_DEV_IFINDEX + 
1] = {
[NETDEV_A_DEV_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
@@ -58,6 +63,13 @@ static const struct nla_policy 
netdev_queue_get_dump_nl_policy[NETDEV_A_QUEUE_IF
[NETDEV_A_QUEUE_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
 };
 
+/* NETDEV_CMD_BIND_RX - do */
+static const struct nla_policy 
netdev_bind_rx_nl_policy[NETDEV_A_BIND_DMABUF_DMABUF_FD + 1] = {
+   [NETDEV_A_BIND_DMABUF_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1),
+   [NETDEV_A_BIND_DMABUF_DMABUF_FD] = { .type = NLA_U32, },
+   [NETDEV_A_BIND_DMABUF_QUEUES] = 
NLA_POLICY_NESTED(netdev_queue_dmabuf_nl_policy),
+};
+
 /* NETDEV_CMD_NAPI_GET - do */
 static const struct nla_policy netdev_napi_get_do_nl_policy[NETDEV_A_NAPI_ID + 
1] = {
[NETDEV_A_NAPI_ID] = { .type = NLA_U32, },
@@ -124,6 +136,13 @@

[net-next v1 04/16] gve: implement queue api

2023-12-07 Thread Mina Almasry

Define a struct that contains all of the memory needed for an RX
queue to function.

Implement the queue-api in GVE using this struct.

Currently the only memory is allocated at the time of queue start are
the RX pages in gve_rx_post_buffers_dqo(). That can be moved up to
queue_mem_alloc() time in the future.

For simplicity the queue API is only supported by the diorite queue
out-of-order (DQO) format without queue-page-lists (QPL). Support for
other GVE formats can be added in the future as well.

Signed-off-by: Mina Almasry 

---
 drivers/net/ethernet/google/gve/gve_adminq.c |   6 +-
 drivers/net/ethernet/google/gve/gve_adminq.h |   3 +
 drivers/net/ethernet/google/gve/gve_dqo.h|   2 +
 drivers/net/ethernet/google/gve/gve_main.c   | 286 +++
 drivers/net/ethernet/google/gve/gve_rx_dqo.c |   5 +-
 5 files changed, 296 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/google/gve/gve_adminq.c 
b/drivers/net/ethernet/google/gve/gve_adminq.c
index 12fbd723ecc6..e515b7278295 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.c
+++ b/drivers/net/ethernet/google/gve/gve_adminq.c
@@ -348,7 +348,7 @@ static int gve_adminq_parse_err(struct gve_priv *priv, u32 
status)
 /* Flushes all AQ commands currently queued and waits for them to complete.
  * If there are failures, it will return the first error.
  */
-static int gve_adminq_kick_and_wait(struct gve_priv *priv)
+int gve_adminq_kick_and_wait(struct gve_priv *priv)
 {
int tail, head;
int i;
@@ -591,7 +591,7 @@ int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 
start_id, u32 num_que
return gve_adminq_kick_and_wait(priv);
 }
 
-static int gve_adminq_create_rx_queue(struct gve_priv *priv, u32 queue_index)
+int gve_adminq_create_rx_queue(struct gve_priv *priv, u32 queue_index)
 {
struct gve_rx_ring *rx = >rx[queue_index];
union gve_adminq_command cmd;
@@ -691,7 +691,7 @@ int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 
start_id, u32 num_qu
return gve_adminq_kick_and_wait(priv);
 }
 
-static int gve_adminq_destroy_rx_queue(struct gve_priv *priv, u32 queue_index)
+int gve_adminq_destroy_rx_queue(struct gve_priv *priv, u32 queue_index)
 {
union gve_adminq_command cmd;
int err;
diff --git a/drivers/net/ethernet/google/gve/gve_adminq.h 
b/drivers/net/ethernet/google/gve/gve_adminq.h
index 5865ccdccbd0..265beed965dc 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.h
+++ b/drivers/net/ethernet/google/gve/gve_adminq.h
@@ -411,6 +411,7 @@ union gve_adminq_command {
 
 static_assert(sizeof(union gve_adminq_command) == 64);
 
+int gve_adminq_kick_and_wait(struct gve_priv *priv);
 int gve_adminq_alloc(struct device *dev, struct gve_priv *priv);
 void gve_adminq_free(struct device *dev, struct gve_priv *priv);
 void gve_adminq_release(struct gve_priv *priv);
@@ -424,7 +425,9 @@ int gve_adminq_deconfigure_device_resources(struct gve_priv 
*priv);
 int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 start_id, u32 
num_queues);
 int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 start_id, u32 
num_queues);
 int gve_adminq_create_rx_queues(struct gve_priv *priv, u32 num_queues);
+int gve_adminq_create_rx_queue(struct gve_priv *priv, u32 queue_index);
 int gve_adminq_destroy_rx_queues(struct gve_priv *priv, u32 queue_id);
+int gve_adminq_destroy_rx_queue(struct gve_priv *priv, u32 queue_id);
 int gve_adminq_register_page_list(struct gve_priv *priv,
  struct gve_queue_page_list *qpl);
 int gve_adminq_unregister_page_list(struct gve_priv *priv, u32 page_list_id);
diff --git a/drivers/net/ethernet/google/gve/gve_dqo.h 
b/drivers/net/ethernet/google/gve/gve_dqo.h
index c36b93f0de15..3eed26a0ed7d 100644
--- a/drivers/net/ethernet/google/gve/gve_dqo.h
+++ b/drivers/net/ethernet/google/gve/gve_dqo.h
@@ -46,6 +46,8 @@ int gve_clean_tx_done_dqo(struct gve_priv *priv, struct 
gve_tx_ring *tx,
  struct napi_struct *napi);
 void gve_rx_post_buffers_dqo(struct gve_rx_ring *rx);
 void gve_rx_write_doorbell_dqo(const struct gve_priv *priv, int queue_idx);
+void gve_free_page_dqo(struct gve_priv *priv, struct gve_rx_buf_state_dqo *bs,
+  bool free_page);
 
 static inline void
 gve_tx_put_doorbell_dqo(const struct gve_priv *priv,
diff --git a/drivers/net/ethernet/google/gve/gve_main.c 
b/drivers/net/ethernet/google/gve/gve_main.c
index 619bf63ec935..5b23d811afd3 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -22,6 +22,7 @@
 #include "gve_dqo.h"
 #include "gve_adminq.h"
 #include "gve_register.h"
+#include "gve_utils.h"
 
 #define GVE_DEFAULT_RX_COPYBREAK   (256)
 
@@ -1702,6 +1703,287 @@ static int gve_xdp(struct net_device *dev, struct 
netdev_bpf *xdp)
}
 }
 
+struct gve_per_rx_queue_mem_dqo {
+   struct gve_rx_buf_state_dqo *buf_states;
+   u32 num_buf_states;
+
+   struct gve_rx_compl_desc_dqo

[net-next v1 03/16] queue_api: define queue api

2023-12-07 Thread Mina Almasry

This API enables the net stack to reset the queues used for devmem.

Signed-off-by: Mina Almasry 

---
 include/linux/netdevice.h | 24 
 1 file changed, 24 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1b935ee341b4..316f7dee86ce 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1432,6 +1432,20 @@ struct netdev_net_notifier {
  *struct kernel_hwtstamp_config *kernel_config,
  *struct netlink_ext_ack *extack);
  * Change the hardware timestamping parameters for NIC device.
+ *
+ * void *(*ndo_queue_mem_alloc)(struct net_device *dev, int idx);
+ * Allocate memory for an RX queue. The memory returned in the form of
+ * a void * can be passed to ndo_queue_mem_free() for freeing or to
+ * ndo_queue_start to create an RX queue with this memory.
+ *
+ * void(*ndo_queue_mem_free)(struct net_device *dev, void *);
+ * Free memory from an RX queue.
+ *
+ * int (*ndo_queue_start)(struct net_device *dev, int idx, void *);
+ * Start an RX queue at the specified index.
+ *
+ * int (*ndo_queue_stop)(struct net_device *dev, int idx, void **);
+ * Stop the RX queue at the specified index.
  */
 struct net_device_ops {
int (*ndo_init)(struct net_device *dev);
@@ -1673,6 +1687,16 @@ struct net_device_ops {
int (*ndo_hwtstamp_set)(struct net_device *dev,
struct 
kernel_hwtstamp_config *kernel_config,
struct netlink_ext_ack 
*extack);
+   void *  (*ndo_queue_mem_alloc)(struct net_device *dev,
+  int idx);
+   void(*ndo_queue_mem_free)(struct net_device *dev,
+ void *queue_mem);
+   int (*ndo_queue_start)(struct net_device *dev,
+  int idx,
+  void *queue_mem);
+   int (*ndo_queue_stop)(struct net_device *dev,
+ int idx,
+ void **out_queue_mem);
 };
 
 /**
-- 
2.43.0.472.g3155946c3a-goog

[net-next v1 02/16] net: page_pool: create hooks for custom page providers

2023-12-07 Thread Mina Almasry

From: Jakub Kicinski 

The page providers which try to reuse the same pages will
need to hold onto the ref, even if page gets released from
the pool - as in releasing the page from the pp just transfers
the "ownership" reference from pp to the provider, and provider
will wait for other references to be gone before feeding this
page back into the pool.

Signed-off-by: Jakub Kicinski 
Signed-off-by: Mina Almasry 

---

This is implemented by Jakub in his RFC:
https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168...@redhat.com/T/

I take no credit for the idea or implementation; I only added minor
edits to make this workable with device memory TCP, and removed some
hacky test code. This is a critical dependency of device memory TCP
and thus I'm pulling it into this series to make it revewable and
mergable.

RFC v3 -> v1
- Removed unusued mem_provider. (Yunsheng).
- Replaced memory_provider & mp_priv with netdev_rx_queue (Jakub).

---
 include/net/page_pool/types.h | 12 ++
 net/core/page_pool.c  | 43 +++
 2 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index ac286ea8ce2d..0e9fa79a5ef1 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -51,6 +51,7 @@ struct pp_alloc_cache {
  * @dev:   device, for DMA pre-mapping purposes
  * @netdev:netdev this pool will serve (leave as NULL if none or multiple)
  * @napi:  NAPI which is the sole consumer of pages, otherwise NULL
+ * @queue: struct netdev_rx_queue this page_pool is being created for.
  * @dma_dir:   DMA mapping direction
  * @max_len:   max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV
  * @offset:DMA sync address offset for PP_FLAG_DMA_SYNC_DEV
@@ -63,6 +64,7 @@ struct page_pool_params {
int nid;
struct device   *dev;
struct napi_struct *napi;
+   struct netdev_rx_queue *queue;
enum dma_data_direction dma_dir;
unsigned intmax_len;
unsigned intoffset;
@@ -125,6 +127,13 @@ struct page_pool_stats {
 };
 #endif
 
+struct memory_provider_ops {
+   int (*init)(struct page_pool *pool);
+   void (*destroy)(struct page_pool *pool);
+   struct page *(*alloc_pages)(struct page_pool *pool, gfp_t gfp);
+   bool (*release_page)(struct page_pool *pool, struct page *page);
+};
+
 struct page_pool {
struct page_pool_params_fast p;
 
@@ -174,6 +183,9 @@ struct page_pool {
 */
struct ptr_ring ring;
 
+   void *mp_priv;
+   const struct memory_provider_ops *mp_ops;
+
 #ifdef CONFIG_PAGE_POOL_STATS
/* recycle stats are per-cpu to avoid locking */
struct page_pool_recycle_stats __percpu *recycle_stats;
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index ca1b3b65c9b5..f5c84d2a4510 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -25,6 +25,8 @@
 
 #include "page_pool_priv.h"
 
+static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers);
+
 #define DEFER_TIME (msecs_to_jiffies(1000))
 #define DEFER_WARN_INTERVAL (60 * HZ)
 
@@ -174,6 +176,7 @@ static int page_pool_init(struct page_pool *pool,
  const struct page_pool_params *params)
 {
unsigned int ring_qsize = 1024; /* Default */
+   int err;
 
memcpy(>p, >fast, sizeof(pool->p));
memcpy(>slow, >slow, sizeof(pool->slow));
@@ -234,10 +237,25 @@ static int page_pool_init(struct page_pool *pool,
/* Driver calling page_pool_create() also call page_pool_destroy() */
refcount_set(>user_cnt, 1);
 
+   if (pool->mp_ops) {
+   err = pool->mp_ops->init(pool);
+   if (err) {
+   pr_warn("%s() mem-provider init failed %d\n",
+   __func__, err);
+   goto free_ptr_ring;
+   }
+
+   static_branch_inc(_pool_mem_providers);
+   }
+
if (pool->p.flags & PP_FLAG_DMA_MAP)
get_device(pool->p.dev);
 
return 0;
+
+free_ptr_ring:
+   ptr_ring_cleanup(>ring, NULL);
+   return err;
 }
 
 static void page_pool_uninit(struct page_pool *pool)
@@ -519,7 +537,10 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, 
gfp_t gfp)
return page;
 
/* Slow-path: cache empty, do real allocation */
-   page = __page_pool_alloc_pages_slow(pool, gfp);
+   if (static_branch_unlikely(_pool_mem_providers) && pool->mp_ops)
+   page = pool->mp_ops->alloc_pages(pool, gfp);
+   else
+   page = __page_pool_alloc_pages_slow(pool, gfp);
return page;
 }
 EXPORT_SYMBOL(page_pool_alloc_pages);
@@ -576,10 +597,13 @@ void __page_pool_release_page_dma(struct page_pool *pool, 
struct page *page)
 void page_pool_return_page(struct page_pool *pool, struct page *page)
 {
int count;
+

[net-next v1 01/16] net: page_pool: factor out releasing DMA from releasing the page

2023-12-07 Thread Mina Almasry

From: Jakub Kicinski 

Releasing the DMA mapping will be useful for other types
of pages, so factor it out. Make sure compiler inlines it,
to avoid any regressions.

Signed-off-by: Jakub Kicinski 
Signed-off-by: Mina Almasry 

---

This is implemented by Jakub in his RFC:

https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168...@redhat.com/T/

I take no credit for the idea or implementation. This is a critical
dependency of device memory TCP and thus I'm pulling it into this series
to make it revewable and mergable.

---
 net/core/page_pool.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index c2e7c9a6efbe..ca1b3b65c9b5 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -548,21 +548,16 @@ s32 page_pool_inflight(const struct page_pool *pool, bool 
strict)
return inflight;
 }
 
-/* Disconnects a page (from a page_pool).  API users can have a need
- * to disconnect a page (from a page_pool), to allow it to be used as
- * a regular page (that will eventually be returned to the normal
- * page-allocator via put_page).
- */
-static void page_pool_return_page(struct page_pool *pool, struct page *page)
+static __always_inline
+void __page_pool_release_page_dma(struct page_pool *pool, struct page *page)
 {
dma_addr_t dma;
-   int count;
 
if (!(pool->p.flags & PP_FLAG_DMA_MAP))
/* Always account for inflight pages, even if we didn't
 * map them
 */
-   goto skip_dma_unmap;
+   return;
 
dma = page_pool_get_dma_addr(page);
 
@@ -571,7 +566,19 @@ static void page_pool_return_page(struct page_pool *pool, 
struct page *page)
 PAGE_SIZE << pool->p.order, pool->p.dma_dir,
 DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING);
page_pool_set_dma_addr(page, 0);
-skip_dma_unmap:
+}
+
+/* Disconnects a page (from a page_pool).  API users can have a need
+ * to disconnect a page (from a page_pool), to allow it to be used as
+ * a regular page (that will eventually be returned to the normal
+ * page-allocator via put_page).
+ */
+void page_pool_return_page(struct page_pool *pool, struct page *page)
+{
+   int count;
+
+   __page_pool_release_page_dma(pool, page);
+
page_pool_clear_pp_info(page);
 
/* This may be the last page returned, releasing the pool, so
-- 
2.43.0.472.g3155946c3a-goog

[net-next v1 00/16] Device Memory TCP

2023-12-07 Thread Mina Almasry

Major changes in v1:
--

1. Implemented MVP queue API ndos to remove the userspace-visible
   driver reset.

2. Fixed issues in the napi_pp_put_page() devmem frag unref path.

3. Removed RFC tag.

Many smaller addressed comments across all the patches (patches have
individual change log).

Full tree including the rest of the GVE driver changes:
https://github.com/mina/linux/commits/tcpdevmem-v1

Cc: Yunsheng Lin 
Cc: Shailend Chand 
Cc: Harshitha Ramamurthy 

Changes in RFC v3:
--

1. Pulled in the memory-provider dependency from Jakub's RFC[1] to make the
   series reviewable and mergable.

2. Implemented multi-rx-queue binding which was a todo in v2.

3. Fix to cmsg handling.

The sticking point in RFC v2[2] was the device reset required to refill
the device rx-queues after the dmabuf bind/unbind. The solution
suggested as I understand is a subset of the per-queue management ops
Jakub suggested or similar:

https://lore.kernel.org/netdev/20230815171638.4c057...@kernel.org/

This is not addressed in this revision, because:

1. This point was discussed at netconf & netdev and there is openness to
   using the current approach of requiring a device reset.

2. Implementing individual queue resetting seems to be difficult for my
   test bed with GVE. My prototype to test this ran into issues with the
   rx-queues not coming back up properly if reset individually. At the
   moment I'm unsure if it's a mistake in the POC or a genuine issue in
   the virtualization stack behind GVE, which currently doesn't test
   individual rx-queue restart.

3. Our usecases are not bothered by requiring a device reset to refill
   the buffer queues, and we'd like to support NICs that run into this
   limitation with resetting individual queues.

My thought is that drivers that have trouble with per-queue configs can
use the support in this series, while drivers that support new netdev
ops to reset individual queues can automatically reset the queue as
part of the dma-buf bind/unbind.

The same approach with device resets is presented again for consideration
with other sticking points addressed.

This proposal includes the rx devmem path only proposed for merge. For a
snapshot of my entire tree which includes the GVE POC page pool support &
device memory support:

https://github.com/torvalds/linux/compare/master...mina:linux:tcpdevmem-v3

[1] 
https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168...@redhat.com/T/
[2] 
https://lore.kernel.org/netdev/cahs8izovjgjh5wf68osrwfkjid1_huzzuk+hpkblcl4psod...@mail.gmail.com/T/

Cc: Shakeel Butt 
Cc: Jeroen de Borst 
Cc: Praveen Kaligineedi 

Changes in RFC v2:
--

The sticking point in RFC v1[1] was the dma-buf pages approach we used to
deliver the device memory to the TCP stack. RFC v2 is a proof-of-concept
that attempts to resolve this by implementing scatterlist support in the
networking stack, such that we can import the dma-buf scatterlist
directly. This is the approach proposed at a high level here[2].

Detailed changes:
1. Replaced dma-buf pages approach with importing scatterlist into the
   page pool.
2. Replace the dma-buf pages centric API with a netlink API.
3. Removed the TX path implementation - there is no issue with
   implementing the TX path with scatterlist approach, but leaving
   out the TX path makes it easier to review.
4. Functionality is tested with this proposal, but I have not conducted
   perf testing yet. I'm not sure there are regressions, but I removed
   perf claims from the cover letter until they can be re-confirmed.
5. Added Signed-off-by: contributors to the implementation.
6. Fixed some bugs with the RX path since RFC v1.

Any feedback welcome, but specifically the biggest pending questions
needing feedback IMO are:

1. Feedback on the scatterlist-based approach in general.
2. Netlink API (Patch 1 & 2).
3. Approach to handle all the drivers that expect to receive pages from
   the page pool (Patch 6).

[1] 
https://lore.kernel.org/netdev/dfe4bae7-13a0-3c5d-d671-f61b375cb...@gmail.com/T/
[2] 
https://lore.kernel.org/netdev/CAHS8izPm6XRS54LdCDZVd0C75tA1zHSu6jLVO8nzTLXCc=h...@mail.gmail.com/

--

* TL;DR:

Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
from device memory efficiently, without bouncing the data to a host memory
buffer.

* Problem:

A large amount of data transfers have device memory as the source and/or
destination. Accelerators drastically increased the volume of such transfers.
Some examples include:
- ML accelerators transferring large amounts of training data from storage into
  GPU/TPU memory. In some cases ML training setup time can be as long as 50% of
  TPU compute time, improving data transfer throughput & efficiency can help
  improving GPU/TPU utilization.

- Distributed training, where ML accelerators, such as GPUs on different hosts,
  exchange data among them.

- Distributed raw block storage applications transfer large

Re:Re: [v3 5/6] drm/vs: Add hdmi driver

2023-12-07 Thread Andy Yan

Hi Keth：






在 2023-12-07 18:48:13，"Keith Zhao"  写道：
>
>
>On 2023/12/7 17:02, Andy Yan wrote:
>> 
>> 
>> 
>> 
>> Hi Keith：
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2023-12-06 22:11:33, "Keith Zhao"  wrote:
>>>
>>>
>>>On 2023/12/6 20:56, Maxime Ripard wrote:
 On Wed, Dec 06, 2023 at 08:02:55PM +0800, Keith Zhao wrote:
> >> +static const struct of_device_id starfive_hdmi_dt_ids[] = {
> >> +  { .compatible = "starfive,jh7110-inno-hdmi",},
> > 
> > So it's inno hdmi, just like Rockchip then?
> > 
> > This should be a common driver.
>
> Rockchip has a inno hdmi IP. and Starfive has a inno hdmi IP.
> but the harewawre difference of them is big , it is not easy to use the 
> common driver
> maybe i need the inno hdmi version here to make a distinction
 
 I just had a look at the rockchip header file: all the registers but the
 STARFIVE_* ones are identical.
 
 There's no need to have two identical drivers then, please use the
 rockchip driver instead.
 
 Maxime
>>>
>>>ok, have a simple test , edid can get . i will continue 
>> 
>> Maybe you can take drivers/gpu/drm/bridge/synopsys/dw-hdmi as a reference， 
>> this
>> is also a hdmi ip used by rockchip/meson/sunxi/jz/imx。
>> We finally make it share one driver。
>>>
>hi Andy:
>
>dw_hdmi seems a good choice , it can handle inno hdmi hardware by define its 
>dw_hdmi_plat_data.
>does it means i can write own driver files such as(dw_hdmi-starfive.c) based 
>on dw_hdmi instead of add plat_data in inno_hdmi.c
>

I think the process maybe like this：

1. split the inno_hdmi.c under rockchip to  inno_hdmi.c(the common part), 
inno_hdmi-rockchip.c(the soc specific part)
2. move the common part inno_hdmi.c to drivers/gpu/drm/bridge/innosilicon/
3. add startfive specific part, inno_hdmi-startfive.c

bellow git log from kernel three show how we convert  dw_hdmi to a common 
driver: 



12b9f204e804 drm: bridge/dw_hdmi: add rockchip rk3288 support
74af9e4d03b8 dt-bindings: Add documentation for rockchip dw hdmi
d346c14eeea9 drm: bridge/dw_hdmi: add function dw_hdmi_phy_enable_spare
a4d3b8b050d5 drm: bridge/dw_hdmi: clear i2cmphy_stat0 reg in 
hdmi_phy_wait_i2c_done
632d035bace2 drm: bridge/dw_hdmi: add mode_valid support
0cd9d1428322 drm: bridge/dw_hdmi: add support for multi-byte register width 
access
cd152393967e dt-bindings: add document for dw_hdmi
b21f4b658df8 drm: imx: imx-hdmi: move imx-hdmi to bridge/dw_hdmi
aaa757a092c2 drm: imx: imx-hdmi: split phy configuration to platform driver
3d1b35a3d9f3 drm: imx: imx-hdmi: convert imx-hdmi to drm_bridge mode
c2c3848851a7 drm: imx: imx-hdmi: return defer if can't get ddc i2c adapter
b587833933de drm: imx: imx-hdmi: make checkpatch happy


>Thanks for pointing this out!!!
>
>>>
>>>___
>>>linux-riscv mailing list
>>>linux-ri...@lists.infradead.org
>>>http://lists.infradead.org/mailman/listinfo/linux-riscv
>
>___
>linux-riscv mailing list
>linux-ri...@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-riscv

[PATCH -next] drm/imagination: Remove unneeded semicolon

2023-12-07 Thread Yang Li

./drivers/gpu/drm/imagination/pvr_fw_trace.c:251:2-3: Unneeded semicolon

Reported-by: Abaci Robot 
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7694
Signed-off-by: Yang Li 
---
 drivers/gpu/drm/imagination/pvr_fw_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/imagination/pvr_fw_trace.c 
b/drivers/gpu/drm/imagination/pvr_fw_trace.c
index 7159fc479001..31199e45b72e 100644
--- a/drivers/gpu/drm/imagination/pvr_fw_trace.c
+++ b/drivers/gpu/drm/imagination/pvr_fw_trace.c
@@ -248,7 +248,7 @@ static bool fw_trace_get_next(struct pvr_fw_trace_seq_data 
*trace_seq_data)
continue;
 
return true;
-   };
+   }
 
/* Hit end of trace data. */
return false;
-- 
2.20.1.7.g153144c

Re: [PATCH v4 1/2] drm/mediatek: fix kernel oops if no crtc is found

2023-12-07 Thread Chun-Kuang Hu

Hi, Michael:

Michael Walle  於 2023年9月5日 週二 下午4:49寫道：
>
> drm_crtc_from_index(0) might return NULL if there are no CRTCs
> registered at all which will lead to a kernel oops in
> mtk_drm_crtc_dma_dev_get(). Add the missing return value check.

Applied to mediatek-drm-fixes [1], thanks.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/log/?h=mediatek-drm-fixes

Regards,
Chun-Kuang.

>
> Fixes: 0d9eee9118b7 ("drm/mediatek: Add drm ovl_adaptor sub driver for 
> MT8195")
> Signed-off-by: Michael Walle 
> Reviewed-by: Nícolas F. R. A. Prado 
> Tested-by: Nícolas F. R. A. Prado 
> Reviewed-by: AngeloGioacchino Del Regno 
> 
> ---
> v4:
>  - collected tags
> v3:
>  - none
> v2:
>  - collected tags
>  - fixed typos
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> index 93552d76b6e7..2c582498817e 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> @@ -420,6 +420,7 @@ static int mtk_drm_kms_init(struct drm_device *drm)
> struct mtk_drm_private *private = drm->dev_private;
> struct mtk_drm_private *priv_n;
> struct device *dma_dev = NULL;
> +   struct drm_crtc *crtc;
> int ret, i, j;
>
> if (drm_firmware_drivers_only())
> @@ -494,7 +495,9 @@ static int mtk_drm_kms_init(struct drm_device *drm)
> }
>
> /* Use OVL device for all DMA memory allocations */
> -   dma_dev = mtk_drm_crtc_dma_dev_get(drm_crtc_from_index(drm, 0));
> +   crtc = drm_crtc_from_index(drm, 0);
> +   if (crtc)
> +   dma_dev = mtk_drm_crtc_dma_dev_get(crtc);
> if (!dma_dev) {
> ret = -ENODEV;
> dev_err(drm->dev, "Need at least one OVL device\n");
> --
> 2.39.2
>

Re: [RFC PATCH] of/platform: Disable sysfb if a simple-framebuffer node is found

2023-12-07 Thread Javier Martinez Canillas

Rob Herring  writes:

> On Mon, Dec 04, 2023 at 05:05:30PM +0100, Javier Martinez Canillas wrote:
>> Rob Herring  writes:
>> 
>> > On Mon, Dec 4, 2023 at 3:39 AM Javier Martinez Canillas
>> >  wrote:
>> >> Rob Herring  writes:
>> >> > On Fri, Dec 1, 2023 at 4:21 AM Javier Martinez Canillas
>> 
>> [...]
>> 
>> >>
>> >> > However, there might be one other issue with that and this fix. The DT
>> >> > simplefb can have resources such as clocks and regulators. With
>> >> > fw_devlink, the driver won't probe until those dependencies are met.
>> >> > So if you want the framebuffer console up early, then you may want to
>> >> > register the EFI framebuffer first and then handoff to the DT simplefb
>> >> > when it probes (rather than registering the device).
>> >> >
>> >> > But I agree, probably better to take this patch now and have those
>> >> > quirks instead of flat out not working.
>> >> >
>> >>
>> >> If we do that what's the plan? Are you thinking about merging this patch
>> >> through your OF tree or do you want to go through drm-misc with your ack?
>> >
>> > I can take it. Do we need this in 6.7 and stable?
>> >
>> 
>> IMO this can wait for v6.8 since is not a fix for a change introduced in
>> the v6.7 merge window and something that only happens on a very specific
>> setup (DT systems booting with u-boot EFI and providing an EFI-GOP table).
>> 
>> Also the -rc cycle is already in -rc5, so it seems risky to push a change
>> at this point. And distros can pick the patch if want to have it earlier.
>
> Okay, I've applied it for 6.8.
>

Great, thanks a lot.

> Rob
>

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [PATCH v3] drm/mediatek: Stop using iommu_present()

2023-12-07 Thread Chun-Kuang Hu

Hi, Robin:

Robin Murphy  於 2023年11月23日 週四 下午9:41寫道：
>
> Remove the pointless check. If an IOMMU is providing transparent DMA API
> ops for any device(s) we care about, the DT code will have enforced the
> appropriate probe ordering already. And if the IOMMU *is* entirely
> absent, then attempting to go ahead with CMA and either suceeding or
> failing decisively seems more useful than deferring forever.

Applied to mediatek-drm-next [1], thanks.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/log/?h=mediatek-drm-next

Regards,
Chun-Kuang.

>
> Signed-off-by: Robin Murphy 
> ---
>
> I realised that last time I sent this I probably should have CCed a
> wider audience of reviewers, so here's one with an updated commit
> message as well to make the resend more worthwhile.
>
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> index 2dfaa613276a..48581da51857 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> @@ -5,7 +5,6 @@
>   */
>
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -608,9 +607,6 @@ static int mtk_drm_bind(struct device *dev)
> struct drm_device *drm;
> int ret, i;
>
> -   if (!iommu_present(_bus_type))
> -   return -EPROBE_DEFER;
> -
> pdev = of_find_device_by_node(private->mutex_node);
> if (!pdev) {
> dev_err(dev, "Waiting for disp-mutex device %pOF\n",
> --
> 2.39.2.101.g768bb238c484.dirty
>

Re: [PATCH 5/5] drm/todo: Add entry to rename drm_atomic_state

2023-12-07 Thread Daniel Vetter

On Mon, Dec 04, 2023 at 01:17:07PM +0100, Maxime Ripard wrote:
> The name of the structure drm_atomic_state is confusing. Let's add an
> entry to our todo list to rename it.
> 
> Signed-off-by: Maxime Ripard 
> ---
>  Documentation/gpu/todo.rst | 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
> index b62c7fa0c2bc..fe95aea89d67 100644
> --- a/Documentation/gpu/todo.rst
> +++ b/Documentation/gpu/todo.rst
> @@ -120,6 +120,24 @@ Contact: Daniel Vetter, respective driver maintainers
>  
>  Level: Advanced
>  
> +Rename drm_atomic_state
> +---
> +
> +The KMS framework uses two slightly different definitions for the ``state``
> +concept. For a given object (plane, CRTC, encoder, etc., so
> +``drm_$OBJECT_state``), the state is the entire state of that object. 
> However,
> +at the device level, ``drm_atomic_state`` refers to a state update for a
> +limited number of objects.

That's a very generous description of my screw-up of calling a commit a
state and making a big mess out of a lot of concepts :-)

> +
> +The state isn't the entire device state anymore, but only the full state of

s/anymore// since it was always meant to be an incremental/partial
update/commit structure.

> +some objects in that device. This is confusing to newcomers, and
> +``drm_atomic_state`` should be renamed to something clearer like
> +``drm_atomic_update``.

My choice would be drm_atomic_commit, because that's what we're calling
these everywhere else in the code. See drm_crtc_commit for the per-crtc
tracking thing of a drm_atomic_commit. If you want update, there's quite a
lot of other things we also need to rename to the _update suffix.

Also this should have some pointers to the functions that need adjusting,
like drm_atomic_state_alloc|get/put/init/ since without also renaming
those this is just going to create even more confusion.

With my comments addressed:

Reviewed-by: Daniel Vetter 

> +
> +Contact: Maxime Ripard 
> +
> +Level: Advanced
> +
>  Fallout from atomic KMS
>  ---
>  
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 4/5] drm/atomic: Make the drm_atomic_state documentation less ambiguous

2023-12-07 Thread Daniel Vetter

On Mon, Dec 04, 2023 at 01:17:06PM +0100, Maxime Ripard wrote:
> The current documentation of drm_atomic_state says that it's the "global
> state object". This is confusing since, while it does contain all the
> objects affected by an update and their respective states, if an object
> isn't affected by this update it won't be part of it.
> 
> Thus, it's not truly a "global state", unlike object state structures
> that do contain the entire state of a given object.
> 
> Signed-off-by: Maxime Ripard 

So this is probably the biggest naming fumble I've committed, because this
is the drm_atomic_commit structure: It's not just a state structure, but
it represents the transition from a set of old to new states. Which is
also why we have both old and new state pointers in it.

> ---
>  include/drm/drm_atomic.h | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> index 914574b58ae7..81ad7369b90d 100644
> --- a/include/drm/drm_atomic.h
> +++ b/include/drm/drm_atomic.h
> @@ -346,11 +346,19 @@ struct __drm_private_objs_state {
>  };
>  
>  /**
> - * struct drm_atomic_state - the global state object for atomic updates
> + * struct drm_atomic_state - Atomic Update structure

I think we're using "commit" more often than "update"

> + *
> + * This structure is the kernel counterpart of @drm_mode_atomic and contains
> + * all the objects affected by an atomic modeset update and their states.

My suggestion:

This structure is the kernel counterpart of @drm_mode_atomic and
represents an atomic commit that transitions from an old to a new display
state. It contains all the objects affected by an atomic commits and both
the new state structures and pointers to the old state structures for
these.

>   *
>   * States are added to an atomic update by calling 
> drm_atomic_get_crtc_state(),
>   * drm_atomic_get_plane_state(), drm_atomic_get_connector_state(), or for
>   * private state structures, drm_atomic_get_private_obj_state().
> + *
> + * NOTE: While this structure looks to be global and affecting the whole DRM
> + * device, it only contains the objects affected by the atomic commit.
> + * Unaffected objects will not be part of that update, unless they have been
> + * explicitly added by either the framework or the driver.

Since you remove the global in the header summary I wouldn't reintroduce
it here. Seems to just add to the confusion again instead of clarifying.

If you want maybe clarify that an atomic commit does not need to contain
all the objects of a _device, or something like that.

With the comments suitably addressed:

Reviewed-by: Daniel Vetter 

>   */
>  struct drm_atomic_state {
>   /**
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 3/5] drm/atomic: Rework the object doc a bit

2023-12-07 Thread Daniel Vetter

On Mon, Dec 04, 2023 at 01:17:05PM +0100, Maxime Ripard wrote:
> The doc for the planes, crtcs, connectors and private_objs fields
> mention that they are pointers to an array of structures with
> per-$OBJECT data.
> 
> While these fields are indeed pointers to an array, each item of that
> array contain a pointer to the object structure affected by the update,
> and its old and new state. There's no per-object data there.
> 
> Let's rephrase those fields a bit to better match the current situation.

Yeah that wasn't updated as part of 5d943aa6c0d4 ("drm: Consolidate crtc
arrays in drm_atomic_state") and b8b5342b699b ("drm: Consolidate plane
arrays in drm_atomic_state"). With that added to the commit message:

Reviewed-by: Daniel Vetter 

> 
> Signed-off-by: Maxime Ripard 
> ---
>  include/drm/drm_atomic.h | 20 
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> index 13cecdc1257d..914574b58ae7 100644
> --- a/include/drm/drm_atomic.h
> +++ b/include/drm/drm_atomic.h
> @@ -403,12 +403,18 @@ struct drm_atomic_state {
>   bool duplicated : 1;
>  
>   /**
> -  * @planes: pointer to array of structures with per-plane data
> +  * @planes:
> +  *
> +  * Pointer to array of @drm_plane and @drm_plane_state part of this
> +  * update.
>*/
>   struct __drm_planes_state *planes;
>  
>   /**
> -  * @crtcs: pointer to array of CRTC pointers
> +  * @crtcs:
> +  *
> +  * Pointer to array of @drm_crtc and @drm_crtc_state part of this
> +  * update.
>*/
>   struct __drm_crtcs_state *crtcs;
>  
> @@ -418,7 +424,10 @@ struct drm_atomic_state {
>   int num_connector;
>  
>   /**
> -  * @connectors: pointer to array of structures with per-connector data
> +  * @connectors:
> +  *
> +  * Pointer to array of @drm_connector and @drm_connector_state part of
> +  * this update.
>*/
>   struct __drm_connnectors_state *connectors;
>  
> @@ -428,7 +437,10 @@ struct drm_atomic_state {
>   int num_private_objs;
>  
>   /**
> -  * @private_objs: pointer to array of private object pointers
> +  * @private_objs:
> +  *
> +  * Pointer to array of @drm_private_obj and @drm_private_obj_state part
> +  * of this update.
>*/
>   struct __drm_private_objs_state *private_objs;
>  
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 2/5] drm/atomic: Remove inexistent reference

2023-12-07 Thread Daniel Vetter

On Mon, Dec 04, 2023 at 01:17:04PM +0100, Maxime Ripard wrote:
> The num_connectors field documentation mentions a connector_states field
> that has never been part of this structure.

Not entirely correct, this is an oversight from 63e83c1dba54 ("drm:
Consolidate connector arrays in drm_atomic_state"). With the commit
message suitably updated.

Reviewed-by: Daniel Vetter 

> 
> Signed-off-by: Maxime Ripard 
> ---
>  include/drm/drm_atomic.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> index 2a08030fcd75..13cecdc1257d 100644
> --- a/include/drm/drm_atomic.h
> +++ b/include/drm/drm_atomic.h
> @@ -413,7 +413,7 @@ struct drm_atomic_state {
>   struct __drm_crtcs_state *crtcs;
>  
>   /**
> -  * @num_connector: size of the @connectors and @connector_states arrays
> +  * @num_connector: size of the @connectors array
>*/
>   int num_connector;
>  
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 1/5] drm/atomic: Move the drm_atomic_state field doc inline

2023-12-07 Thread Daniel Vetter

On Mon, Dec 04, 2023 at 01:17:03PM +0100, Maxime Ripard wrote:
> Some fields of drm_atomic_state have been documented in-line, but some
> were documented in the main kerneldoc block before the structure.
> 
> Since the former is the preferred option in DRM, let's move all the
> fields to an inline documentation.
> 
> Signed-off-by: Maxime Ripard 

Acked-by: Daniel Vetter 

> ---
>  include/drm/drm_atomic.h | 50 
>  1 file changed, 40 insertions(+), 10 deletions(-)
> 
> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> index cf8e1220a4ac..2a08030fcd75 100644
> --- a/include/drm/drm_atomic.h
> +++ b/include/drm/drm_atomic.h
> @@ -347,24 +347,22 @@ struct __drm_private_objs_state {
>  
>  /**
>   * struct drm_atomic_state - the global state object for atomic updates
> - * @ref: count of all references to this state (will not be freed until zero)
> - * @dev: parent DRM device
> - * @async_update: hint for asynchronous plane update
> - * @planes: pointer to array of structures with per-plane data
> - * @crtcs: pointer to array of CRTC pointers
> - * @num_connector: size of the @connectors and @connector_states arrays
> - * @connectors: pointer to array of structures with per-connector data
> - * @num_private_objs: size of the @private_objs array
> - * @private_objs: pointer to array of private object pointers
> - * @acquire_ctx: acquire context for this atomic modeset state update
>   *
>   * States are added to an atomic update by calling 
> drm_atomic_get_crtc_state(),
>   * drm_atomic_get_plane_state(), drm_atomic_get_connector_state(), or for
>   * private state structures, drm_atomic_get_private_obj_state().
>   */
>  struct drm_atomic_state {
> + /**
> +  * @ref:
> +  *
> +  * Count of all references to this update (will not be freed until 
> zero).
> +  */
>   struct kref ref;
>  
> + /**
> +  * @dev: Parent DRM Device.
> +  */
>   struct drm_device *dev;
>  
>   /**
> @@ -388,7 +386,12 @@ struct drm_atomic_state {
>* flag are not allowed.
>*/
>   bool legacy_cursor_update : 1;
> +
> + /**
> +  * @async_update: hint for asynchronous plane update
> +  */
>   bool async_update : 1;
> +
>   /**
>* @duplicated:
>*
> @@ -398,13 +401,40 @@ struct drm_atomic_state {
>* states.
>*/
>   bool duplicated : 1;
> +
> + /**
> +  * @planes: pointer to array of structures with per-plane data
> +  */
>   struct __drm_planes_state *planes;
> +
> + /**
> +  * @crtcs: pointer to array of CRTC pointers
> +  */
>   struct __drm_crtcs_state *crtcs;
> +
> + /**
> +  * @num_connector: size of the @connectors and @connector_states arrays
> +  */
>   int num_connector;
> +
> + /**
> +  * @connectors: pointer to array of structures with per-connector data
> +  */
>   struct __drm_connnectors_state *connectors;
> +
> + /**
> +  * @num_private_objs: size of the @private_objs array
> +  */
>   int num_private_objs;
> +
> + /**
> +  * @private_objs: pointer to array of private object pointers
> +  */
>   struct __drm_private_objs_state *private_objs;
>  
> + /**
> +  * @acquire_ctx: acquire context for this atomic modeset state update
> +  */
>   struct drm_modeset_acquire_ctx *acquire_ctx;
>  
>   /**
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH][next] drm/amd/display: Fix spelling mistake "SMC_MSG_AllowZstatesEntr" -> "SMC_MSG_AllowZstatesEntry"

2023-12-07 Thread Alex Deucher

Applied.  Thanks!

Alex

On Thu, Dec 7, 2023 at 6:32 AM Colin Ian King  wrote:
>
> There is a spelling mistake in a smu_print message. Fix it.
>
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c 
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c
> index d6db9d7fced2..6d4a1ffab5ed 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35/dcn35_smu.c
> @@ -361,26 +361,26 @@ void dcn35_smu_set_zstate_support(struct 
> clk_mgr_internal *clk_mgr, enum dcn_zst
> case DCN_ZSTATE_SUPPORT_ALLOW:
> msg_id = VBIOSSMC_MSG_AllowZstatesEntry;
> param = (1 << 10) | (1 << 9) | (1 << 8);
> -   smu_print("%s: SMC_MSG_AllowZstatesEntr msg = ALLOW, param = 
> %d\n", __func__, param);
> +   smu_print("%s: SMC_MSG_AllowZstatesEntry msg = ALLOW, param = 
> %d\n", __func__, param);
> break;
>
> case DCN_ZSTATE_SUPPORT_DISALLOW:
> msg_id = VBIOSSMC_MSG_AllowZstatesEntry;
> param = 0;
> -   smu_print("%s: SMC_MSG_AllowZstatesEntr msg_id = DISALLOW, 
> param = %d\n",  __func__, param);
> +   smu_print("%s: SMC_MSG_AllowZstatesEntry msg_id = DISALLOW, 
> param = %d\n",  __func__, param);
> break;
>
>
> case DCN_ZSTATE_SUPPORT_ALLOW_Z10_ONLY:
> msg_id = VBIOSSMC_MSG_AllowZstatesEntry;
> param = (1 << 10);
> -   smu_print("%s: SMC_MSG_AllowZstatesEntr msg = ALLOW_Z10_ONLY, 
> param = %d\n", __func__, param);
> +   smu_print("%s: SMC_MSG_AllowZstatesEntry msg = 
> ALLOW_Z10_ONLY, param = %d\n", __func__, param);
> break;
>
> case DCN_ZSTATE_SUPPORT_ALLOW_Z8_Z10_ONLY:
> msg_id = VBIOSSMC_MSG_AllowZstatesEntry;
> param = (1 << 10) | (1 << 8);
> -   smu_print("%s: SMC_MSG_AllowZstatesEntr msg = 
> ALLOW_Z8_Z10_ONLY, param = %d\n", __func__, param);
> +   smu_print("%s: SMC_MSG_AllowZstatesEntry msg = 
> ALLOW_Z8_Z10_ONLY, param = %d\n", __func__, param);
> break;
>
> case DCN_ZSTATE_SUPPORT_ALLOW_Z8_ONLY:
> --
> 2.39.2
>

Re: [PATCH v2 2/6] drm/msm/dsi: set video mode widebus enable bit when widebus is enabled

2023-12-07 Thread Jessica Zhang





On 11/14/2023 2:58 PM, Jonathan Marek wrote:

The value returned by msm_dsi_wide_bus_enabled() doesn't match what the
driver is doing in video mode. Fix that by actually enabling widebus for
video mode.

Fixes: efcbd6f9cdeb ("drm/msm/dsi: Enable widebus for DSI")
Signed-off-by: Jonathan Marek 
---
  drivers/gpu/drm/msm/dsi/dsi.xml.h  | 1 +
  drivers/gpu/drm/msm/dsi/dsi_host.c | 2 ++
  2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.xml.h 
b/drivers/gpu/drm/msm/dsi/dsi.xml.h
index 2a7d980e12c3..f0b3cdc020a1 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.xml.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.xml.h
@@ -231,6 +231,7 @@ static inline uint32_t DSI_VID_CFG0_TRAFFIC_MODE(enum 
dsi_traffic_mode val)
  #define DSI_VID_CFG0_HSA_POWER_STOP   0x0001
  #define DSI_VID_CFG0_HBP_POWER_STOP   0x0010
  #define DSI_VID_CFG0_HFP_POWER_STOP   0x0100
+#define DSI_VID_CFG0_DATABUS_WIDEN 0x0200
  #define DSI_VID_CFG0_PULSE_MODE_HSA_HE
0x1000
  
  #define REG_DSI_VID_CFG1	0x001c

diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index deeecdfd6c4e..f2c1cbd08d4d 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -745,6 +745,8 @@ static void dsi_ctrl_enable(struct msm_dsi_host *msm_host,
data |= DSI_VID_CFG0_TRAFFIC_MODE(dsi_get_traffic_mode(flags));
data |= DSI_VID_CFG0_DST_FORMAT(dsi_get_vid_fmt(mipi_fmt));
data |= DSI_VID_CFG0_VIRT_CHANNEL(msm_host->channel);
+   if (msm_dsi_host_is_wide_bus_enabled(_host->base))
+   data |= DSI_VID_CFG0_DATABUS_WIDEN;


Hi Jonathan,

Now that widebus is enabled for video mode, I think you can also drop 
the TODO here [1]. Other than that, this LGTM.


Reviewed-by: Jessica Zhang 

Thanks,

Jessica Zhang

[1] 
https://elixir.bootlin.com/linux/v6.7-rc3/source/drivers/gpu/drm/msm/dsi/dsi_host.c#L772



dsi_write(msm_host, REG_DSI_VID_CFG0, data);
  
  		/* Do not swap RGB colors */

--
2.26.1

Re: [PATCH v2 5/6] drm/msm/dsi: support DSC configurations with slice_per_pkt > 1

2023-12-07 Thread Jessica Zhang





On 11/14/2023 2:58 PM, Jonathan Marek wrote:

Add a dsc_slice_per_pkt field to mipi_dsi_device struct and the necessary
changes to msm driver to support this field.

Note that the removed "pkt_per_line = slice_per_intf * slice_per_pkt"
comment is incorrect.


Hi John,

Thanks for catching the typo.



Signed-off-by: Jonathan Marek 
---
  drivers/gpu/drm/msm/dsi/dsi_host.c | 25 ++---
  include/drm/drm_mipi_dsi.h |  1 +
  2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 842765063b1b..892a463a7e03 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -161,6 +161,7 @@ struct msm_dsi_host {
  
  	struct drm_display_mode *mode;

struct drm_dsc_config *dsc;
+   unsigned int dsc_slice_per_pkt;
  
  	/* connected device info */

unsigned int channel;
@@ -857,17 +858,10 @@ static void dsi_update_dsc_timing(struct msm_dsi_host 
*msm_host, bool is_cmd_mod
slice_per_intf = msm_dsc_get_slices_per_intf(dsc, hdisplay);
  
  	total_bytes_per_intf = dsc->slice_chunk_size * slice_per_intf;

-   bytes_per_pkt = dsc->slice_chunk_size; /* * slice_per_pkt; */
+   bytes_per_pkt = dsc->slice_chunk_size * msm_host->dsc_slice_per_pkt;
  
  	eol_byte_num = total_bytes_per_intf % 3;

-
-   /*
-* Typically, pkt_per_line = slice_per_intf * slice_per_pkt.
-*
-* Since the current driver only supports slice_per_pkt = 1,
-* pkt_per_line will be equal to slice per intf for now.
-*/
-   pkt_per_line = slice_per_intf;
+   pkt_per_line = slice_per_intf / msm_host->dsc_slice_per_pkt;
  
  	if (is_cmd_mode) /* packet data type */

reg = 
DSI_COMMAND_COMPRESSION_MODE_CTRL_STREAM0_DATATYPE(MIPI_DSI_DCS_LONG_WRITE);
@@ -1004,12 +998,8 @@ static void dsi_timing_setup(struct msm_dsi_host 
*msm_host, bool is_bonded_dsi)
else
/*
 * When DSC is enabled, WC = slice_chunk_size * 
slice_per_pkt + 1.
-* Currently, the driver only supports default value of 
slice_per_pkt = 1
-*
-* TODO: Expand mipi_dsi_device struct to hold 
slice_per_pkt info
-*   and adjust DSC math to account for 
slice_per_pkt.
 */
-   wc = msm_host->dsc->slice_chunk_size + 1;
+   wc = msm_host->dsc->slice_chunk_size * 
msm_host->dsc_slice_per_pkt + 1;


Maybe we can reuse bytes_per_pkt here.

  
  		dsi_write(msm_host, REG_DSI_CMD_MDP_STREAM0_CTRL,

DSI_CMD_MDP_STREAM0_CTRL_WORD_COUNT(wc) |
@@ -1636,8 +1626,13 @@ static int dsi_host_attach(struct mipi_dsi_host *host,
msm_host->lanes = dsi->lanes;
msm_host->format = dsi->format;
msm_host->mode_flags = dsi->mode_flags;
-   if (dsi->dsc)
+   if (dsi->dsc) {
msm_host->dsc = dsi->dsc;
+   msm_host->dsc_slice_per_pkt = dsi->dsc_slice_per_pkt;
+   /* for backwards compatibility, assume 1 if not set */
+   if (!msm_host->dsc_slice_per_pkt)
+   msm_host->dsc_slice_per_pkt = 1;
+   }
  
  	/* Some gpios defined in panel DT need to be controlled by host */

ret = dsi_host_init_panel_gpios(msm_host, >dev);
diff --git a/include/drm/drm_mipi_dsi.h b/include/drm/drm_mipi_dsi.h
index c9df0407980c..3e32fa52d94b 100644
--- a/include/drm/drm_mipi_dsi.h
+++ b/include/drm/drm_mipi_dsi.h
@@ -193,6 +193,7 @@ struct mipi_dsi_device {
unsigned long hs_rate;
unsigned long lp_rate;
struct drm_dsc_config *dsc;


Any reason for not putting this in drm_dsc_config?

Thanks,

Jessica Zhang


+   unsigned int dsc_slice_per_pkt;
  };
  
  #define MIPI_DSI_MODULE_PREFIX "mipi-dsi:"

--
2.26.1

Re: [PATCH] drm/amd/display: fix cursor-plane-only atomic commits not triggering pageflips

2023-12-07 Thread Harry Wentland





On 2023-12-07 14:30, Xaver Hugl wrote:

Sorry, it looks like I sent this too soon. I tested the patch on a
second PC and it doesn't fix the issue there.



Ah, too bad. Won't merge it then.

Harry



Am Do., 7. Dez. 2023 um 19:25 Uhr schrieb Xaver Hugl :


With VRR, every atomic commit affecting a given display must trigger
a new scanout cycle, so that userspace is able to control the refresh
rate of the display. Before this commit, this was not the case for
atomic commits that only contain cursor plane properties.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034
Cc: sta...@vger.kernel.org

Signed-off-by: Xaver Hugl 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b452796fc6d3..b379c859fbef 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8149,9 +8149,15 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
 /* Cursor plane is handled after stream updates */
 if (plane->type == DRM_PLANE_TYPE_CURSOR) {
 if ((fb && crtc == pcrtc) ||
-   (old_plane_state->fb && old_plane_state->crtc == 
pcrtc))
+   (old_plane_state->fb && old_plane_state->crtc == 
pcrtc)) {
 cursor_update = true;
-
+   /*
+* With atomic modesetting, cursor changes must
+* also trigger a new refresh period with vrr
+*/
+   if (!state->legacy_cursor_update)
+   pflip_present = true;
+   }
 continue;
 }

--
2.43.0

Re: [PATCH] drm/amd/display: fix cursor-plane-only atomic commits not triggering pageflips

2023-12-07 Thread Harry Wentland


On 2023-12-07 13:25, Xaver Hugl wrote:

With VRR, every atomic commit affecting a given display must trigger
a new scanout cycle, so that userspace is able to control the refresh
rate of the display. Before this commit, this was not the case for
atomic commits that only contain cursor plane properties.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034
Cc: sta...@vger.kernel.org

Signed-off-by: Xaver Hugl 


Reviewed-by: Harry Wentland 

Harry


---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b452796fc6d3..b379c859fbef 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8149,9 +8149,15 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
/* Cursor plane is handled after stream updates */
if (plane->type == DRM_PLANE_TYPE_CURSOR) {
if ((fb && crtc == pcrtc) ||
-   (old_plane_state->fb && old_plane_state->crtc == 
pcrtc))
+   (old_plane_state->fb && old_plane_state->crtc == 
pcrtc)) {
cursor_update = true;
-
+   /*
+* With atomic modesetting, cursor changes must
+* also trigger a new refresh period with vrr
+*/
+   if (!state->legacy_cursor_update)
+   pflip_present = true;
+   }
continue;
}

Re: [PATCH v2 1/1] drm/msm/adreno: Add support for SM7150 SoC machine

2023-12-07 Thread Konrad Dybcio





On 12/7/23 20:46, Akhil P Oommen wrote:

On Thu, Nov 23, 2023 at 12:03:56AM +0300, Danila Tikhonov wrote:


sc7180/sm7125 (atoll) expects speedbins from atoll.dtsi:
And has a parameter: /delete-property/ qcom,gpu-speed-bin;
107 for 504Mhz max freq, pwrlevel 4
130 for 610Mhz max freq, pwrlevel 3
159 for 750Mhz max freq, pwrlevel 5
169 for 800Mhz max freq, pwrlevel 2
174 for 825Mhz max freq, pwrlevel 1 (Downstream says 172, but thats probably
typo)

A bit confused. where do you see 172 in downstream code? It is 174 in the 
downstream
code when I checked.

For rest of the speed bins, speed-bin value is calulated as
FMAX/4.8MHz + 2 round up to zero decimal places.

sm7150 (sdmmagpie) expects speedbins from sdmmagpie-gpu.dtsi:
128 for 610Mhz max freq, pwrlevel 3
146 for 700Mhz max freq, pwrlevel 2
167 for 800Mhz max freq, pwrlevel 4
172 for 504Mhz max freq, pwrlevel 1
For rest of the speed bins, speed-bin value is calulated as
FMAX/4.8 MHz round up to zero decimal places.

Creating a new entry does not make much sense.
I can suggest expanding the standard entry:

.speedbins = ADRENO_SPEEDBINS(
     { 0, 0 },
     /* sc7180/sm7125 */
     { 107, 3 },
     { 130, 4 },
     { 159, 5 },
     { 168, 1 }, has already
     { 174, 2 }, has already
     /* sm7150 */
     { 128, 1 },
     { 146, 2 },
     { 167, 3 },
     { 172, 4 }, ),



A difference I see between atoll and sdmmagpie is that the former
doesn't support 180Mhz. If you want to do the same, then you need to use
a new bit in the supported-hw bitfield instead of reusing an existing one.
Generally it is better to stick to exactly what downstream does.

OK I take my doubts back, let's go with adding a new one.

Konrad

Re: [PATCH 3/3] arm64: dts: qcom: sm8650: Add DisplayPort device nodes

2023-12-07 Thread Konrad Dybcio





On 12/7/23 17:37, Neil Armstrong wrote:

Declare the displayport controller present on the Qualcomm SM8650 SoC
and connected to the USB3/DP Combo PHY.

Signed-off-by: Neil Armstrong 
---

[...]


+   clocks = < DISP_CC_MDSS_AHB_CLK>,
+< DISP_CC_MDSS_DPTX0_AUX_CLK>,
+< DISP_CC_MDSS_DPTX0_LINK_CLK>,
+< 
DISP_CC_MDSS_DPTX0_LINK_INTF_CLK>,
+< 
DISP_CC_MDSS_DPTX0_PIXEL0_CLK>;

What about PIXEL1 clocks?

[...]


+   opp-16200 {
+   opp-hz = /bits/ 64 <16200>;
+   required-opps = 
<_opp_low_svs_d1>;
+   };
+
+   opp-27000 {
+   opp-hz = /bits/ 64 <27000>;
+   required-opps = 
<_opp_low_svs>;
+   };
+
+   opp-54000 {
+   opp-hz = /bits/ 64 <54000>;
+   required-opps = 
<_opp_svs_l1>;
+   };
+
+   opp-81000 {
+   opp-hz = /bits/ 64 <81000>;
+   required-opps = 
<_opp_nom>;
+   };
+   };
+   };
};
  
  		dispcc: clock-controller@af0 {

@@ -2996,8 +3086,8 @@ dispcc: clock-controller@af0 {
 <_dsi0_phy 1>,
 <_dsi1_phy 0>,
 <_dsi1_phy 1>,
-<0>, /* dp0 */
-<0>,
+<_dp_qmpphy QMP_USB43DP_DP_LINK_CLK>,
+<_dp_qmpphy QMP_USB43DP_DP_VCO_DIV_CLK>,
 <0>, /* dp1 */
 <0>,
 <0>, /* dp2 */

I noticed that this is not in line with your mdss patch [1]
where there are only two DP INTFs available.. Unless all of
these controllers can work using some sharing/only some at
one time...

Konrad

[1] 
https://lore.kernel.org/all/20231030-topic-sm8650-upstream-mdss-v2-5-43f1887c8...@linaro.org/

Re: [PATCH v2 1/1] drm/msm/adreno: Add support for SM7150 SoC machine

2023-12-07 Thread Akhil P Oommen

On Thu, Nov 23, 2023 at 12:03:56AM +0300, Danila Tikhonov wrote:
> 
> sc7180/sm7125 (atoll) expects speedbins from atoll.dtsi:
> And has a parameter: /delete-property/ qcom,gpu-speed-bin;
> 107 for 504Mhz max freq, pwrlevel 4
> 130 for 610Mhz max freq, pwrlevel 3
> 159 for 750Mhz max freq, pwrlevel 5
> 169 for 800Mhz max freq, pwrlevel 2
> 174 for 825Mhz max freq, pwrlevel 1 (Downstream says 172, but thats probably
> typo)
A bit confused. where do you see 172 in downstream code? It is 174 in the 
downstream
code when I checked.
> For rest of the speed bins, speed-bin value is calulated as
> FMAX/4.8MHz + 2 round up to zero decimal places.
> 
> sm7150 (sdmmagpie) expects speedbins from sdmmagpie-gpu.dtsi:
> 128 for 610Mhz max freq, pwrlevel 3
> 146 for 700Mhz max freq, pwrlevel 2
> 167 for 800Mhz max freq, pwrlevel 4
> 172 for 504Mhz max freq, pwrlevel 1
> For rest of the speed bins, speed-bin value is calulated as
> FMAX/4.8 MHz round up to zero decimal places.
> 
> Creating a new entry does not make much sense.
> I can suggest expanding the standard entry:
> 
> .speedbins = ADRENO_SPEEDBINS(
>     { 0, 0 },
>     /* sc7180/sm7125 */
>     { 107, 3 },
>     { 130, 4 },
>     { 159, 5 },
>     { 168, 1 }, has already
>     { 174, 2 }, has already
>     /* sm7150 */
>     { 128, 1 },
>     { 146, 2 },
>     { 167, 3 },
>     { 172, 4 }, ),
> 

A difference I see between atoll and sdmmagpie is that the former
doesn't support 180Mhz. If you want to do the same, then you need to use
a new bit in the supported-hw bitfield instead of reusing an existing one.
Generally it is better to stick to exactly what downstream does.

-Akhil.

> All the best,
> Danila
> 
> On 11/22/23 23:28, Konrad Dybcio wrote:
> > 
> > 
> > On 10/16/23 16:32, Dmitry Baryshkov wrote:
> > > On 26/09/2023 23:03, Konrad Dybcio wrote:
> > > > On 26.09.2023 21:10, Danila Tikhonov wrote:
> > > > > 
> > > > > I think you mean by name downstream dt - sdmmagpie-gpu.dtsi
> > > > > 
> > > > > You can see the forked version of the mainline here:
> > > > > https://github.com/sm7150-mainline/linux/blob/next/arch/arm64/boot/dts/qcom/sm7150.dtsi
> > > > > 
> > > > > 
> > > > > All fdt that we got here, if it is useful for you:
> > > > > https://github.com/sm7150-mainline/downstream-fdt
> > > > > 
> > > > > Best wishes, Danila
> > > > Taking a look at downstream, atoll.dtsi (SC7180) includes
> > > > sdmmagpie-gpu.dtsi.
> > > > 
> > > > Bottom line is, they share the speed bins, so it should be
> > > > fine to just extend the existing entry.
> > > 
> > > But then atoll.dtsi rewrites speed bins and pwrlevel bins. So they
> > > are not shared.
> > +Akhil
> > 
> > could you please check internally?
> > 
> > Konrad
>

Re: [PATCH v2 4/4] drm/panel-edp: Add some panels with conservative timings

2023-12-07 Thread Doug Anderson

Hi,

On Thu, Dec 7, 2023 at 10:58 AM Maxime Ripard  wrote:
>
> On Thu, Dec 07, 2023 at 10:23:53AM -0800, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Dec 7, 2023 at 12:18 AM Pin-yen Lin  wrote:
> > >
> > > These panels are used by Mediatek MT8173 Chromebooks but we can't find
> > > the corresponding data sheets, and these panels used to work on v4.19
> > > kernel without any specified delays.
> > >
> > > Therefore, instead of having them use the default conservative timings,
> > > update them with less-conservative timings from other panels of the same
> > > vendor. The panels should still work under those timings, and we can
> > > save some delays and suppress the warnings.
> > >
> > > Signed-off-by: Pin-yen Lin 
> > >
> > > ---
> > >
> > > (no changes since v1)
> > >
> > >  drivers/gpu/drm/panel/panel-edp.c | 31 +++
> > >  1 file changed, 31 insertions(+)
> >
> > Reviewed-by: Douglas Anderson 
> >
> > Repeating my comments from v1 here too, since I expect this patch to
> > sit on the lists for a little while:
> >
> >
> > This is OK w/ me, but it will need time on the mailing lists before
> > landing in case anyone else has opinions.
>
> Generally speaking, I'm not really a fan of big patches that dump
> whatever ChromeOS is doing ...
>
> > Specifical thoughts:
> >
> > * I at least feel fairly confident that this is OK since these panels
> > essentially booted without _any_ delays back on the old downstream
> > v4.19 kernel. Presumably the panels just had fairly robust timing
> > controllers and so worked OK, but it's better to get the timing more
> > correct.
>
> ... especially since you have to rely on the recollection of engineers
> involved at the time and you have no real way to test and make things
> clearer anymore, and we have to take patches in that are handwavy "trust
> us, it's doing the right thing".
>
> I'd really prefer to have these patches sent as they are found out.

It's probably not clear enough from the commit message, but this isn't
just a dump from downstream 4.19. What happened was:

1. Downstream chromeos-4.19 used the "little white lie" approach. They
all claimed a specific panel's compatible string even though there
were a whole pile of panels out there actually being used. Personally,
this was not something I was ever a fan of (and I wasn't personally
involved in this project), but it was the "state of the art" before
the generic panel-edp. Getting out of the "little white lie" situation
was why I spent so much time on the generic edp-panel solution
upstream.

2. These devices have now been uprevved to a newer kernel and I
believe that there were issues seen that necessitated a move to the
proper generic panel-edp code.

3. We are now getting field reports from our warning collector about a
whole pile of panels that are falling back to the "conservative"
timings, which means that they turn on/off much more slowly than they
should.

Pin-yen made an attempt to search for panels data sheets that matched
up with the IDs that came in from the field reports but there were
some panels that he just couldn't find.

So basically we're stuck. Options:

1. Leave customers who have these panels stuck with the hardware
behaving worse than it used to. This is not acceptable to me.

2. Land Pin-yen's patch as a downstream-only patch in ChromeOS. This
would solve the problem, but it would make me sad. If anyone ever
wants to take these old laptops and run some other Linux distribution
on them (and there are several that target old Chromebooks) then
they'd be again stuck with old timings.

3. Land a patch like this one that at least gets us into not such a bad shape.

While I don't love this patch (and that's why I made it clear that it
needs to spend time on the list), it seems better than the
alternatives. Do you have a proposal for something else? If not, can
you confirm you're OK with #3 given this explanation? ...and perhaps
more details in the commit message?

I would also note that, hopefully, patches like this shouldn't be a
recurring pattern. Any new Chromebooks using panel-edp will get
flagged much earlier and we should be able to get real/proper timings
added. I believe that we've also added a factory test so that
(assuming it doesn't get ignored by someone) devices that aren't
supported don't even make it out of the factory.

> Also, the fact that the 4.19 kernel mentionned in the commit log is
> actually a downstream tree needs to be made much clearer.

Yeah, that would help too.

[PULL] drm-intel-next

2023-12-07 Thread Rodrigo Vivi

Hi Dave and Daniel,

Here goes another pull-request towards 6.8.
We are likely going to send another one in 2 weeks,
but I'd like to get this in right now so we can
get a clean drm-xe-next on top of drm-next for our
first Xe pull request.

Thanks,
Rodrigo.

drm-intel-next-2023-12-07:
- Improve display debug msgs and other general clean-ups (Ville, Rahuul)
- PSR fixes and improvements around selective fetch (Jouni, Ville)
- Remove FBC restrictions for Xe2LPD displays (Vinod)
- Skip some timing checks on BXT/GLK DSI transcoders (Ville)
- DP MST Fixes (Ville)
- Correct the input parameter on _intel_dsb_commit (heminhong)
- Fix IP version of the display WAs (Bala)
- DGFX uses direct VBT pin mapping (Clint)
- Proper handling of bool on PIPE_CONF_CHECK macros (Jani)
- Skip state verification with TBT-ALT mod (Mika Kahona)
- General organization of display code for reusage with Xe
  (Jouni, Luca, Jani, Maarten)
- Squelch a sparse warning (Jani)
- Don't use "proxy" headers (Andy Shevchenko)
- Use devm_gpiod_get() for all GPIOs (Hans)
- Fix ADL+ tiled plane stride (Ville)
- Use octal permissions in display debugfs (Jani)

Thanks,
Rodrigo.

The following changes since commit deac453244d309ad7a94d0501eb5e0f9d8d1f1df:

  drm/i915: Fix glk+ degamma LUT conversions (2023-11-23 15:11:47 +0200)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-next-2023-12-07

for you to fetch changes up to 10690b8a49bceafb1badf0ad91842a359e796d8b:

  drm/i915/display: Add intel_fb_bo_framebuffer_fini (2023-12-07 17:31:02 +0200)


- Improve display debug msgs and other general clean-ups (Ville, Rahuul)
- PSR fixes and improvements around selective fetch (Jouni, Ville)
- Remove FBC restrictions for Xe2LPD displays (Vinod)
- Skip some timing checks on BXT/GLK DSI transcoders (Ville)
- DP MST Fixes (Ville)
- Correct the input parameter on _intel_dsb_commit (heminhong)
- Fix IP version of the display WAs (Bala)
- DGFX uses direct VBT pin mapping (Clint)
- Proper handling of bool on PIPE_CONF_CHECK macros (Jani)
- Skip state verification with TBT-ALT mod (Mika Kahona)
- General organization of display code for reusage with Xe
  (Jouni, Luca, Jani, Maarten)
- Squelch a sparse warning (Jani)
- Don't use "proxy" headers (Andy Shevchenko)
- Use devm_gpiod_get() for all GPIOs (Hans)
- Fix ADL+ tiled plane stride (Ville)
- Use octal permissions in display debugfs (Jani)


Andy Shevchenko (1):
  drm/i915/display: Don't use "proxy" headers

Balasubramani Vivekanandan (1):
  drm/i915/display: Fix IP version of the WAs

Clint Taylor (1):
  drm/i915/dgfx: DGFX uses direct VBT pin mapping

Hans de Goede (1):
  drm/i915/dsi: Use devm_gpiod_get() for all GPIOs

Jani Nikula (7):
  drm/i915: use PIPE_CONF_CHECK_BOOL() for bool members
  drm/i915: add bool type checks in PIPE_CONF_CHECK_*
  drm/i915/syncmap: squelch a sparse warning
  drm/i915/rpm: add rpm_to_i915() helper around container_of()
  drm/i915: use intel_connector in intel_connector_debugfs_add()
  drm/i915: pass struct intel_connector to connector debugfs fops
  drm/i915: use octal permissions in display debugfs

Jouni Högander (9):
  drm/i915/psr: Move plane sel fetch configuration into plane source files
  drm/i915/psr: Add proper handling for disabling sel fetch for planes
  drm/i915/display: split i915 specific code from intel_fbdev
  drm/i915/display: use intel_bo_to_drm_bo in intel_fbdev
  drm/i915/display: use intel_bo_to_drm_bo in intel_fb.c
  drm/i915/display: Convert intel_fb_modifier_to_tiling as non-static
  drm/i915/display: Handle invalid fb_modifier in 
intel_fb_modifier_to_tiling
  drm/i915/display: Split i915 specific code away from intel_fb.c
  drm/i915/display: Add intel_fb_bo_framebuffer_fini

Luca Coelho (1):
  drm/i915: handle uncore spinlock when not available

Maarten Lankhorst (1):
  drm/i915/display: Use i915_gem_object_get_dma_address to get dma address

Mika Kahola (1):
  drm/i915/display: Skip state verification with TBT-ALT mode

Rahul Rameshbabu (1):
  drm/i915/irq: Improve error logging for unexpected DE Misc interrupts

Ville Syrjälä (8):
  drm/i915: Stop printing pipe name as hex
  drm/i915: Move the SDP split debug spew to the correct place
  drm/i915/psr: Include some basic PSR information in the state dump
  drm/i915: Skip some timing checks on BXT/GLK DSI transcoders
  drm/i915/mst: Fix .mode_valid_ctx() return values
  drm/i915/mst: Reject modes that require the bigjoiner
  drm/i915: Clean up some DISPLAY_VER checks
  drm/i915: Fix ADL+ tiled plane stride when the POT stride is smaller than 
the original

Vinod Govindapillai (1):
  drm/i915/xe2lpd: remove the FBC restriction if PSR2 is enabled

heminhong (1):
  drm/i915: correct the input

Re: [PATCH v2 4/4] drm/panel-edp: Add some panels with conservative timings

2023-12-07 Thread Maxime Ripard

On Thu, Dec 07, 2023 at 10:23:53AM -0800, Doug Anderson wrote:
> Hi,
> 
> On Thu, Dec 7, 2023 at 12:18 AM Pin-yen Lin  wrote:
> >
> > These panels are used by Mediatek MT8173 Chromebooks but we can't find
> > the corresponding data sheets, and these panels used to work on v4.19
> > kernel without any specified delays.
> >
> > Therefore, instead of having them use the default conservative timings,
> > update them with less-conservative timings from other panels of the same
> > vendor. The panels should still work under those timings, and we can
> > save some delays and suppress the warnings.
> >
> > Signed-off-by: Pin-yen Lin 
> >
> > ---
> >
> > (no changes since v1)
> >
> >  drivers/gpu/drm/panel/panel-edp.c | 31 +++
> >  1 file changed, 31 insertions(+)
> 
> Reviewed-by: Douglas Anderson 
> 
> Repeating my comments from v1 here too, since I expect this patch to
> sit on the lists for a little while:
> 
> 
> This is OK w/ me, but it will need time on the mailing lists before
> landing in case anyone else has opinions.

Generally speaking, I'm not really a fan of big patches that dump
whatever ChromeOS is doing ...

> Specifical thoughts:
> 
> * I at least feel fairly confident that this is OK since these panels
> essentially booted without _any_ delays back on the old downstream
> v4.19 kernel. Presumably the panels just had fairly robust timing
> controllers and so worked OK, but it's better to get the timing more
> correct.

... especially since you have to rely on the recollection of engineers
involved at the time and you have no real way to test and make things
clearer anymore, and we have to take patches in that are handwavy "trust
us, it's doing the right thing".

I'd really prefer to have these patches sent as they are found out.

Also, the fact that the 4.19 kernel mentionned in the commit log is
actually a downstream tree needs to be made much clearer.

Maxime


signature.asc
Description: PGP signature

Re: [PATCH v2 4/4] drm/panel-edp: Add some panels with conservative timings

2023-12-07 Thread Doug Anderson

Hi,

On Thu, Dec 7, 2023 at 12:18 AM Pin-yen Lin  wrote:
>
> These panels are used by Mediatek MT8173 Chromebooks but we can't find
> the corresponding data sheets, and these panels used to work on v4.19
> kernel without any specified delays.
>
> Therefore, instead of having them use the default conservative timings,
> update them with less-conservative timings from other panels of the same
> vendor. The panels should still work under those timings, and we can
> save some delays and suppress the warnings.
>
> Signed-off-by: Pin-yen Lin 
>
> ---
>
> (no changes since v1)
>
>  drivers/gpu/drm/panel/panel-edp.c | 31 +++
>  1 file changed, 31 insertions(+)

Reviewed-by: Douglas Anderson 

Repeating my comments from v1 here too, since I expect this patch to
sit on the lists for a little while:

This is OK w/ me, but it will need time on the mailing lists before
landing in case anyone else has opinions. Specifical thoughts:

* I at least feel fairly confident that this is OK since these panels
essentially booted without _any_ delays back on the old downstream
v4.19 kernel. Presumably the panels just had fairly robust timing
controllers and so worked OK, but it's better to get the timing more
correct.

* This is definitely better than the very conservative timings and the
WARN_ON splat.

* I don't love the "Unknown" string, but it doesn't do anything other
than print to dmesg anyway and at least it conveys to anyone else
reading the table that the timings may not be quite as tight.

Re: [PATCH v2 3/4] drm/edp-panel: Add panels delay entries

2023-12-07 Thread Doug Anderson

Hi,

On Thu, Dec 7, 2023 at 12:18 AM Pin-yen Lin  wrote:
>
> Add panels used by Mediatek MT8173 Chromebooks.
>
> Signed-off-by: Pin-yen Lin 
> ---
>
> (no changes since v1)
>
>  drivers/gpu/drm/panel/panel-edp.c | 39 +++
>  1 file changed, 39 insertions(+)

Reviewed-by: Douglas Anderson

Re: [PATCH v2 2/4] drm/panel-edp: Add powered_on_to_enable delay

2023-12-07 Thread Doug Anderson

Hi,

On Thu, Dec 7, 2023 at 12:18 AM Pin-yen Lin  wrote:
>
> Add the support of powered_on_to_enable delay as the minimum time that
> needs to have passed between the panel powered on and enable may begin.
>
> This delay is seen in BOE panels as the minimum delay of T3+T4+T5+T6+T8
> in the eDP timing diagrams.
>
> Signed-off-by: Pin-yen Lin 
> ---
>
> (no changes since v1)
>
>  drivers/gpu/drm/panel/panel-edp.c | 20 
>  1 file changed, 20 insertions(+)

Should have carried my tag from v1 since there were no changes:

Reviewed-by: Douglas Anderson 

As per my response in v1: This needs to bake a little while on the
lists (1-2 weeks) before I apply it in case others have opinions.

Re: [PATCH v2 1/4] drm/edp-panel: Move the KDC panel to a separate group

2023-12-07 Thread Doug Anderson

Hi,

On Thu, Dec 7, 2023 at 12:18 AM Pin-yen Lin  wrote:
>
> Move the KDC panel entry to make the list sorted by the vendor string.
>
> Signed-off-by: Pin-yen Lin 
> ---
>
> (no changes since v1)
>
>  drivers/gpu/drm/panel/panel-edp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Douglas Anderson 

Pushed to drm-misc-next:

67a5f0ff3429 drm/edp-panel: Move the KDC panel to a separate group

[PATCH v2 0/7] IOMMU related FW parsing cleanup

2023-12-07 Thread Jason Gunthorpe

These are the patches from the from the prior series without the "fwspec
polishing":
 https://lore.kernel.org/r/0-v2-36a0088ecaa7+22c6e-iommu_fwspec_...@nvidia.com

Does a few things to prepare for the next:

- Clean up the call chains around dma_configure so the iommu_ops isn't being
  exposed.

- Add additional lockdep annotations now that we can.

- Fix some missed places that need to call tegra_dev_iommu_get_stream_id()

Based on Joerg's for-next with Robin's bus changes.

Robin's dma_base/size cleanup squashes the first patch, but we can't do
the ops removal in the other parts without it, so let's keep it
unsquashed.

v2:
 - Remove comments and bracket around tegra_dev_iommu_get_stream_id()
   in gp10b.c
 - Remove WARN_ON() in tegra186_mc_client_sid_override(), just return 0
 - Push the locking change to a later series
 - Drop the COMPILE_TEST improvement, not important enough to argue.
v1: 
https://lore.kernel.org/r/0-v1-720585788a7d+811b-iommu_fwspec_p1_...@nvidia.com

Jason Gunthorpe (7):
  iommu: Remove struct iommu_ops *iommu from arch_setup_dma_ops()
  iommmu/of: Do not return struct iommu_ops from of_iommu_configure()
  iommu/of: Use -ENODEV consistently in of_iommu_configure()
  iommu: Mark dev_iommu_get() with lockdep
  iommu: Mark dev_iommu_priv_set() with a lockdep
  acpi: Do not return struct iommu_ops from acpi_iommu_configure_id()
  iommu/tegra: Use tegra_dev_iommu_get_stream_id() in the remaining
places

 arch/arc/mm/dma.c |  2 +-
 arch/arm/mm/dma-mapping-nommu.c   |  2 +-
 arch/arm/mm/dma-mapping.c | 10 +--
 arch/arm64/mm/dma-mapping.c   |  4 +-
 arch/mips/mm/dma-noncoherent.c|  2 +-
 arch/riscv/mm/dma-noncoherent.c   |  2 +-
 drivers/acpi/scan.c   | 32 ++
 drivers/dma/tegra186-gpc-dma.c|  8 +--
 .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c   |  9 +--
 drivers/hv/hv_common.c|  2 +-
 drivers/iommu/amd/iommu.c |  2 -
 drivers/iommu/apple-dart.c|  1 -
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  1 -
 drivers/iommu/arm/arm-smmu/arm-smmu.c |  1 -
 drivers/iommu/intel/iommu.c   |  2 -
 drivers/iommu/iommu.c | 11 
 drivers/iommu/of_iommu.c  | 64 ---
 drivers/iommu/omap-iommu.c|  1 -
 drivers/memory/tegra/tegra186.c   | 14 ++--
 drivers/of/device.c   | 24 ---
 include/linux/dma-map-ops.h   |  4 +-
 include/linux/iommu.h |  5 +-
 include/linux/of_iommu.h  | 13 ++--
 23 files changed, 105 insertions(+), 111 deletions(-)


base-commit: 173ff345925a394284250bfa6e47d231e62031c7
-- 
2.43.0

[PATCH v2 1/7] iommu: Remove struct iommu_ops *iommu from arch_setup_dma_ops()

2023-12-07 Thread Jason Gunthorpe

This is not being used to pass ops, it is just a way to tell if an
iommu driver was probed. These days this can be detected directly via
device_iommu_mapped(). Call device_iommu_mapped() in the two places that
need to check it and remove the iommu parameter everywhere.

Reviewed-by: Jerry Snitselaar 
Reviewed-by: Lu Baolu 
Reviewed-by: Moritz Fischer 
Acked-by: Christoph Hellwig 
Acked-by: Rob Herring 
Tested-by: Hector Martin 
Signed-off-by: Jason Gunthorpe 
---
 arch/arc/mm/dma.c   |  2 +-
 arch/arm/mm/dma-mapping-nommu.c |  2 +-
 arch/arm/mm/dma-mapping.c   | 10 +-
 arch/arm64/mm/dma-mapping.c |  4 ++--
 arch/mips/mm/dma-noncoherent.c  |  2 +-
 arch/riscv/mm/dma-noncoherent.c |  2 +-
 drivers/acpi/scan.c |  3 +--
 drivers/hv/hv_common.c  |  2 +-
 drivers/of/device.c |  2 +-
 include/linux/dma-map-ops.h |  4 ++--
 10 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index 2a7fbbb83b7056..197707bc765889 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -91,7 +91,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
  * Plug in direct dma map ops.
  */
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent)
+   bool coherent)
 {
/*
 * IOC hardware snoops all DMA traffic keeping the caches consistent
diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
index cfd9c933d2f09c..b94850b579952a 100644
--- a/arch/arm/mm/dma-mapping-nommu.c
+++ b/arch/arm/mm/dma-mapping-nommu.c
@@ -34,7 +34,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent)
+   bool coherent)
 {
if (IS_ENABLED(CONFIG_CPU_V7M)) {
/*
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 5409225b4abc06..6c359a3af8d9c7 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1713,7 +1713,7 @@ void arm_iommu_detach_device(struct device *dev)
 EXPORT_SYMBOL_GPL(arm_iommu_detach_device);
 
 static void arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool 
coherent)
+   bool coherent)
 {
struct dma_iommu_mapping *mapping;
 
@@ -1748,7 +1748,7 @@ static void arm_teardown_iommu_dma_ops(struct device *dev)
 #else
 
 static void arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool 
coherent)
+   bool coherent)
 {
 }
 
@@ -1757,7 +1757,7 @@ static void arm_teardown_iommu_dma_ops(struct device 
*dev) { }
 #endif /* CONFIG_ARM_DMA_USE_IOMMU */
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent)
+   bool coherent)
 {
/*
 * Due to legacy code that sets the ->dma_coherent flag from a bus
@@ -1776,8 +1776,8 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, 
u64 size,
if (dev->dma_ops)
return;
 
-   if (iommu)
-   arm_setup_iommu_dma_ops(dev, dma_base, size, iommu, coherent);
+   if (device_iommu_mapped(dev))
+   arm_setup_iommu_dma_ops(dev, dma_base, size, coherent);
 
xen_setup_dma_ops(dev);
dev->archdata.dma_ops_setup = true;
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 3cb101e8cb29ba..61886e43e3a10f 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -47,7 +47,7 @@ void arch_teardown_dma_ops(struct device *dev)
 #endif
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent)
+   bool coherent)
 {
int cls = cache_line_size_of_cpu();
 
@@ -58,7 +58,7 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 
size,
   ARCH_DMA_MINALIGN, cls);
 
dev->dma_coherent = coherent;
-   if (iommu)
+   if (device_iommu_mapped(dev))
iommu_setup_dma_ops(dev, dma_base, dma_base + size - 1);
 
xen_setup_dma_ops(dev);
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index 3c4fc97b9f394b..0f3cec663a12cd 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -138,7 +138,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 
 #ifdef CONFIG_ARCH_HAS_SETUP_DMA_OPS
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent)
+   bool coherent)
 {

[PATCH v2 5/7] iommu: Mark dev_iommu_priv_set() with a lockdep

2023-12-07 Thread Jason Gunthorpe

A perfect driver would only call dev_iommu_priv_set() from its probe
callback. We've made it functionally correct to call it from the of_xlate
by adding a lock around that call.

lockdep assert that iommu_probe_device_lock is held to discourage misuse.

Exclude PPC kernels with CONFIG_FSL_PAMU turned on because FSL_PAMU uses a
global static for its priv and abuses priv for its domain.

Remove the pointless stores of NULL, all these are on paths where the core
code will free dev->iommu after the op returns.

Reviewed-by: Lu Baolu 
Reviewed-by: Jerry Snitselaar 
Tested-by: Hector Martin 
Signed-off-by: Jason Gunthorpe 
---
 drivers/iommu/amd/iommu.c   | 2 --
 drivers/iommu/apple-dart.c  | 1 -
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 -
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 1 -
 drivers/iommu/intel/iommu.c | 2 --
 drivers/iommu/iommu.c   | 9 +
 drivers/iommu/omap-iommu.c  | 1 -
 include/linux/iommu.h   | 5 +
 8 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9f706436082833..be58644a6fa518 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -551,8 +551,6 @@ static void amd_iommu_uninit_device(struct device *dev)
if (dev_data->domain)
detach_device(dev);
 
-   dev_iommu_priv_set(dev, NULL);
-
/*
 * We keep dev_data around for unplugged devices and reuse it when the
 * device is re-plugged - not doing so would introduce a ton of races.
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 7438e9c82ba982..25135440b5dd54 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -743,7 +743,6 @@ static void apple_dart_release_device(struct device *dev)
 {
struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
 
-   dev_iommu_priv_set(dev, NULL);
kfree(cfg);
 }
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fc4317c25b6d53..1855d3892b15f8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2695,7 +2695,6 @@ static struct iommu_device *arm_smmu_probe_device(struct 
device *dev)
 
 err_free_master:
kfree(master);
-   dev_iommu_priv_set(dev, NULL);
return ERR_PTR(ret);
 }
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 4d09c004789274..adc7937fd8a3a3 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1420,7 +1420,6 @@ static void arm_smmu_release_device(struct device *dev)
 
arm_smmu_rpm_put(cfg->smmu);
 
-   dev_iommu_priv_set(dev, NULL);
kfree(cfg);
 }
 
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 897159dba47de4..511589341074f0 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4461,7 +4461,6 @@ static struct iommu_device 
*intel_iommu_probe_device(struct device *dev)
ret = intel_pasid_alloc_table(dev);
if (ret) {
dev_err(dev, "PASID table allocation failed\n");
-   dev_iommu_priv_set(dev, NULL);
kfree(info);
return ERR_PTR(ret);
}
@@ -4479,7 +4478,6 @@ static void intel_iommu_release_device(struct device *dev)
dmar_remove_one_dev_info(dev);
intel_pasid_free_table(dev);
intel_iommu_debugfs_remove_dev(info);
-   dev_iommu_priv_set(dev, NULL);
kfree(info);
set_dma_ops(dev, NULL);
 }
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4323b6276e977f..08f29a1dfcd5f8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -387,6 +387,15 @@ static u32 dev_iommu_get_max_pasids(struct device *dev)
return min_t(u32, max_pasids, dev->iommu->iommu_dev->max_pasids);
 }
 
+void dev_iommu_priv_set(struct device *dev, void *priv)
+{
+   /* FSL_PAMU does something weird */
+   if (!IS_ENABLED(CONFIG_FSL_PAMU))
+   lockdep_assert_held(_probe_device_lock);
+   dev->iommu->priv = priv;
+}
+EXPORT_SYMBOL_GPL(dev_iommu_priv_set);
+
 /*
  * Init the dev->iommu and dev->iommu_group in the struct device and get the
  * driver probed
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index c66b070841dd41..c9528065a59afa 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1719,7 +1719,6 @@ static void omap_iommu_release_device(struct device *dev)
if (!dev->of_node || !arch_data)
return;
 
-   dev_iommu_priv_set(dev, NULL);
kfree(arch_data);
 
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c7394b39599c84..c24933a1d0d643 100644
--- a/include/linux/iommu.h

[PATCH v2 2/7] iommmu/of: Do not return struct iommu_ops from of_iommu_configure()

2023-12-07 Thread Jason Gunthorpe

Nothing needs this pointer. Return a normal error code with the usual
IOMMU semantic that ENODEV means 'there is no IOMMU driver'.

Reviewed-by: Jerry Snitselaar 
Reviewed-by: Lu Baolu 
Acked-by: Rob Herring 
Tested-by: Hector Martin 
Signed-off-by: Jason Gunthorpe 
---
 drivers/iommu/of_iommu.c | 31 +++
 drivers/of/device.c  | 22 +++---
 include/linux/of_iommu.h | 13 ++---
 3 files changed, 40 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 5ecca53847d325..c6510d7e7b241b 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -107,16 +107,22 @@ static int of_iommu_configure_device(struct device_node 
*master_np,
  of_iommu_configure_dev(master_np, dev);
 }
 
-const struct iommu_ops *of_iommu_configure(struct device *dev,
-  struct device_node *master_np,
-  const u32 *id)
+/*
+ * Returns:
+ *  0 on success, an iommu was configured
+ *  -ENODEV if the device does not have any IOMMU
+ *  -EPROBEDEFER if probing should be tried again
+ *  -errno fatal errors
+ */
+int of_iommu_configure(struct device *dev, struct device_node *master_np,
+  const u32 *id)
 {
const struct iommu_ops *ops = NULL;
struct iommu_fwspec *fwspec;
int err = NO_IOMMU;
 
if (!master_np)
-   return NULL;
+   return -ENODEV;
 
/* Serialise to make dev->iommu stable under our potential fwspec */
mutex_lock(_probe_device_lock);
@@ -124,7 +130,7 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
if (fwspec) {
if (fwspec->ops) {
mutex_unlock(_probe_device_lock);
-   return fwspec->ops;
+   return 0;
}
/* In the deferred case, start again from scratch */
iommu_fwspec_free(dev);
@@ -169,14 +175,15 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
err = iommu_probe_device(dev);
 
/* Ignore all other errors apart from EPROBE_DEFER */
-   if (err == -EPROBE_DEFER) {
-   ops = ERR_PTR(err);
-   } else if (err < 0) {
-   dev_dbg(dev, "Adding to IOMMU failed: %d\n", err);
-   ops = NULL;
+   if (err < 0) {
+   if (err == -EPROBE_DEFER)
+   return err;
+   dev_dbg(dev, "Adding to IOMMU failed: %pe\n", ERR_PTR(err));
+   return err;
}
-
-   return ops;
+   if (!ops)
+   return -ENODEV;
+   return 0;
 }
 
 static enum iommu_resv_type __maybe_unused
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 65c71be71a8d45..873d933e8e6d1d 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -93,12 +93,12 @@ of_dma_set_restricted_buffer(struct device *dev, struct 
device_node *np)
 int of_dma_configure_id(struct device *dev, struct device_node *np,
bool force_dma, const u32 *id)
 {
-   const struct iommu_ops *iommu;
const struct bus_dma_region *map = NULL;
struct device_node *bus_np;
u64 dma_start = 0;
u64 mask, end, size = 0;
bool coherent;
+   int iommu_ret;
int ret;
 
if (np == dev->of_node)
@@ -181,21 +181,29 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
dev_dbg(dev, "device is%sdma coherent\n",
coherent ? " " : " not ");
 
-   iommu = of_iommu_configure(dev, np, id);
-   if (PTR_ERR(iommu) == -EPROBE_DEFER) {
+   iommu_ret = of_iommu_configure(dev, np, id);
+   if (iommu_ret == -EPROBE_DEFER) {
/* Don't touch range map if it wasn't set from a valid 
dma-ranges */
if (!ret)
dev->dma_range_map = NULL;
kfree(map);
return -EPROBE_DEFER;
-   }
+   } else if (iommu_ret == -ENODEV) {
+   dev_dbg(dev, "device is not behind an iommu\n");
+   } else if (iommu_ret) {
+   dev_err(dev, "iommu configuration for device failed with %pe\n",
+   ERR_PTR(iommu_ret));
 
-   dev_dbg(dev, "device is%sbehind an iommu\n",
-   iommu ? " " : " not ");
+   /*
+* Historically this routine doesn't fail driver probing
+* due to errors in of_iommu_configure()
+*/
+   } else
+   dev_dbg(dev, "device is behind an iommu\n");
 
arch_setup_dma_ops(dev, dma_start, size, coherent);
 
-   if (!iommu)
+   if (iommu_ret)
of_dma_set_restricted_buffer(dev, np);
 
return 0;
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 9a5e6b410dd2fb..e61cbbe12dac6f 100644
--- a/include/linux/of_iommu.h
+++

[PATCH v2 4/7] iommu: Mark dev_iommu_get() with lockdep

2023-12-07 Thread Jason Gunthorpe

Allocation of dev->iommu must be done under the
iommu_probe_device_lock. Mark this with lockdep to discourage future
mistakes.

Reviewed-by: Jerry Snitselaar 
Tested-by: Hector Martin 
Reviewed-by: Lu Baolu 
Reviewed-by: Moritz Fischer 
Signed-off-by: Jason Gunthorpe 
---
 drivers/iommu/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0d25468d53a68a..4323b6276e977f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -334,6 +334,8 @@ static struct dev_iommu *dev_iommu_get(struct device *dev)
 {
struct dev_iommu *param = dev->iommu;
 
+   lockdep_assert_held(_probe_device_lock);
+
if (param)
return param;
 
-- 
2.43.0

[PATCH v2 6/7] acpi: Do not return struct iommu_ops from acpi_iommu_configure_id()

2023-12-07 Thread Jason Gunthorpe

Nothing needs this pointer. Return a normal error code with the usual
IOMMU semantic that ENODEV means 'there is no IOMMU driver'.

Acked-by: Rafael J. Wysocki 
Reviewed-by: Jerry Snitselaar 
Reviewed-by: Lu Baolu 
Reviewed-by: Moritz Fischer 
Tested-by: Hector Martin 
Signed-off-by: Jason Gunthorpe 
---
 drivers/acpi/scan.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 444a0b3c72f2d8..340ba720c72129 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1562,8 +1562,7 @@ static inline const struct iommu_ops 
*acpi_iommu_fwspec_ops(struct device *dev)
return fwspec ? fwspec->ops : NULL;
 }
 
-static const struct iommu_ops *acpi_iommu_configure_id(struct device *dev,
-  const u32 *id_in)
+static int acpi_iommu_configure_id(struct device *dev, const u32 *id_in)
 {
int err;
const struct iommu_ops *ops;
@@ -1577,7 +1576,7 @@ static const struct iommu_ops 
*acpi_iommu_configure_id(struct device *dev,
ops = acpi_iommu_fwspec_ops(dev);
if (ops) {
mutex_unlock(_probe_device_lock);
-   return ops;
+   return 0;
}
 
err = iort_iommu_configure_id(dev, id_in);
@@ -1594,12 +1593,14 @@ static const struct iommu_ops 
*acpi_iommu_configure_id(struct device *dev,
 
/* Ignore all other errors apart from EPROBE_DEFER */
if (err == -EPROBE_DEFER) {
-   return ERR_PTR(err);
+   return err;
} else if (err) {
dev_dbg(dev, "Adding to IOMMU failed: %d\n", err);
-   return NULL;
+   return -ENODEV;
}
-   return acpi_iommu_fwspec_ops(dev);
+   if (!acpi_iommu_fwspec_ops(dev))
+   return -ENODEV;
+   return 0;
 }
 
 #else /* !CONFIG_IOMMU_API */
@@ -1611,10 +1612,9 @@ int acpi_iommu_fwspec_init(struct device *dev, u32 id,
return -ENODEV;
 }
 
-static const struct iommu_ops *acpi_iommu_configure_id(struct device *dev,
-  const u32 *id_in)
+static int acpi_iommu_configure_id(struct device *dev, const u32 *id_in)
 {
-   return NULL;
+   return -ENODEV;
 }
 
 #endif /* !CONFIG_IOMMU_API */
@@ -1628,7 +1628,7 @@ static const struct iommu_ops 
*acpi_iommu_configure_id(struct device *dev,
 int acpi_dma_configure_id(struct device *dev, enum dev_dma_attr attr,
  const u32 *input_id)
 {
-   const struct iommu_ops *iommu;
+   int ret;
 
if (attr == DEV_DMA_NOT_SUPPORTED) {
set_dma_ops(dev, _dummy_ops);
@@ -1637,10 +1637,15 @@ int acpi_dma_configure_id(struct device *dev, enum 
dev_dma_attr attr,
 
acpi_arch_dma_setup(dev);
 
-   iommu = acpi_iommu_configure_id(dev, input_id);
-   if (PTR_ERR(iommu) == -EPROBE_DEFER)
+   ret = acpi_iommu_configure_id(dev, input_id);
+   if (ret == -EPROBE_DEFER)
return -EPROBE_DEFER;
 
+   /*
+* Historically this routine doesn't fail driver probing due to errors
+* in acpi_iommu_configure_id()
+*/
+
arch_setup_dma_ops(dev, 0, U64_MAX, attr == DEV_DMA_COHERENT);
 
return 0;
-- 
2.43.0

[PATCH v2 3/7] iommu/of: Use -ENODEV consistently in of_iommu_configure()

2023-12-07 Thread Jason Gunthorpe

Instead of returning 1 and trying to handle positive error codes just
stick to the convention of returning -ENODEV. Remove references to ops
from of_iommu_configure(), a NULL ops will already generate an error code.

There is no reason to check dev->bus, if err=0 at this point then the
called configure functions thought there was an iommu and we should try to
probe it. Remove it.

Reviewed-by: Jerry Snitselaar 
Reviewed-by: Moritz Fischer 
Tested-by: Hector Martin 
Signed-off-by: Jason Gunthorpe 
---
 drivers/iommu/of_iommu.c | 49 
 1 file changed, 15 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index c6510d7e7b241b..164317bfb8a81f 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -17,8 +17,6 @@
 #include 
 #include 
 
-#define NO_IOMMU   1
-
 static int of_iommu_xlate(struct device *dev,
  struct of_phandle_args *iommu_spec)
 {
@@ -29,7 +27,7 @@ static int of_iommu_xlate(struct device *dev,
ops = iommu_ops_from_fwnode(fwnode);
if ((ops && !ops->of_xlate) ||
!of_device_is_available(iommu_spec->np))
-   return NO_IOMMU;
+   return -ENODEV;
 
ret = iommu_fwspec_init(dev, _spec->np->fwnode, ops);
if (ret)
@@ -61,7 +59,7 @@ static int of_iommu_configure_dev_id(struct device_node 
*master_np,
 "iommu-map-mask", _spec.np,
 iommu_spec.args);
if (err)
-   return err == -ENODEV ? NO_IOMMU : err;
+   return err;
 
err = of_iommu_xlate(dev, _spec);
of_node_put(iommu_spec.np);
@@ -72,7 +70,7 @@ static int of_iommu_configure_dev(struct device_node 
*master_np,
  struct device *dev)
 {
struct of_phandle_args iommu_spec;
-   int err = NO_IOMMU, idx = 0;
+   int err = -ENODEV, idx = 0;
 
while (!of_parse_phandle_with_args(master_np, "iommus",
   "#iommu-cells",
@@ -117,9 +115,8 @@ static int of_iommu_configure_device(struct device_node 
*master_np,
 int of_iommu_configure(struct device *dev, struct device_node *master_np,
   const u32 *id)
 {
-   const struct iommu_ops *ops = NULL;
struct iommu_fwspec *fwspec;
-   int err = NO_IOMMU;
+   int err;
 
if (!master_np)
return -ENODEV;
@@ -153,37 +150,21 @@ int of_iommu_configure(struct device *dev, struct 
device_node *master_np,
} else {
err = of_iommu_configure_device(master_np, dev, id);
}
-
-   /*
-* Two success conditions can be represented by non-negative err here:
-* >0 : there is no IOMMU, or one was unavailable for non-fatal reasons
-*  0 : we found an IOMMU, and dev->fwspec is initialised appropriately
-* <0 : any actual error
-*/
-   if (!err) {
-   /* The fwspec pointer changed, read it again */
-   fwspec = dev_iommu_fwspec_get(dev);
-   ops= fwspec->ops;
-   }
mutex_unlock(_probe_device_lock);
 
-   /*
-* If we have reason to believe the IOMMU driver missed the initial
-* probe for dev, replay it to get things in order.
-*/
-   if (!err && dev->bus)
-   err = iommu_probe_device(dev);
-
-   /* Ignore all other errors apart from EPROBE_DEFER */
-   if (err < 0) {
-   if (err == -EPROBE_DEFER)
-   return err;
-   dev_dbg(dev, "Adding to IOMMU failed: %pe\n", ERR_PTR(err));
+   if (err == -ENODEV || err == -EPROBE_DEFER)
return err;
-   }
-   if (!ops)
-   return -ENODEV;
+   if (err)
+   goto err_log;
+
+   err = iommu_probe_device(dev);
+   if (err)
+   goto err_log;
return 0;
+
+err_log:
+   dev_dbg(dev, "Adding to IOMMU failed: %pe\n", ERR_PTR(err));
+   return err;
 }
 
 static enum iommu_resv_type __maybe_unused
-- 
2.43.0

[PATCH v2 7/7] iommu/tegra: Use tegra_dev_iommu_get_stream_id() in the remaining places

2023-12-07 Thread Jason Gunthorpe

This API was defined to formalize the access to internal iommu details on
some Tegra SOCs, but a few callers got missed. Add them.

The helper already masks by 0x so remove this code from the callers.

Suggested-by: Thierry Reding 
Reviewed-by: Thierry Reding 
Signed-off-by: Jason Gunthorpe 
---
 drivers/dma/tegra186-gpc-dma.c  |  8 +++-
 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c |  9 ++---
 drivers/memory/tegra/tegra186.c | 14 --
 3 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/dma/tegra186-gpc-dma.c b/drivers/dma/tegra186-gpc-dma.c
index fa4d4142a68a21..88547a23825b18 100644
--- a/drivers/dma/tegra186-gpc-dma.c
+++ b/drivers/dma/tegra186-gpc-dma.c
@@ -1348,8 +1348,8 @@ static int tegra_dma_program_sid(struct tegra_dma_channel 
*tdc, int stream_id)
 static int tegra_dma_probe(struct platform_device *pdev)
 {
const struct tegra_dma_chip_data *cdata = NULL;
-   struct iommu_fwspec *iommu_spec;
-   unsigned int stream_id, i;
+   unsigned int i;
+   u32 stream_id;
struct tegra_dma *tdma;
int ret;
 
@@ -1378,12 +1378,10 @@ static int tegra_dma_probe(struct platform_device *pdev)
 
tdma->dma_dev.dev = >dev;
 
-   iommu_spec = dev_iommu_fwspec_get(>dev);
-   if (!iommu_spec) {
+   if (!tegra_dev_iommu_get_stream_id(>dev, _id)) {
dev_err(>dev, "Missing iommu stream-id\n");
return -EINVAL;
}
-   stream_id = iommu_spec->ids[0] & 0x;
 
ret = device_property_read_u32(>dev, "dma-channel-mask",
   >chan_mask);
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c
index e7e8fdf3adab7a..29682722b0b36b 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c
@@ -28,19 +28,14 @@ static void
 gp10b_ltc_init(struct nvkm_ltc *ltc)
 {
struct nvkm_device *device = ltc->subdev.device;
-   struct iommu_fwspec *spec;
+   u32 sid;
 
nvkm_wr32(device, 0x17e27c, ltc->ltc_nr);
nvkm_wr32(device, 0x17e000, ltc->ltc_nr);
nvkm_wr32(device, 0x100800, ltc->ltc_nr);
 
-   spec = dev_iommu_fwspec_get(device->dev);
-   if (spec) {
-   u32 sid = spec->ids[0] & 0x;
-
-   /* stream ID */
+   if (tegra_dev_iommu_get_stream_id(device->dev, ))
nvkm_wr32(device, 0x16, sid << 2);
-   }
 }
 
 static const struct nvkm_ltc_func
diff --git a/drivers/memory/tegra/tegra186.c b/drivers/memory/tegra/tegra186.c
index 533f85a4b2bdb7..9cbf22a10a8270 100644
--- a/drivers/memory/tegra/tegra186.c
+++ b/drivers/memory/tegra/tegra186.c
@@ -111,9 +111,12 @@ static void tegra186_mc_client_sid_override(struct 
tegra_mc *mc,
 static int tegra186_mc_probe_device(struct tegra_mc *mc, struct device *dev)
 {
 #if IS_ENABLED(CONFIG_IOMMU_API)
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct of_phandle_args args;
unsigned int i, index = 0;
+   u32 sid;
+
+   if (!tegra_dev_iommu_get_stream_id(dev, ))
+   return 0;
 
while (!of_parse_phandle_with_args(dev->of_node, "interconnects", 
"#interconnect-cells",
   index, )) {
@@ -121,11 +124,10 @@ static int tegra186_mc_probe_device(struct tegra_mc *mc, 
struct device *dev)
for (i = 0; i < mc->soc->num_clients; i++) {
const struct tegra_mc_client *client = 
>soc->clients[i];
 
-   if (client->id == args.args[0]) {
-   u32 sid = fwspec->ids[0] & 
MC_SID_STREAMID_OVERRIDE_MASK;
-
-   tegra186_mc_client_sid_override(mc, 
client, sid);
-   }
+   if (client->id == args.args[0])
+   tegra186_mc_client_sid_override(
+   mc, client,
+   sid & 
MC_SID_STREAMID_OVERRIDE_MASK);
}
}
 
-- 
2.43.0

[PATCH 2/2] drm/amdgpu: add shared fdinfo stats

2023-12-07 Thread Alex Deucher

Add shared stats.  Useful for seeing shared memory.

v2: take dma-buf into account as well

Signed-off-by: Alex Deucher 
Cc: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  6 ++
 3 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 5706b282a0c7..c7df7fa3459f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -97,6 +97,10 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct 
drm_file *file)
   stats.requested_visible_vram/1024UL);
drm_printf(p, "amd-requested-gtt:\t%llu KiB\n",
   stats.requested_gtt/1024UL);
+   drm_printf(p, "drm-shared-vram:\t%llu KiB\n", stats.vram_shared/1024UL);
+   drm_printf(p, "drm-shared-gtt:\t%llu KiB\n", stats.gtt_shared/1024UL);
+   drm_printf(p, "drm-shared-cpu:\t%llu KiB\n", stats.cpu_shared/1024UL);
+
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index d79b4ca1ecfc..1b37d95475b8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1287,25 +1287,36 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
  struct amdgpu_mem_stats *stats)
 {
uint64_t size = amdgpu_bo_size(bo);
+   struct drm_gem_object *obj;
unsigned int domain;
+   bool shared;
 
/* Abort if the BO doesn't currently have a backing store */
if (!bo->tbo.resource)
return;
 
+   obj = >tbo.base;
+   shared = (obj->handle_count > 1) || obj->dma_buf;
+
domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
switch (domain) {
case AMDGPU_GEM_DOMAIN_VRAM:
stats->vram += size;
if (amdgpu_bo_in_cpu_visible_vram(bo))
stats->visible_vram += size;
+   if (shared)
+   stats->vram_shared += size;
break;
case AMDGPU_GEM_DOMAIN_GTT:
stats->gtt += size;
+   if (shared)
+   stats->gtt_shared += size;
break;
case AMDGPU_GEM_DOMAIN_CPU:
default:
stats->cpu += size;
+   if (shared)
+   stats->cpu_shared += size;
break;
}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index d28e21baef16..0503af75dc26 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -138,12 +138,18 @@ struct amdgpu_bo_vm {
 struct amdgpu_mem_stats {
/* current VRAM usage, includes visible VRAM */
uint64_t vram;
+   /* current shared VRAM usage, includes visible VRAM */
+   uint64_t vram_shared;
/* current visible VRAM usage */
uint64_t visible_vram;
/* current GTT usage */
uint64_t gtt;
+   /* current shared GTT usage */
+   uint64_t gtt_shared;
/* current system memory usage */
uint64_t cpu;
+   /* current shared system memory usage */
+   uint64_t cpu_shared;
/* sum of evicted buffers, includes visible VRAM */
uint64_t evicted_vram;
/* sum of evicted buffers due to CPU access */
-- 
2.42.0

[PATCH 1/2] drm: update drm_show_memory_stats() for dma-bufs

2023-12-07 Thread Alex Deucher

Show buffers as shared if they are shared via dma-buf as well
(e.g., shared with v4l or some other subsystem).

Signed-off-by: Alex Deucher 
Cc: Rob Clark 
---
 drivers/gpu/drm/drm_file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 5ddaffd32586..5d5f93b9c263 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -973,7 +973,7 @@ void drm_show_memory_stats(struct drm_printer *p, struct 
drm_file *file)
DRM_GEM_OBJECT_PURGEABLE;
}
 
-   if (obj->handle_count > 1) {
+   if ((obj->handle_count > 1) || obj->dma_buf) {
status.shared += obj->size;
} else {
status.private += obj->size;
-- 
2.42.0

[PATCH 0/2] fdinfo shared stats

2023-12-07 Thread Alex Deucher

We had a request to add shared buffer stats to fdinfo for amdgpu and
while implementing that, Christian mentioned that just looking at
the GEM handle count doesn't take into account buffers shared with other
subsystems like V4L or RDMA.  Those subsystems don't use GEM, so it
doesn't really matter from a GPU top perspective, but it's more
correct if you actually want to see shared buffers.

Alex Deucher (2):
  drm: update drm_show_memory_stats() for dma-bufs
  drm/amdgpu: add shared fdinfo stats

 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c |  4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  6 ++
 drivers/gpu/drm/drm_file.c |  2 +-
 4 files changed, 22 insertions(+), 1 deletion(-)

-- 
2.42.0

Re: [PATCH] drm/bridge: aux-hpd: Replace of_device.h with explicit include

2023-12-07 Thread Dmitry Baryshkov


On 07/12/2023 18:25, Rob Herring wrote:

The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it was merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. Soon the implicit includes are going to be removed.

of_device.h isn't needed, but of.h is for of_node_put().

Reported-by: Stephen Rothwell 
Signed-off-by: Rob Herring 


Reviewed-by: Dmitry Baryshkov 


---
  drivers/gpu/drm/bridge/aux-hpd-bridge.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


--
With best wishes
Dmitry

Re: [RFC PATCH] of/platform: Disable sysfb if a simple-framebuffer node is found

2023-12-07 Thread Rob Herring

On Mon, Dec 04, 2023 at 05:05:30PM +0100, Javier Martinez Canillas wrote:
> Rob Herring  writes:
> 
> > On Mon, Dec 4, 2023 at 3:39 AM Javier Martinez Canillas
> >  wrote:
> >> Rob Herring  writes:
> >> > On Fri, Dec 1, 2023 at 4:21 AM Javier Martinez Canillas
> 
> [...]
> 
> >>
> >> > However, there might be one other issue with that and this fix. The DT
> >> > simplefb can have resources such as clocks and regulators. With
> >> > fw_devlink, the driver won't probe until those dependencies are met.
> >> > So if you want the framebuffer console up early, then you may want to
> >> > register the EFI framebuffer first and then handoff to the DT simplefb
> >> > when it probes (rather than registering the device).
> >> >
> >> > But I agree, probably better to take this patch now and have those
> >> > quirks instead of flat out not working.
> >> >
> >>
> >> If we do that what's the plan? Are you thinking about merging this patch
> >> through your OF tree or do you want to go through drm-misc with your ack?
> >
> > I can take it. Do we need this in 6.7 and stable?
> >
> 
> IMO this can wait for v6.8 since is not a fix for a change introduced in
> the v6.7 merge window and something that only happens on a very specific
> setup (DT systems booting with u-boot EFI and providing an EFI-GOP table).
> 
> Also the -rc cycle is already in -rc5, so it seems risky to push a change
> at this point. And distros can pick the patch if want to have it earlier.

Okay, I've applied it for 6.8.

Rob

Re: [PATCH] x86/vmware: Add TDX hypercall support

2023-12-07 Thread Dave Hansen

On 12/5/23 23:15, Alexey Makhalov wrote:
> +#ifdef CONFIG_INTEL_TDX_GUEST
> +/* Export tdx hypercall and allow it only for VMware guests. */
> +void vmware_tdx_hypercall_args(struct tdx_module_args *args)
> +{
> + if (hypervisor_is_type(X86_HYPER_VMWARE))
> + __tdx_hypercall(args);
> +}
> +EXPORT_SYMBOL_GPL(vmware_tdx_hypercall_args);
> +#endif

I think this is still too generic.  This still allows anything setting
X86_HYPER_VMWARE to make any TDX hypercall.

I'd *much* rather you export something like vmware_tdx_hypercall() or
even the high-level calls like hypervisor_ppn_reset_all().  The higher
level and more specialized the interface, the less likely it is to be
abused.

Re: [PATCH 2/3] drm/msm/dp: Add DisplayPort controller for SM8650

2023-12-07 Thread Dmitry Baryshkov

On Thu, 7 Dec 2023 at 18:37, Neil Armstrong  wrote:
>
> The Qualcomm SM8650 platform comes with a DisplayPort controller
> with a different base offset than the previous SM8550 SoC,
> add support for this in the DisplayPort driver.
>
> Signed-off-by: Neil Armstrong 

Reviewed-by: Dmitry Baryshkov 

> ---
>  drivers/gpu/drm/msm/dp/dp_display.c | 6 ++
>  1 file changed, 6 insertions(+)

-- 
With best wishes
Dmitry

Re: [PATCH] drm/panel: re-alphabetize the menu list

2023-12-07 Thread Randy Dunlap




On 12/7/23 01:52, Aradhya Bhatia wrote:
> Hi Randy,
> 
> Thanks for the patch!
> 
> On 07/12/23 11:52, Randy Dunlap wrote:
>> A few of the DRM_PANEL entries have become out of alphabetical order,
>> so move them around a bit to restore alpha order.
>>
>> Signed-off-by: Randy Dunlap 
>> Cc: Neil Armstrong 
>> Cc: Jessica Zhang 
>> Cc: Sam Ravnborg 
>> Cc: Maarten Lankhorst 
>> Cc: Maxime Ripard 
>> Cc: Thomas Zimmermann 
>> Cc: David Airlie 
>> Cc: Daniel Vetter 
>> Cc: dri-devel@lists.freedesktop.org
>> ---
>>  drivers/gpu/drm/panel/Kconfig |   90 
>>  1 file changed, 45 insertions(+), 45 deletions(-)
>>
>> diff -- a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
>> --- a/drivers/gpu/drm/panel/Kconfig
>> +++ b/drivers/gpu/drm/panel/Kconfig
>> @@ -95,34 +95,6 @@ config DRM_PANEL_LVDS
>>handling of power supplies or control signals. It implements automatic
>>backlight handling if the panel is attached to a backlight controller.
>>  
>> -config DRM_PANEL_SIMPLE
>> -tristate "support for simple panels (other than eDP ones)"
>> -depends on OF
>> -depends on BACKLIGHT_CLASS_DEVICE
>> -depends on PM
>> -select VIDEOMODE_HELPERS
>> -help
>> -  DRM panel driver for dumb non-eDP panels that need at most a regulator
>> -  and a GPIO to be powered up. Optionally a backlight can be attached so
>> -  that it can be automatically turned off when the panel goes into a
>> -  low power state.
>> -
>> -config DRM_PANEL_EDP
>> -tristate "support for simple Embedded DisplayPort panels"
>> -depends on OF
>> -depends on BACKLIGHT_CLASS_DEVICE
>> -depends on PM
>> -select VIDEOMODE_HELPERS
>> -select DRM_DISPLAY_DP_HELPER
>> -select DRM_DISPLAY_HELPER
>> -select DRM_DP_AUX_BUS
>> -select DRM_KMS_HELPER
>> -help
>> -  DRM panel driver for dumb eDP panels that need at most a regulator and
>> -  a GPIO to be powered up. Optionally a backlight can be attached so
>> -  that it can be automatically turned off when the panel goes into a
>> -  low power state.
>> -
>>  config DRM_PANEL_EBBG_FT8719
>>  tristate "EBBG FT8719 panel driver"
>>  depends on OF
>> @@ -317,12 +289,6 @@ config DRM_PANEL_LEADTEK_LTK500HD1829
>>24 bit RGB per pixel. It provides a MIPI DSI interface to
>>the host and has a built-in LED backlight.
>>  
>> -config DRM_PANEL_SAMSUNG_LD9040
>> -tristate "Samsung LD9040 RGB/SPI panel"
>> -depends on OF && SPI
>> -depends on BACKLIGHT_CLASS_DEVICE
>> -select VIDEOMODE_HELPERS
>> -
>>  config DRM_PANEL_LG_LB035Q02
>>  tristate "LG LB035Q024573 RGB panel"
>>  depends on GPIOLIB && OF && SPI
>> @@ -350,6 +316,17 @@ config DRM_PANEL_MAGNACHIP_D53E6EA8966
>>with the Magnachip D53E6EA8966 panel IC. This panel receives
>>video data via DSI but commands via 9-bit SPI using DBI.
>>  
>> +config DRM_PANEL_MANTIX_MLAF057WE51
>> +tristate "Mantix MLAF057WE51-X MIPI-DSI LCD panel"
>> +depends on OF
>> +depends on DRM_MIPI_DSI
>> +depends on BACKLIGHT_CLASS_DEVICE
>> +help
>> +  Say Y here if you want to enable support for the Mantix
>> +  MLAF057WE51-X MIPI DSI panel as e.g. used in the Librem 5. It
>> +  has a resolution of 720x1440 pixels, a built in backlight and touch
>> +  controller.
>> +
>>  config DRM_PANEL_NEC_NL8048HL11
>>  tristate "NEC NL8048HL11 RGB panel"
>>  depends on GPIOLIB && OF && SPI
>> @@ -438,17 +415,6 @@ config DRM_PANEL_NOVATEK_NT39016
>>Say Y here if you want to enable support for the panels built
>>around the Novatek NT39016 display controller.
>>  
>> -config DRM_PANEL_MANTIX_MLAF057WE51
>> -tristate "Mantix MLAF057WE51-X MIPI-DSI LCD panel"
>> -depends on OF
>> -depends on DRM_MIPI_DSI
>> -depends on BACKLIGHT_CLASS_DEVICE
>> -help
>> -  Say Y here if you want to enable support for the Mantix
>> -  MLAF057WE51-X MIPI DSI panel as e.g. used in the Librem 5. It
>> -  has a resolution of 720x1440 pixels, a built in backlight and touch
>> -  controller.
>> -
>>  config DRM_PANEL_OLIMEX_LCD_OLINUXINO
>>  tristate "Olimex LCD-OLinuXino panel"
>>  depends on OF
>> @@ -566,6 +532,12 @@ config DRM_PANEL_SAMSUNG_DB7430
>>DB7430 DPI display controller used in such devices as the
>>LMS397KF04 480x800 DPI panel.
>>  
>> +config DRM_PANEL_SAMSUNG_LD9040
>> +tristate "Samsung LD9040 RGB/SPI panel"
>> +depends on OF && SPI
>> +depends on BACKLIGHT_CLASS_DEVICE
>> +select VIDEOMODE_HELPERS
>> +
>>  config DRM_PANEL_SAMSUNG_S6D16D0
>>  tristate "Samsung S6D16D0 DSI video mode panel"
>>  depends on OF
>> @@ -774,6 +746,34 @@ config DRM_PANEL_STARTEK_KD070FHFID015
>>with a resolution of 1024 x 600 pixels. It provides a MIPI DSI 
>> interface to
>>the host, a built-in LED backlight and touch controller.
>>  
>> +config DRM_PANEL_EDP
>> +

Re: [PATCH v5 04/10] drm: bridge: samsung-dsim: complete the CLKLANE_STOP setting

2023-12-07 Thread Frieder Schrempf

On 07.12.23 15:16, Dario Binacchi wrote:
> The patch completes the setting of CLKLANE_STOP for the imx8mn and imx8mp
> platforms (i. e. not exynos).

This also affects i.MX8MM, so better just mention i.MX in general in the
commit message.

> 
> Co-developed-by: Michael Trimarchi 
> Signed-off-by: Michael Trimarchi 
> Signed-off-by: Dario Binacchi 
> ---
> 
> (no changes since v1)
> 
>  drivers/gpu/drm/bridge/samsung-dsim.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
> b/drivers/gpu/drm/bridge/samsung-dsim.c
> index 15bf05b2bbe4..13f181c99d7e 100644
> --- a/drivers/gpu/drm/bridge/samsung-dsim.c
> +++ b/drivers/gpu/drm/bridge/samsung-dsim.c
> @@ -96,6 +96,7 @@
>  #define DSIM_MFLUSH_VS   BIT(29)
>  /* This flag is valid only for exynos3250/3472/5260/5430 */
>  #define DSIM_CLKLANE_STOPBIT(30)
> +#define DSIM_NON_CONTINUOUS_CLKLANE  BIT(31)
>  
>  /* DSIM_ESCMODE */
>  #define DSIM_TX_TRIGGER_RST  BIT(4)
> @@ -945,8 +946,12 @@ static int samsung_dsim_init_link(struct samsung_dsim 
> *dsi)
>* power consumption.
>*/
>   if (driver_data->has_clklane_stop &&
> - dsi->mode_flags & MIPI_DSI_CLOCK_NON_CONTINUOUS)
> + dsi->mode_flags & MIPI_DSI_CLOCK_NON_CONTINUOUS) {
> + if (!samsung_dsim_hw_is_exynos(dsi->plat_data->hw_type))
> + reg |= DSIM_NON_CONTINUOUS_CLKLANE;
> +
>   reg |= DSIM_CLKLANE_STOP;
> + }

I really wonder what the difference between DSIM_NON_CONTINUOUS_CLKLANE
and DSIM_CLKLANE_STOP is.

If Exynos only has the latter, it's pretty clear what to use. But as
i.MX has both of these bits, should both be set? Or is setting
DSIM_NON_CONTINUOUS_CLKLANE enough and we should leave DSIM_CLKLANE_STOP
alone?

Maybe someone has a clue here. The description of the bits in the RM is:

DSIM_NON_CONTINUOUS_CLKLANE - Non-continuous clock mode
DSIM_CLKLANE_STOP -  PHY clock lane On/Off for ESD

>   samsung_dsim_write(dsi, DSIM_CONFIG_REG, reg);
>  
>   lanes_mask = BIT(dsi->lanes) - 1;

Re: [PATCH 2/2] drm/amdgpu: Enable clear page functionality

2023-12-07 Thread Alex Deucher

On Thu, Dec 7, 2023 at 10:12 AM Arunpravin Paneer Selvam
 wrote:
>
> Add clear page support in vram memory region.
>
> Signed-off-by: Arunpravin Paneer Selvam 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 13 +++--
>  .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 25 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 50 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  4 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 14 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h  |  5 ++
>  6 files changed, 105 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index cef920a93924..bc4ea87f8b5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -39,6 +39,7 @@
>  #include "amdgpu.h"
>  #include "amdgpu_trace.h"
>  #include "amdgpu_amdkfd.h"
> +#include "amdgpu_vram_mgr.h"
>
>  /**
>   * DOC: amdgpu_object
> @@ -629,15 +630,17 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>
> if (bp->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED &&
> bo->tbo.resource->mem_type == TTM_PL_VRAM) {
> -   struct dma_fence *fence;
> +   struct dma_fence *fence = NULL;
>
> -   r = amdgpu_fill_buffer(bo, 0, bo->tbo.base.resv, , 
> true);
> +   r = amdgpu_clear_buffer(bo, bo->tbo.base.resv, , true);
> if (unlikely(r))
> goto fail_unreserve;
>
> -   dma_resv_add_fence(bo->tbo.base.resv, fence,
> -  DMA_RESV_USAGE_KERNEL);
> -   dma_fence_put(fence);
> +   if (fence) {
> +   dma_resv_add_fence(bo->tbo.base.resv, fence,
> +  DMA_RESV_USAGE_KERNEL);
> +   dma_fence_put(fence);
> +   }
> }
> if (!bp->resv)
> amdgpu_bo_unreserve(bo);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
> index 381101d2bf05..50fcd86e1033 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
> @@ -164,4 +164,29 @@ static inline void amdgpu_res_next(struct 
> amdgpu_res_cursor *cur, uint64_t size)
> }
>  }
>
> +/**
> + * amdgpu_res_cleared - check if blocks are cleared
> + *
> + * @cur: the cursor to extract the block
> + *
> + * Check if the @cur block is cleared
> + */
> +static inline bool amdgpu_res_cleared(struct amdgpu_res_cursor *cur)
> +{
> +   struct drm_buddy_block *block;
> +
> +   switch (cur->mem_type) {
> +   case TTM_PL_VRAM:
> +   block = cur->node;
> +
> +   if (!amdgpu_vram_mgr_is_cleared(block))
> +   return false;
> +   break;
> +   default:
> +   return false;
> +   }
> +
> +   return true;
> +}
> +
>  #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 05991c5c8ddb..6d7514e8f40c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -,6 +,56 @@ static int amdgpu_ttm_fill_mem(struct amdgpu_ring 
> *ring, uint32_t src_data,
> return 0;
>  }
>
> +int amdgpu_clear_buffer(struct amdgpu_bo *bo,

amdgpu_ttm_clear_buffer() for naming consistency.

Alex

> +   struct dma_resv *resv,
> +   struct dma_fence **fence,
> +   bool delayed)
> +{
> +   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> +   struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> +   struct amdgpu_res_cursor cursor;
> +   struct dma_fence *f = NULL;
> +   u64 addr;
> +   int r;
> +
> +   if (!adev->mman.buffer_funcs_enabled)
> +   return -EINVAL;
> +
> +   amdgpu_res_first(bo->tbo.resource, 0, amdgpu_bo_size(bo), );
> +
> +   mutex_lock(>mman.gtt_window_lock);
> +   while (cursor.remaining) {
> +   struct dma_fence *next = NULL;
> +   u64 size;
> +
> +   /* Never clear more than 256MiB at once to avoid timeouts */
> +   size = min(cursor.size, 256ULL << 20);
> +
> +   if (!amdgpu_res_cleared()) {
> +   r = amdgpu_ttm_map_buffer(>tbo, bo->tbo.resource, 
> ,
> + 1, ring, false, , 
> );
> +   if (r)
> +   goto err;
> +
> +   r = amdgpu_ttm_fill_mem(ring, 0, addr, size, resv,
> +   , true, delayed);
> +   if (r)
> +   goto err;
> +   }
> +   dma_fence_put(f);
> +   f = next;
> +
> +

Re: [PATCH] drm/bridge: samsung-dsim: check the return value only if necessary

2023-12-07 Thread Frieder Schrempf

On 07.12.23 17:10, Dario Binacchi wrote:
> It was useless to check again the "ret" variable if the function
> register_host() was not called.
> 
> Signed-off-by: Dario Binacchi 

Reviewed-by: Frieder Schrempf 

> ---
> 
>  drivers/gpu/drm/bridge/samsung-dsim.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
> b/drivers/gpu/drm/bridge/samsung-dsim.c
> index be5914caa17d..98cd589e4427 100644
> --- a/drivers/gpu/drm/bridge/samsung-dsim.c
> +++ b/drivers/gpu/drm/bridge/samsung-dsim.c
> @@ -2020,11 +2020,11 @@ int samsung_dsim_probe(struct platform_device *pdev)
>   else
>   dsi->bridge.timings = _dsim_bridge_timings_de_high;
>  
> - if (dsi->plat_data->host_ops && dsi->plat_data->host_ops->register_host)
> + if (dsi->plat_data->host_ops && 
> dsi->plat_data->host_ops->register_host) {
>   ret = dsi->plat_data->host_ops->register_host(dsi);
> -
> - if (ret)
> - goto err_disable_runtime;
> + if (ret)
> + goto err_disable_runtime;
> + }
>  
>   return 0;
>

1 2 3 >

1 - 100 of 278 matches

Mail list logo