date:20210609

Re: [PATCH 07/13] drm/i915/guc: New definition of the CTB registration action

2021-06-09 Thread Matthew Brost

On Wed, Jun 09, 2021 at 10:07:21PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 09.06.2021 19:36, John Harrison wrote:
> > On 6/7/2021 18:23, Daniele Ceraolo Spurio wrote:
> >> On 6/7/2021 11:03 AM, Matthew Brost wrote:
> >>> From: Michal Wajdeczko 
> >>>
> >>> Definition of the CTB registration action has changed.
> >>> Add some ABI documentation and implement required changes.
> >>>
> >>> Signed-off-by: Michal Wajdeczko 
> >>> Signed-off-by: Matthew Brost 
> >>> Cc: Piotr Piórkowski  #4
> >>> ---
> >>>   .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  | 107 ++
> >>>   .../gt/uc/abi/guc_communication_ctb_abi.h |   4 -
> >>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  76 -
> >>>   3 files changed, 152 insertions(+), 35 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> >>> b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> >>> index 90efef8a73e4..6426fc183692 100644
> >>> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> >>> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
> >>> @@ -6,6 +6,113 @@
> >>>   #ifndef _ABI_GUC_ACTIONS_ABI_H
> >>>   #define _ABI_GUC_ACTIONS_ABI_H
> >>>   +/**
> >>> + * DOC: HOST2GUC_REGISTER_CTB
> >>> + *
> >>> + * This message is used as part of the `CTB based communication`_
> >>> setup.
> >>> + *
> >>> + * This message must be sent as `MMIO HXG Message`_.
> >>> + *
> >>> + *
> >>> +---+---+--+
> >>>
> >>> + *  |   | Bits  |
> >>> Description  |
> >>> + *
> >>> +===+===+==+
> >>>
> >>> + *  | 0 |    31 | ORIGIN =
> >>> GUC_HXG_ORIGIN_HOST_    |
> >>> + *  |
> >>> +---+--+
> >>> + *  |   | 30:28 | TYPE =
> >>> GUC_HXG_TYPE_REQUEST_ |
> >>> + *  |
> >>> +---+--+
> >>> + *  |   | 27:16 | DATA0 =
> >>> MBZ  |
> >>> + *  |
> >>> +---+--+
> >>> + *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_REGISTER_CTB` =
> >>> 0x5200    |
> >>
> >> Specs says 4505
> >>
> >>> + *
> >>> +---+---+--+
> >>>
> >>> + *  | 1 | 31:12 | RESERVED =
> >>> MBZ   |
> >>> + *  |
> >>> +---+--+
> >>> + *  |   |  11:8 | **TYPE** - type for the `CT
> >>> Buffer`_ |
> >>> + *  |   |
> >>> |  |
> >>> + *  |   |   |   - _`GUC_CTB_TYPE_HOST2GUC` =
> >>> 0 |
> >>> + *  |   |   |   - _`GUC_CTB_TYPE_GUC2HOST` =
> >>> 1 |
> >>> + *  |
> >>> +---+--+
> >>> + *  |   |   7:0 | **SIZE** - size of the `CT Buffer`_ in 4K units
> >>> minus 1  |
> >>> + *
> >>> +---+---+--+
> >>>
> >>> + *  | 2 |  31:0 | **DESC_ADDR** - GGTT address of the `CTB
> >>> Descriptor`_    |
> >>> + *
> >>> +---+---+--+
> >>>
> >>> + *  | 3 |  31:0 | **BUFF_ADDF** - GGTT address of the `CT
> >>> Buffer`_ |
> >>> + *
> >>> +---+---+--+
> >>>
> >>> +*
> >>> + *
> >>> +---+---+--+
> >>>
> >>> + *  |   | Bits  |
> >>> Description  |
> >>> + *
> >>> +===+===+==+
> >>>
> >>> + *  | 0 |    31 | ORIGIN =
> >>> GUC_HXG_ORIGIN_GUC_ |
> >>> + *  |
> >>> +---+--+
> >>> + *  |   | 30:28 | TYPE =
> >>> GUC_HXG_TYPE_RESPONSE_SUCCESS_    |
> >>> + *  |
> >>> +---+--+
> >>> + *  |   |  27:0 | DATA0 =
> >>> MBZ  |
> >>> + *
> >>> +---+---+--+
> >>>
> >>> + */
> >>> +#define GUC_ACTION_HOST2GUC_REGISTER_CTB    0x4505 // FIXME 0x5200
> >>
> >> Why FIXME? AFAICS the specs still says 4505, even if we plan to update
> >> at some point I don;t think this deserves a FIXME since nothing is
> >> incorrect.
> >>
> >>> +
> >>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_LEN
> >>> (GUC_HXG_REQUEST_MSG_MIN_LEN + 3u)
> >>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_0_MBZ
> >>>

Re: [Intel-gfx] [PATCH 00/13] Update firmware to v62.0.0

2021-06-09 Thread Matthew Brost

On Wed, Jun 09, 2021 at 09:36:36PM -0700, Matthew Brost wrote:
> As part of enabling GuC submission [1] we need to update to the latest
> and greatest firmware. This series does that. This is a destructive
> change. e.g. Without all the patches in this series it will break the
> i915 driver. As such, after we review most of these patches they will
> squashed into a single patch for merging.
> 
> v2: Address comments, looking for remaning RBs so patches can be
> squashed and sent for CI
> 

Ugh, forgot to include some RBs in this rev. Just looking for RBs 1-2,
and 6-8 in this rev.

Matt

> Signed-off-by: Matthew Brost 
> 
> [1] https://patchwork.freedesktop.org/series/89844/i
> 
> John Harrison (3):
>   drm/i915/guc: Support per context scheduling policies
>   drm/i915/guc: Unified GuC log
>   drm/i915/guc: Update firmware to v62.0.0
> 
> Michal Wajdeczko (10):
>   drm/i915/guc: Introduce unified HXG messages
>   drm/i915/guc: Update MMIO based communication
>   drm/i915/guc: Update CTB response status definition
>   drm/i915/guc: Add flag for mark broken CTB
>   drm/i915/guc: New definition of the CTB descriptor
>   drm/i915/guc: New definition of the CTB registration action
>   drm/i915/guc: New CTB based communication
>   drm/i915/doc: Include GuC ABI documentation
>   drm/i915/guc: Kill guc_clients.ct_pool
>   drm/i915/guc: Kill ads.client_info
> 
>  Documentation/gpu/i915.rst|   8 +
>  .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  | 107 ++
>  .../gt/uc/abi/guc_communication_ctb_abi.h | 128 +--
>  .../gt/uc/abi/guc_communication_mmio_abi.h|  65 ++--
>  .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 213 +++
>  drivers/gpu/drm/i915/gt/uc/intel_guc.c| 107 --
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  45 +--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 356 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |   6 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  75 +---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  29 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.h|   6 +-
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  26 +-
>  13 files changed, 750 insertions(+), 421 deletions(-)
> 
> -- 
> 2.28.0
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[PATCH 13/13] drm/i915/guc: Update firmware to v62.0.0

2021-06-09 Thread Matthew Brost

From: John Harrison 

Signed-off-by: John Harrison 
Signed-off-by: Michal Wajdeczko 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 26 
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index df647c9a8d56..9f23e9de3237 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -48,19 +48,19 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
  * firmware as TGL.
  */
 #define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
-   fw_def(ALDERLAKE_S, 0, guc_def(tgl, 49, 0, 1), huc_def(tgl,  7, 5, 0)) \
-   fw_def(ROCKETLAKE,  0, guc_def(tgl, 49, 0, 1), huc_def(tgl,  7, 5, 0)) \
-   fw_def(TIGERLAKE,   0, guc_def(tgl, 49, 0, 1), huc_def(tgl,  7, 5, 0)) \
-   fw_def(JASPERLAKE,  0, guc_def(ehl, 49, 0, 1), huc_def(ehl,  9, 0, 0)) \
-   fw_def(ELKHARTLAKE, 0, guc_def(ehl, 49, 0, 1), huc_def(ehl,  9, 0, 0)) \
-   fw_def(ICELAKE, 0, guc_def(icl, 49, 0, 1), huc_def(icl,  9, 0, 0)) \
-   fw_def(COMETLAKE,   5, guc_def(cml, 49, 0, 1), huc_def(cml,  4, 0, 0)) \
-   fw_def(COMETLAKE,   0, guc_def(kbl, 49, 0, 1), huc_def(kbl,  4, 0, 0)) \
-   fw_def(COFFEELAKE,  0, guc_def(kbl, 49, 0, 1), huc_def(kbl,  4, 0, 0)) \
-   fw_def(GEMINILAKE,  0, guc_def(glk, 49, 0, 1), huc_def(glk,  4, 0, 0)) \
-   fw_def(KABYLAKE,0, guc_def(kbl, 49, 0, 1), huc_def(kbl,  4, 0, 0)) \
-   fw_def(BROXTON, 0, guc_def(bxt, 49, 0, 1), huc_def(bxt,  2, 0, 0)) \
-   fw_def(SKYLAKE, 0, guc_def(skl, 49, 0, 1), huc_def(skl,  2, 0, 0))
+   fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 5, 0)) \
+   fw_def(ROCKETLAKE,  0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 5, 0)) \
+   fw_def(TIGERLAKE,   0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 5, 0)) \
+   fw_def(JASPERLAKE,  0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
+   fw_def(ELKHARTLAKE, 0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
+   fw_def(ICELAKE, 0, guc_def(icl, 62, 0, 0), huc_def(icl,  9, 0, 0)) \
+   fw_def(COMETLAKE,   5, guc_def(cml, 62, 0, 0), huc_def(cml,  4, 0, 0)) \
+   fw_def(COMETLAKE,   0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
+   fw_def(COFFEELAKE,  0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
+   fw_def(GEMINILAKE,  0, guc_def(glk, 62, 0, 0), huc_def(glk,  4, 0, 0)) \
+   fw_def(KABYLAKE,0, guc_def(kbl, 62, 0, 0), huc_def(kbl,  4, 0, 0)) \
+   fw_def(BROXTON, 0, guc_def(bxt, 62, 0, 0), huc_def(bxt,  2, 0, 0)) \
+   fw_def(SKYLAKE, 0, guc_def(skl, 62, 0, 0), huc_def(skl,  2, 0, 0))
 
 #define __MAKE_UC_FW_PATH(prefix_, name_, major_, minor_, patch_) \
"i915/" \
-- 
2.28.0

[PATCH 06/13] drm/i915/guc: New definition of the CTB descriptor

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

Definition of the CTB descriptor has changed, leaving only
minimal shared fields like HEAD/TAIL/STATUS.

Both HEAD and TAIL are now in dwords.

Add some ABI documentation and implement required changes.

v2:
 (Daniele)
  - Drop GUC_CTB_STATUS_NO_BACKCHANNEL, GUC_CTB_STATUS_NO_BACKCHANNEL

Signed-off-by: Michal Wajdeczko 
Signed-off-by: Matthew Brost 
---
 .../gt/uc/abi/guc_communication_ctb_abi.h | 68 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 70 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  2 +-
 3 files changed, 83 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index d38935f47ecf..88f1fc2a19e0 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -7,6 +7,56 @@
 #define _ABI_GUC_COMMUNICATION_CTB_ABI_H
 
 #include 
+#include 
+
+#include "guc_messages_abi.h"
+
+/**
+ * DOC: CT Buffer
+ *
+ * TBD
+ */
+
+/**
+ * DOC: CTB Descriptor
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  | 0 |  31:0 | **HEAD** - offset (in dwords) to the last dword that was 
|
+ *  |   |   | read from the `CT Buffer`_.  
|
+ *  |   |   | It can only be updated by the receiver.  
|
+ *  
+---+---+--+
+ *  | 1 |  31:0 | **TAIL** - offset (in dwords) to the last dword that was 
|
+ *  |   |   | written to the `CT Buffer`_. 
|
+ *  |   |   | It can only be updated by the sender.
|
+ *  
+---+---+--+
+ *  | 2 |  31:0 | **STATUS** - status of the CTB   
|
+ *  |   |   |  
|
+ *  |   |   |   - _`GUC_CTB_STATUS_NO_ERROR` = 0 (normal operation)
|
+ *  |   |   |   - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too large) 
|
+ *  |   |   |   - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated message)  
|
+ *  |   |   |   - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail modified)  
|
+ *  
+---+---+--+
+ *  |...|   | RESERVED = MBZ   
|
+ *  
+---+---+--+
+ *  | 15|  31:0 | RESERVED = MBZ   
|
+ *  
+---+---+--+
+ */
+
+struct guc_ct_buffer_desc {
+   u32 head;
+   u32 tail;
+   u32 status;
+#define GUC_CTB_STATUS_NO_ERROR0
+#define GUC_CTB_STATUS_OVERFLOW(1 << 0)
+#define GUC_CTB_STATUS_UNDERFLOW   (1 << 1)
+#define GUC_CTB_STATUS_MISMATCH(1 << 2)
+#define GUC_CTB_STATUS_NO_BACKCHANNEL  (1 << 3)
+#define GUC_CTB_STATUS_MALFORMED_MSG   (1 << 4)
+   u32 reserved[13];
+} __packed;
+static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
 
 /**
  * DOC: CTB based communication
@@ -60,24 +110,6 @@
  * - **flags**, holds various bits to control message handling
  */
 
-/*
- * Describes single command transport buffer.
- * Used by both guc-master and clients.
- */
-struct guc_ct_buffer_desc {
-   u32 addr;   /* gfx address */
-   u64 host_private;   /* host private data */
-   u32 size;   /* size in bytes */
-   u32 head;   /* offset updated by GuC*/
-   u32 tail;   /* offset updated by owner */
-   u32 is_in_error;/* error indicator */
-   u32 reserved1;
-   u32 reserved2;
-   u32 owner;  /* id of the channel owner */
-   u32 owner_sub_id;   /* owner-defined field for extra tracking */
-   u32 reserved[5];
-} __packed;
-
 /* Type of command transport buffer */
 #define INTEL_GUC_CT_BUFFER_TYPE_SEND  0x0u
 #define INTEL_GUC_CT_BUFFER_TYPE_RECV  0x1u
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 63056ea0631e..3241a477196f 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -112,32 +112,28 @@ static inline const char *guc_ct_buffer_type_to_str(u32 
type)
}
 }
 
-static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc,
-   u32 cmds_addr, u32 size)
+static void

[PATCH 12/13] drm/i915/guc: Unified GuC log

2021-06-09 Thread Matthew Brost

From: John Harrison 

GuC v57 unified the 'DPC' and 'ISR' buffers into a single buffer with
the option for it to be larger.

Signed-off-by: Matthew Brost 
Signed-off-by: John Harrison 
Cc: Alan Previn 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.c  | 15 ---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c  | 29 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.h  |  6 ++---
 4 files changed, 20 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index b773567cb080..6661dcb02239 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -219,24 +219,19 @@ static u32 guc_ctl_log_params_flags(struct intel_guc *guc)
 
BUILD_BUG_ON(!CRASH_BUFFER_SIZE);
BUILD_BUG_ON(!IS_ALIGNED(CRASH_BUFFER_SIZE, UNIT));
-   BUILD_BUG_ON(!DPC_BUFFER_SIZE);
-   BUILD_BUG_ON(!IS_ALIGNED(DPC_BUFFER_SIZE, UNIT));
-   BUILD_BUG_ON(!ISR_BUFFER_SIZE);
-   BUILD_BUG_ON(!IS_ALIGNED(ISR_BUFFER_SIZE, UNIT));
+   BUILD_BUG_ON(!DEBUG_BUFFER_SIZE);
+   BUILD_BUG_ON(!IS_ALIGNED(DEBUG_BUFFER_SIZE, UNIT));
 
BUILD_BUG_ON((CRASH_BUFFER_SIZE / UNIT - 1) >
(GUC_LOG_CRASH_MASK >> GUC_LOG_CRASH_SHIFT));
-   BUILD_BUG_ON((DPC_BUFFER_SIZE / UNIT - 1) >
-   (GUC_LOG_DPC_MASK >> GUC_LOG_DPC_SHIFT));
-   BUILD_BUG_ON((ISR_BUFFER_SIZE / UNIT - 1) >
-   (GUC_LOG_ISR_MASK >> GUC_LOG_ISR_SHIFT));
+   BUILD_BUG_ON((DEBUG_BUFFER_SIZE / UNIT - 1) >
+   (GUC_LOG_DEBUG_MASK >> GUC_LOG_DEBUG_SHIFT));
 
flags = GUC_LOG_VALID |
GUC_LOG_NOTIFY_ON_HALF_FULL |
FLAG |
((CRASH_BUFFER_SIZE / UNIT - 1) << GUC_LOG_CRASH_SHIFT) |
-   ((DPC_BUFFER_SIZE / UNIT - 1) << GUC_LOG_DPC_SHIFT) |
-   ((ISR_BUFFER_SIZE / UNIT - 1) << GUC_LOG_ISR_SHIFT) |
+   ((DEBUG_BUFFER_SIZE / UNIT - 1) << GUC_LOG_DEBUG_SHIFT) |
(offset << GUC_LOG_BUF_ADDR_SHIFT);
 
#undef UNIT
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index f2df5c11c11d..617ec601648d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -81,10 +81,8 @@
 #define   GUC_LOG_ALLOC_IN_MEGABYTE(1 << 3)
 #define   GUC_LOG_CRASH_SHIFT  4
 #define   GUC_LOG_CRASH_MASK   (0x3 << GUC_LOG_CRASH_SHIFT)
-#define   GUC_LOG_DPC_SHIFT6
-#define   GUC_LOG_DPC_MASK (0x7 << GUC_LOG_DPC_SHIFT)
-#define   GUC_LOG_ISR_SHIFT9
-#define   GUC_LOG_ISR_MASK (0x7 << GUC_LOG_ISR_SHIFT)
+#define   GUC_LOG_DEBUG_SHIFT  6
+#define   GUC_LOG_DEBUG_MASK   (0xF << GUC_LOG_DEBUG_SHIFT)
 #define   GUC_LOG_BUF_ADDR_SHIFT   12
 
 #define GUC_CTL_WA 1
@@ -311,8 +309,7 @@ struct guc_ads {
 /* GuC logging structures */
 
 enum guc_log_buffer_type {
-   GUC_ISR_LOG_BUFFER,
-   GUC_DPC_LOG_BUFFER,
+   GUC_DEBUG_LOG_BUFFER,
GUC_CRASH_DUMP_LOG_BUFFER,
GUC_MAX_LOG_BUFFER
 };
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index c36d5eb5bbb9..ac0931f0374b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -197,10 +197,8 @@ static bool guc_check_log_buf_overflow(struct 
intel_guc_log *log,
 static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
 {
switch (type) {
-   case GUC_ISR_LOG_BUFFER:
-   return ISR_BUFFER_SIZE;
-   case GUC_DPC_LOG_BUFFER:
-   return DPC_BUFFER_SIZE;
+   case GUC_DEBUG_LOG_BUFFER:
+   return DEBUG_BUFFER_SIZE;
case GUC_CRASH_DUMP_LOG_BUFFER:
return CRASH_BUFFER_SIZE;
default:
@@ -245,7 +243,7 @@ static void guc_read_update_log_buffer(struct intel_guc_log 
*log)
src_data += PAGE_SIZE;
dst_data += PAGE_SIZE;
 
-   for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+   for (type = GUC_DEBUG_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
/*
 * Make a copy of the state structure, inside GuC log buffer
 * (which is uncached mapped), on the stack to avoid reading
@@ -463,21 +461,16 @@ int intel_guc_log_create(struct intel_guc_log *log)
 *  +===+ 00B
 *  |Crash dump state header|
 *  +---+ 32B
-*  |   DPC state header|
+*  |  Debug state header   |
 *  +---+ 64B
-*  |   ISR state header|
-*  +---+ 96B
 *  |   |

[PATCH 05/13] drm/i915/guc: Add flag for mark broken CTB

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

Once CTB descriptor is found in error state, either set by GuC
or us, there is no need continue checking descriptor any more,
we can rely on our internal flag.

Signed-off-by: Matthew Brost 
Signed-off-by: Michal Wajdeczko 
Cc: Piotr Piórkowski 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 13 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  2 ++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 3f7f48611487..63056ea0631e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -123,6 +123,7 @@ static void guc_ct_buffer_desc_init(struct 
guc_ct_buffer_desc *desc,
 
 static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb, u32 cmds_addr)
 {
+   ctb->broken = false;
guc_ct_buffer_desc_init(ctb->desc, cmds_addr, ctb->size);
 }
 
@@ -387,9 +388,12 @@ static int ct_write(struct intel_guc_ct *ct,
u32 *cmds = ctb->cmds;
unsigned int i;
 
-   if (unlikely(desc->is_in_error))
+   if (unlikely(ctb->broken))
return -EPIPE;
 
+   if (unlikely(desc->is_in_error))
+   goto corrupted;
+
if (unlikely(!IS_ALIGNED(head | tail, 4) ||
 (tail | head) >= size))
goto corrupted;
@@ -451,6 +455,7 @@ static int ct_write(struct intel_guc_ct *ct,
CT_ERROR(ct, "Corrupted descriptor addr=%#x head=%u tail=%u size=%u\n",
 desc->addr, desc->head, desc->tail, desc->size);
desc->is_in_error = 1;
+   ctb->broken = true;
return -EPIPE;
 }
 
@@ -632,9 +637,12 @@ static int ct_read(struct intel_guc_ct *ct, struct 
ct_incoming_msg **msg)
unsigned int i;
u32 header;
 
-   if (unlikely(desc->is_in_error))
+   if (unlikely(ctb->broken))
return -EPIPE;
 
+   if (unlikely(desc->is_in_error))
+   goto corrupted;
+
if (unlikely(!IS_ALIGNED(head | tail, 4) ||
 (tail | head) >= size))
goto corrupted;
@@ -698,6 +706,7 @@ static int ct_read(struct intel_guc_ct *ct, struct 
ct_incoming_msg **msg)
CT_ERROR(ct, "Corrupted descriptor addr=%#x head=%u tail=%u size=%u\n",
 desc->addr, desc->head, desc->tail, desc->size);
desc->is_in_error = 1;
+   ctb->broken = true;
return -EPIPE;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index cb222f202301..7d3cd375d6a7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -32,12 +32,14 @@ struct intel_guc;
  * @desc: pointer to the buffer descriptor
  * @cmds: pointer to the commands buffer
  * @size: size of the commands buffer
+ * @broken: flag to indicate if descriptor data is broken
  */
 struct intel_guc_ct_buffer {
spinlock_t lock;
struct guc_ct_buffer_desc *desc;
u32 *cmds;
u32 size;
+   bool broken;
 };
 
 
-- 
2.28.0

[PATCH 10/13] drm/i915/guc: Kill guc_clients.ct_pool

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

CTB pool is now maintained internally by the GuC as part of its
"private data". No need to allocate separate buffer and pass it
to GuC as yet another ADS.

Signed-off-by: Matthew Brost  #v4
Signed-off-by: Michal Wajdeczko 
Cc: Janusz Krzysztofik 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 12 
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 12 +---
 2 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 4fcbe4b921f9..6e26fe04ce92 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -26,8 +26,6 @@
  *  +---+
  *  | guc_clients_info  |
  *  +---+
- *  | guc_ct_pool_entry[size]   |
- *  +---+
  *  | padding   |
  *  +---+ <== 4K aligned
  *  | private data  |
@@ -40,7 +38,6 @@ struct __guc_ads_blob {
struct guc_policies policies;
struct guc_gt_system_info system_info;
struct guc_clients_info clients_info;
-   struct guc_ct_pool_entry ct_pool[GUC_CT_POOL_SIZE];
 } __packed;
 
 static u32 guc_ads_private_data_size(struct intel_guc *guc)
@@ -68,11 +65,6 @@ static void guc_policies_init(struct guc_policies *policies)
policies->is_valid = 1;
 }
 
-static void guc_ct_pool_entries_init(struct guc_ct_pool_entry *pool, u32 num)
-{
-   memset(pool, 0, num * sizeof(*pool));
-}
-
 static void guc_mapping_table_init(struct intel_gt *gt,
   struct guc_gt_system_info *system_info)
 {
@@ -161,11 +153,7 @@ static void __guc_ads_init(struct intel_guc *guc)
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
 
/* Clients info  */
-   guc_ct_pool_entries_init(blob->ct_pool, ARRAY_SIZE(blob->ct_pool));
-
blob->clients_info.clients_num = 1;
-   blob->clients_info.ct_pool_addr = base + ptr_offset(blob, ct_pool);
-   blob->clients_info.ct_pool_count = ARRAY_SIZE(blob->ct_pool);
 
/* ADS */
blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 251c3836bd2c..2266444d074f 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -295,19 +295,9 @@ struct guc_gt_system_info {
 } __packed;
 
 /* Clients info */
-struct guc_ct_pool_entry {
-   struct guc_ct_buffer_desc desc;
-   u32 reserved[7];
-} __packed;
-
-#define GUC_CT_POOL_SIZE   2
-
 struct guc_clients_info {
u32 clients_num;
-   u32 reserved0[13];
-   u32 ct_pool_addr;
-   u32 ct_pool_count;
-   u32 reserved[4];
+   u32 reserved[19];
 } __packed;
 
 /* GuC Additional Data Struct */
-- 
2.28.0

[PATCH 09/13] drm/i915/doc: Include GuC ABI documentation

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

GuC ABI documentation is now ready to be included in i915.rst

Signed-off-by: Michal Wajdeczko 
Signed-off-by: Matthew Brost 
Cc: Piotr Piórkowski 
---
 Documentation/gpu/i915.rst | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 42ce0196930a..c7846b1d9293 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -518,6 +518,14 @@ GuC-based command submission
 .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
:doc: GuC-based command submission
 
+GuC ABI
+
+
+.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
+.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
+.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
+
 HuC
 ---
 .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_huc.c
-- 
2.28.0

[PATCH 11/13] drm/i915/guc: Kill ads.client_info

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

New GuC does not require it any more.

Reviewed-by: Matthew Brost 
Signed-off-by: Michal Wajdeczko 
Cc: Piotr Piórkowski 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 7 ---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 8 +---
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 6e26fe04ce92..b82145652d57 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -24,8 +24,6 @@
  *  +---+
  *  | guc_gt_system_info|
  *  +---+
- *  | guc_clients_info  |
- *  +---+
  *  | padding   |
  *  +---+ <== 4K aligned
  *  | private data  |
@@ -37,7 +35,6 @@ struct __guc_ads_blob {
struct guc_ads ads;
struct guc_policies policies;
struct guc_gt_system_info system_info;
-   struct guc_clients_info clients_info;
 } __packed;
 
 static u32 guc_ads_private_data_size(struct intel_guc *guc)
@@ -152,13 +149,9 @@ static void __guc_ads_init(struct intel_guc *guc)
 
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
 
-   /* Clients info  */
-   blob->clients_info.clients_num = 1;
-
/* ADS */
blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
-   blob->ads.clients_info = base + ptr_offset(blob, clients_info);
 
/* Private Data */
blob->ads.private_data = base + guc_ads_private_data_offset(guc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 2266444d074f..f2df5c11c11d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -294,19 +294,13 @@ struct guc_gt_system_info {
u32 generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_MAX];
 } __packed;
 
-/* Clients info */
-struct guc_clients_info {
-   u32 clients_num;
-   u32 reserved[19];
-} __packed;
-
 /* GuC Additional Data Struct */
 struct guc_ads {
struct guc_mmio_reg_set 
reg_state_list[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
u32 reserved0;
u32 scheduler_policies;
u32 gt_system_info;
-   u32 clients_info;
+   u32 reserved1;
u32 control_data;
u32 golden_context_lrca[GUC_MAX_ENGINE_CLASSES];
u32 eng_state_size[GUC_MAX_ENGINE_CLASSES];
-- 
2.28.0

[PATCH 03/13] drm/i915/guc: Update CTB response status definition

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

Format of the STATUS dword in CTB response message now follows
definition of the HXG header. Update our code and remove any
obsolete legacy definitions.

GuC: 55.0.0
Signed-off-by: Matthew Brost 
Signed-off-by: Michal Wajdeczko 
Acked-by: Piotr Piórkowski 
Reviewed-by: Daniele Ceraolo Spurio 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c   | 14 --
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 17 -
 2 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 8f7b148fef58..3f7f48611487 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -477,7 +477,9 @@ static int wait_for_ct_request_update(struct ct_request 
*req, u32 *status)
 * up to that length of time, then switch to a slower sleep-wait loop.
 * No GuC command should ever take longer than 10ms.
 */
-#define done INTEL_GUC_MSG_IS_RESPONSE(READ_ONCE(req->status))
+#define done \
+   (FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \
+GUC_HXG_ORIGIN_GUC)
err = wait_for_us(done, 10);
if (err)
err = wait_for(done, 10);
@@ -532,21 +534,21 @@ static int ct_send(struct intel_guc_ct *ct,
if (unlikely(err))
goto unlink;
 
-   if (!INTEL_GUC_MSG_IS_RESPONSE_SUCCESS(*status)) {
+   if (FIELD_GET(GUC_HXG_MSG_0_TYPE, *status) != 
GUC_HXG_TYPE_RESPONSE_SUCCESS) {
err = -EIO;
goto unlink;
}
 
if (response_buf) {
/* There shall be no data in the status */
-   WARN_ON(INTEL_GUC_MSG_TO_DATA(request.status));
+   WARN_ON(FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, 
request.status));
/* Return actual response len */
err = request.response_len;
} else {
/* There shall be no response payload */
WARN_ON(request.response_len);
/* Return data decoded from the status dword */
-   err = INTEL_GUC_MSG_TO_DATA(*status);
+   err = FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, *status);
}
 
 unlink:
@@ -741,8 +743,8 @@ static int ct_handle_response(struct intel_guc_ct *ct, 
struct ct_incoming_msg *r
status = response->msg[2];
datalen = len - 2;
 
-   /* Format of the status follows RESPONSE message */
-   if (unlikely(!INTEL_GUC_MSG_IS_RESPONSE(status))) {
+   /* Format of the status dword follows HXG header */
+   if (unlikely(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, status) != 
GUC_HXG_ORIGIN_GUC)) {
CT_ERROR(ct, "Corrupted response (status %#x)\n", status);
return -EPROTO;
}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index e9a9d85e2aa3..fb04e2211b79 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -414,23 +414,6 @@ struct guc_shared_ctx_data {
struct guc_ctx_report preempt_ctx_report[GUC_MAX_ENGINES_NUM];
 } __packed;
 
-#define __INTEL_GUC_MSG_GET(T, m) \
-   (((m) & INTEL_GUC_MSG_ ## T ## _MASK) >> INTEL_GUC_MSG_ ## T ## _SHIFT)
-#define INTEL_GUC_MSG_TO_TYPE(m)   __INTEL_GUC_MSG_GET(TYPE, m)
-#define INTEL_GUC_MSG_TO_DATA(m)   __INTEL_GUC_MSG_GET(DATA, m)
-#define INTEL_GUC_MSG_TO_CODE(m)   __INTEL_GUC_MSG_GET(CODE, m)
-
-#define __INTEL_GUC_MSG_TYPE_IS(T, m) \
-   (INTEL_GUC_MSG_TO_TYPE(m) == INTEL_GUC_MSG_TYPE_ ## T)
-#define INTEL_GUC_MSG_IS_REQUEST(m)__INTEL_GUC_MSG_TYPE_IS(REQUEST, m)
-#define INTEL_GUC_MSG_IS_RESPONSE(m)   __INTEL_GUC_MSG_TYPE_IS(RESPONSE, m)
-
-#define INTEL_GUC_MSG_IS_RESPONSE_SUCCESS(m) \
-(typecheck(u32, (m)) && \
- ((m) & (INTEL_GUC_MSG_TYPE_MASK | INTEL_GUC_MSG_CODE_MASK)) == \
- ((INTEL_GUC_MSG_TYPE_RESPONSE << INTEL_GUC_MSG_TYPE_SHIFT) | \
-  (INTEL_GUC_RESPONSE_STATUS_SUCCESS << INTEL_GUC_MSG_CODE_SHIFT)))
-
 /* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
 enum intel_guc_recv_message {
INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED = BIT(1),
-- 
2.28.0

[PATCH 04/13] drm/i915/guc: Support per context scheduling policies

2021-06-09 Thread Matthew Brost

From: John Harrison 

GuC firmware v53.0.0 introduced per context scheduling policies. This
includes changes to some of the ADS structures which are required to
load the firmware even if not using GuC submission.

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c  | 26 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 31 +
 2 files changed, 11 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 9abfbc6edbd6..4fcbe4b921f9 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -59,30 +59,12 @@ static u32 guc_ads_blob_size(struct intel_guc *guc)
   guc_ads_private_data_size(guc);
 }
 
-static void guc_policy_init(struct guc_policy *policy)
-{
-   policy->execution_quantum = POLICY_DEFAULT_EXECUTION_QUANTUM_US;
-   policy->preemption_time = POLICY_DEFAULT_PREEMPTION_TIME_US;
-   policy->fault_time = POLICY_DEFAULT_FAULT_TIME_US;
-   policy->policy_flags = 0;
-}
-
 static void guc_policies_init(struct guc_policies *policies)
 {
-   struct guc_policy *policy;
-   u32 p, i;
-
-   policies->dpc_promote_time = POLICY_DEFAULT_DPC_PROMOTE_TIME_US;
-   policies->max_num_work_items = POLICY_MAX_NUM_WI;
-
-   for (p = 0; p < GUC_CLIENT_PRIORITY_NUM; p++) {
-   for (i = 0; i < GUC_MAX_ENGINE_CLASSES; i++) {
-   policy = >policy[p][i];
-
-   guc_policy_init(policy);
-   }
-   }
-
+   policies->dpc_promote_time = GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US;
+   policies->max_num_work_items = GLOBAL_POLICY_MAX_NUM_WI;
+   /* Disable automatic resets as not yet supported. */
+   policies->global_flags = GLOBAL_POLICY_DISABLE_ENGINE_RESET;
policies->is_valid = 1;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index fb04e2211b79..251c3836bd2c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -247,32 +247,14 @@ struct guc_stage_desc {
 
 /* Scheduling policy settings */
 
-/* Reset engine upon preempt failure */
-#define POLICY_RESET_ENGINE(1<<0)
-/* Preempt to idle on quantum expiry */
-#define POLICY_PREEMPT_TO_IDLE (1<<1)
-
-#define POLICY_MAX_NUM_WI 15
-#define POLICY_DEFAULT_DPC_PROMOTE_TIME_US 50
-#define POLICY_DEFAULT_EXECUTION_QUANTUM_US 100
-#define POLICY_DEFAULT_PREEMPTION_TIME_US 50
-#define POLICY_DEFAULT_FAULT_TIME_US 25
-
-struct guc_policy {
-   /* Time for one workload to execute. (in micro seconds) */
-   u32 execution_quantum;
-   /* Time to wait for a preemption request to completed before issuing a
-* reset. (in micro seconds). */
-   u32 preemption_time;
-   /* How much time to allow to run after the first fault is observed.
-* Then preempt afterwards. (in micro seconds) */
-   u32 fault_time;
-   u32 policy_flags;
-   u32 reserved[8];
-} __packed;
+#define GLOBAL_POLICY_MAX_NUM_WI 15
+
+/* Don't reset an engine upon preemption failure */
+#define GLOBAL_POLICY_DISABLE_ENGINE_RESET BIT(0)
+
+#define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 50
 
 struct guc_policies {
-   struct guc_policy 
policy[GUC_CLIENT_PRIORITY_NUM][GUC_MAX_ENGINE_CLASSES];
u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
/* In micro seconds. How much time to allow before DPC processing is
 * called back via interrupt (to prevent DPC queue drain starving).
@@ -286,6 +268,7 @@ struct guc_policies {
 * idle. */
u32 max_num_work_items;
 
+   u32 global_flags;
u32 reserved[4];
 } __packed;
 
-- 
2.28.0

[PATCH 02/13] drm/i915/guc: Update MMIO based communication

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

The MMIO based Host-to-GuC communication protocol has been
updated to use unified HXG messages.

Update our intel_guc_send_mmio() function by correctly handle
BUSY, RETRY and FAILURE replies. Also update our documentation.

Since some of the new MMIO actions may use DATA0 from MMIO HXG
response, we must update intel_guc_send_mmio() to copy full response,
including HXG header. There will be no impact to existing users as all
of them are only relying just on return code.

v2:
 (Daniele)
  - preffered -> preferred
  - Max MMIO DW set to 4
  - Update commit message

GuC: 55.0.0
Signed-off-by: Matthew Brost 
Signed-off-by: Michal Wajdeczko 
Cc: Piotr Piórkowski 
Cc: Michal Winiarski  #v3
---
 .../gt/uc/abi/guc_communication_mmio_abi.h| 65 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc.c| 92 ++-
 2 files changed, 98 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
index be066a62e9e0..bbf1ddb77434 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
@@ -7,46 +7,43 @@
 #define _ABI_GUC_COMMUNICATION_MMIO_ABI_H
 
 /**
- * DOC: MMIO based communication
+ * DOC: GuC MMIO based communication
  *
- * The MMIO based communication between Host and GuC uses software scratch
- * registers, where first register holds data treated as message header,
- * and other registers are used to hold message payload.
+ * The MMIO based communication between Host and GuC relies on special
+ * hardware registers which format could be defined by the software
+ * (so called scratch registers).
  *
- * For Gen9+, GuC uses software scratch registers 0xC180-0xC1B8,
- * but no H2G command takes more than 8 parameters and the GuC FW
- * itself uses an 8-element array to store the H2G message.
+ * Each MMIO based message, both Host to GuC (H2G) and GuC to Host (G2H)
+ * messages, which maximum length depends on number of available scratch
+ * registers, is directly written into those scratch registers.
  *
- *  +---+-+-+-+
- *  |  MMIO[0]  | MMIO[1] |   ...   | MMIO[n] |
- *  +---+-+-+-+
- *  | header|  optional payload   |
- *  +==++=+=+=+
- *  | 31:28|type| | | |
- *  +--++ | | |
- *  | 27:16|data| | | |
- *  +--++ | | |
- *  |  15:0|code| | | |
- *  +--++-+-+-+
+ * For Gen9+, there are 16 software scratch registers 0xC180-0xC1B8,
+ * but no H2G command takes more than 4 parameters and the GuC firmware
+ * itself uses an 4-element array to store the H2G message.
  *
- * The message header consists of:
+ * For Gen11+, there are additional 4 registers 0x190240-0x19024C, which
+ * are, regardless on lower count, preferred over legacy ones.
  *
- * - **type**, indicates message type
- * - **code**, indicates message code, is specific for **type**
- * - **data**, indicates message data, optional, depends on **code**
- *
- * The following message **types** are supported:
- *
- * - **REQUEST**, indicates Host-to-GuC request, requested GuC action code
- *   must be priovided in **code** field. Optional action specific parameters
- *   can be provided in remaining payload registers or **data** field.
- *
- * - **RESPONSE**, indicates GuC-to-Host response from earlier GuC request,
- *   action response status will be provided in **code** field. Optional
- *   response data can be returned in remaining payload registers or **data**
- *   field.
+ * The MMIO based communication is mainly used during driver initialization
+ * phase to setup the `CTB based communication`_ that will be used afterwards.
  */
 
-#define GUC_MAX_MMIO_MSG_LEN   8
+#define GUC_MAX_MMIO_MSG_LEN   4
+
+/**
+ * DOC: MMIO HXG Message
+ *
+ * Format of the MMIO messages follows definitions of `HXG Message`_.
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  | 0 |  31:0 |  ++  
|
+ *  +---+---+  ||  
|
+ *  |...|   |  |  Embedded `HXG Message`_   |  
|
+ *  +---+---+  ||  
|
+ *  | n |  31:0 |  ++  
|
+ *  
+---+---+--+
+ */
 
 #endif /* _ABI_GUC_COMMUNICATION_MMIO_ABI_H

[PATCH 01/13] drm/i915/guc: Introduce unified HXG messages

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

New GuC firmware will unify format of MMIO and CTB H2G messages.
Introduce their definitions now to allow gradual transition of
our code to match new changes.

Signed-off-by: Matthew Brost 
Signed-off-by: Michal Wajdeczko 
Cc: Michał Winiarski 
---
 .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 213 ++
 1 file changed, 213 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
index 775e21f3058c..29ac823acd4c 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
@@ -6,6 +6,219 @@
 #ifndef _ABI_GUC_MESSAGES_ABI_H
 #define _ABI_GUC_MESSAGES_ABI_H
 
+/**
+ * DOC: HXG Message
+ *
+ * All messages exchanged with GuC are defined using 32 bit dwords.
+ * First dword is treated as a message header. Remaining dwords are optional.
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  |   |   |  
|
+ *  | 0 |31 | **ORIGIN** - originator of the message   
|
+ *  |   |   |   - _`GUC_HXG_ORIGIN_HOST` = 0   
|
+ *  |   |   |   - _`GUC_HXG_ORIGIN_GUC` = 1
|
+ *  |   |   |  
|
+ *  |   
+---+--+
+ *  |   | 30:28 | **TYPE** - message type  
|
+ *  |   |   |   - _`GUC_HXG_TYPE_REQUEST` = 0  
|
+ *  |   |   |   - _`GUC_HXG_TYPE_EVENT` = 1
|
+ *  |   |   |   - _`GUC_HXG_TYPE_NO_RESPONSE_BUSY` = 3 
|
+ *  |   |   |   - _`GUC_HXG_TYPE_NO_RESPONSE_RETRY` = 5
|
+ *  |   |   |   - _`GUC_HXG_TYPE_RESPONSE_FAILURE` = 6 
|
+ *  |   |   |   - _`GUC_HXG_TYPE_RESPONSE_SUCCESS` = 7 
|
+ *  |   
+---+--+
+ *  |   |  27:0 | **AUX** - auxiliary data (depends on TYPE)   
|
+ *  
+---+---+--+
+ *  | 1 |  31:0 |  
|
+ *  +---+---+  
|
+ *  |...|   | **PAYLOAD** - optional payload (depends on TYPE) 
|
+ *  +---+---+  
|
+ *  | n |  31:0 |  
|
+ *  
+---+---+--+
+ */
+
+#define GUC_HXG_MSG_MIN_LEN1u
+#define GUC_HXG_MSG_0_ORIGIN   (0x1 << 31)
+#define   GUC_HXG_ORIGIN_HOST  0u
+#define   GUC_HXG_ORIGIN_GUC   1u
+#define GUC_HXG_MSG_0_TYPE (0x7 << 28)
+#define   GUC_HXG_TYPE_REQUEST 0u
+#define   GUC_HXG_TYPE_EVENT   1u
+#define   GUC_HXG_TYPE_NO_RESPONSE_BUSY3u
+#define   GUC_HXG_TYPE_NO_RESPONSE_RETRY   5u
+#define   GUC_HXG_TYPE_RESPONSE_FAILURE6u
+#define   GUC_HXG_TYPE_RESPONSE_SUCCESS7u
+#define GUC_HXG_MSG_0_AUX  (0xfff << 0)
+#define GUC_HXG_MSG_n_PAYLOAD  (0x << 0)
+
+/**
+ * DOC: HXG Request
+ *
+ * The `HXG Request`_ message should be used to initiate synchronous activity
+ * for which confirmation or return data is expected.
+ *
+ * The recipient of this message shall use `HXG Response`_, `HXG Failure`_
+ * or `HXG Retry`_ message as a definite reply, and may use `HXG Busy`_
+ * message as a intermediate reply.
+ *
+ * Format of @DATA0 and all @DATAn fields depends on the @ACTION code.
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  | 0 |31 | ORIGIN   
|
+ *  |   
+---+--+
+ *  |   | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ 
|
+ *  |   
+---+--+
+ *  |   | 27:16 | **DATA0** - request data (depends on ACTION) 
|
+ *  |   
+---+--+
+ *  |   |  15:0 | **ACTION** - requested action code

[PATCH 07/13] drm/i915/guc: New definition of the CTB registration action

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

Definition of the CTB registration action has changed.
Add some ABI documentation and implement required changes.

v2:
 (Checkpoint)
  - Fix warnings
 (Daniele)
  - Drop FIXME
 (John H)
  - Drop value in kernel doc, just use define

Signed-off-by: Michal Wajdeczko 
Signed-off-by: Matthew Brost 
Cc: Piotr Piórkowski  #4
---
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  | 107 ++
 .../gt/uc/abi/guc_communication_ctb_abi.h |   4 -
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  76 -
 3 files changed, 152 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
index 90efef8a73e4..dfaea0b54370 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
@@ -6,6 +6,113 @@
 #ifndef _ABI_GUC_ACTIONS_ABI_H
 #define _ABI_GUC_ACTIONS_ABI_H
 
+/**
+ * DOC: HOST2GUC_REGISTER_CTB
+ *
+ * This message is used as part of the `CTB based communication`_ setup.
+ *
+ * This message must be sent as `MMIO HXG Message`_.
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  | 0 |31 | ORIGIN = GUC_HXG_ORIGIN_HOST_
|
+ *  |   
+---+--+
+ *  |   | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ 
|
+ *  |   
+---+--+
+ *  |   | 27:16 | DATA0 = MBZ  
|
+ *  |   
+---+--+
+ *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_REGISTER_CTB` 
|
+ *  
+---+---+--+
+ *  | 1 | 31:12 | RESERVED = MBZ   
|
+ *  |   
+---+--+
+ *  |   |  11:8 | **TYPE** - type for the `CT Buffer`_ 
|
+ *  |   |   |  
|
+ *  |   |   |   - _`GUC_CTB_TYPE_HOST2GUC` = 0 
|
+ *  |   |   |   - _`GUC_CTB_TYPE_GUC2HOST` = 1 
|
+ *  |   
+---+--+
+ *  |   |   7:0 | **SIZE** - size of the `CT Buffer`_ in 4K units minus 1  
|
+ *  
+---+---+--+
+ *  | 2 |  31:0 | **DESC_ADDR** - GGTT address of the `CTB Descriptor`_
|
+ *  
+---+---+--+
+ *  | 3 |  31:0 | **BUFF_ADDF** - GGTT address of the `CT Buffer`_ 
|
+ *  
+---+---+--+
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  | 0 |31 | ORIGIN = GUC_HXG_ORIGIN_GUC_ 
|
+ *  |   
+---+--+
+ *  |   | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_
|
+ *  |   
+---+--+
+ *  |   |  27:0 | DATA0 = MBZ  
|
+ *  
+---+---+--+
+ */
+#define GUC_ACTION_HOST2GUC_REGISTER_CTB   0x4505
+
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_LEN  
(GUC_HXG_REQUEST_MSG_MIN_LEN + 3u)
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_0_MBZ
GUC_HXG_REQUEST_MSG_0_DATA0
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_MBZ(0xf << 12)
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_TYPE   (0xf << 8)
+#define   GUC_CTB_TYPE_HOST2GUC0u
+#define   GUC_CTB_TYPE_GUC2HOST1u
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_SIZE   (0xff << 0)
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_2_DESC_ADDR  
GUC_HXG_REQUEST_MSG_n_DATAn
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_3_BUFF_ADDR  
GUC_HXG_REQUEST_MSG_n_DATAn
+
+#define HOST2GUC_REGISTER_CTB_RESPONSE_MSG_LEN 
GUC_HXG_RESPONSE_MSG_MIN_LEN
+#define HOST2GUC_REGISTER_CTB_RESPONSE_MSG_0_MBZ   
GUC_HXG_RESPONSE_MSG_0_DATA0
+
+/**
+ * DOC: HOST2GUC_DEREGISTER_CTB
+ *
+ * This message is used as part of the `CTB based communication`_ teardown.
+ *
+ * This message must be sent as `MMIO HXG Message`_.
+ *
+

[PATCH 08/13] drm/i915/guc: New CTB based communication

2021-06-09 Thread Matthew Brost

From: Michal Wajdeczko 

Format of the CTB messages has changed:
 - support for multiple formats
 - message fence is now part of the header
 - reuse of unified HXG message formats

v2:
 (Daniele)
  - Better comment in ct_write()

Signed-off-by: Michal Wajdeczko 
Signed-off-by: Matthew Brost 
Cc: Piotr Piórkowski 
---
 .../gt/uc/abi/guc_communication_ctb_abi.h |  56 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 195 +++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |   2 +-
 3 files changed, 136 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index 6735f1fdaa2a..d3a2e002b6c0 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -58,6 +58,62 @@ struct guc_ct_buffer_desc {
 } __packed;
 static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
 
+/**
+ * DOC: CTB Message
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  | 0 | 31:16 | **FENCE** - message identifier   
|
+ *  |   
+---+--+
+ *  |   | 15:12 | **FORMAT** - format of the CTB message   
|
+ *  |   |   |  - _`GUC_CTB_FORMAT_HXG` = 0 - see `CTB HXG Message`_
|
+ *  |   
+---+--+
+ *  |   |  11:8 | **RESERVED** 
|
+ *  |   
+---+--+
+ *  |   |   7:0 | **NUM_DWORDS** - length of the CTB message (w/o header)  
|
+ *  
+---+---+--+
+ *  | 1 |  31:0 | optional (depends on FORMAT) 
|
+ *  +---+---+  
|
+ *  |...|   |  
|
+ *  +---+---+  
|
+ *  | n |  31:0 |  
|
+ *  
+---+---+--+
+ */
+
+#define GUC_CTB_MSG_MIN_LEN1u
+#define GUC_CTB_MSG_MAX_LEN256u
+#define GUC_CTB_MSG_0_FENCE(0x << 16)
+#define GUC_CTB_MSG_0_FORMAT   (0xf << 12)
+#define   GUC_CTB_FORMAT_HXG   0u
+#define GUC_CTB_MSG_0_RESERVED (0xf << 8)
+#define GUC_CTB_MSG_0_NUM_DWORDS   (0xff << 0)
+
+/**
+ * DOC: CTB HXG Message
+ *
+ *  
+---+---+--+
+ *  |   | Bits  | Description  
|
+ *  
+===+===+==+
+ *  | 0 | 31:16 | FENCE
|
+ *  |   
+---+--+
+ *  |   | 15:12 | FORMAT = GUC_CTB_FORMAT_HXG_ 
|
+ *  |   
+---+--+
+ *  |   |  11:8 | RESERVED = MBZ   
|
+ *  |   
+---+--+
+ *  |   |   7:0 | NUM_DWORDS = length (in dwords) of the embedded HXG message  
|
+ *  
+---+---+--+
+ *  | 1 |  31:0 |  ++  
|
+ *  +---+---+  ||  
|
+ *  |...|   |  |  Embedded `HXG Message`_   |  
|
+ *  +---+---+  ||  
|
+ *  | n |  31:0 |  ++  
|
+ *  
+---+---+--+
+ */
+
+#define GUC_CTB_HXG_MSG_MIN_LEN(GUC_CTB_MSG_MIN_LEN + 
GUC_HXG_MSG_MIN_LEN)
+#define GUC_CTB_HXG_MSG_MAX_LENGUC_CTB_MSG_MAX_LEN
+
 /**
  * DOC: CTB based communication
  *
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 6a29be779cc9..43409044528e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -365,24 +365,6 @@ static void write_barrier(struct intel_guc_ct *ct)
}
 }
 
-/**
- * DOC: CTB Host to GuC request
- *
- * Format of the CTB Host to GuC request message is as

[PATCH 00/13] Update firmware to v62.0.0

2021-06-09 Thread Matthew Brost

As part of enabling GuC submission [1] we need to update to the latest
and greatest firmware. This series does that. This is a destructive
change. e.g. Without all the patches in this series it will break the
i915 driver. As such, after we review most of these patches they will
squashed into a single patch for merging.

v2: Address comments, looking for remaning RBs so patches can be
squashed and sent for CI

Signed-off-by: Matthew Brost 

[1] https://patchwork.freedesktop.org/series/89844/i

John Harrison (3):
  drm/i915/guc: Support per context scheduling policies
  drm/i915/guc: Unified GuC log
  drm/i915/guc: Update firmware to v62.0.0

Michal Wajdeczko (10):
  drm/i915/guc: Introduce unified HXG messages
  drm/i915/guc: Update MMIO based communication
  drm/i915/guc: Update CTB response status definition
  drm/i915/guc: Add flag for mark broken CTB
  drm/i915/guc: New definition of the CTB descriptor
  drm/i915/guc: New definition of the CTB registration action
  drm/i915/guc: New CTB based communication
  drm/i915/doc: Include GuC ABI documentation
  drm/i915/guc: Kill guc_clients.ct_pool
  drm/i915/guc: Kill ads.client_info

 Documentation/gpu/i915.rst|   8 +
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  | 107 ++
 .../gt/uc/abi/guc_communication_ctb_abi.h | 128 +--
 .../gt/uc/abi/guc_communication_mmio_abi.h|  65 ++--
 .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 213 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c| 107 --
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  45 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 356 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |   6 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  75 +---
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c|  29 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.h|   6 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |  26 +-
 13 files changed, 750 insertions(+), 421 deletions(-)

-- 
2.28.0

Re: [PATCH 08/13] drm/i915/guc: New CTB based communication

2021-06-09 Thread Matthew Brost

On Mon, Jun 07, 2021 at 07:20:01PM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 6/7/2021 11:03 AM, Matthew Brost wrote:
> > From: Michal Wajdeczko 
> > 
> > Format of the CTB messages has changed:
> >   - support for multiple formats
> >   - message fence is now part of the header
> >   - reuse of unified HXG message formats
> > 
> > Signed-off-by: Michal Wajdeczko 
> > Signed-off-by: Matthew Brost 
> > Cc: Piotr Piórkowski 
> > ---
> >   .../gt/uc/abi/guc_communication_ctb_abi.h |  56 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 194 +++---
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |   2 +-
> >   3 files changed, 135 insertions(+), 117 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h 
> > b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> > index 127b256a662c..92660726c094 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> > @@ -60,6 +60,62 @@ struct guc_ct_buffer_desc {
> >   } __packed;
> >   static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
> > +/**
> > + * DOC: CTB Message
> > + *
> > + *  
> > +---+---+--+
> > + *  |   | Bits  | Description  
> > |
> > + *  
> > +===+===+==+
> > + *  | 0 | 31:16 | **FENCE** - message identifier   
> > |
> > + *  |   
> > +---+--+
> > + *  |   | 15:12 | **FORMAT** - format of the CTB message   
> > |
> > + *  |   |   |  - _`GUC_CTB_FORMAT_HXG` = 0 - see `CTB HXG Message`_
> > |
> > + *  |   
> > +---+--+
> > + *  |   |  11:8 | **RESERVED** 
> > |
> > + *  |   
> > +---+--+
> > + *  |   |   7:0 | **NUM_DWORDS** - length of the CTB message (w/o header)  
> > |
> > + *  
> > +---+---+--+
> > + *  | 1 |  31:0 | optional (depends on FORMAT) 
> > |
> > + *  +---+---+  
> > |
> > + *  |...|   |  
> > |
> > + *  +---+---+  
> > |
> > + *  | n |  31:0 |  
> > |
> > + *  
> > +---+---+--+
> > + */
> > +
> > +#define GUC_CTB_MSG_MIN_LEN1u
> > +#define GUC_CTB_MSG_MAX_LEN256u
> > +#define GUC_CTB_MSG_0_FENCE(0x << 16)
> > +#define GUC_CTB_MSG_0_FORMAT   (0xf << 12)
> > +#define   GUC_CTB_FORMAT_HXG   0u
> > +#define GUC_CTB_MSG_0_RESERVED (0xf << 8)
> > +#define GUC_CTB_MSG_0_NUM_DWORDS   (0xff << 0)
> > +
> > +/**
> > + * DOC: CTB HXG Message
> > + *
> > + *  
> > +---+---+--+
> > + *  |   | Bits  | Description  
> > |
> > + *  
> > +===+===+==+
> > + *  | 0 | 31:16 | FENCE
> > |
> > + *  |   
> > +---+--+
> > + *  |   | 15:12 | FORMAT = GUC_CTB_FORMAT_HXG_ 
> > |
> > + *  |   
> > +---+--+
> > + *  |   |  11:8 | RESERVED = MBZ   
> > |
> > + *  |   
> > +---+--+
> > + *  |   |   7:0 | NUM_DWORDS = length (in dwords) of the embedded HXG 
> > message  |
> > + *  
> > +---+---+--+
> > + *  | 1 |  31:0 |  
> > ++  |
> > + *  +---+---+  |   
> >  |  |
> > + *  |...|   |  |  Embedded `HXG Message`_  
> >  |  |
> > + *  +---+---+  |   
> >  |  |
> > + *  | n |  31:0 |  
> > ++  |
> > + *  
> > +---+---+--+
> > + */
> > +
> > +#define GUC_CTB_HXG_MSG_MIN_LEN(GUC_CTB_MSG_MIN_LEN + 
> >

[pull] amdgpu, radeon drm-fixes-5.13

2021-06-09 Thread Alex Deucher

Hi Dave, Daniel,

Fixes for 5.13.

The following changes since commit 614124bea77e452aa6df7a8714e8bc820b489922:

  Linux 5.13-rc5 (2021-06-06 15:47:27 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-5.13-2021-06-09

for you to fetch changes up to ab8363d3875a83f4901eb1cc00ce8afd24de6c85:

  radeon: use memcpy_to/fromio for UVD fw upload (2021-06-08 14:05:11 -0400)


amd-drm-fixes-5.13-2021-06-09:

amdgpu:
- Use kvzmalloc in amdgu_bo_create
- Use drm_dbg_kms for reporting failure to get a GEM FB
- Fix some register offsets for Sienna Cichlid
- Fix fall-through warning

radeon:
- memcpy_to/from_io fixes


Changfeng (1):
  drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create

Chen Li (1):
  radeon: use memcpy_to/fromio for UVD fw upload

Gustavo A. R. Silva (1):
  drm/amd/pm: Fix fall-through warning for Clang

Michel Dänzer (1):
  drm/amdgpu: Use drm_dbg_kms for reporting failure to get a GEM FB

Rohit Khaire (1):
  drm/amdgpu: Fix incorrect register offsets for Sienna Cichlid

 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 26 +-
 .../gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c   |  1 +
 drivers/gpu/drm/radeon/radeon_uvd.c|  4 ++--
 5 files changed, 28 insertions(+), 11 deletions(-)

Re: [PATCH] drm/amdgpu: use correct rounding macro for 64-bit

2021-06-09 Thread Alex Deucher

On Wed, Jun 9, 2021 at 11:33 PM Dave Airlie  wrote:
>
> On Thu, 10 Jun 2021 at 13:23, Alex Deucher  wrote:
> >
> > On Wed, Jun 9, 2021 at 11:10 PM Dave Airlie  wrote:
> > >
> > > From: Dave Airlie 
> > >
> > > This fixes 32-bit arm build due to lack of 64-bit divides.
> > >
> > > Fixes: cb1c81467af3 ("drm/ttm: flip the switch for driver allocated 
> > > resources v2")
> > > Signed-off-by: Dave Airlie 
> >
> > Reviewed-by: Alex Deucher 
>
> I'm going to apply this directly to next.

Thanks!

Alex

>
> Dave.

Re: [PATCH] drm/amdgpu: use correct rounding macro for 64-bit

2021-06-09 Thread Dave Airlie

On Thu, 10 Jun 2021 at 13:23, Alex Deucher  wrote:
>
> On Wed, Jun 9, 2021 at 11:10 PM Dave Airlie  wrote:
> >
> > From: Dave Airlie 
> >
> > This fixes 32-bit arm build due to lack of 64-bit divides.
> >
> > Fixes: cb1c81467af3 ("drm/ttm: flip the switch for driver allocated 
> > resources v2")
> > Signed-off-by: Dave Airlie 
>
> Reviewed-by: Alex Deucher 

I'm going to apply this directly to next.

Dave.

Re: [PATCH] drm/amdgpu: use correct rounding macro for 64-bit

2021-06-09 Thread Alex Deucher

On Wed, Jun 9, 2021 at 11:10 PM Dave Airlie  wrote:
>
> From: Dave Airlie 
>
> This fixes 32-bit arm build due to lack of 64-bit divides.
>
> Fixes: cb1c81467af3 ("drm/ttm: flip the switch for driver allocated resources 
> v2")
> Signed-off-by: Dave Airlie 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 9a6df02477ce..436ec246a7da 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -407,7 +407,7 @@ static int amdgpu_vram_mgr_new(struct 
> ttm_resource_manager *man,
>  #endif
> pages_per_node = max_t(uint32_t, pages_per_node,
>tbo->page_alignment);
> -   num_nodes = DIV_ROUND_UP(PFN_UP(mem_bytes), pages_per_node);
> +   num_nodes = DIV_ROUND_UP_ULL(PFN_UP(mem_bytes), 
> pages_per_node);
> }
>
> node = kvmalloc(struct_size(node, mm_nodes, num_nodes),
> --
> 2.25.4
>

[pull] amdgpu, amdkfd, radeon drm-next-5.14

2021-06-09 Thread Alex Deucher

Hi Dave, Daniel,

More new stuff for 5.14.

The following changes since commit 5745d647d5563d3e9d32013ad4e5c629acff04d7:

  Merge tag 'amd-drm-next-5.14-2021-06-02' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2021-06-04 06:13:57 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-5.14-2021-06-09

for you to fetch changes up to 2c1b1ac7084edf477309d27c02d9da7f79b33cec:

  drm/amdgpu/vcn: drop gfxoff control for VCN2+ (2021-06-09 22:15:02 -0400)


amd-drm-next-5.14-2021-06-09:

amdgpu:
- SR-IOV fixes
- Smartshift updates
- GPUVM TLB flush updates
- 16bpc fixed point display fix for DCE11
- BACO cleanups and core refactoring
- Aldebaran updates
- Initial Yellow Carp support
- RAS fixes
- PM API cleanup
- DC visual confirm updates
- DC DP MST fixes
- DC DML fixes
- Misc code cleanups and bug fixes

amdkfd:
- Initial Yellow Carp support

radeon:
- memcpy_to/from_io fixes

UAPI:
- Add Yellow Carp chip family id
  Used internally in the kernel driver and by mesa


Aaron Liu (42):
  drm/amdgpu: add yellow carp asic header files (v3)
  drm/amdgpu: add yellow carp asic_type enum
  drm/amdgpu: add uapi to define yellow carp series
  drm/amdgpu: add yellow carp support for gpu_info and ip block setting
  drm/amdgpu: add nv common ip block support for yellow carp
  drm/amdgpu: add yellow carp support for ih block
  drm/amdgpu: add gmc v10 supports for yellow carp
  drm/amdgpu: support fw load type for yellow carp
  drm/amdgpu: add gfx support for yellow carp
  drm/amdgpu: add sdma support for yellow carp
  drm/amdgpu: set ip blocks for yellow carp
  drm/amdkfd: add yellow carp KFD support
  drm/amdgpu: support nbio_7_2_1 for yellow carp
  drm/admgpu/pm: add smu v13 driver interface header for yellow carp (v3)
  drm/amdgpu/pm: add smu v13.0.1 firmware header for yellow carp (V4)
  drm/amdgpu/pm: add smu v13.0.1 smc header for yellow carp (v2)
  drm/amd/pm: add smu13 ip support for moment(V3)
  drm/amd/pm: add yellow_carp_ppt implementation(V3)
  drm/amd/pm: partially enable swsmu for yellow carp(V2)
  drm/amdgpu: add smu ip block for yellow carp(V3)
  drm/amdgpu: add gfx golden settings for yellow carp (v3)
  drm/amdgpu: reserved buffer is not needed with ip discovery enabled
  drm/amdgpu: add psp_v13 support for yellow carp
  drm/amdgpu: enable psp_v13 for yellow carp
  drm/amdgpu/pm: set_pp_feature is unsupport for yellow carp
  drm/amdgpu/pm: add set_driver_table_location implementation for yellow 
carp
  drm/amdgpu: add GFX Clock Gating support for yellow carp
  drm/amdgpu: add MMHUB Clock Gating support for yellow carp
  drm/amdgpu: add GFX Power Gating support for yellow carp
  drm/amdgpu/pm: enable smu_hw_init for yellow carp
  drm/amdgpu/pm: add gfx_off_control for yellow carp
  drm/amdgpu/pm: enable gfx_off in yellow carp smu post init
  drm/amdgpu: add SDMA Clock Gating support for yellow carp
  drm/amdgpu: add HDP Clock Gating support for yellow carp
  drm/amdgpu: add ATHUB Clock Gating support for yellow carp
  drm/amdgpu: add IH Clock Gating support for yellow carp
  drm/amdgpu: enable VCN PG and CG for yellow carp
  drm/amdgpu/pm: support smu_post_init for yellow carp
  drm/amdgpu: add RLC_PG_DELAY_3 for yellow carp
  drm/amdgpu: add timestamp counter query support for yellow carp
  drm/amd/pm: add PrepareMp1ForUnload support for yellow carp
  drm/amdgpu: add mode2 reset support for yellow carp

Alex Deucher (5):
  drm/amdgpu: add yellow_carp_reg_base_init function for yellow carp (v2)
  drm/amdgpu: add mmhub client support for yellow carp
  drm/amdgpu/dc: fix DCN3.1 Makefile for PPC64
  drm/amdgpu/dc: fix DCN3.1 FP handling
  drm/amdgpu/vcn: drop gfxoff control for VCN2+

Anthony Koo (1):
  drm/amd/display: [FW Promotion] Release 0.0.68

Aric Cyr (4):
  drm/amd/display: Change default policy for MPO with multidisplay
  drm/amd/display: 3.2.138
  drm/amd/display: Fix crash during MPO + ODM combine mode recalculation
  drm/amd/display: 3.2.139

Bernard Zhao (1):
  drm/amd/display: remove no need variable

Changfeng (1):
  drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create

Chen Li (2):
  radeon: fix coding issues reported from sparse
  radeon: use memcpy_to/fromio for UVD fw upload

Christian König (1):
  drm/amdgpu: fix VM handling for GART allocations

Christophe JAILLET (1):
  drm/amdgpu: Fix a a typo in a comment

Colin Ian King (3):
  drm/amdgpu: remove redundant assignment of variable k
  drm/amd/display: remove variable active_disp
  drm/amd/display: Fix two spelling mistakes, clean wide lines

Darren Powell (6):
  amdgpu/pm: reorder

[PATCH] drm/amdgpu: use correct rounding macro for 64-bit

2021-06-09 Thread Dave Airlie

From: Dave Airlie 

This fixes 32-bit arm build due to lack of 64-bit divides.

Fixes: cb1c81467af3 ("drm/ttm: flip the switch for driver allocated resources 
v2")
Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 9a6df02477ce..436ec246a7da 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -407,7 +407,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
 #endif
pages_per_node = max_t(uint32_t, pages_per_node,
   tbo->page_alignment);
-   num_nodes = DIV_ROUND_UP(PFN_UP(mem_bytes), pages_per_node);
+   num_nodes = DIV_ROUND_UP_ULL(PFN_UP(mem_bytes), pages_per_node);
}
 
node = kvmalloc(struct_size(node, mm_nodes, num_nodes),
-- 
2.25.4

Re: [PATCH] ARM: config: Refresh mutli v7

2021-06-09 Thread Joel Stanley

On Wed, 9 Jun 2021 at 09:30, Arnd Bergmann  wrote:
>
> On Tue, Jun 8, 2021 at 6:49 PM Hans Verkuil  wrote:
> > On 08/06/2021 18:14, Arnd Bergmann wrote:
> >
> > Right now it is inherent to the driver. It is probably possible to drop 
> > support
> > for video overlay devices if CONFIG_FB=n, but it is not something I have 
> > time
> > for. It's just a test driver (albeit a very useful test driver), so it is no
> > big deal if it is disabled when CONFIG_FB=n.
>
> Ok, thanks for the reply, makes sense.
>
> I checked what other consequences there are if we disable CONFIG_FB
> and CONFIG_DRM_KMS_FB_HELPER=y in all the defconfigs now,
> as the patch from Kees did.
>
> It appears that the only other arm32 framebuffers that remain are
> FB_EFI=y, FB_WM8505=y, FB_MX3=m and FB_SIMPLE=y.

FB_SH_MOBILE_LCDC on arm32 too.

> As long as simplefb, efifb and xenfb are needed though, we probably
> want CONFIG_FB=y anyway and leaving VIVID=m with the dependency
> does not cause problems until those are all turned into drm drivers.

I will go ahead with this for the v7 defconfig.

Cheers,

Joel

Re: [PATCH v10 07/10] mm: Device exclusive memory access

2021-06-09 Thread Alistair Popple

On Thursday, 10 June 2021 2:05:06 AM AEST Peter Xu wrote:
> On Wed, Jun 09, 2021 at 07:38:04PM +1000, Alistair Popple wrote:
> > On Wednesday, 9 June 2021 4:33:52 AM AEST Peter Xu wrote:
> > > On Mon, Jun 07, 2021 at 05:58:52PM +1000, Alistair Popple wrote:

[...]

> > For thp this means we could end up passing
> > tail pages to rmap_walk(), however it doesn't actually walk them.
> >
> > Based on the results of previous testing I had done I assumed rmap_walk()
> > filtered out tail pages. It does, and I didn't hit the BUG_ON above, but the
> > filtering was not as deliberate as assumed.
> >
> > I've gone back and looked at what was happening in my earlier tests and the
> > tail pages get filtered because the VMA is not getting locked in
> > page_lock_anon_vma_read() due to failing this check:
> >
> >   anon_mapping = (unsigned long)READ_ONCE(page->mapping);
> >   if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
> >   goto out;
> >
> > And now I'm not sure it makes sense to read page->mapping of a tail page. So
> > it might be best if we explicitly ignore any tail pages returned from GUP, 
> > at
> > least for now (a future series will improve thp support such as adding a pmd
> > version for exclusive entries).
> 
> I feel like it's illegal to access page->mapping of tail pages; I looked at
> what happens if we call page_anon_vma() on a tail page:
> 
> struct anon_vma *page_anon_vma(struct page *page)
> {
> unsigned long mapping;
> 
> page = compound_head(page);
> mapping = (unsigned long)page->mapping;
> if ((mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
> return NULL;
> return __page_rmapping(page);
> }
> 
> It'll just take the head's mapping instead.  It makes sense since the tail 
> page
> shouldn't have a different value against the head page, afaiu.

Right, it makes no sense to look at ->mapping on a tail page because the field
is used for something else. On the 1st tail page it is ->compound_nr and on the
2nd tail page it is ->deferred_list. See the definitions of compound_nr() and
page_deferred_list() respectively. I suppose on the rest of the pages it could
be anything.

I think in practice it is probably ok - iuc bit 0 won't be set for compound_nr
and certainly not for deferred_list->next (a pointer). But none of that seems
intentional, so it would be better to be explicit and not walk the tail pages.

> It would be great if thp experts could chim in.  Before that happens, I agree
> with you that a safer approach is to explicitly not walk a tail page for its
> rmap (and I think the rmap of a tail page will be the same of the head
> anyways.. since they seem to share the anon_vma as quoted).
> >
> > > So... for thp mappings, wondering whether we should do normal GUP (without
> > > SPLIT), pass in always normal or head pages into rmap_walk(), but then
> > > unconditionally split_huge_pmd_address() in 
> > > page_make_device_exclusive_one()?
> >
> > That could work (although I think GUP will still return tail pages - see
> > follow_trans_huge_pmd() which is called from follow_pmd_mask() in gup).
> 
> Agreed.
> 
> > The main problem is split_huge_pmd_address() unconditionally calls a mmu
> > notifier so I would need to plumb in passing an owner everywhere which could
> > get messy.
> 
> Could I ask why?  split_huge_pmd_address() will notify with CLEAR, so I'm a 
> bit
> confused why we need to pass over the owner.

Sure, it is the same reason we need to pass it for the exclusive notifier.
Any invalidation during the make exclusive operation will break the mmu read
side critical section forcing a retry of the operation. The owner field is what
is used to filter out invalidations (such as the exclusive invalidation) that
don't need to be retried.
 
> I thought plumb it right before your EXCLUSIVE notifier init would work?

I did try this just to double check and it doesn't work due to the unconditional
notifier.

> ---8<---
> diff --git a/mm/rmap.c b/mm/rmap.c
> index a94d9aed9d95..360ce86f3822 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -2042,6 +2042,12 @@ static bool page_make_device_exclusive_one(struct page 
> *page,
> swp_entry_t entry;
> pte_t swp_pte;
> 
> +   /*
> +* Make sure thps split as device exclusive entries only support pte
> +* level for now.
> +*/
> +   split_huge_pmd_address(vma, address, false, page);
> +
> mmu_notifier_range_init_owner(, MMU_NOTIFY_EXCLUSIVE, 0, vma,
>   vma->vm_mm, address, min(vma->vm_end,
>   address + page_size(page)), 
> args->owner);
> ---8<---
> 
> Thanks,
> 
> --
> Peter Xu
>

Re: [Intel-gfx] [RFC PATCH 36/97] drm/i915/guc: Add non blocking CTB send function

2021-06-09 Thread Matthew Brost

On Wed, Jun 09, 2021 at 04:14:05PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 07.06.2021 19:31, Matthew Brost wrote:
> > On Thu, May 27, 2021 at 04:11:50PM +0100, Tvrtko Ursulin wrote:
> >>
> >> On 27/05/2021 15:35, Matthew Brost wrote:
> >>> On Thu, May 27, 2021 at 11:02:24AM +0100, Tvrtko Ursulin wrote:
> 
>  On 26/05/2021 19:10, Matthew Brost wrote:
> 
>  [snip]
> 
> > +static int ct_send_nb(struct intel_guc_ct *ct,
> > + const u32 *action,
> > + u32 len,
> > + u32 flags)
> > +{
> > +   struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > +   unsigned long spin_flags;
> > +   u32 fence;
> > +   int ret;
> > +
> > +   spin_lock_irqsave(>lock, spin_flags);
> > +
> > +   ret = ctb_has_room(ctb, len + 1);
> > +   if (unlikely(ret))
> > +   goto out;
> > +
> > +   fence = ct_get_next_fence(ct);
> > +   ret = ct_write(ct, action, len, fence, flags);
> > +   if (unlikely(ret))
> > +   goto out;
> > +
> > +   intel_guc_notify(ct_to_guc(ct));
> > +
> > +out:
> > +   spin_unlock_irqrestore(>lock, spin_flags);
> > +
> > +   return ret;
> > +}
> > +
> >  static int ct_send(struct intel_guc_ct *ct,
> >const u32 *action,
> >u32 len,
> > @@ -473,6 +541,7 @@ static int ct_send(struct intel_guc_ct *ct,
> >u32 response_buf_size,
> >u32 *status)
> >  {
> > +   struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > struct ct_request request;
> > unsigned long flags;
> > u32 fence;
> > @@ -482,8 +551,20 @@ static int ct_send(struct intel_guc_ct *ct,
> > GEM_BUG_ON(!len);
> > GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
> > GEM_BUG_ON(!response_buf && response_buf_size);
> > +   might_sleep();
> 
>  Sleep is just cond_resched below or there is more?
> 
> >>>
> >>> Yes, the cond_resched.
> >>>
> > +   /*
> > +* We use a lazy spin wait loop here as we believe that if the 
> > CT
> > +* buffers are sized correctly the flow control condition 
> > should be
> > +* rare.
> > +*/
> > +retry:
> > spin_lock_irqsave(>ctbs.send.lock, flags);
> > +   if (unlikely(!ctb_has_room(ctb, len + 1))) {
> > +   spin_unlock_irqrestore(>ctbs.send.lock, flags);
> > +   cond_resched();
> > +   goto retry;
> > +   }
> 
>  If this patch is about adding a non-blocking send function, and 
>  below we can
>  see that it creates a fork:
> 
>  intel_guc_ct_send:
>  ...
>   if (flags & INTEL_GUC_SEND_NB)
>   return ct_send_nb(ct, action, len, flags);
> 
>   ret = ct_send(ct, action, len, response_buf, response_buf_size, 
>  );
> 
>  Then why is there a change in ct_send here, which is not the new
>  non-blocking path?
> 
> >>>
> >>> There is not a change to ct_send(), just to intel_guc_ct_send.
> >>
> >> I was doing by the diff which says:
> >>
> >>static int ct_send(struct intel_guc_ct *ct,
> >>   const u32 *action,
> >>   u32 len,
> >> @@ -473,6 +541,7 @@ static int ct_send(struct intel_guc_ct *ct,
> >>   u32 response_buf_size,
> >>   u32 *status)
> >>{
> >> +  struct intel_guc_ct_buffer *ctb = >ctbs.send;
> >>struct ct_request request;
> >>unsigned long flags;
> >>u32 fence;
> >> @@ -482,8 +551,20 @@ static int ct_send(struct intel_guc_ct *ct,
> >>GEM_BUG_ON(!len);
> >>GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
> >>GEM_BUG_ON(!response_buf && response_buf_size);
> >> +  might_sleep();
> >> +  /*
> >> +   * We use a lazy spin wait loop here as we believe that if the 
> >> CT
> >> +   * buffers are sized correctly the flow control condition 
> >> should be
> >> +   * rare.
> >> +   */
> >> +retry:
> >>spin_lock_irqsave(>ctbs.send.lock, flags);
> >> +  if (unlikely(!ctb_has_room(ctb, len + 1))) {
> >> +  spin_unlock_irqrestore(>ctbs.send.lock, flags);
> >> +  cond_resched();
> >> +  goto retry;
> >> +  }
> >>
> >> So it looks like a change to ct_send to me. Is that wrong?
> 
>

Re: [Intel-gfx] [RFC PATCH 36/97] drm/i915/guc: Add non blocking CTB send function

2021-06-09 Thread Matthew Brost

On Tue, Jun 08, 2021 at 10:46:15AM +0200, Daniel Vetter wrote:
> On Tue, Jun 8, 2021 at 10:39 AM Tvrtko Ursulin
>  wrote:
> >
> >
> > On 07/06/2021 18:31, Matthew Brost wrote:
> > > On Thu, May 27, 2021 at 04:11:50PM +0100, Tvrtko Ursulin wrote:
> > >>
> > >> On 27/05/2021 15:35, Matthew Brost wrote:
> > >>> On Thu, May 27, 2021 at 11:02:24AM +0100, Tvrtko Ursulin wrote:
> > 
> >  On 26/05/2021 19:10, Matthew Brost wrote:
> > 
> >  [snip]
> > 
> > > +static int ct_send_nb(struct intel_guc_ct *ct,
> > > +   const u32 *action,
> > > +   u32 len,
> > > +   u32 flags)
> > > +{
> > > + struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > > + unsigned long spin_flags;
> > > + u32 fence;
> > > + int ret;
> > > +
> > > + spin_lock_irqsave(>lock, spin_flags);
> > > +
> > > + ret = ctb_has_room(ctb, len + 1);
> > > + if (unlikely(ret))
> > > + goto out;
> > > +
> > > + fence = ct_get_next_fence(ct);
> > > + ret = ct_write(ct, action, len, fence, flags);
> > > + if (unlikely(ret))
> > > + goto out;
> > > +
> > > + intel_guc_notify(ct_to_guc(ct));
> > > +
> > > +out:
> > > + spin_unlock_irqrestore(>lock, spin_flags);
> > > +
> > > + return ret;
> > > +}
> > > +
> > >   static int ct_send(struct intel_guc_ct *ct,
> > >  const u32 *action,
> > >  u32 len,
> > > @@ -473,6 +541,7 @@ static int ct_send(struct intel_guc_ct *ct,
> > >  u32 response_buf_size,
> > >  u32 *status)
> > >   {
> > > + struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > >   struct ct_request request;
> > >   unsigned long flags;
> > >   u32 fence;
> > > @@ -482,8 +551,20 @@ static int ct_send(struct intel_guc_ct *ct,
> > >   GEM_BUG_ON(!len);
> > >   GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
> > >   GEM_BUG_ON(!response_buf && response_buf_size);
> > > + might_sleep();
> > 
> >  Sleep is just cond_resched below or there is more?
> > 
> > >>>
> > >>> Yes, the cond_resched.
> > >>>
> > > + /*
> > > +  * We use a lazy spin wait loop here as we believe that if 
> > > the CT
> > > +  * buffers are sized correctly the flow control condition 
> > > should be
> > > +  * rare.
> > > +  */
> > > +retry:
> > >   spin_lock_irqsave(>ctbs.send.lock, flags);
> > > + if (unlikely(!ctb_has_room(ctb, len + 1))) {
> > > + spin_unlock_irqrestore(>ctbs.send.lock, flags);
> > > + cond_resched();
> > > + goto retry;
> > > + }
> > 
> >  If this patch is about adding a non-blocking send function, and 
> >  below we can
> >  see that it creates a fork:
> > 
> >  intel_guc_ct_send:
> >  ...
> > if (flags & INTEL_GUC_SEND_NB)
> > return ct_send_nb(ct, action, len, flags);
> > 
> > ret = ct_send(ct, action, len, response_buf, 
> >  response_buf_size, );
> > 
> >  Then why is there a change in ct_send here, which is not the new
> >  non-blocking path?
> > 
> > >>>
> > >>> There is not a change to ct_send(), just to intel_guc_ct_send.
> > >>
> > >> I was doing by the diff which says:
> > >>
> > >> static int ct_send(struct intel_guc_ct *ct,
> > >> const u32 *action,
> > >> u32 len,
> > >> @@ -473,6 +541,7 @@ static int ct_send(struct intel_guc_ct *ct,
> > >> u32 response_buf_size,
> > >> u32 *status)
> > >> {
> > >> +struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > >>  struct ct_request request;
> > >>  unsigned long flags;
> > >>  u32 fence;
> > >> @@ -482,8 +551,20 @@ static int ct_send(struct intel_guc_ct *ct,
> > >>  GEM_BUG_ON(!len);
> > >>  GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
> > >>  GEM_BUG_ON(!response_buf && response_buf_size);
> > >> +might_sleep();
> > >> +/*
> > >> + * We use a lazy spin wait loop here as we believe that if 
> > >> the CT
> > >> + * buffers are sized correctly the flow control condition 
> > >> should be
> > >> +

[PATCH] drm/msm/dpu: Avoid ABBA deadlock between IRQ modules

2021-06-09 Thread Bjorn Andersson

Handling of the interrupt callback lists is done in dpu_core_irq.c,
under the "cb_lock" spinlock. When these operations results in the need
for enableing or disabling the IRQ in the hardware the code jumps to
dpu_hw_interrupts.c, which protects its operations with "irq_lock"
spinlock.

When an interrupt fires, dpu_hw_intr_dispatch_irq() inspects the
hardware state while holding the "irq_lock" spinlock and jumps to
dpu_core_irq_callback_handler() to invoke the registered handlers, which
traverses the callback list under the "cb_lock" spinlock.

As such, in the event that these happens concurrently we'll end up with
a deadlock.

Prior to '1c1e7763a6d4 ("drm/msm/dpu: simplify IRQ enabling/disabling")'
the enable/disable of the hardware interrupt was done outside the
"cb_lock" region, optimitically by using an atomic enable-counter for
each interrupt and an warning print if someone changed the list between
the atomic_read and the time the operation concluded.

Rather than re-introducing the large array of atomics, serialize the
register/unregister operations under a single mutex.

Fixes: 1c1e7763a6d4 ("drm/msm/dpu: simplify IRQ enabling/disabling")
Signed-off-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c | 10 +++---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h  |  2 ++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c
index 4f110c428b60..62bbe35eff7b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_irq.c
@@ -82,11 +82,13 @@ int dpu_core_irq_register_callback(struct dpu_kms *dpu_kms, 
int irq_idx,
 
DPU_DEBUG("[%pS] irq_idx=%d\n", __builtin_return_address(0), irq_idx);
 
+   mutex_lock(_kms->irq_obj.hw_enable_lock);
spin_lock_irqsave(_kms->irq_obj.cb_lock, irq_flags);
trace_dpu_core_irq_register_callback(irq_idx, register_irq_cb);
list_del_init(_irq_cb->list);
list_add_tail(_irq_cb->list,
_kms->irq_obj.irq_cb_tbl[irq_idx]);
+   spin_unlock_irqrestore(_kms->irq_obj.cb_lock, irq_flags);
if (list_is_first(_irq_cb->list,
_kms->irq_obj.irq_cb_tbl[irq_idx])) {
int ret = dpu_kms->hw_intr->ops.enable_irq(
@@ -96,8 +98,7 @@ int dpu_core_irq_register_callback(struct dpu_kms *dpu_kms, 
int irq_idx,
DPU_ERROR("Fail to enable IRQ for irq_idx:%d\n",
irq_idx);
}
-
-   spin_unlock_irqrestore(_kms->irq_obj.cb_lock, irq_flags);
+   mutex_unlock(_kms->irq_obj.hw_enable_lock);
 
return 0;
 }
@@ -127,9 +128,11 @@ int dpu_core_irq_unregister_callback(struct dpu_kms 
*dpu_kms, int irq_idx,
 
DPU_DEBUG("[%pS] irq_idx=%d\n", __builtin_return_address(0), irq_idx);
 
+   mutex_lock(_kms->irq_obj.hw_enable_lock);
spin_lock_irqsave(_kms->irq_obj.cb_lock, irq_flags);
trace_dpu_core_irq_unregister_callback(irq_idx, register_irq_cb);
list_del_init(_irq_cb->list);
+   spin_unlock_irqrestore(_kms->irq_obj.cb_lock, irq_flags);
/* empty callback list but interrupt is still enabled */
if (list_empty(_kms->irq_obj.irq_cb_tbl[irq_idx])) {
int ret = dpu_kms->hw_intr->ops.disable_irq(
@@ -140,7 +143,7 @@ int dpu_core_irq_unregister_callback(struct dpu_kms 
*dpu_kms, int irq_idx,
irq_idx);
DPU_DEBUG("irq_idx=%d ret=%d\n", irq_idx, ret);
}
-   spin_unlock_irqrestore(_kms->irq_obj.cb_lock, irq_flags);
+   mutex_unlock(_kms->irq_obj.hw_enable_lock);
 
return 0;
 }
@@ -207,6 +210,7 @@ void dpu_core_irq_preinstall(struct dpu_kms *dpu_kms)
dpu_disable_all_irqs(dpu_kms);
pm_runtime_put_sync(_kms->pdev->dev);
 
+   mutex_init(_kms->irq_obj.hw_enable_lock);
spin_lock_init(_kms->irq_obj.cb_lock);
 
/* Create irq callbacks for all possible irq_idx */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
index f6840b1af6e4..5a162caea29d 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
@@ -83,6 +83,7 @@ struct dpu_irq_callback {
  * @total_irq:total number of irq_idx obtained from HW interrupts mapping
  * @irq_cb_tbl:   array of IRQ callbacks setting
  * @cb_lock:  callback lock
+ * @hw_enable_lock: lock to synchronize callback register and unregister
  * @debugfs_file: debugfs file for irq statistics
  */
 struct dpu_irq {
@@ -90,6 +91,7 @@ struct dpu_irq {
struct list_head *irq_cb_tbl;
atomic_t *irq_counts;
spinlock_t cb_lock;
+   struct mutex hw_enable_lock;
 };
 
 struct dpu_kms {
-- 
2.29.2

Re: [Intel-gfx] [RFC PATCH 36/97] drm/i915/guc: Add non blocking CTB send function

2021-06-09 Thread Matthew Brost

On Wed, Jun 09, 2021 at 03:58:38PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 08.06.2021 10:39, Tvrtko Ursulin wrote:
> > 
> > On 07/06/2021 18:31, Matthew Brost wrote:
> >> On Thu, May 27, 2021 at 04:11:50PM +0100, Tvrtko Ursulin wrote:
> >>>
> >>> On 27/05/2021 15:35, Matthew Brost wrote:
>  On Thu, May 27, 2021 at 11:02:24AM +0100, Tvrtko Ursulin wrote:
> >
> > On 26/05/2021 19:10, Matthew Brost wrote:
> >
> > [snip]
> >
> >> +static int ct_send_nb(struct intel_guc_ct *ct,
> >> +  const u32 *action,
> >> +  u32 len,
> >> +  u32 flags)
> >> +{
> >> +    struct intel_guc_ct_buffer *ctb = >ctbs.send;
> >> +    unsigned long spin_flags;
> >> +    u32 fence;
> >> +    int ret;
> >> +
> >> +    spin_lock_irqsave(>lock, spin_flags);
> >> +
> >> +    ret = ctb_has_room(ctb, len + 1);
> >> +    if (unlikely(ret))
> >> +    goto out;
> >> +
> >> +    fence = ct_get_next_fence(ct);
> >> +    ret = ct_write(ct, action, len, fence, flags);
> >> +    if (unlikely(ret))
> >> +    goto out;
> >> +
> >> +    intel_guc_notify(ct_to_guc(ct));
> >> +
> >> +out:
> >> +    spin_unlock_irqrestore(>lock, spin_flags);
> >> +
> >> +    return ret;
> >> +}
> >> +
> >>   static int ct_send(struct intel_guc_ct *ct,
> >>  const u32 *action,
> >>  u32 len,
> >> @@ -473,6 +541,7 @@ static int ct_send(struct intel_guc_ct *ct,
> >>  u32 response_buf_size,
> >>  u32 *status)
> >>   {
> >> +    struct intel_guc_ct_buffer *ctb = >ctbs.send;
> >>   struct ct_request request;
> >>   unsigned long flags;
> >>   u32 fence;
> >> @@ -482,8 +551,20 @@ static int ct_send(struct intel_guc_ct *ct,
> >>   GEM_BUG_ON(!len);
> >>   GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
> >>   GEM_BUG_ON(!response_buf && response_buf_size);
> >> +    might_sleep();
> >
> > Sleep is just cond_resched below or there is more?
> >
> 
>  Yes, the cond_resched.
> 
> >> +    /*
> >> + * We use a lazy spin wait loop here as we believe that
> >> if the CT
> >> + * buffers are sized correctly the flow control condition
> >> should be
> >> + * rare.
> >> + */
> >> +retry:
> >>   spin_lock_irqsave(>ctbs.send.lock, flags);
> >> +    if (unlikely(!ctb_has_room(ctb, len + 1))) {
> >> +    spin_unlock_irqrestore(>ctbs.send.lock, flags);
> >> +    cond_resched();
> >> +    goto retry;
> >> +    }
> >
> > If this patch is about adding a non-blocking send function, and
> > below we can
> > see that it creates a fork:
> >
> > intel_guc_ct_send:
> > ...
> > if (flags & INTEL_GUC_SEND_NB)
> >     return ct_send_nb(ct, action, len, flags);
> >
> >  ret = ct_send(ct, action, len, response_buf,
> > response_buf_size, );
> >
> > Then why is there a change in ct_send here, which is not the new
> > non-blocking path?
> >
> 
>  There is not a change to ct_send(), just to intel_guc_ct_send.
> >>>
> >>> I was doing by the diff which says:
> >>>
> >>>     static int ct_send(struct intel_guc_ct *ct,
> >>>    const u32 *action,
> >>>    u32 len,
> >>> @@ -473,6 +541,7 @@ static int ct_send(struct intel_guc_ct *ct,
> >>>    u32 response_buf_size,
> >>>    u32 *status)
> >>>     {
> >>> +    struct intel_guc_ct_buffer *ctb = >ctbs.send;
> >>>     struct ct_request request;
> >>>     unsigned long flags;
> >>>     u32 fence;
> >>> @@ -482,8 +551,20 @@ static int ct_send(struct intel_guc_ct *ct,
> >>>     GEM_BUG_ON(!len);
> >>>     GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
> >>>     GEM_BUG_ON(!response_buf && response_buf_size);
> >>> +    might_sleep();
> >>> +    /*
> >>> + * We use a lazy spin wait loop here as we believe that if
> >>> the CT
> >>> + * buffers are sized correctly the flow control condition
> >>> should be
> >>> + * rare.
> >>> + */
> >>> +retry:
> >>>     spin_lock_irqsave(>ctbs.send.lock, flags);
> >>> +    if (unlikely(!ctb_has_room(ctb, len + 1))) {
> >>> +    spin_unlock_irqrestore(>ctbs.send.lock, flags);
> >>> +    cond_resched();
> >>> +    goto retry;
> >>> +

Re: [PATCH v4 2/2] drm/doc: document drm_mode_get_plane

2021-06-09 Thread Leandro Ribeiro




On 6/9/21 8:00 PM, Leandro Ribeiro wrote:
> Add a small description and document struct fields of
> drm_mode_get_plane.
> 
> Signed-off-by: Leandro Ribeiro 
> ---
>  include/uapi/drm/drm_mode.h | 36 
>  1 file changed, 36 insertions(+)
> 
> diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
> index a5e76aa06ad5..67bcd8e1931c 100644
> --- a/include/uapi/drm/drm_mode.h
> +++ b/include/uapi/drm/drm_mode.h
> @@ -312,16 +312,52 @@ struct drm_mode_set_plane {
>   __u32 src_w;
>  };
> 
> +/**
> + * struct drm_mode_get_plane - Get plane metadata.
> + *
> + * Userspace can perform a GETPLANE ioctl to retrieve information about a
> + * plane.
> + *
> + * To retrieve the number of formats supported, set @count_format_types to 
> zero
> + * and call the ioctl. @count_format_types will be updated with the value.
> + *
> + * To retrieve these formats, allocate an array with the memory needed to 
> store
> + * @count_format_types formats. Point @format_type_ptr to this array and call
> + * the ioctl again (with @count_format_types still set to the value returned 
> in
> + * the first ioctl call).
> + *
> + * Between one ioctl and the other, the number of formats may change.
> + * Userspace should retry the last ioctl until this number stabilizes. The
> + * kernel won't fill any array which doesn't have the expected length.
> + */

Actually I don't know if this last paragraph applies. For connectors,
for instance, I can see this happening because of hot-plugging. But for
plane formats I have no idea. As in libdrm we have this algorithm, I've
decided to describe it here.

>  struct drm_mode_get_plane {
> + /**
> +  * @plane_id: Object ID of the plane whose information should be
> +  * retrieved. Set by caller.
> +  */
>   __u32 plane_id;
> 
> + /** @crtc_id: Object ID of the current CRTC. */
>   __u32 crtc_id;
> + /** @fb_id: Object ID of the current fb. */
>   __u32 fb_id;
> 
> + /**
> +  * @possible_crtcs: Bitmask of CRTC's compatible with the plane. CRTC's
> +  * are created and they receive an index, which corresponds to their
> +  * position in the bitmask. Bit N corresponds to
> +  * :ref:`CRTC index` N.
> +  */
>   __u32 possible_crtcs;
> + /** @gamma_size: Number of entries of the legacy gamma lookup table. */
>   __u32 gamma_size;
> 
> + /** @count_format_types: Number of formats. */
>   __u32 count_format_types;
> + /**
> +  * @format_type_ptr: Pointer to ``__u32`` array of formats that are
> +  * supported by the plane. These formats do not require modifiers.
> +  */
>   __u64 format_type_ptr;
>  };
> 
> --
> 2.31.1
> 
>

[PATCH v4 2/2] drm/doc: document drm_mode_get_plane

2021-06-09 Thread Leandro Ribeiro

Add a small description and document struct fields of
drm_mode_get_plane.

Signed-off-by: Leandro Ribeiro 
---
 include/uapi/drm/drm_mode.h | 36 
 1 file changed, 36 insertions(+)

diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
index a5e76aa06ad5..67bcd8e1931c 100644
--- a/include/uapi/drm/drm_mode.h
+++ b/include/uapi/drm/drm_mode.h
@@ -312,16 +312,52 @@ struct drm_mode_set_plane {
__u32 src_w;
 };

+/**
+ * struct drm_mode_get_plane - Get plane metadata.
+ *
+ * Userspace can perform a GETPLANE ioctl to retrieve information about a
+ * plane.
+ *
+ * To retrieve the number of formats supported, set @count_format_types to zero
+ * and call the ioctl. @count_format_types will be updated with the value.
+ *
+ * To retrieve these formats, allocate an array with the memory needed to store
+ * @count_format_types formats. Point @format_type_ptr to this array and call
+ * the ioctl again (with @count_format_types still set to the value returned in
+ * the first ioctl call).
+ *
+ * Between one ioctl and the other, the number of formats may change.
+ * Userspace should retry the last ioctl until this number stabilizes. The
+ * kernel won't fill any array which doesn't have the expected length.
+ */
 struct drm_mode_get_plane {
+   /**
+* @plane_id: Object ID of the plane whose information should be
+* retrieved. Set by caller.
+*/
__u32 plane_id;

+   /** @crtc_id: Object ID of the current CRTC. */
__u32 crtc_id;
+   /** @fb_id: Object ID of the current fb. */
__u32 fb_id;

+   /**
+* @possible_crtcs: Bitmask of CRTC's compatible with the plane. CRTC's
+* are created and they receive an index, which corresponds to their
+* position in the bitmask. Bit N corresponds to
+* :ref:`CRTC index` N.
+*/
__u32 possible_crtcs;
+   /** @gamma_size: Number of entries of the legacy gamma lookup table. */
__u32 gamma_size;

+   /** @count_format_types: Number of formats. */
__u32 count_format_types;
+   /**
+* @format_type_ptr: Pointer to ``__u32`` array of formats that are
+* supported by the plane. These formats do not require modifiers.
+*/
__u64 format_type_ptr;
 };

--
2.31.1

[PATCH v4 1/2] drm/doc: document how userspace should find out CRTC index

2021-06-09 Thread Leandro Ribeiro

In this patch we add a section to document what userspace should do to
find out the CRTC index. This is important as they may be many places in
the documentation that need this, so it's better to just point to this
section and avoid repetition.

Signed-off-by: Leandro Ribeiro 
---
 Documentation/gpu/drm-uapi.rst| 13 +
 drivers/gpu/drm/drm_debugfs_crc.c |  8 
 include/uapi/drm/drm.h|  4 ++--
 3 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index 04bdc7a91d53..7e51dd40bf6e 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -457,6 +457,19 @@ Userspace API Structures
 .. kernel-doc:: include/uapi/drm/drm_mode.h
:doc: overview

+.. _crtc_index:
+
+CRTC index
+--
+
+CRTC's have both an object ID and an index, and they are not the same thing.
+The index is used in cases where a densely packed identifier for a CRTC is
+needed, for instance a bitmask of CRTC's. The member possible_crtcs of struct
+drm_mode_get_plane is an example.
+
+DRM_IOCTL_MODE_GETRESOURCES populates a structure with an array of CRTC ID's,
+and the CRTC index is its position in this array.
+
 .. kernel-doc:: include/uapi/drm/drm.h
:internal:

diff --git a/drivers/gpu/drm/drm_debugfs_crc.c 
b/drivers/gpu/drm/drm_debugfs_crc.c
index 3dd70d813f69..bbc3bc4ba844 100644
--- a/drivers/gpu/drm/drm_debugfs_crc.c
+++ b/drivers/gpu/drm/drm_debugfs_crc.c
@@ -46,10 +46,10 @@
  * it reached a given hardware component (a CRC sampling "source").
  *
  * Userspace can control generation of CRCs in a given CRTC by writing to the
- * file dri/0/crtc-N/crc/control in debugfs, with N being the index of the 
CRTC.
- * Accepted values are source names (which are driver-specific) and the "auto"
- * keyword, which will let the driver select a default source of frame CRCs
- * for this CRTC.
+ * file dri/0/crtc-N/crc/control in debugfs, with N being the :ref:`index of
+ * the CRTC`. Accepted values are source names (which are
+ * driver-specific) and the "auto" keyword, which will let the driver select a
+ * default source of frame CRCs for this CRTC.
  *
  * Once frame CRC generation is enabled, userspace can capture them by reading
  * the dri/0/crtc-N/crc/data file. Each line in that file contains the frame
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 67b94bc3c885..bbf4e76daa55 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -635,8 +635,8 @@ struct drm_gem_open {
 /**
  * DRM_CAP_VBLANK_HIGH_CRTC
  *
- * If set to 1, the kernel supports specifying a CRTC index in the high bits of
- * _wait_vblank_request.type.
+ * If set to 1, the kernel supports specifying a :ref:`CRTC index`
+ * in the high bits of _wait_vblank_request.type.
  *
  * Starting kernel version 2.6.39, this capability is always set to 1.
  */
--
2.31.1

[PATCH v4 0/2] Document drm_mode_get_plane

2021-06-09 Thread Leandro Ribeiro

v2: possible_crtcs field is a bitmask, not a pointer. Suggested by
Ville Syrjälä 

v3: document how userspace should find out CRTC index. Also,
document that field 'gamma_size' represents the number of
entries in the lookup table. Suggested by Pekka Paalanen
 and Daniel Vetter 

v4: document IN and OUT fields and make the description more
concise. Suggested by Pekka Paalanen 

Leandro Ribeiro (2):
  drm/doc: document how userspace should find out CRTC index
  drm/doc: document drm_mode_get_plane

 Documentation/gpu/drm-uapi.rst| 13 +++
 drivers/gpu/drm/drm_debugfs_crc.c |  8 +++
 include/uapi/drm/drm.h|  4 ++--
 include/uapi/drm/drm_mode.h   | 36 +++
 4 files changed, 55 insertions(+), 6 deletions(-)

--
2.31.1

Re: [PATCH v3 2/3] dt-bindings: msm: dsi: document phy-type property for 7nm dsi phy

2021-06-09 Thread Rob Herring

On Tue, Jun 08, 2021 at 03:53:28PM -0400, Jonathan Marek wrote:
> Document a new phy-type property which will be used to determine whether
> the phy should operate in D-PHY or C-PHY mode.
> 
> Signed-off-by: Jonathan Marek 
> ---
>  .../devicetree/bindings/display/msm/dsi-phy-7nm.yaml  | 4 
>  include/dt-bindings/phy/phy.h | 2 ++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml 
> b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
> index bf16b1c65e10..d447b517ea19 100644
> --- a/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
> +++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-7nm.yaml
> @@ -34,6 +34,10 @@ properties:
>  description: |
>Connected to VDD_A_DSI_PLL_0P9 pin (or VDDA_DSI{0,1}_PLL_0P9 for 
> sm8150)
>  
> +  phy-type:
> +description: |
> +  D-PHY (default) or C-PHY mode: PHY_TYPE_DSI_DPHY or PHY_TYPE_DSI_CPHY

Don't write prose for what can be schema. Unfortunately, can't do 
defines here, but you need:

enum: [ 10, 11 ]
default: 10

> +
>  required:
>- compatible
>- reg
> diff --git a/include/dt-bindings/phy/phy.h b/include/dt-bindings/phy/phy.h
> index 887a31b250a8..b978dac16bb8 100644
> --- a/include/dt-bindings/phy/phy.h
> +++ b/include/dt-bindings/phy/phy.h
> @@ -20,5 +20,7 @@
>  #define PHY_TYPE_XPCS7
>  #define PHY_TYPE_SGMII   8
>  #define PHY_TYPE_QSGMII  9
> +#define PHY_TYPE_DSI_DPHY10
> +#define PHY_TYPE_DSI_CPHY11
>  
>  #endif /* _DT_BINDINGS_PHY */
> -- 
> 2.26.1

Re: [PATCH v4 6/6] drm/msm: devcoredump iommu fault support

2021-06-09 Thread Rob Clark

On Tue, Jun 8, 2021 at 8:20 AM Jordan Crouse  wrote:
>
> On Tue, Jun 01, 2021 at 03:47:25PM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > Wire up support to stall the SMMU on iova fault, and collect a devcore-
> > dump snapshot for easier debugging of faults.
> >
> > Currently this is a6xx-only, but mostly only because so far it is the
> > only one using adreno-smmu-priv.
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 29 +--
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 15 
> >  drivers/gpu/drm/msm/msm_gem.h   |  1 +
> >  drivers/gpu/drm/msm/msm_gem_submit.c|  1 +
> >  drivers/gpu/drm/msm/msm_gpu.c   | 48 +
> >  drivers/gpu/drm/msm/msm_gpu.h   | 17 +
> >  drivers/gpu/drm/msm/msm_gpummu.c|  5 +++
> >  drivers/gpu/drm/msm/msm_iommu.c | 11 ++
> >  drivers/gpu/drm/msm/msm_mmu.h   |  1 +
> >  9 files changed, 126 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index 094dc17fd20f..0dcde917e575 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -1008,6 +1008,16 @@ static int a6xx_fault_handler(void *arg, unsigned 
> > long iova, int flags, void *da
> >   struct msm_gpu *gpu = arg;
> >   struct adreno_smmu_fault_info *info = data;
> >   const char *type = "UNKNOWN";
> > + const char *block;
> > + bool do_devcoredump = info && !READ_ONCE(gpu->crashstate);
> > +
> > + /*
> > +  * If we aren't going to be resuming later from fault_worker, then do
> > +  * it now.
> > +  */
> > + if (!do_devcoredump) {
> > + gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu);
> > + }
> >
> >   /*
> >* Print a default message if we couldn't get the data from the
> > @@ -1031,15 +1041,30 @@ static int a6xx_fault_handler(void *arg, unsigned 
> > long iova, int flags, void *da
> >   else if (info->fsr & ARM_SMMU_FSR_EF)
> >   type = "EXTERNAL";
> >
> > + block = a6xx_fault_block(gpu, info->fsynr1 & 0xff);
> > +
> >   pr_warn_ratelimited("*** gpu fault: ttbr0=%.16llx iova=%.16lx dir=%s 
> > type=%s source=%s (%u,%u,%u,%u)\n",
> >   info->ttbr0, iova,
> > - flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", type,
> > - a6xx_fault_block(gpu, info->fsynr1 & 0xff),
> > + flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ",
> > + type, block,
> >   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)),
> >   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)),
> >   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)),
> >   gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7)));
> >
> > + if (do_devcoredump) {
> > + /* Turn off the hangcheck timer to keep it from bothering us 
> > */
> > + del_timer(>hangcheck_timer);
> > +
> > + gpu->fault_info.ttbr0 = info->ttbr0;
> > + gpu->fault_info.iova  = iova;
> > + gpu->fault_info.flags = flags;
> > + gpu->fault_info.type  = type;
> > + gpu->fault_info.block = block;
> > +
> > + kthread_queue_work(gpu->worker, >fault_work);
> > + }
> > +
> >   return 0;
> >  }
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > index cf897297656f..4e88d4407667 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > @@ -684,6 +684,21 @@ void adreno_show(struct msm_gpu *gpu, struct 
> > msm_gpu_state *state,
> >   adreno_gpu->info->revn, adreno_gpu->rev.core,
> >   adreno_gpu->rev.major, adreno_gpu->rev.minor,
> >   adreno_gpu->rev.patchid);
> > + /*
> > +  * If this is state collected due to iova fault, so fault related info
> > +  *
> > +  * TTBR0 would not be zero, so this is a good way to distinguish
> > +  */
> > + if (state->fault_info.ttbr0) {
> > + const struct msm_gpu_fault_info *info = >fault_info;
> > +
> > + drm_puts(p, "fault-info:\n");
> > + drm_printf(p, "  - ttbr0=%.16llx\n", info->ttbr0);
> > + drm_printf(p, "  - iova=%.16lx\n", info->iova);
> > + drm_printf(p, "  - dir=%s\n", info->flags & IOMMU_FAULT_WRITE 
> > ? "WRITE" : "READ");
> > + drm_printf(p, "  - type=%s\n", info->type);
> > + drm_printf(p, "  - source=%s\n", info->block);
> > + }
> >
> >   drm_printf(p, "rbbm-status: 0x%08x\n", state->rbbm_status);
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
> > index 03e2cc2a2ce1..405f8411e395 100644
> > --- a/drivers/gpu/drm/msm/msm_gem.h
> > +++

Re: [PATCH v4 5/6] drm/msm: Add crashdump support for stalled SMMU

2021-06-09 Thread Rob Clark

On Tue, Jun 8, 2021 at 8:12 AM Jordan Crouse  wrote:
>
> On Tue, Jun 01, 2021 at 03:47:24PM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > For collecting devcoredumps with the SMMU stalled after an iova fault,
> > we need to skip the parts of the GPU state which are normally collected
> > with the hw crashdumper, since with the SMMU stalled the hw would be
> > unable to write out the requested state to memory.
>
> On a5xx and a6xx you can query RBBM_STATUS3 bit 24 to see if the IOMMU is
> stalled.  That could be an alternative option to adding the "stalled"
> infrastructure across all targets.

Hmm, I suppose it is really only a5xx/a6xx that needs to do something
differently in this case, because of crashdumper, so maybe this would
be a reasonable approach

BR,
-R

> Jordan
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/msm/adreno/a2xx_gpu.c   |  2 +-
> >  drivers/gpu/drm/msm/adreno/a3xx_gpu.c   |  2 +-
> >  drivers/gpu/drm/msm/adreno/a4xx_gpu.c   |  2 +-
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  5 ++-
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  2 +-
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 43 -
> >  drivers/gpu/drm/msm/msm_debugfs.c   |  2 +-
> >  drivers/gpu/drm/msm/msm_gpu.c   |  7 ++--
> >  drivers/gpu/drm/msm/msm_gpu.h   |  2 +-
> >  9 files changed, 47 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> > index bdc989183c64..d2c31fae64fd 100644
> > --- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> > @@ -434,7 +434,7 @@ static void a2xx_dump(struct msm_gpu *gpu)
> >   adreno_dump(gpu);
> >  }
> >
> > -static struct msm_gpu_state *a2xx_gpu_state_get(struct msm_gpu *gpu)
> > +static struct msm_gpu_state *a2xx_gpu_state_get(struct msm_gpu *gpu, bool 
> > stalled)
> >  {
> >   struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL);
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
> > index 4534633fe7cd..b1a6f87d74ef 100644
> > --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
> > @@ -464,7 +464,7 @@ static void a3xx_dump(struct msm_gpu *gpu)
> >   adreno_dump(gpu);
> >  }
> >
> > -static struct msm_gpu_state *a3xx_gpu_state_get(struct msm_gpu *gpu)
> > +static struct msm_gpu_state *a3xx_gpu_state_get(struct msm_gpu *gpu, bool 
> > stalled)
> >  {
> >   struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL);
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > index 82bebb40234d..22780a594d6f 100644
> > --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > @@ -549,7 +549,7 @@ static const unsigned int a405_registers[] = {
> >   ~0 /* sentinel */
> >  };
> >
> > -static struct msm_gpu_state *a4xx_gpu_state_get(struct msm_gpu *gpu)
> > +static struct msm_gpu_state *a4xx_gpu_state_get(struct msm_gpu *gpu, bool 
> > stalled)
> >  {
> >   struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL);
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index a0eef5d9b89b..2e7714b1a17f 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -1519,7 +1519,7 @@ static void a5xx_gpu_state_get_hlsq_regs(struct 
> > msm_gpu *gpu,
> >   msm_gem_kernel_put(dumper.bo, gpu->aspace, true);
> >  }
> >
> > -static struct msm_gpu_state *a5xx_gpu_state_get(struct msm_gpu *gpu)
> > +static struct msm_gpu_state *a5xx_gpu_state_get(struct msm_gpu *gpu, bool 
> > stalled)
> >  {
> >   struct a5xx_gpu_state *a5xx_state = kzalloc(sizeof(*a5xx_state),
> >   GFP_KERNEL);
> > @@ -1536,7 +1536,8 @@ static struct msm_gpu_state 
> > *a5xx_gpu_state_get(struct msm_gpu *gpu)
> >   a5xx_state->base.rbbm_status = gpu_read(gpu, REG_A5XX_RBBM_STATUS);
> >
> >   /* Get the HLSQ regs with the help of the crashdumper */
> > - a5xx_gpu_state_get_hlsq_regs(gpu, a5xx_state);
> > + if (!stalled)
> > + a5xx_gpu_state_get_hlsq_regs(gpu, a5xx_state);
> >
> >   a5xx_set_hwcg(gpu, true);
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> > index ce0610c5256f..e0f06ce4e1a9 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> > @@ -86,7 +86,7 @@ unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu);
> >  void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
> >   struct drm_printer *p);
> >
> > -struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu);
> > +struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu, bool 
> > stalled);
> >  int a6xx_gpu_state_put(struct

[PULL] drm-intel-next

2021-06-09 Thread Rodrigo Vivi

Hi Dave and Daniel,

Here goes the last pull request towards 5.14.
Mostly it is ADL-P enabling related and a few other things.

drm-intel-next-2021-06-09:

Cross-subsystem Changes:

-  x86/gpu: add JasperLake to gen11 early quirks
  (Although the patch lacks the Ack info, it has been Acked by Borislav)

Driver Changes:

- General DMC improves (Anusha)
- More ADL-P enabling (Vandita, Matt, Jose, Mika, Anusha, Imre, Lucas, Jani, 
Manasi, Ville, Stanislav)
- Introduce MBUS relative dbuf offset (Ville)
- PSR fixes and improvements (Gwan, Jose, Ville)
- Re-enable LTTPR non-transparent LT mode for DPCD_REV < 1.4 (Ville)
- Remove duplicated declarations (Shaokun, Wan)
- Check HDMI sink deep color capabilities during .mode_valid (Ville)
- Fix display flicker screan related to console and FBC (Chris)
- Remaining conversions of GRAPHICS_VER (Lucas)
- Drop invalid FIXME (Jose)
- Fix bigjoiner check in dsc_disable (Vandita)

Thanks,
Rodrigo.

The following changes since commit 9a91e5e0af5e03940d0eec72c36364a1701de240:

  Merge tag 'amd-drm-next-5.14-2021-05-21' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2021-05-21 15:59:05 
+1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-next-2021-06-09

for you to fetch changes up to 0d6695b112762aa7aad28c46e65561389b6f50d6:

  drm/i915/adl_p: Same slices mask is not same Dbuf state (2021-06-09 17:24:58 
+0300)


Cross-subsystem Changes:

-  x86/gpu: add JasperLake to gen11 early quirks
  (Although the patch lacks the Ack info, it has been Acked by Borislav)

Driver Changes:

- General DMC improves (Anusha)
- More ADL-P enabling (Vandita, Matt, Jose, Mika, Anusha, Imre, Lucas, Jani, 
Manasi, Ville, Stanislav)
- Introduce MBUS relative dbuf offset (Ville)
- PSR fixes and improvements (Gwan, Jose, Ville)
- Re-enable LTTPR non-transparent LT mode for DPCD_REV < 1.4 (Ville)
- Remove duplicated declarations (Shaokun, Wan)
- Check HDMI sink deep color capabilities during .mode_valid (Ville)
- Fix display flicker screan related to console and FBC (Chris)
- Remaining conversions of GRAPHICS_VER (Lucas)
- Drop invalid FIXME (Jose)
- Fix bigjoiner check in dsc_disable (Vandita)


Anusha Srivatsa (13):
  drm/i915/dmc: s/intel_csr/intel_dmc
  drm/i915/dmc: s/HAS_CSR/HAS_DMC
  drm/i915/dmc: Rename macro names containing csr
  drm/i915/dmc: Rename functions names having "csr"
  drm/i915/dmc: s/intel_csr.c/intel_dmc.c and s/intel_csr.h/intel_dmc.h
  drm/i915/adl_p: Setup ports/phys
  drm/i915/adl_p: Add PLL Support
  drm/i915/adlp: Add PIPE_MISC2 programming
  drm/i915/adl_p: Update memory bandwidth parameters
  drm/i915/gvt: Add missing macro name changes
  drm/i915/dmc: s/DRM_ERROR/drm_err
  drm/i915/dmc: Add intel_dmc_has_payload() helper
  drm/i915/dmc: Move struct intel_dmc to intel_dmc.h

Chris Wilson (1):
  drm/i915/display: relax 2big checking around initial fb

Gwan-gyeong Mun (4):
  drm/i915/display: Replace dc3co_enabled with dc3co_exitline on intel_psr 
struct
  drm/i915/display: Add PSR interrupt error check function
  drm/i915/display: Remove a redundant function argument from 
intel_psr_enable_source()
  drm/i915/display: Introduce new intel_psr_pause/resume function

Imre Deak (9):
  drm/i915/adl_p: Program DP/HDMI link rate to DDI_BUF_CTL
  drm/i915: Reenable LTTPR non-transparent LT mode for DPCD_REV<1.4
  drm/i915/adlp: Require DPT FB CCS color planes to be 2MB aligned
  drm/i915/adlp: Fix GEM VM asserts for DPT VMs
  drm/i915/debugfs: Print remap info for DPT VMAs as well
  drm/i915/adlp: Add missing TBT AUX -> PW#2 power domain dependencies
  drm/i915/ddi: Flush encoder power domain ref puts during driver unload
  drm/i915: Fix incorrect assert about pending power domain async-put work
  drm/i915/adlp: Fix AUX power well -> PHY mapping

Jani Nikula (1):
  drm/i915/adl_p: enable MSO on pipe B

José Roberto de Souza (10):
  drm/i915/adl_p: Implement TC sequences
  drm/i915/adl_p: Don't config MBUS and DBUF during display initialization
  drm/i915/display/adl_p: Drop earlier return in tc_has_modular_fia()
  drm/i915/adl_p: Handle TC cold
  drm/i915: WA for zero memory channel
  drm/i915/display/adl_p: Allow DC3CO in pipe and port B
  drm/i915/display/adl_p: Disable PSR2
  drm/i915/display: Fix fastsets involving PSR
  drm/i915/display: Allow fastsets when DP_SDP_VSC infoframe do not match 
with PSR enabled
  drm/i915/display: Drop FIXME about turn off infoframes

Lucas De Marchi (5):
  drm/i915/display: fix typo when returning table
  drm/i915/gvt: replace IS_GEN and friends with GRAPHICS_VER
  drm/i915/display: replace IS_GEN() in commented code
  drm/i915: replace IS_GEN and friends with GRAPHICS_VER

[PATCH 5/5] DONOTMERGE: dma-buf: Get rid of dma_fence_get_rcu_safe

2021-06-09 Thread Jason Ekstrand

This helper existed to handle the weird corner-cases caused by using
SLAB_TYPESAFE_BY_RCU for backing dma_fence.  Now that no one is using
that anymore (i915 was the only real user), dma_fence_get_rcu is
sufficient.  The one slightly annoying thing we have to deal with here
is that dma_fence_get_rcu_safe did an rcu_dereference as well as a
SLAB_TYPESAFE_BY_RCU-safe dma_fence_get_rcu.  This means each call site
ends up being 3 lines instead of 1.

Signed-off-by: Jason Ekstrand 
Cc: Daniel Vetter 
Cc: Christian König 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
---
 drivers/dma-buf/dma-fence-chain.c |  8 ++--
 drivers/dma-buf/dma-resv.c|  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c |  4 +-
 drivers/gpu/drm/i915/i915_active.h|  4 +-
 drivers/gpu/drm/i915/i915_vma.c   |  4 +-
 include/drm/drm_syncobj.h |  4 +-
 include/linux/dma-fence.h | 50 ---
 include/linux/dma-resv.h  |  4 +-
 8 files changed, 23 insertions(+), 59 deletions(-)

diff --git a/drivers/dma-buf/dma-fence-chain.c 
b/drivers/dma-buf/dma-fence-chain.c
index 7d129e68ac701..46dfc7d94d8ed 100644
--- a/drivers/dma-buf/dma-fence-chain.c
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -15,15 +15,17 @@ static bool dma_fence_chain_enable_signaling(struct 
dma_fence *fence);
  * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence
  * @chain: chain node to get the previous node from
  *
- * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the
- * chain node.
+ * Use rcu_dereference and dma_fence_get_rcu to get a reference to the
+ * previous fence of the chain node.
  */
 static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain 
*chain)
 {
struct dma_fence *prev;
 
rcu_read_lock();
-   prev = dma_fence_get_rcu_safe(>prev);
+   prev = rcu_dereference(chain->prev);
+   if (prev)
+   prev = dma_fence_get_rcu(prev);
rcu_read_unlock();
return prev;
 }
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43a..cfe0db3cca292 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -376,7 +376,9 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct 
dma_resv *src)
dst_list = NULL;
}
 
-   new = dma_fence_get_rcu_safe(>fence_excl);
+   new = rcu_dereference(src->fence_excl);
+   if (new)
+   new = dma_fence_get_rcu(new);
rcu_read_unlock();
 
src_list = dma_resv_shared_list(dst);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 72d9b92b17547..0aeb6117f3893 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -161,7 +161,9 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
dma_fence **f,
struct dma_fence *old;
 
rcu_read_lock();
-   old = dma_fence_get_rcu_safe(ptr);
+   old = rcu_dereference(*ptr);
+   if (old)
+   old = dma_fence_get_rcu(old);
rcu_read_unlock();
 
if (old) {
diff --git a/drivers/gpu/drm/i915/i915_active.h 
b/drivers/gpu/drm/i915/i915_active.h
index d0feda68b874f..bd89cfc806ca5 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -103,7 +103,9 @@ i915_active_fence_get(struct i915_active_fence *active)
struct dma_fence *fence;
 
rcu_read_lock();
-   fence = dma_fence_get_rcu_safe(>fence);
+   fence = rcu_dereference(active->fence);
+   if (fence)
+   fence = dma_fence_get_rcu(fence);
rcu_read_unlock();
 
return fence;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 0f227f28b2802..ed0388d99197e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -351,7 +351,9 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
struct dma_fence *fence;
 
rcu_read_lock();
-   fence = dma_fence_get_rcu_safe(>active.excl.fence);
+   fence = rcu_dereference(vma->active.excl.fence);
+   if (fence)
+   fence = dma_fence_get_rcu(fence);
rcu_read_unlock();
if (fence) {
err = dma_fence_wait(fence, MAX_SCHEDULE_TIMEOUT);
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 6cf7243a1dc5e..6c45d52988bcc 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -105,7 +105,9 @@ drm_syncobj_fence_get(struct drm_syncobj *syncobj)
struct dma_fence *fence;
 
rcu_read_lock();
-   fence = dma_fence_get_rcu_safe(>fence);
+   fence = rcu_dereference(syncobj->fence);
+   if (fence)
+   fence = dma_fence_get_rcu(syncobj->fence);

[PATCH 4/5] dma-buf: Stop using SLAB_TYPESAFE_BY_RCU in selftests

2021-06-09 Thread Jason Ekstrand

The only real-world user of SLAB_TYPESAFE_BY_RCU was i915 and it doesn't
use that anymore so there's no need to be testing it in selftests.

Signed-off-by: Jason Ekstrand 
Cc: Daniel Vetter 
Cc: Christian König 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
---
 drivers/dma-buf/st-dma-fence-chain.c | 24 
 drivers/dma-buf/st-dma-fence.c   | 27 +--
 2 files changed, 9 insertions(+), 42 deletions(-)

diff --git a/drivers/dma-buf/st-dma-fence-chain.c 
b/drivers/dma-buf/st-dma-fence-chain.c
index 9525f7f561194..73010184559fe 100644
--- a/drivers/dma-buf/st-dma-fence-chain.c
+++ b/drivers/dma-buf/st-dma-fence-chain.c
@@ -19,36 +19,27 @@
 
 #define CHAIN_SZ (4 << 10)
 
-static struct kmem_cache *slab_fences;
-
-static inline struct mock_fence {
+struct mock_fence {
struct dma_fence base;
spinlock_t lock;
-} *to_mock_fence(struct dma_fence *f) {
-   return container_of(f, struct mock_fence, base);
-}
+};
 
 static const char *mock_name(struct dma_fence *f)
 {
return "mock";
 }
 
-static void mock_fence_release(struct dma_fence *f)
-{
-   kmem_cache_free(slab_fences, to_mock_fence(f));
-}
-
 static const struct dma_fence_ops mock_ops = {
.get_driver_name = mock_name,
.get_timeline_name = mock_name,
-   .release = mock_fence_release,
+   .release = dma_fence_free,
 };
 
 static struct dma_fence *mock_fence(void)
 {
struct mock_fence *f;
 
-   f = kmem_cache_alloc(slab_fences, GFP_KERNEL);
+   f = kmalloc(sizeof(*f), GFP_KERNEL);
if (!f)
return NULL;
 
@@ -701,14 +692,7 @@ int dma_fence_chain(void)
pr_info("sizeof(dma_fence_chain)=%zu\n",
sizeof(struct dma_fence_chain));
 
-   slab_fences = KMEM_CACHE(mock_fence,
-SLAB_TYPESAFE_BY_RCU |
-SLAB_HWCACHE_ALIGN);
-   if (!slab_fences)
-   return -ENOMEM;
-
ret = subtests(tests, NULL);
 
-   kmem_cache_destroy(slab_fences);
return ret;
 }
diff --git a/drivers/dma-buf/st-dma-fence.c b/drivers/dma-buf/st-dma-fence.c
index c8a12d7ad71ab..ca98cb0b9525b 100644
--- a/drivers/dma-buf/st-dma-fence.c
+++ b/drivers/dma-buf/st-dma-fence.c
@@ -14,25 +14,16 @@
 
 #include "selftest.h"
 
-static struct kmem_cache *slab_fences;
-
-static struct mock_fence {
+struct mock_fence {
struct dma_fence base;
struct spinlock lock;
-} *to_mock_fence(struct dma_fence *f) {
-   return container_of(f, struct mock_fence, base);
-}
+};
 
 static const char *mock_name(struct dma_fence *f)
 {
return "mock";
 }
 
-static void mock_fence_release(struct dma_fence *f)
-{
-   kmem_cache_free(slab_fences, to_mock_fence(f));
-}
-
 struct wait_cb {
struct dma_fence_cb cb;
struct task_struct *task;
@@ -77,14 +68,14 @@ static const struct dma_fence_ops mock_ops = {
.get_driver_name = mock_name,
.get_timeline_name = mock_name,
.wait = mock_wait,
-   .release = mock_fence_release,
+   .release = dma_fence_free,
 };
 
 static struct dma_fence *mock_fence(void)
 {
struct mock_fence *f;
 
-   f = kmem_cache_alloc(slab_fences, GFP_KERNEL);
+   f = kmalloc(sizeof(*f), GFP_KERNEL);
if (!f)
return NULL;
 
@@ -463,7 +454,7 @@ static int thread_signal_callback(void *arg)
 
rcu_read_lock();
do {
-   f2 = dma_fence_get_rcu_safe(>fences[!t->id]);
+   f2 = dma_fence_get_rcu(t->fences[!t->id]);
} while (!f2 && !kthread_should_stop());
rcu_read_unlock();
 
@@ -563,15 +554,7 @@ int dma_fence(void)
 
pr_info("sizeof(dma_fence)=%zu\n", sizeof(struct dma_fence));
 
-   slab_fences = KMEM_CACHE(mock_fence,
-SLAB_TYPESAFE_BY_RCU |
-SLAB_HWCACHE_ALIGN);
-   if (!slab_fences)
-   return -ENOMEM;
-
ret = subtests(tests, NULL);
 
-   kmem_cache_destroy(slab_fences);
-
return ret;
 }
-- 
2.31.1

[PATCH 3/5] drm/i915: Stop using SLAB_TYPESAFE_BY_RCU for i915_request

2021-06-09 Thread Jason Ekstrand

Ever since 0eafec6d3244 ("drm/i915: Enable lockless lookup of request
tracking via RCU"), the i915 driver has used SLAB_TYPESAFE_BY_RCU (it
was called SLAB_DESTROY_BY_RCU at the time) in order to allow RCU on
i915_request.  As nifty as SLAB_TYPESAFE_BY_RCU may be, it comes with
some serious disclaimers.  In particular, objects can get recycled while
RCU readers are still in-flight.  This can be ok if everyone who touches
these objects knows about the disclaimers and is careful.  However,
because we've chosen to use SLAB_TYPESAFE_BY_RCU for i915_request and
because i915_request contains a dma_fence, we've leaked
SLAB_TYPESAFE_BY_RCU and its whole pile of disclaimers to every driver
in the kernel which may consume a dma_fence.

We've tried to keep it somewhat contained by doing most of the hard work
to prevent access of recycled objects via dma_fence_get_rcu_safe().
However, a quick grep of kernel sources says that, of the 30 instances
of dma_fence_get_rcu*, only 11 of them use dma_fence_get_rcu_safe().
It's likely there bear traps in DRM and related subsystems just waiting
for someone to accidentally step in them.

This commit gets stops us using SLAB_TYPESAFE_BY_RCU for i915_request
and, instead, does an RCU-safe slab free via rcu_call().  This should
let us keep most of the perf benefits of slab allocation while avoiding
the bear traps inherent in SLAB_TYPESAFE_BY_RCU.

Signed-off-by: Jason Ekstrand 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Christian König 
Cc: Dave Airlie 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_request.c | 76 -
 1 file changed, 43 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index e531c74f0b0e2..55fa938126100 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -111,9 +111,44 @@ void intel_engine_free_request_pool(struct intel_engine_cs 
*engine)
if (!engine->request_pool)
return;
 
+   /*
+* It's safe to free this right away because we always put a fresh
+* i915_request in the cache that's never been touched by an RCU
+* reader.
+*/
kmem_cache_free(global.slab_requests, engine->request_pool);
 }
 
+static void __i915_request_free(struct rcu_head *head)
+{
+   struct i915_request *rq = container_of(head, typeof(*rq), fence.rcu);
+
+   kmem_cache_free(global.slab_requests, rq);
+}
+
+static void i915_request_free_rcu(struct i915_request *rq)
+{
+   /*
+* Because we're on a slab allocator, memory may be re-used the
+* moment we free it.  There is no kfree_rcu() equivalent for
+* slabs.  Instead, we hand-roll it here with call_rcu().  This
+* gives us all the perf benefits to slab allocation while ensuring
+* that we never release a request back to the slab until there are
+* no more readers.
+*
+* We do have to be careful, though, when calling kmem_cache_destroy()
+* as there may be outstanding free requests.  This is solved by
+* inserting an rcu_barrier() before kmem_cache_destroy().  An RCU
+* barrier is sufficient and we don't need synchronize_rcu()
+* because the call_rcu() here will wait on any outstanding RCU
+* readers and the rcu_barrier() will wait on any outstanding
+* call_rcu() callbacks.  So, if there are any readers who once had
+* valid references to a request, rcu_barrier() will end up waiting
+* on them by transitivity.
+*/
+   call_rcu(>fence.rcu, __i915_request_free);
+}
+
 static void i915_fence_release(struct dma_fence *fence)
 {
struct i915_request *rq = to_request(fence);
@@ -127,8 +162,7 @@ static void i915_fence_release(struct dma_fence *fence)
 */
i915_sw_fence_fini(>submit);
i915_sw_fence_fini(>semaphore);
-
-   kmem_cache_free(global.slab_requests, rq);
+   i915_request_free_rcu(rq);
 }
 
 const struct dma_fence_ops i915_fence_ops = {
@@ -933,35 +967,6 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 */
ensure_cached_request(>engine->request_pool, gfp);
 
-   /*
-* Beware: Dragons be flying overhead.
-*
-* We use RCU to look up requests in flight. The lookups may
-* race with the request being allocated from the slab freelist.
-* That is the request we are writing to here, may be in the process
-* of being read by __i915_active_request_get_rcu(). As such,
-* we have to be very careful when overwriting the contents. During
-* the RCU lookup, we change chase the request->engine pointer,
-* read the request->global_seqno and increment the reference count.
-*
-* The reference count is incremented atomically. If it is zero,
-* the lookup knows the request is unallocated and complete. Otherwise,
-*

[PATCH 2/5] drm/i915: Use a simpler scheme for caching i915_request

2021-06-09 Thread Jason Ekstrand

Instead of attempting to recycle a request in to the cache when it
retires, stuff a new one in the cache every time we allocate a request
for some other reason.

Signed-off-by: Jason Ekstrand 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_request.c | 66 ++---
 1 file changed, 31 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 48c5f8527854b..e531c74f0b0e2 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -128,41 +128,6 @@ static void i915_fence_release(struct dma_fence *fence)
i915_sw_fence_fini(>submit);
i915_sw_fence_fini(>semaphore);
 
-   /*
-* Keep one request on each engine for reserved use under mempressure
-*
-* We do not hold a reference to the engine here and so have to be
-* very careful in what rq->engine we poke. The virtual engine is
-* referenced via the rq->context and we released that ref during
-* i915_request_retire(), ergo we must not dereference a virtual
-* engine here. Not that we would want to, as the only consumer of
-* the reserved engine->request_pool is the power management parking,
-* which must-not-fail, and that is only run on the physical engines.
-*
-* Since the request must have been executed to be have completed,
-* we know that it will have been processed by the HW and will
-* not be unsubmitted again, so rq->engine and rq->execution_mask
-* at this point is stable. rq->execution_mask will be a single
-* bit if the last and _only_ engine it could execution on was a
-* physical engine, if it's multiple bits then it started on and
-* could still be on a virtual engine. Thus if the mask is not a
-* power-of-two we assume that rq->engine may still be a virtual
-* engine and so a dangling invalid pointer that we cannot dereference
-*
-* For example, consider the flow of a bonded request through a virtual
-* engine. The request is created with a wide engine mask (all engines
-* that we might execute on). On processing the bond, the request mask
-* is reduced to one or more engines. If the request is subsequently
-* bound to a single engine, it will then be constrained to only
-* execute on that engine and never returned to the virtual engine
-* after timeslicing away, see __unwind_incomplete_requests(). Thus we
-* know that if the rq->execution_mask is a single bit, rq->engine
-* can be a physical engine with the exact corresponding mask.
-*/
-   if (is_power_of_2(rq->execution_mask) &&
-   !cmpxchg(>engine->request_pool, NULL, rq))
-   return;
-
kmem_cache_free(global.slab_requests, rq);
 }
 
@@ -869,6 +834,29 @@ static void retire_requests(struct intel_timeline *tl)
break;
 }
 
+static void
+ensure_cached_request(struct i915_request **rsvd, gfp_t gfp)
+{
+   struct i915_request *rq;
+
+   /* Don't try to add to the cache if we don't allow blocking.  That
+* just increases the chance that the actual allocation will fail.
+*/
+   if (gfpflags_allow_blocking(gfp))
+   return;
+
+   if (READ_ONCE(rsvd))
+   return;
+
+   rq = kmem_cache_alloc(global.slab_requests,
+ gfp | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
+   if (!rq)
+   return; /* Oops but nothing we can do */
+
+   if (cmpxchg(rsvd, NULL, rq))
+   kmem_cache_free(global.slab_requests, rq);
+}
+
 static noinline struct i915_request *
 request_alloc_slow(struct intel_timeline *tl,
   struct i915_request **rsvd,
@@ -937,6 +925,14 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
/* Check that the caller provided an already pinned context */
__intel_context_pin(ce);
 
+   /* Before we do anything, try to make sure we have at least one
+* request in the engine's cache.  If we get here with GPF_NOWAIT
+* (this can happen when switching to a kernel context), we we want
+* to try very hard to not fail and we fall back to this cache.
+* Top it off with a fresh request whenever it's empty.
+*/
+   ensure_cached_request(>engine->request_pool, gfp);
+
/*
 * Beware: Dragons be flying overhead.
 *
-- 
2.31.1

[PATCH 1/5] drm/i915: Move intel_engine_free_request_pool to i915_request.c

2021-06-09 Thread Jason Ekstrand

This appears to break encapsulation by moving an intel_engine_cs
function to a i915_request file.  However, this function is
intrinsically tied to the lifetime rules and allocation scheme of
i915_request and having it in intel_engine_cs.c leaks details of
i915_request.  We have an abstraction leak either way.  Since
i915_request's allocation scheme is far more subtle than the simple
pointer that is intel_engine_cs.request_pool, it's probably better to
keep i915_request's details to itself.

Signed-off-by: Jason Ekstrand 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 8 
 drivers/gpu/drm/i915/i915_request.c   | 7 +--
 drivers/gpu/drm/i915/i915_request.h   | 2 --
 3 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 9ceddfbb1687d..df6b80ec84199 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -422,14 +422,6 @@ void intel_engines_release(struct intel_gt *gt)
}
 }
 
-void intel_engine_free_request_pool(struct intel_engine_cs *engine)
-{
-   if (!engine->request_pool)
-   return;
-
-   kmem_cache_free(i915_request_slab_cache(), engine->request_pool);
-}
-
 void intel_engines_free(struct intel_gt *gt)
 {
struct intel_engine_cs *engine;
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 1014c71cf7f52..48c5f8527854b 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -106,9 +106,12 @@ static signed long i915_fence_wait(struct dma_fence *fence,
 timeout);
 }
 
-struct kmem_cache *i915_request_slab_cache(void)
+void intel_engine_free_request_pool(struct intel_engine_cs *engine)
 {
-   return global.slab_requests;
+   if (!engine->request_pool)
+   return;
+
+   kmem_cache_free(global.slab_requests, engine->request_pool);
 }
 
 static void i915_fence_release(struct dma_fence *fence)
diff --git a/drivers/gpu/drm/i915/i915_request.h 
b/drivers/gpu/drm/i915/i915_request.h
index 270f6cd37650c..f84c38d29f988 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -300,8 +300,6 @@ static inline bool dma_fence_is_i915(const struct dma_fence 
*fence)
return fence->ops == _fence_ops;
 }
 
-struct kmem_cache *i915_request_slab_cache(void);
-
 struct i915_request * __must_check
 __i915_request_create(struct intel_context *ce, gfp_t gfp);
 struct i915_request * __must_check
-- 
2.31.1

[PATCH 0/5] dma-fence, i915: Stop allowing SLAB_TYPESAFE_BY_RCU for dma_fence

2021-06-09 Thread Jason Ekstrand

Ever since 0eafec6d3244 ("drm/i915: Enable lockless lookup of request
tracking via RCU"), the i915 driver has used SLAB_TYPESAFE_BY_RCU (it
was called SLAB_DESTROY_BY_RCU at the time) in order to allow RCU on
i915_request.  As nifty as SLAB_TYPESAFE_BY_RCU may be, it comes with
some serious disclaimers.  In particular, objects can get recycled while
RCU readers are still in-flight.  This can be ok if everyone who touches
these objects knows about the disclaimers and is careful.  However,
because we've chosen to use SLAB_TYPESAFE_BY_RCU for i915_request and
because i915_request contains a dma_fence, we've leaked
SLAB_TYPESAFE_BY_RCU and its whole pile of disclaimers to every driver
in the kernel which may consume a dma_fence.

We've tried to keep it somewhat contained by doing most of the hard work
to prevent access of recycled objects via dma_fence_get_rcu_safe().
However, a quick grep of kernel sources says that, of the 30 instances
of dma_fence_get_rcu*, only 11 of them use dma_fence_get_rcu_safe().
It's likely there bear traps in DRM and related subsystems just waiting
for someone to accidentally step in them.

This patch series stops us using SLAB_TYPESAFE_BY_RCU for i915_request
and, instead, does an RCU-safe slab free via rcu_call().  This should
let us keep most of the perf benefits of slab allocation while avoiding
the bear traps inherent in SLAB_TYPESAFE_BY_RCU.  It then removes support
for SLAB_TYPESAFE_BY_RCU from dma_fence entirely.

Note: The last patch is labled DONOTMERGE.  This was at Daniel Vetter's
request as we may want to let this bake for a couple releases before we
rip out dma_fence_get_rcu_safe entirely.

Signed-off-by: Jason Ekstrand 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Christian König 
Cc: Dave Airlie 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 

Jason Ekstrand (5):
  drm/i915: Move intel_engine_free_request_pool to i915_request.c
  drm/i915: Use a simpler scheme for caching i915_request
  drm/i915: Stop using SLAB_TYPESAFE_BY_RCU for i915_request
  dma-buf: Stop using SLAB_TYPESAFE_BY_RCU in selftests
  DONOTMERGE: dma-buf: Get rid of dma_fence_get_rcu_safe

 drivers/dma-buf/dma-fence-chain.c |   8 +-
 drivers/dma-buf/dma-resv.c|   4 +-
 drivers/dma-buf/st-dma-fence-chain.c  |  24 +---
 drivers/dma-buf/st-dma-fence.c|  27 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c |   4 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |   8 --
 drivers/gpu/drm/i915/i915_active.h|   4 +-
 drivers/gpu/drm/i915/i915_request.c   | 147 --
 drivers/gpu/drm/i915/i915_request.h   |   2 -
 drivers/gpu/drm/i915/i915_vma.c   |   4 +-
 include/drm/drm_syncobj.h |   4 +-
 include/linux/dma-fence.h |  50 
 include/linux/dma-resv.h  |   4 +-
 13 files changed, 110 insertions(+), 180 deletions(-)

-- 
2.31.1

[PATCH v2 7/7] drm/connector: add ref to drm_connector_get in iter docs

2021-06-09 Thread Simon Ser

Mention that connectors need to be referenced manually if they are
to be accessed after the iteration has progressed or ended.

Signed-off-by: Simon Ser 
---
 include/drm/drm_connector.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 714d1a01c065..c1af1e4ca560 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -1735,6 +1735,11 @@ void drm_mode_put_tile_group(struct drm_device *dev,
  * drm_connector_list_iter_begin(), drm_connector_list_iter_end() and
  * drm_connector_list_iter_next() respectively the convenience macro
  * drm_for_each_connector_iter().
+ *
+ * Note that the return value of drm_connector_list_iter_next() is only valid
+ * up to the next drm_connector_list_iter_next() or
+ * drm_connector_list_iter_end() call. If you want to use the connector later,
+ * then you need to grab your own reference first using drm_connector_get().
  */
 struct drm_connector_list_iter {
 /* private: */
-- 
2.31.1

[PATCH v2 6/7] i915/display/dp: send a more fine-grained link-status uevent

2021-06-09 Thread Simon Ser

When link-status changes, send a hotplug uevent which contains the
connector and property ID. That way, user-space can more easily
figure out that only the link-status property of this connector has
been updated.

Signed-off-by: Simon Ser 
---
 drivers/gpu/drm/i915/display/intel_dp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 5c983044..0ce44a97dd14 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -5276,6 +5276,8 @@ static void intel_dp_modeset_retry_work_fn(struct 
work_struct *work)
mutex_unlock(>dev->mode_config.mutex);
/* Send Hotplug uevent so userspace can reprobe */
drm_kms_helper_hotplug_event(connector->dev);
+   drm_sysfs_connector_status_event(connector,
+
connector->dev->mode_config.link_status_property);
 }
 
 bool
-- 
2.31.1

[PATCH v2 5/7] drm/probe-helper: use drm_kms_helper_connector_hotplug_event

2021-06-09 Thread Simon Ser

If an hotplug event only updates a single connector, use
drm_kms_helper_connector_hotplug_event instead of
drm_kms_helper_hotplug_event.

Signed-off-by: Simon Ser 
---
 drivers/gpu/drm/drm_probe_helper.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_probe_helper.c 
b/drivers/gpu/drm/drm_probe_helper.c
index 8cc673267cba..f4130c1a90e2 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -843,7 +843,7 @@ EXPORT_SYMBOL(drm_kms_helper_poll_fini);
  */
 bool drm_helper_hpd_irq_event(struct drm_device *dev)
 {
-   struct drm_connector *connector;
+   struct drm_connector *connector, *changed_connector = NULL;
struct drm_connector_list_iter conn_iter;
enum drm_connector_status old_status;
bool changed = false;
@@ -883,16 +883,27 @@ bool drm_helper_hpd_irq_event(struct drm_device *dev)
 * Check if epoch counter had changed, meaning that we need
 * to send a uevent.
 */
-   if (old_epoch_counter != connector->epoch_counter)
+   if (old_epoch_counter != connector->epoch_counter) {
+   if (changed) {
+   if (changed_connector)
+   drm_connector_put(changed_connector);
+   changed_connector = NULL;
+   } else {
+   drm_connector_get(connector);
+   changed_connector = connector;
+   }
changed = true;
+   }
 
}
drm_connector_list_iter_end(_iter);
mutex_unlock(>mode_config.mutex);
 
-   if (changed) {
+   if (changed_connector) {
+   drm_kms_helper_connector_hotplug_event(changed_connector);
+   drm_connector_put(changed_connector);
+   } else if (changed) {
drm_kms_helper_hotplug_event(dev);
-   DRM_DEBUG_KMS("Sent hotplug event\n");
}
 
return changed;
-- 
2.31.1

[PATCH v2 4/7] amdgpu: use drm_kms_helper_connector_hotplug_event

2021-06-09 Thread Simon Ser

When updating a single connector, use
drm_kms_helper_connector_hotplug_event instead of
drm_kms_helper_hotplug_event.

Signed-off-by: Simon Ser 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 3267eb2e35dd..4b91534ff324 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2638,7 +2638,7 @@ static void handle_hpd_irq(void *param)
drm_modeset_unlock_all(dev);
 
if (aconnector->base.force == DRM_FORCE_UNSPECIFIED)
-   drm_kms_helper_hotplug_event(dev);
+   drm_kms_helper_connector_hotplug_event(connector);
 
} else if (dc_link_detect(aconnector->dc_link, DETECT_REASON_HPD)) {
if (new_connection_type == dc_connection_none &&
@@ -2652,7 +2652,7 @@ static void handle_hpd_irq(void *param)
drm_modeset_unlock_all(dev);
 
if (aconnector->base.force == DRM_FORCE_UNSPECIFIED)
-   drm_kms_helper_hotplug_event(dev);
+   drm_kms_helper_connector_hotplug_event(connector);
}
mutex_unlock(>hpd_lock);
 
@@ -2805,7 +2805,7 @@ static void handle_hpd_rx_irq(void *param)
dm_restore_drm_connector_state(dev, connector);
drm_modeset_unlock_all(dev);
 
-   drm_kms_helper_hotplug_event(dev);
+   drm_kms_helper_connector_hotplug_event(connector);
} else if (dc_link_detect(dc_link, DETECT_REASON_HPDRX)) {
 
if (aconnector->fake_enable)
@@ -2818,7 +2818,7 @@ static void handle_hpd_rx_irq(void *param)
dm_restore_drm_connector_state(dev, connector);
drm_modeset_unlock_all(dev);
 
-   drm_kms_helper_hotplug_event(dev);
+   drm_kms_helper_connector_hotplug_event(connector);
}
}
 #ifdef CONFIG_DRM_AMD_DC_HDCP
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 9fbbd0159119..221242b6e528 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -1200,7 +1200,7 @@ static ssize_t trigger_hotplug(struct file *f, const char 
__user *buf,
dm_restore_drm_connector_state(dev, connector);
drm_modeset_unlock_all(dev);
 
-   drm_kms_helper_hotplug_event(dev);
+   drm_kms_helper_connector_hotplug_event(connector);
} else if (param[0] == 0) {
if (!aconnector->dc_link)
goto unlock;
@@ -1222,7 +1222,7 @@ static ssize_t trigger_hotplug(struct file *f, const char 
__user *buf,
dm_restore_drm_connector_state(dev, connector);
drm_modeset_unlock_all(dev);
 
-   drm_kms_helper_hotplug_event(dev);
+   drm_kms_helper_connector_hotplug_event(connector);
}
 
 unlock:
-- 
2.31.1

[PATCH v2 3/7] drm/connector: use drm_sysfs_connector_hotplug_event

2021-06-09 Thread Simon Ser

In drm_connector_register, use drm_sysfs_connector_hotplug_event
instead of drm_sysfs_hotplug_event, because the hotplug event
only updates a single connector.

Signed-off-by: Simon Ser 
---
 drivers/gpu/drm/drm_connector.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index da39e7ff6965..76930e0b8949 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -530,7 +530,7 @@ int drm_connector_register(struct drm_connector *connector)
connector->registration_state = DRM_CONNECTOR_REGISTERED;
 
/* Let userspace know we have a new connector */
-   drm_sysfs_hotplug_event(connector->dev);
+   drm_sysfs_connector_hotplug_event(connector);
 
goto unlock;
 
-- 
2.31.1

[PATCH v2 2/7] drm/probe-helper: add drm_kms_helper_connector_hotplug_event

2021-06-09 Thread Simon Ser

This function is the same as drm_kms_helper_hotplug_event, but takes
a connector instead of a device.

Signed-off-by: Simon Ser 
---
 drivers/gpu/drm/drm_probe_helper.c | 23 +++
 include/drm/drm_probe_helper.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/drm_probe_helper.c 
b/drivers/gpu/drm/drm_probe_helper.c
index e7e1ee2aa352..8cc673267cba 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -604,6 +604,9 @@ EXPORT_SYMBOL(drm_helper_probe_single_connector_modes);
  *
  * This function must be called from process context with no mode
  * setting locks held.
+ *
+ * If only a single connector has changed, consider calling
+ * drm_kms_helper_connector_hotplug_event() instead.
  */
 void drm_kms_helper_hotplug_event(struct drm_device *dev)
 {
@@ -616,6 +619,26 @@ void drm_kms_helper_hotplug_event(struct drm_device *dev)
 }
 EXPORT_SYMBOL(drm_kms_helper_hotplug_event);
 
+/**
+ * drm_kms_helper_connector_hotplug_event - fire off a KMS connector hotplug 
event
+ * @connector: drm_connector which has changed
+ *
+ * This is the same as drm_kms_helper_hotplug_event(), except it fires a more
+ * fine-grained uevent for a single connector.
+ */
+void drm_kms_helper_connector_hotplug_event(struct drm_connector *connector)
+{
+   struct drm_device *dev = connector->dev;
+
+   /* send a uevent + call fbdev */
+   drm_sysfs_connector_hotplug_event(connector);
+   if (dev->mode_config.funcs->output_poll_changed)
+   dev->mode_config.funcs->output_poll_changed(dev);
+
+   drm_client_dev_hotplug(dev);
+}
+EXPORT_SYMBOL(drm_kms_helper_connector_hotplug_event);
+
 static void output_poll_execute(struct work_struct *work)
 {
struct delayed_work *delayed_work = to_delayed_work(work);
diff --git a/include/drm/drm_probe_helper.h b/include/drm/drm_probe_helper.h
index 8d3ed2834d34..733147ea89be 100644
--- a/include/drm/drm_probe_helper.h
+++ b/include/drm/drm_probe_helper.h
@@ -19,6 +19,7 @@ void drm_kms_helper_poll_init(struct drm_device *dev);
 void drm_kms_helper_poll_fini(struct drm_device *dev);
 bool drm_helper_hpd_irq_event(struct drm_device *dev);
 void drm_kms_helper_hotplug_event(struct drm_device *dev);
+void drm_kms_helper_connector_hotplug_event(struct drm_connector *connector);
 
 void drm_kms_helper_poll_disable(struct drm_device *dev);
 void drm_kms_helper_poll_enable(struct drm_device *dev);
-- 
2.31.1

[PATCH v2 1/7] drm/sysfs: introduce drm_sysfs_connector_hotplug_event

2021-06-09 Thread Simon Ser

This function sends a hotplug uevent with a CONNECTOR property.

Signed-off-by: Simon Ser 
---
 drivers/gpu/drm/drm_sysfs.c | 25 +
 include/drm/drm_sysfs.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 968a9560b4aa..8423e44c3035 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -343,6 +343,31 @@ void drm_sysfs_hotplug_event(struct drm_device *dev)
 }
 EXPORT_SYMBOL(drm_sysfs_hotplug_event);
 
+/**
+ * drm_sysfs_connector_hotplug_event - generate a DRM uevent for any connector
+ * change
+ * @connector: connector which has changed
+ *
+ * Send a uevent for the DRM connector specified by @connector. This will send
+ * a uevent with the properties HOTPLUG=1 and CONNECTOR.
+ */
+void drm_sysfs_connector_hotplug_event(struct drm_connector *connector)
+{
+   struct drm_device *dev = connector->dev;
+   char hotplug_str[] = "HOTPLUG=1", conn_id[21];
+   char *envp[] = { hotplug_str, conn_id, NULL };
+
+   snprintf(conn_id, sizeof(conn_id),
+"CONNECTOR=%u", connector->base.id);
+
+   drm_dbg_kms(connector->dev,
+   "[CONNECTOR:%d:%s] generating connector hotplug event\n",
+   connector->base.id, connector->name);
+
+   kobject_uevent_env(>primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+EXPORT_SYMBOL(drm_sysfs_connector_hotplug_event);
+
 /**
  * drm_sysfs_connector_status_event - generate a DRM uevent for connector
  * property status change
diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h
index d454ef617b2c..6273cac44e47 100644
--- a/include/drm/drm_sysfs.h
+++ b/include/drm/drm_sysfs.h
@@ -11,6 +11,7 @@ int drm_class_device_register(struct device *dev);
 void drm_class_device_unregister(struct device *dev);
 
 void drm_sysfs_hotplug_event(struct drm_device *dev);
+void drm_sysfs_connector_hotplug_event(struct drm_connector *connector);
 void drm_sysfs_connector_status_event(struct drm_connector *connector,
  struct drm_property *property);
 #endif
-- 
2.31.1

[PATCH v2 0/7] drm: add per-connector hotplug events

2021-06-09 Thread Simon Ser

When a uevent only updates a single connector, add a CONNECTOR property
to the uevent. This allows user-space to ignore other connectors when
handling the uevent. This is purely an optimization, drivers can still
send a uevent without the CONNECTOR property.

The CONNECTOR property is already set when sending HDCP property update
uevents, see drm_sysfs_connector_status_event.

This has been tested with a wlroots patch [1].

amdgpu and the probe-helper has been updated to use these new fine-grained
uevents.

[1]: https://github.com/swaywm/wlroots/pull/2959

Simon Ser (7):
  drm/sysfs: introduce drm_sysfs_connector_hotplug_event
  drm/probe-helper: add drm_kms_helper_connector_hotplug_event
  drm/connector: use drm_sysfs_connector_hotplug_event
  amdgpu: use drm_kms_helper_connector_hotplug_event
  drm/probe-helper: use drm_kms_helper_connector_hotplug_event
  i915/display/dp: send a more fine-grained link-status uevent
  drm/connector: add ref to drm_connector_get in iter docs

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  8 ++--
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |  4 +-
 drivers/gpu/drm/drm_connector.c   |  2 +-
 drivers/gpu/drm/drm_probe_helper.c| 42 +--
 drivers/gpu/drm/drm_sysfs.c   | 25 +++
 drivers/gpu/drm/i915/display/intel_dp.c   |  2 +
 include/drm/drm_connector.h   |  5 +++
 include/drm/drm_probe_helper.h|  1 +
 include/drm/drm_sysfs.h   |  1 +
 9 files changed, 79 insertions(+), 11 deletions(-)

-- 
2.31.1

[RFC 6/6] drm/msm/kms: drop set_encoder_mode callback

2021-06-09 Thread Dmitry Baryshkov

set_encoder_mode callback is completely unused now. Drop it from
msm_kms_func().

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/msm_kms.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index 086a2d59b8c8..9484e8b62630 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -117,9 +117,6 @@ struct msm_kms_funcs {
struct drm_encoder *encoder,
struct drm_encoder *slave_encoder,
bool is_cmd_mode);
-   void (*set_encoder_mode)(struct msm_kms *kms,
-struct drm_encoder *encoder,
-bool cmd_mode);
/* cleanup: */
void (*destroy)(struct msm_kms *kms);
 
-- 
2.30.2

[RFC 3/6] drm/msm/mdp5: move mdp5_encoder_set_intf_mode after msm_dsi_modeset_init

2021-06-09 Thread Dmitry Baryshkov

Move a call to mdp5_encoder_set_intf_mode() after
msm_dsi_modeset_init(), removing set_encoder_mode callback.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c 
b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
index 15aed45022bc..b3b42672b2d4 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
@@ -209,13 +209,6 @@ static int mdp5_set_split_display(struct msm_kms *kms,
  slave_encoder);
 }
 
-static void mdp5_set_encoder_mode(struct msm_kms *kms,
- struct drm_encoder *encoder,
- bool cmd_mode)
-{
-   mdp5_encoder_set_intf_mode(encoder, cmd_mode);
-}
-
 static void mdp5_kms_destroy(struct msm_kms *kms)
 {
struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
@@ -287,7 +280,6 @@ static const struct mdp_kms_funcs kms_funcs = {
.get_format  = mdp_get_format,
.round_pixclk= mdp5_round_pixclk,
.set_split_display = mdp5_set_split_display,
-   .set_encoder_mode = mdp5_set_encoder_mode,
.destroy = mdp5_kms_destroy,
 #ifdef CONFIG_DEBUG_FS
.debugfs_init= mdp5_kms_debugfs_init,
@@ -448,6 +440,9 @@ static int modeset_init_intf(struct mdp5_kms *mdp5_kms,
}
 
ret = msm_dsi_modeset_init(priv->dsi[dsi_id], dev, encoder);
+   if (!ret)
+   mdp5_encoder_set_intf_mode(encoder, 
msm_dsi_is_cmd_mode(priv->dsi[dsi_id]));
+
break;
}
default:
-- 
2.30.2

[RFC 4/6] drm/msm/dp: stop calling set_encoder_mode callback

2021-06-09 Thread Dmitry Baryshkov

None of the display drivers now implement set_encoder_mode callback.
Stop calling it from the modeset init code.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dp/dp_display.c | 18 --
 1 file changed, 18 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c
index 051c1be1de7e..70b319a8fe83 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -102,8 +102,6 @@ struct dp_display_private {
struct dp_display_mode dp_mode;
struct msm_dp dp_display;
 
-   bool encoder_mode_set;
-
/* wait for audio signaling */
struct completion audio_comp;
 
@@ -283,20 +281,6 @@ static void dp_display_send_hpd_event(struct msm_dp 
*dp_display)
 }
 
 
-static void dp_display_set_encoder_mode(struct dp_display_private *dp)
-{
-   struct msm_drm_private *priv = dp->dp_display.drm_dev->dev_private;
-   struct msm_kms *kms = priv->kms;
-
-   if (!dp->encoder_mode_set && dp->dp_display.encoder &&
-   kms->funcs->set_encoder_mode) {
-   kms->funcs->set_encoder_mode(kms,
-   dp->dp_display.encoder, false);
-
-   dp->encoder_mode_set = true;
-   }
-}
-
 static int dp_display_send_hpd_notification(struct dp_display_private *dp,
bool hpd)
 {
@@ -369,8 +353,6 @@ static void dp_display_host_init(struct dp_display_private 
*dp, int reset)
if (dp->usbpd->orientation == ORIENTATION_CC2)
flip = true;
 
-   dp_display_set_encoder_mode(dp);
-
dp_power_init(dp->power, flip);
dp_ctrl_host_init(dp->ctrl, flip, reset);
dp_aux_init(dp->aux);
-- 
2.30.2

[RFC 5/6] drm/msm/dsi: stop calling set_encoder_mode callback

2021-06-09 Thread Dmitry Baryshkov

None of the display drivers now implement set_encoder_mode callback.
Stop calling it from the modeset init code.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dsi/dsi.c |  2 --
 drivers/gpu/drm/msm/dsi/dsi.h |  1 -
 drivers/gpu/drm/msm/dsi/dsi_manager.c | 12 
 3 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.c b/drivers/gpu/drm/msm/dsi/dsi.c
index 874c1527d300..881f14bc022d 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.c
+++ b/drivers/gpu/drm/msm/dsi/dsi.c
@@ -250,8 +250,6 @@ int msm_dsi_modeset_init(struct msm_dsi *msm_dsi, struct 
drm_device *dev,
goto fail;
}
 
-   msm_dsi_manager_setup_encoder(msm_dsi->id);
-
priv->bridges[priv->num_bridges++]   = msm_dsi->bridge;
priv->connectors[priv->num_connectors++] = msm_dsi->connector;
 
diff --git a/drivers/gpu/drm/msm/dsi/dsi.h b/drivers/gpu/drm/msm/dsi/dsi.h
index 9b8e9b07eced..c0bdbe63050a 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.h
@@ -80,7 +80,6 @@ struct drm_connector *msm_dsi_manager_connector_init(u8 id);
 struct drm_connector *msm_dsi_manager_ext_bridge_init(u8 id);
 int msm_dsi_manager_cmd_xfer(int id, const struct mipi_dsi_msg *msg);
 bool msm_dsi_manager_cmd_xfer_trigger(int id, u32 dma_base, u32 len);
-void msm_dsi_manager_setup_encoder(int id);
 int msm_dsi_manager_register(struct msm_dsi *msm_dsi);
 void msm_dsi_manager_unregister(struct msm_dsi *msm_dsi);
 bool msm_dsi_manager_validate_current_config(u8 id);
diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c 
b/drivers/gpu/drm/msm/dsi/dsi_manager.c
index 7d4f6fae1ab0..1996b40d2ae9 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
@@ -217,18 +217,6 @@ static int dsi_mgr_bridge_get_id(struct drm_bridge *bridge)
return dsi_bridge->id;
 }
 
-void msm_dsi_manager_setup_encoder(int id)
-{
-   struct msm_dsi *msm_dsi = dsi_mgr_get_dsi(id);
-   struct msm_drm_private *priv = msm_dsi->dev->dev_private;
-   struct msm_kms *kms = priv->kms;
-   struct drm_encoder *encoder = msm_dsi_get_encoder(msm_dsi);
-
-   if (encoder && kms->funcs->set_encoder_mode)
-   kms->funcs->set_encoder_mode(kms, encoder,
-msm_dsi_is_cmd_mode(msm_dsi));
-}
-
 static int msm_dsi_manager_panel_init(struct drm_connector *conn, u8 id)
 {
struct msm_drm_private *priv = conn->dev->dev_private;
-- 
2.30.2

[RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors

2021-06-09 Thread Dmitry Baryshkov

Move setting up encoders from set_encoder_mode to
_dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This
allows us to support not only "single DSI" and "dual DSI" but also "two
independent DSI" configurations. In future this would also help adding
support for multiple DP connectors.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 -
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 1d3a4f395e74..b63e1c948ff2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct drm_device *dev,
struct dpu_kms *dpu_kms)
 {
struct drm_encoder *encoder = NULL;
+   struct msm_display_info info;
int i, rc = 0;
 
if (!(priv->dsi[0] || priv->dsi[1]))
return rc;
 
-   /*TODO: Support two independent DSI connectors */
-   encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
-   if (IS_ERR(encoder)) {
-   DPU_ERROR("encoder init failed for dsi display\n");
-   return PTR_ERR(encoder);
-   }
-
-   priv->encoders[priv->num_encoders++] = encoder;
-
for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) {
if (!priv->dsi[i])
continue;
 
+   if (!encoder) {
+   encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
+   if (IS_ERR(encoder)) {
+   DPU_ERROR("encoder init failed for dsi 
display\n");
+   return PTR_ERR(encoder);
+   }
+
+   priv->encoders[priv->num_encoders++] = encoder;
+
+   memset(, 0, sizeof(info));
+   info.intf_type = encoder->encoder_type;
+   info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ?
+   MSM_DISPLAY_CAP_CMD_MODE :
+   MSM_DISPLAY_CAP_VID_MODE;
+   }
+
rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder);
if (rc) {
DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n",
i, rc);
break;
}
+
+   info.h_tile_instance[info.num_of_h_tiles++] = i;
+
+   if (!msm_dsi_is_dual_dsi(priv->dsi[i])) {
+   rc = dpu_encoder_setup(dev, encoder, );
+   if (rc)
+   DPU_ERROR("failed to setup DPU encoder %d: 
rc:%d\n",
+   encoder->base.id, rc);
+   encoder = NULL;
+   }
+   }
+
+   if (encoder) {
+   rc = dpu_encoder_setup(dev, encoder, );
+   if (rc)
+   DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n",
+   encoder->base.id, rc);
}
 
return rc;
@@ -505,6 +530,7 @@ static int _dpu_kms_initialize_displayport(struct 
drm_device *dev,
struct dpu_kms *dpu_kms)
 {
struct drm_encoder *encoder = NULL;
+   struct msm_display_info info;
int rc = 0;
 
if (!priv->dp)
@@ -516,6 +542,7 @@ static int _dpu_kms_initialize_displayport(struct 
drm_device *dev,
return PTR_ERR(encoder);
}
 
+   memset(, 0, sizeof(info));
rc = msm_dp_modeset_init(priv->dp, dev, encoder);
if (rc) {
DPU_ERROR("modeset_init failed for DP, rc = %d\n", rc);
@@ -524,6 +551,14 @@ static int _dpu_kms_initialize_displayport(struct 
drm_device *dev,
}
 
priv->encoders[priv->num_encoders++] = encoder;
+
+   info.num_of_h_tiles = 1;
+   info.capabilities = MSM_DISPLAY_CAP_VID_MODE;
+   info.intf_type = encoder->encoder_type;
+   rc = dpu_encoder_setup(dev, encoder, );
+   if (rc)
+   DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n",
+   encoder->base.id, rc);
return rc;
 }
 
@@ -726,41 +761,6 @@ static void dpu_kms_destroy(struct msm_kms *kms)
msm_kms_destroy(_kms->base);
 }
 
-static void _dpu_kms_set_encoder_mode(struct msm_kms *kms,
-struct drm_encoder *encoder,
-bool cmd_mode)
-{
-   struct msm_display_info info;
-   struct msm_drm_private *priv = encoder->dev->dev_private;
-   int i, rc = 0;
-
-   memset(, 0, sizeof(info));
-
-   info.intf_type = encoder->encoder_type;
-   info.capabilities = cmd_mode ? MSM_DISPLAY_CAP_CMD_MODE :
-   MSM_DISPLAY_CAP_VID_MODE;
-
-   switch (info.intf_type) {
-   case DRM_MODE_ENCODER_DSI:
-

[RFC 1/6] drm/msm/dsi: add two helper functions

2021-06-09 Thread Dmitry Baryshkov

Add two helper functions to be used by display drivers for setting up
encoders.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dsi/dsi.c |  6 ++
 drivers/gpu/drm/msm/dsi/dsi_manager.c | 14 ++
 drivers/gpu/drm/msm/msm_drv.h | 12 ++--
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.c b/drivers/gpu/drm/msm/dsi/dsi.c
index 75afc12a7b25..874c1527d300 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.c
+++ b/drivers/gpu/drm/msm/dsi/dsi.c
@@ -13,6 +13,12 @@ struct drm_encoder *msm_dsi_get_encoder(struct msm_dsi 
*msm_dsi)
return msm_dsi->encoder;
 }
 
+bool msm_dsi_is_cmd_mode(struct msm_dsi *msm_dsi)
+{
+   unsigned long host_flags = msm_dsi_host_get_mode_flags(msm_dsi->host);
+   return !(host_flags & MIPI_DSI_MODE_VIDEO);
+}
+
 static int dsi_get_phy(struct msm_dsi *msm_dsi)
 {
struct platform_device *pdev = msm_dsi->pdev;
diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c 
b/drivers/gpu/drm/msm/dsi/dsi_manager.c
index cd016576e8c5..7d4f6fae1ab0 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
@@ -217,12 +217,6 @@ static int dsi_mgr_bridge_get_id(struct drm_bridge *bridge)
return dsi_bridge->id;
 }
 
-static bool dsi_mgr_is_cmd_mode(struct msm_dsi *msm_dsi)
-{
-   unsigned long host_flags = msm_dsi_host_get_mode_flags(msm_dsi->host);
-   return !(host_flags & MIPI_DSI_MODE_VIDEO);
-}
-
 void msm_dsi_manager_setup_encoder(int id)
 {
struct msm_dsi *msm_dsi = dsi_mgr_get_dsi(id);
@@ -232,7 +226,7 @@ void msm_dsi_manager_setup_encoder(int id)
 
if (encoder && kms->funcs->set_encoder_mode)
kms->funcs->set_encoder_mode(kms, encoder,
-dsi_mgr_is_cmd_mode(msm_dsi));
+msm_dsi_is_cmd_mode(msm_dsi));
 }
 
 static int msm_dsi_manager_panel_init(struct drm_connector *conn, u8 id)
@@ -277,7 +271,7 @@ static int msm_dsi_manager_panel_init(struct drm_connector 
*conn, u8 id)
if (other_dsi && other_dsi->panel && kms->funcs->set_split_display) {
kms->funcs->set_split_display(kms, master_dsi->encoder,
  slave_dsi->encoder,
- dsi_mgr_is_cmd_mode(msm_dsi));
+ msm_dsi_is_cmd_mode(msm_dsi));
}
 
 out:
@@ -840,3 +834,7 @@ void msm_dsi_manager_unregister(struct msm_dsi *msm_dsi)
msm_dsim->dsi[msm_dsi->id] = NULL;
 }
 
+bool msm_dsi_is_dual_dsi(struct msm_dsi *msm_dsi)
+{
+   return IS_DUAL_DSI();
+}
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 3352125ce428..826cc5e25bcb 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -343,7 +343,8 @@ void __exit msm_dsi_unregister(void);
 int msm_dsi_modeset_init(struct msm_dsi *msm_dsi, struct drm_device *dev,
 struct drm_encoder *encoder);
 void msm_dsi_snapshot(struct msm_disp_state *disp_state, struct msm_dsi 
*msm_dsi);
-
+bool msm_dsi_is_cmd_mode(struct msm_dsi *msm_dsi);
+bool msm_dsi_is_dual_dsi(struct msm_dsi *msm_dsi);
 #else
 static inline void __init msm_dsi_register(void)
 {
@@ -360,7 +361,14 @@ static inline int msm_dsi_modeset_init(struct msm_dsi 
*msm_dsi,
 static inline void msm_dsi_snapshot(struct msm_disp_state *disp_state, struct 
msm_dsi *msm_dsi)
 {
 }
-
+static inline bool msm_dsi_is_cmd_mode(struct msm_dsi *msm_dsi)
+{
+   return false;
+}
+static bool msm_dsi_is_dual_dsi(struct msm_dsi *msm_dsi)
+{
+   return false;
+}
 #endif
 
 #ifdef CONFIG_DRM_MSM_DP
-- 
2.30.2

[RFC 0/6] drm/msm/dpu: add support for idependent DSI config

2021-06-09 Thread Dmitry Baryshkov

This patchseries adds support for independent DSI config to DPU1 display
subdriver. This results in ability to drop one of msm_kms_funcs
callbacks.

This code was tested on RB5 (dpu, dsi). Neither DP nor MDP5 changes were
tested (thus the RFC tag).

Re: [PATCH] drm: display: Remove duplicated argument in dcn31

2021-06-09 Thread Alex Deucher

Applied.  Thanks!

On Wed, Jun 9, 2021 at 2:48 PM Rodrigo Siqueira
 wrote:
>
> On 06/09, Wan Jiabing wrote:
> > Fix the following coccicheck warning:
> > ./drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c:
> > 3539:12-42: duplicated argument to && or ||
> > ./drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c:
> > 5677:87-123: duplicated argument to && or ||
> >
> > Signed-off-by: Wan Jiabing 
> > ---
> >  .../gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c| 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
> > b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
> > index d655655baaba..06fac59a3d40 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
> > @@ -3536,7 +3536,7 @@ static bool CalculateBytePerPixelAnd256BBlockSizes(
> >   *BytePerPixelDETC = 0;
> >   *BytePerPixelY = 4;
> >   *BytePerPixelC = 0;
> > - } else if (SourcePixelFormat == dm_444_16 || SourcePixelFormat == 
> > dm_444_16) {
> > + } else if (SourcePixelFormat == dm_444_16) {
> >   *BytePerPixelDETY = 2;
> >   *BytePerPixelDETC = 0;
> >   *BytePerPixelY = 2;
> > @@ -5674,7 +5674,7 @@ void 
> > dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
> >   for (k = 0; k < v->NumberOfActivePlanes; k++) {
> >   if (v->ViewportWidth[k] > v->SurfaceWidthY[k] || 
> > v->ViewportHeight[k] > v->SurfaceHeightY[k]) {
> >   ViewportExceedsSurface = true;
> > - if (v->SourcePixelFormat[k] != dm_444_64 && 
> > v->SourcePixelFormat[k] != dm_444_32 && v->SourcePixelFormat[k] != dm_444_16
> > + if (v->SourcePixelFormat[k] != dm_444_64 && 
> > v->SourcePixelFormat[k] != dm_444_32
> >   && v->SourcePixelFormat[k] != 
> > dm_444_16 && v->SourcePixelFormat[k] != dm_444_8
> >   && v->SourcePixelFormat[k] != 
> > dm_rgbe) {
> >   if (v->ViewportWidthChroma[k] > 
> > v->SurfaceWidthC[k]
> > --
> > 2.20.1
> >
>
> + Anson
>
> Reviewed-by: Rodrigo Siqueira 
>
> --
> Rodrigo Siqueira
> https://siqueira.tech
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm: display: Remove duplicate include in dce110

2021-06-09 Thread Alex Deucher

Applied.  Thanks!

On Wed, Jun 9, 2021 at 2:43 PM Rodrigo Siqueira
 wrote:
>
> On 06/08, Wan Jiabing wrote:
> > Fix the following checkincludes.pl warning:
> > ./drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > 35  #include "dce110_hw_sequencer.h"
> > 69  #include "dce110_hw_sequencer.h"
> >
> >
> > Signed-off-by: Wan Jiabing 
> > ---
> >  drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
> > b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > index a08cd52f6ba8..e20d4def3eb9 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> > @@ -66,7 +66,6 @@
> >
> >  #include "atomfirmware.h"
> >
> > -#include "dce110_hw_sequencer.h"
> >  #include "dcn10/dcn10_hw_sequencer.h"
> >
> >  #define GAMMA_HW_POINTS_NUM 256
> > --
> > 2.20.1
> >
>
> lgtm,
>
> Thanks
>
> Reviewed-by: Rodrigo Siqueira 
>
> --
> Rodrigo Siqueira
> https://siqueira.tech
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: use ARRAY_SIZE for base60_refresh_rates

2021-06-09 Thread Alex Deucher

Applied.  Thanks!

On Wed, Jun 9, 2021 at 6:09 AM Jiapeng Chong
 wrote:
>
> Use ARRAY_SIZE instead of dividing sizeof array with sizeof an
> element.
>
> Clean up the following coccicheck warning:
>
> ./drivers/gpu/drm/amd/display/dc/core/dc_resource.c:448:47-48: WARNING:
> Use ARRAY_SIZE.
>
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 
> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> index 57afe71..3f00989 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
> @@ -445,7 +445,7 @@ bool resource_are_vblanks_synchronizable(
>  {
> uint32_t base60_refresh_rates[] = {10, 20, 5};
> uint8_t i;
> -   uint8_t rr_count = 
> sizeof(base60_refresh_rates)/sizeof(base60_refresh_rates[0]);
> +   uint8_t rr_count = ARRAY_SIZE(base60_refresh_rates);
> uint64_t frame_time_diff;
>
> if (stream1->ctx->dc->config.vblank_alignment_dto_params &&
> --
> 1.8.3.1
>

[PATCH] drm/msm/dsi: do not enable PHYs when called for the slave DSI interface

2021-06-09 Thread Dmitry Baryshkov

Move the call to dsi_mgr_phy_enable after checking whether the DSI
interface is slave, so that PHY enablement happens together with the
host enablement.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dsi/dsi_manager.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c 
b/drivers/gpu/drm/msm/dsi/dsi_manager.c
index cd016576e8c5..9243edada9ba 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
@@ -373,14 +373,14 @@ static void dsi_mgr_bridge_pre_enable(struct drm_bridge 
*bridge)
if (!msm_dsi_device_connected(msm_dsi))
return;
 
-   ret = dsi_mgr_phy_enable(id, phy_shared_timings);
-   if (ret)
-   goto phy_en_fail;
-
/* Do nothing with the host if it is slave-DSI in case of dual DSI */
if (is_dual_dsi && !IS_MASTER_DSI_LINK(id))
return;
 
+   ret = dsi_mgr_phy_enable(id, phy_shared_timings);
+   if (ret)
+   goto phy_en_fail;
+
ret = msm_dsi_host_power_on(host, _shared_timings[id], is_dual_dsi);
if (ret) {
pr_err("%s: power on host %d failed, %d\n", __func__, id, ret);
-- 
2.30.2

Re: [PATCH] drm/amd/display: Fix duplicate included clk_mgr.h

2021-06-09 Thread Alex Deucher

Applied.  Thanks!

On Wed, Jun 9, 2021 at 6:05 AM Jiapeng Chong
 wrote:
>
> Clean up the following includecheck warning:
>
> ./drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c: clk_mgr.h is
> included more than once.
>
> No functional change.
>
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c 
> b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c
> index c0e544d..1007051 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c
> @@ -33,7 +33,6 @@
>  #include "clk_mgr.h"
>  #include "reg_helper.h"
>  #include "abm.h"
> -#include "clk_mgr.h"
>  #include "hubp.h"
>  #include "dchubbub.h"
>  #include "timing_generator.h"
> --
> 1.8.3.1
>

[PATCH v4] drm/msm/dsi: add continuous clock support for 7nm PHY

2021-06-09 Thread Dmitry Baryshkov

Unlike previous generations, 7nm PHYs are required to collaborate with
the host for conitnuos clock mode. Add changes neccessary to enable
continuous clock mode in the 7nm DSI PHYs.

Signed-off-by: Dmitry Baryshkov 
---
Chanes since v3:
 - Invert the DSI_LANE_CTRL_HS_REQ_SEL_PHY bit logic, as noted by
   Abhinav.

Changes since v2:
 - Really drop msm_dsi_phy_needs_hs_phy_sel()

Changes since v1:
 - Remove the need for a separate msm_dsi_phy_needs_hs_phy_sel() call
 - Fix setting continuous clock for a dual DSI case.
---
 drivers/gpu/drm/msm/dsi/dsi.h |  3 ++-
 drivers/gpu/drm/msm/dsi/dsi.xml.h |  1 +
 drivers/gpu/drm/msm/dsi/dsi_host.c| 12 
 drivers/gpu/drm/msm/dsi/dsi_manager.c |  4 ++--
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.c |  9 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy.h |  1 +
 drivers/gpu/drm/msm/dsi/phy/dsi_phy_7nm.c | 17 +
 7 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi.h b/drivers/gpu/drm/msm/dsi/dsi.h
index 9b8e9b07eced..58e63bf34fe9 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.h
@@ -109,7 +109,7 @@ int msm_dsi_host_enable(struct mipi_dsi_host *host);
 int msm_dsi_host_disable(struct mipi_dsi_host *host);
 int msm_dsi_host_power_on(struct mipi_dsi_host *host,
struct msm_dsi_phy_shared_timings *phy_shared_timings,
-   bool is_dual_dsi);
+   bool is_dual_dsi, struct msm_dsi_phy *phy);
 int msm_dsi_host_power_off(struct mipi_dsi_host *host);
 int msm_dsi_host_set_display_mode(struct mipi_dsi_host *host,
  const struct drm_display_mode *mode);
@@ -175,6 +175,7 @@ int msm_dsi_phy_get_clk_provider(struct msm_dsi_phy *phy,
 void msm_dsi_phy_pll_save_state(struct msm_dsi_phy *phy);
 int msm_dsi_phy_pll_restore_state(struct msm_dsi_phy *phy);
 void msm_dsi_phy_snapshot(struct msm_disp_state *disp_state, struct 
msm_dsi_phy *phy);
+bool msm_dsi_phy_set_continuous_clock(struct msm_dsi_phy *phy, bool enable);
 
 #endif /* __DSI_CONNECTOR_H__ */
 
diff --git a/drivers/gpu/drm/msm/dsi/dsi.xml.h 
b/drivers/gpu/drm/msm/dsi/dsi.xml.h
index 50eb4d1b8fdd..9762af6035e9 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.xml.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.xml.h
@@ -510,6 +510,7 @@ static inline uint32_t 
DSI_CLKOUT_TIMING_CTRL_T_CLK_POST(uint32_t val)
 #define DSI_LANE_STATUS_DLN0_DIRECTION 0x0001
 
 #define REG_DSI_LANE_CTRL  0x00a8
+#define DSI_LANE_CTRL_HS_REQ_SEL_PHY   0x0100
 #define DSI_LANE_CTRL_CLKLN_HS_FORCE_REQUEST   0x1000
 
 #define REG_DSI_LANE_SWAP_CTRL 0x00ac
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index ed504fe5074f..3558e5cd400f 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -834,7 +834,7 @@ static inline enum dsi_cmd_dst_format dsi_get_cmd_fmt(
 }
 
 static void dsi_ctrl_config(struct msm_dsi_host *msm_host, bool enable,
-   struct msm_dsi_phy_shared_timings *phy_shared_timings)
+   struct msm_dsi_phy_shared_timings *phy_shared_timings, 
struct msm_dsi_phy *phy)
 {
u32 flags = msm_host->mode_flags;
enum mipi_dsi_pixel_format mipi_fmt = msm_host->format;
@@ -929,6 +929,10 @@ static void dsi_ctrl_config(struct msm_dsi_host *msm_host, 
bool enable,
 
if (!(flags & MIPI_DSI_CLOCK_NON_CONTINUOUS)) {
lane_ctrl = dsi_read(msm_host, REG_DSI_LANE_CTRL);
+
+   if (msm_dsi_phy_set_continuous_clock(phy, enable))
+   lane_ctrl &= ~DSI_LANE_CTRL_HS_REQ_SEL_PHY;
+
dsi_write(msm_host, REG_DSI_LANE_CTRL,
lane_ctrl | DSI_LANE_CTRL_CLKLN_HS_FORCE_REQUEST);
}
@@ -2354,7 +2358,7 @@ static void msm_dsi_sfpb_config(struct msm_dsi_host 
*msm_host, bool enable)
 
 int msm_dsi_host_power_on(struct mipi_dsi_host *host,
struct msm_dsi_phy_shared_timings *phy_shared_timings,
-   bool is_dual_dsi)
+   bool is_dual_dsi, struct msm_dsi_phy *phy)
 {
struct msm_dsi_host *msm_host = to_msm_dsi_host(host);
const struct msm_dsi_cfg_handler *cfg_hnd = msm_host->cfg_hnd;
@@ -2394,7 +2398,7 @@ int msm_dsi_host_power_on(struct mipi_dsi_host *host,
 
dsi_timing_setup(msm_host, is_dual_dsi);
dsi_sw_reset(msm_host);
-   dsi_ctrl_config(msm_host, true, phy_shared_timings);
+   dsi_ctrl_config(msm_host, true, phy_shared_timings, phy);
 
if (msm_host->disp_en_gpio)
gpiod_set_value(msm_host->disp_en_gpio, 1);
@@ -2425,7 +2429,7 @@ int msm_dsi_host_power_off(struct mipi_dsi_host *host)
goto unlock_ret;
}
 
-   dsi_ctrl_config(msm_host, false, NULL);
+

Re: [PATCH 07/13] drm/i915/guc: New definition of the CTB registration action

2021-06-09 Thread Michal Wajdeczko




On 09.06.2021 19:36, John Harrison wrote:
> On 6/7/2021 18:23, Daniele Ceraolo Spurio wrote:
>> On 6/7/2021 11:03 AM, Matthew Brost wrote:
>>> From: Michal Wajdeczko 
>>>
>>> Definition of the CTB registration action has changed.
>>> Add some ABI documentation and implement required changes.
>>>
>>> Signed-off-by: Michal Wajdeczko 
>>> Signed-off-by: Matthew Brost 
>>> Cc: Piotr Piórkowski  #4
>>> ---
>>>   .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  | 107 ++
>>>   .../gt/uc/abi/guc_communication_ctb_abi.h |   4 -
>>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  76 -
>>>   3 files changed, 152 insertions(+), 35 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>>> b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>>> index 90efef8a73e4..6426fc183692 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>>> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>>> @@ -6,6 +6,113 @@
>>>   #ifndef _ABI_GUC_ACTIONS_ABI_H
>>>   #define _ABI_GUC_ACTIONS_ABI_H
>>>   +/**
>>> + * DOC: HOST2GUC_REGISTER_CTB
>>> + *
>>> + * This message is used as part of the `CTB based communication`_
>>> setup.
>>> + *
>>> + * This message must be sent as `MMIO HXG Message`_.
>>> + *
>>> + *
>>> +---+---+--+
>>>
>>> + *  |   | Bits  |
>>> Description  |
>>> + *
>>> +===+===+==+
>>>
>>> + *  | 0 |    31 | ORIGIN =
>>> GUC_HXG_ORIGIN_HOST_    |
>>> + *  |
>>> +---+--+
>>> + *  |   | 30:28 | TYPE =
>>> GUC_HXG_TYPE_REQUEST_ |
>>> + *  |
>>> +---+--+
>>> + *  |   | 27:16 | DATA0 =
>>> MBZ  |
>>> + *  |
>>> +---+--+
>>> + *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_REGISTER_CTB` =
>>> 0x5200    |
>>
>> Specs says 4505
>>
>>> + *
>>> +---+---+--+
>>>
>>> + *  | 1 | 31:12 | RESERVED =
>>> MBZ   |
>>> + *  |
>>> +---+--+
>>> + *  |   |  11:8 | **TYPE** - type for the `CT
>>> Buffer`_ |
>>> + *  |   |
>>> |  |
>>> + *  |   |   |   - _`GUC_CTB_TYPE_HOST2GUC` =
>>> 0 |
>>> + *  |   |   |   - _`GUC_CTB_TYPE_GUC2HOST` =
>>> 1 |
>>> + *  |
>>> +---+--+
>>> + *  |   |   7:0 | **SIZE** - size of the `CT Buffer`_ in 4K units
>>> minus 1  |
>>> + *
>>> +---+---+--+
>>>
>>> + *  | 2 |  31:0 | **DESC_ADDR** - GGTT address of the `CTB
>>> Descriptor`_    |
>>> + *
>>> +---+---+--+
>>>
>>> + *  | 3 |  31:0 | **BUFF_ADDF** - GGTT address of the `CT
>>> Buffer`_ |
>>> + *
>>> +---+---+--+
>>>
>>> +*
>>> + *
>>> +---+---+--+
>>>
>>> + *  |   | Bits  |
>>> Description  |
>>> + *
>>> +===+===+==+
>>>
>>> + *  | 0 |    31 | ORIGIN =
>>> GUC_HXG_ORIGIN_GUC_ |
>>> + *  |
>>> +---+--+
>>> + *  |   | 30:28 | TYPE =
>>> GUC_HXG_TYPE_RESPONSE_SUCCESS_    |
>>> + *  |
>>> +---+--+
>>> + *  |   |  27:0 | DATA0 =
>>> MBZ  |
>>> + *
>>> +---+---+--+
>>>
>>> + */
>>> +#define GUC_ACTION_HOST2GUC_REGISTER_CTB    0x4505 // FIXME 0x5200
>>
>> Why FIXME? AFAICS the specs still says 4505, even if we plan to update
>> at some point I don;t think this deserves a FIXME since nothing is
>> incorrect.
>>
>>> +
>>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_LEN
>>> (GUC_HXG_REQUEST_MSG_MIN_LEN + 3u)
>>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_0_MBZ
>>> GUC_HXG_REQUEST_MSG_0_DATA0
>>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_MBZ    (0xf << 12)
>>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_TYPE    (0xf << 8)
>>> +#define   GUC_CTB_TYPE_HOST2GUC    0u
>>> +#define   GUC_CTB_TYPE_GUC2HOST    1u
>>> +#define

Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

2021-06-09 Thread Ondrej Zary

On Wednesday 09 June 2021 11:21:05 Christian König wrote:
> Am 09.06.21 um 09:10 schrieb Ondrej Zary:
> > On Wednesday 09 June 2021, Christian König wrote:
> >> Am 09.06.21 um 08:57 schrieb Ondrej Zary:
> >>> [SNIP]
>  Thanks for the heads up. So the problem with my patch is already fixed,
>  isn't it?
> >>> The NULL pointer dereference in nouveau_bo_wr16 introduced in
> >>> 141b15e59175aa174ca1f7596188bd15a7ca17ba was fixed by
> >>> aea656b0d05ec5b8ed5beb2f94c4dd42ea834e9d.
> >>>
> >>> That's the bug I hit when bisecting the original problem:
> >>> NULL pointer dereference in nouveau_bo_sync_for_device
> >>> It's caused by:
> >>> # first bad commit: [e34b8feeaa4b65725b25f49c9b08a0f8707e8e86] drm/ttm: 
> >>> merge ttm_dma_tt back into ttm_tt
> >> Good that I've asked :)
> >>
> >> Ok that's a bit strange. e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 was
> >> created mostly automated.
> >>
> >> Do you have the original backtrace of that NULL pointer deref once more?
> > The original backtrace is here: 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2021%2F6%2F5%2F350data=04%7C01%7Cchristian.koenig%40amd.com%7Ce905b6bd2aa842ace15508d92b15b96d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588195000729460%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=zFqheBbJcOHtYgqG%2Fs63AT1dwuk4REmUDJWHvzaLAlc%3Dreserved=0
> 
> And the problem is that ttm_dma->dma_address is NULL, right? Mhm, I 
> don't see how that can happen since nouveau is using ttm_sg_tt_init().
> 
> Apart from that what nouveau does here is rather questionable since you 
> need a coherent architecture for most things anyway, but that's not what 
> we are trying to fix here.
> 
> Can you try to narrow down if ttm_sg_tt_init is called before calling 
> this function for the tt object in question?

ttm_sg_tt_init is not called:
[   12.150124] nouveau :01:00.0: DRM: VRAM: 31 MiB
[   12.150133] nouveau :01:00.0: DRM: GART: 128 MiB
[   12.150143] nouveau :01:00.0: DRM: BMP version 5.6
[   12.150151] nouveau :01:00.0: DRM: No DCB data found in VBIOS
[   12.151362] ttm_tt_init
[   12.151370] ttm_tt_init_fields
[   12.151374] ttm_tt_alloc_page_directory
[   12.151615] BUG: kernel NULL pointer dereference, address: 



-- 
Ondrej Zary

Re: [PATCH 07/13] drm/i915/guc: New definition of the CTB registration action

2021-06-09 Thread Michal Wajdeczko




On 08.06.2021 03:23, Daniele Ceraolo Spurio wrote:
> 
> 
> On 6/7/2021 11:03 AM, Matthew Brost wrote:
>> From: Michal Wajdeczko 
>>
>> Definition of the CTB registration action has changed.
>> Add some ABI documentation and implement required changes.
>>
>> Signed-off-by: Michal Wajdeczko 
>> Signed-off-by: Matthew Brost 
>> Cc: Piotr Piórkowski  #4
>> ---
>>   .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  | 107 ++
>>   .../gt/uc/abi/guc_communication_ctb_abi.h |   4 -
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  76 -
>>   3 files changed, 152 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>> b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>> index 90efef8a73e4..6426fc183692 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
>> @@ -6,6 +6,113 @@
>>   #ifndef _ABI_GUC_ACTIONS_ABI_H
>>   #define _ABI_GUC_ACTIONS_ABI_H
>>   +/**
>> + * DOC: HOST2GUC_REGISTER_CTB
>> + *
>> + * This message is used as part of the `CTB based communication`_ setup.
>> + *
>> + * This message must be sent as `MMIO HXG Message`_.
>> + *
>> + * 
>> +---+---+--+
>>
>> + *  |   | Bits  |
>> Description  |
>> + * 
>> +===+===+==+
>>
>> + *  | 0 |    31 | ORIGIN =
>> GUC_HXG_ORIGIN_HOST_    |
>> + *  |  
>> +---+--+
>> + *  |   | 30:28 | TYPE =
>> GUC_HXG_TYPE_REQUEST_ |
>> + *  |  
>> +---+--+
>> + *  |   | 27:16 | DATA0 =
>> MBZ  |
>> + *  |  
>> +---+--+
>> + *  |   |  15:0 | ACTION = _`GUC_ACTION_HOST2GUC_REGISTER_CTB` =
>> 0x5200    |
> 
> Specs says 4505

but draft was saying 5200 ;)

> 
>> + * 
>> +---+---+--+
>>
>> + *  | 1 | 31:12 | RESERVED =
>> MBZ   |
>> + *  |  
>> +---+--+
>> + *  |   |  11:8 | **TYPE** - type for the `CT
>> Buffer`_ |
>> + *  |   |  
>> |  |
>> + *  |   |   |   - _`GUC_CTB_TYPE_HOST2GUC` =
>> 0 |
>> + *  |   |   |   - _`GUC_CTB_TYPE_GUC2HOST` =
>> 1 |
>> + *  |  
>> +---+--+
>> + *  |   |   7:0 | **SIZE** - size of the `CT Buffer`_ in 4K units
>> minus 1  |
>> + * 
>> +---+---+--+
>>
>> + *  | 2 |  31:0 | **DESC_ADDR** - GGTT address of the `CTB
>> Descriptor`_    |
>> + * 
>> +---+---+--+
>>
>> + *  | 3 |  31:0 | **BUFF_ADDF** - GGTT address of the `CT
>> Buffer`_ |
>> + * 
>> +---+---+--+
>>
>> +*
>> + * 
>> +---+---+--+
>>
>> + *  |   | Bits  |
>> Description  |
>> + * 
>> +===+===+==+
>>
>> + *  | 0 |    31 | ORIGIN =
>> GUC_HXG_ORIGIN_GUC_ |
>> + *  |  
>> +---+--+
>> + *  |   | 30:28 | TYPE =
>> GUC_HXG_TYPE_RESPONSE_SUCCESS_    |
>> + *  |  
>> +---+--+
>> + *  |   |  27:0 | DATA0 =
>> MBZ  |
>> + * 
>> +---+---+--+
>>
>> + */
>> +#define GUC_ACTION_HOST2GUC_REGISTER_CTB    0x4505 // FIXME 0x5200
> 
> Why FIXME? AFAICS the specs still says 4505, even if we plan to update
> at some point I don;t think this deserves a FIXME since nothing is
> incorrect.

patch was prepared based on draft spec and this FIXME was added just as
head-up since we were expecting GuC to make this change soon, but since
we are going with GuC 62 that uses 4505, agree, we need drop this FIXME

> 
>> +
>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_LEN   
>> (GUC_HXG_REQUEST_MSG_MIN_LEN + 3u)
>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_0_MBZ   
>> GUC_HXG_REQUEST_MSG_0_DATA0
>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_MBZ    (0xf << 12)
>> +#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_TYPE    (0xf << 8)
>> +#define

Re: [RFC PATCH v2 1/8] ext4/xfs: add page refcount helper

2021-06-09 Thread Matthew Wilcox

On Mon, Jun 07, 2021 at 03:42:19PM -0500, Alex Sierra wrote:
> +++ b/include/linux/dax.h
> @@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space 
> *mapping)
>   return mapping->host && IS_DAX(mapping->host);
>  }
>  
> +static inline bool dax_layout_is_idle_page(struct page *page)
> +{
> + return page_ref_count(page) == 1;
> +}

We already have something called an idle page, and that's quite a
different thing from this.  How about dax_page_unused() (it's a use
count, so once it's got down to its minimum value, it's unused)?

Re: [PATCH] drm: display: Remove duplicated argument in dcn31

2021-06-09 Thread Rodrigo Siqueira

On 06/09, Wan Jiabing wrote:
> Fix the following coccicheck warning:
> ./drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c:
> 3539:12-42: duplicated argument to && or ||
> ./drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c:
> 5677:87-123: duplicated argument to && or ||
> 
> Signed-off-by: Wan Jiabing 
> ---
>  .../gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c| 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
> b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
> index d655655baaba..06fac59a3d40 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
> @@ -3536,7 +3536,7 @@ static bool CalculateBytePerPixelAnd256BBlockSizes(
>   *BytePerPixelDETC = 0;
>   *BytePerPixelY = 4;
>   *BytePerPixelC = 0;
> - } else if (SourcePixelFormat == dm_444_16 || SourcePixelFormat == 
> dm_444_16) {
> + } else if (SourcePixelFormat == dm_444_16) {
>   *BytePerPixelDETY = 2;
>   *BytePerPixelDETC = 0;
>   *BytePerPixelY = 2;
> @@ -5674,7 +5674,7 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct 
> display_mode_lib *mode_l
>   for (k = 0; k < v->NumberOfActivePlanes; k++) {
>   if (v->ViewportWidth[k] > v->SurfaceWidthY[k] || 
> v->ViewportHeight[k] > v->SurfaceHeightY[k]) {
>   ViewportExceedsSurface = true;
> - if (v->SourcePixelFormat[k] != dm_444_64 && 
> v->SourcePixelFormat[k] != dm_444_32 && v->SourcePixelFormat[k] != dm_444_16
> + if (v->SourcePixelFormat[k] != dm_444_64 && 
> v->SourcePixelFormat[k] != dm_444_32
>   && v->SourcePixelFormat[k] != dm_444_16 
> && v->SourcePixelFormat[k] != dm_444_8
>   && v->SourcePixelFormat[k] != dm_rgbe) {
>   if (v->ViewportWidthChroma[k] > 
> v->SurfaceWidthC[k]
> -- 
> 2.20.1
>

+ Anson

Reviewed-by: Rodrigo Siqueira  

-- 
Rodrigo Siqueira
https://siqueira.tech


signature.asc
Description: PGP signature

Re: [PATCH] drm: display: Remove duplicate include in dce110

2021-06-09 Thread Rodrigo Siqueira

On 06/08, Wan Jiabing wrote:
> Fix the following checkincludes.pl warning:
> ./drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> 35  #include "dce110_hw_sequencer.h"
> 69  #include "dce110_hw_sequencer.h"
> 
> 
> Signed-off-by: Wan Jiabing 
> ---
>  drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
> b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> index a08cd52f6ba8..e20d4def3eb9 100644
> --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
> @@ -66,7 +66,6 @@
>  
>  #include "atomfirmware.h"
>  
> -#include "dce110_hw_sequencer.h"
>  #include "dcn10/dcn10_hw_sequencer.h"
>  
>  #define GAMMA_HW_POINTS_NUM 256
> -- 
> 2.20.1
>

lgtm,

Thanks

Reviewed-by: Rodrigo Siqueira  

-- 
Rodrigo Siqueira
https://siqueira.tech


signature.asc
Description: PGP signature

[PATCH] udmabuf: Add support for mapping hugepages (v4)

2021-06-09 Thread Vivek Kasireddy

If the VMM's (Qemu) memory backend is backed up by memfd + Hugepages
(hugetlbfs and not THP), we have to first find the hugepage(s) where
the Guest allocations are located and then extract the regular 4k
sized subpages from them.

v2: Ensure that the subpage and hugepage offsets are calculated correctly
when the range of subpage allocations cuts across multiple hugepages.

v3: Instead of repeatedly looking up the hugepage for each subpage,
only do it when the subpage allocation crosses over into a different
hugepage. (suggested by Gerd and DW)

v4: Fix the following warning identified by checkpatch:
CHECK:OPEN_ENDED_LINE: Lines should not end with a '('

Cc: Gerd Hoffmann 
Signed-off-by: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 drivers/dma-buf/udmabuf.c | 50 +--
 1 file changed, 43 insertions(+), 7 deletions(-)

diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index db732f71e59a..d509f0d60794 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const u32list_limit = 1024;  /* udmabuf_create_list->count limit */
 static const size_t size_limit_mb = 64; /* total dmabuf size, in megabytes  */
@@ -160,10 +161,13 @@ static long udmabuf_create(struct miscdevice *device,
 {
DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
struct file *memfd = NULL;
+   struct address_space *mapping = NULL;
struct udmabuf *ubuf;
struct dma_buf *buf;
pgoff_t pgoff, pgcnt, pgidx, pgbuf = 0, pglimit;
-   struct page *page;
+   struct page *page, *hpage = NULL;
+   pgoff_t subpgoff, maxsubpgs;
+   struct hstate *hpstate;
int seals, ret = -EINVAL;
u32 i, flags;
 
@@ -194,7 +198,8 @@ static long udmabuf_create(struct miscdevice *device,
memfd = fget(list[i].memfd);
if (!memfd)
goto err;
-   if (!shmem_mapping(file_inode(memfd)->i_mapping))
+   mapping = file_inode(memfd)->i_mapping;
+   if (!shmem_mapping(mapping) && !is_file_hugepages(memfd))
goto err;
seals = memfd_fcntl(memfd, F_GET_SEALS, 0);
if (seals == -EINVAL)
@@ -205,17 +210,48 @@ static long udmabuf_create(struct miscdevice *device,
goto err;
pgoff = list[i].offset >> PAGE_SHIFT;
pgcnt = list[i].size   >> PAGE_SHIFT;
+   if (is_file_hugepages(memfd)) {
+   hpstate = hstate_file(memfd);
+   pgoff = list[i].offset >> huge_page_shift(hpstate);
+   subpgoff = (list[i].offset &
+   ~huge_page_mask(hpstate)) >> PAGE_SHIFT;
+   maxsubpgs = huge_page_size(hpstate) >> PAGE_SHIFT;
+   }
for (pgidx = 0; pgidx < pgcnt; pgidx++) {
-   page = shmem_read_mapping_page(
-   file_inode(memfd)->i_mapping, pgoff + pgidx);
-   if (IS_ERR(page)) {
-   ret = PTR_ERR(page);
-   goto err;
+   if (is_file_hugepages(memfd)) {
+   if (!hpage) {
+   hpage = find_get_page_flags(mapping,
+   pgoff, FGP_ACCESSED);
+   if (IS_ERR(hpage)) {
+   ret = PTR_ERR(hpage);
+   goto err;
+   }
+   }
+   page = hpage + subpgoff;
+   get_page(page);
+   subpgoff++;
+   if (subpgoff == maxsubpgs) {
+   put_page(hpage);
+   hpage = NULL;
+   subpgoff = 0;
+   pgoff++;
+   }
+   } else {
+   page = shmem_read_mapping_page(mapping,
+  pgoff + pgidx);
+   if (IS_ERR(page)) {
+   ret = PTR_ERR(page);
+   goto err;
+   }
}
ubuf->pages[pgbuf++] = page;
}
fput(memfd);
memfd = NULL;
+   if (hpage) {
+   put_page(hpage);
+   hpage = NULL;
+   }
}
 
exp_info.ops  = _ops;
-- 
2.30.2

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-09 Thread Daniel Vetter

On Wed, Jun 09, 2021 at 03:58:26PM +0200, Christian König wrote:
> Am 09.06.21 um 15:19 schrieb Daniel Vetter:
> > [SNIP]
> > > Yeah, we call this the lightweight and the heavyweight tlb flush.
> > > 
> > > The lighweight can be used when you are sure that you don't have any of 
> > > the
> > > PTEs currently in flight in the 3D/DMA engine and you just need to
> > > invalidate the TLB.
> > > 
> > > The heavyweight must be used when you need to invalidate the TLB *AND* 
> > > make
> > > sure that no concurrently operation moves new stuff into the TLB.
> > > 
> > > The problem is for this use case we have to use the heavyweight one.
> > Just for my own curiosity: So the lightweight flush is only for in-between
> > CS when you know access is idle? Or does that also not work if userspace
> > has a CS on a dma engine going at the same time because the tlb aren't
> > isolated enough between engines?
> 
> More or less correct, yes.
> 
> The problem is a lightweight flush only invalidates the TLB, but doesn't
> take care of entries which have been handed out to the different engines.
> 
> In other words what can happen is the following:
> 
> 1. Shader asks TLB to resolve address X.
> 2. TLB looks into its cache and can't find address X so it asks the walker
> to resolve.
> 3. Walker comes back with result for address X and TLB puts that into its
> cache and gives it to Shader.
> 4. Shader starts doing some operation using result for address X.
> 5. You send lightweight TLB invalidate and TLB throws away cached values for
> address X.
> 6. Shader happily still uses whatever the TLB gave to it in step 3 to
> accesses address X
> 
> See it like the shader has their own 1 entry L0 TLB cache which is not
> affected by the lightweight flush.
> 
> The heavyweight flush on the other hand sends out a broadcast signal to
> everybody and only comes back when we are sure that an address is not in use
> any more.

Ah makes sense. On intel the shaders only operate in VA, everything goes
around as explicit async messages to IO blocks. So we don't have this, the
only difference in tlb flushes is between tlb flush in the IB and an mmio
one which is independent for anything currently being executed on an
egine.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 1/1] drm/i915/uc: Use platform specific defaults for GuC/HuC enabling

2021-06-09 Thread Daniele Ceraolo Spurio





On 6/3/2021 9:48 AM, Matthew Brost wrote:

From: John Harrison 

The meaning of 'default' for the enable_guc module parameter has been
updated to accurately reflect what is supported on current platforms.
So start using the defaults instead of forcing everything off.
Although, note that right now, the default is for everything to be off
anyway. So this is not a change for current platforms.

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: Daniele Ceraolo Spurio 


Double checked the CI results and the 2 errors are unrelated.
Pushed to gt-next.

Daniele


---
  drivers/gpu/drm/i915/i915_params.c | 2 +-
  drivers/gpu/drm/i915/i915_params.h | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 0320878d96b0..e07f4cfea63a 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -160,7 +160,7 @@ i915_param_named_unsafe(edp_vswing, int, 0400,
  i915_param_named_unsafe(enable_guc, int, 0400,
"Enable GuC load for GuC submission and/or HuC load. "
"Required functionality can be selected using bitmask values. "
-   "(-1=auto, 0=disable [default], 1=GuC submission, 2=HuC load)");
+   "(-1=auto [default], 0=disable, 1=GuC submission, 2=HuC load)");
  
  i915_param_named(guc_log_level, int, 0400,

"GuC firmware logging level. Requires GuC to be loaded. "
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index 4a114a5ad000..f27eceb82c0f 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -59,7 +59,7 @@ struct drm_printer;
param(int, disable_power_well, -1, 0400) \
param(int, enable_ips, 1, 0600) \
param(int, invert_brightness, 0, 0600) \
-   param(int, enable_guc, 0, 0400) \
+   param(int, enable_guc, -1, 0400) \
param(int, guc_log_level, -1, 0400) \
param(char *, guc_firmware_path, NULL, 0400) \
param(char *, huc_firmware_path, NULL, 0400) \

Re: [PATCH 06/13] drm/i915/guc: New definition of the CTB descriptor

2021-06-09 Thread Michal Wajdeczko




On 08.06.2021 02:59, Daniele Ceraolo Spurio wrote:
> 
> 
> On 6/7/2021 11:03 AM, Matthew Brost wrote:
>> From: Michal Wajdeczko 
>>
>> Definition of the CTB descriptor has changed, leaving only
>> minimal shared fields like HEAD/TAIL/STATUS.
>>
>> Both HEAD and TAIL are now in dwords.
>>
>> Add some ABI documentation and implement required changes.
>>
>> Signed-off-by: Michal Wajdeczko 
>> Signed-off-by: Matthew Brost 
>> ---
>>   .../gt/uc/abi/guc_communication_ctb_abi.h | 70 ++-
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 70 +--
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  2 +-
>>   3 files changed, 85 insertions(+), 57 deletions(-)
>>
>> diff --git
>> a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>> b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>> index d38935f47ecf..c2a069a78e01 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>> @@ -7,6 +7,58 @@
>>   #define _ABI_GUC_COMMUNICATION_CTB_ABI_H
>>     #include 
>> +#include 
>> +
>> +#include "guc_messages_abi.h"
>> +
>> +/**
>> + * DOC: CT Buffer
>> + *
>> + * TBD
> 
> What's the plan with this TBD here?

Plan was to add some updated text based on old "DOC: CTB based
communication" section

> 
>> + */
>> +
>> +/**
>> + * DOC: CTB Descriptor
>> + *
>> + * 
>> +---+---+--+
>>
>> + *  |   | Bits  |
>> Description  |
>> + * 
>> +===+===+==+
>>
>> + *  | 0 |  31:0 | **HEAD** - offset (in dwords) to the last dword
>> that was |
>> + *  |   |   | read from the `CT
>> Buffer`_.  |
>> + *  |   |   | It can only be updated by the
>> receiver.  |
>> + * 
>> +---+---+--+
>>
>> + *  | 1 |  31:0 | **TAIL** - offset (in dwords) to the last dword
>> that was |
>> + *  |   |   | written to the `CT
>> Buffer`_. |
>> + *  |   |   | It can only be updated by the
>> sender.    |
>> + * 
>> +---+---+--+
>>
>> + *  | 2 |  31:0 | **STATUS** - status of the
>> CTB   |
>> + *  |   |  
>> |  |
>> + *  |   |   |   - _`GUC_CTB_STATUS_NO_ERROR` = 0 (normal
>> operation)    |
>> + *  |   |   |   - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too
>> large) |
>> + *  |   |   |   - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated
>> message)  |
>> + *  |   |   |   - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail
>> modified)  |
>> + *  |   |   |   - _`GUC_CTB_STATUS_NO_BACKCHANNEL` =
>> 8 |
>> + *  |   |   |   - _`GUC_CTB_STATUS_MALFORMED_MSG` =
>> 16 |
> 
> I don't see the last 2 error (8 & 16) in the 62.0.0 specs. Where is the
> reference for them?

both were discussed on various meetings but likely didn't make into
final spec 62, so for now we can drop them both

> 
>> + * 
>> +---+---+--+
>>
>> + *  |...|   | RESERVED =
>> MBZ   |
>> + * 
>> +---+---+--+
>>
>> + *  | 15|  31:0 | RESERVED =
>> MBZ   |
>> + * 
>> +---+---+--+
>>
>> + */
>> +
>> +struct guc_ct_buffer_desc {
>> +    u32 head;
>> +    u32 tail;
>> +    u32 status;
>> +#define GUC_CTB_STATUS_NO_ERROR    0
>> +#define GUC_CTB_STATUS_OVERFLOW    (1 << 0)
>> +#define GUC_CTB_STATUS_UNDERFLOW    (1 << 1)
>> +#define GUC_CTB_STATUS_MISMATCH    (1 << 2)
>> +#define GUC_CTB_STATUS_NO_BACKCHANNEL    (1 << 3)
>> +#define GUC_CTB_STATUS_MALFORMED_MSG    (1 << 4)
> 
> use BIT() ?

as explained before, on ABI headers we didn't want any dependency and
just use plain C

> 
>> +    u32 reserved[13];
>> +} __packed;
>> +static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
>>     /**
>>    * DOC: CTB based communication
>> @@ -60,24 +112,6 @@
>>    * - **flags**, holds various bits to control message handling
>>    */
>>   -/*
>> - * Describes single command transport buffer.
>> - * Used by both guc-master and clients.
>> - */
>> -struct guc_ct_buffer_desc {
>> -    u32 addr;    /* gfx address */
>> -    u64 host_private;    /* host private data */
>> -    u32 size;    /* size in bytes */
>> -    u32 head;    /* offset updated by GuC*/
>> -    u32 tail;    /* offset updated by owner */
>> -    u32 is_in_error;    /* error

Re: [PATCH v2 02/10] drm/arm: malidp: Use fourcc_mod_is_vendor() helper

2021-06-09 Thread Daniel Vetter

On Fri, Mar 26, 2021 at 03:51:31PM +0100, Thierry Reding wrote:
> From: Thierry Reding 
> 
> Rather than open-coding the vendor extraction operation, use the newly
> introduced helper macro.
> 
> Signed-off-by: Thierry Reding 

On the first two patches:

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/arm/malidp_planes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/arm/malidp_planes.c 
> b/drivers/gpu/drm/arm/malidp_planes.c
> index ddbba67f0283..cd218883cff8 100644
> --- a/drivers/gpu/drm/arm/malidp_planes.c
> +++ b/drivers/gpu/drm/arm/malidp_planes.c
> @@ -165,7 +165,7 @@ bool malidp_format_mod_supported(struct drm_device *drm,
>   return !malidp_hw_format_is_afbc_only(format);
>   }
>  
> - if ((modifier >> 56) != DRM_FORMAT_MOD_VENDOR_ARM) {
> + if (!fourcc_mod_is_vendor(modifier, ARM)) {
>   DRM_ERROR("Unknown modifier (not Arm)\n");
>   return false;
>   }
> -- 
> 2.30.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm/nouveau: init the base GEM fields for internal BOs

2021-06-09 Thread Mikko Perttunen


On 6/9/21 8:29 PM, Christian König wrote:

TTMs buffer objects are based on GEM objects for quite a while
and rely on initializing those fields before initializing the TTM BO.

Noveau now doesn't init the GEM object for internally allocated BOs,


Nouveau


so make sure that we at least initialize some necessary fields.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/nouveau/nouveau_bo.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 520b1ea9d16c..085023624fb0 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -149,6 +149,8 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo)
 */
if (bo->base.dev)
drm_gem_object_release(>base);
+   else
+   dma_resv_fini(>base._resv);
  
  	kfree(nvbo);

  }
@@ -330,6 +332,10 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int 
align,
if (IS_ERR(nvbo))
return PTR_ERR(nvbo);
  
+	nvbo->bo.base.size = size;

+   dma_resv_init(>bo.base._resv);
+   drm_vma_node_reset(>bo.base.vma_node);
+
ret = nouveau_bo_init(nvbo, size, align, domain, sg, robj);
if (ret)
return ret;



That works, thanks for the fix!

Tested-by: Mikko Perttunen 

Mikko

[PATCH 31/31] HACK: Always finalize contexts

2021-06-09 Thread Jason Ekstrand

Only for verifying the previous patch with I-G-T.  DO NOT MERGE!
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7d6f52d8a8012..9395d9d7f9530 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1996,7 +1996,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, 
void *data,
goto err_pc;
}
 
-   if (GRAPHICS_VER(i915) > 12) {
+   if (1 || (GRAPHICS_VER(i915) > 12)) {
struct i915_gem_context *ctx;
 
/* Get ourselves a context ID */
-- 
2.31.1

[PATCH 30/31] drm/i915: Finalize contexts in GEM_CONTEXT_CREATE on version 13+

2021-06-09 Thread Jason Ekstrand

All the proto-context stuff for context creation exists to allow older
userspace drivers to set VMs and engine sets via SET_CONTEXT_PARAM.
Drivers need to update to use CONTEXT_CREATE_EXT_* for this going
forward.  Force the issue by blocking the old mechanism on any future
hardware generations.

Signed-off-by: Jason Ekstrand 
Cc: Jon Bloomfield 
Cc: Carl Zhang 
Cc: Michal Mrozek 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 39 -
 1 file changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c67e305f5bc74..7d6f52d8a8012 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1996,9 +1996,28 @@ int i915_gem_context_create_ioctl(struct drm_device 
*dev, void *data,
goto err_pc;
}
 
-   ret = proto_context_register(ext_data.fpriv, ext_data.pc, );
-   if (ret < 0)
-   goto err_pc;
+   if (GRAPHICS_VER(i915) > 12) {
+   struct i915_gem_context *ctx;
+
+   /* Get ourselves a context ID */
+   ret = xa_alloc(_data.fpriv->context_xa, , NULL,
+  xa_limit_32b, GFP_KERNEL);
+   if (ret)
+   goto err_pc;
+
+   ctx = i915_gem_create_context(i915, ext_data.pc);
+   if (IS_ERR(ctx)) {
+   ret = PTR_ERR(ctx);
+   goto err_pc;
+   }
+
+   proto_context_close(ext_data.pc);
+   gem_context_register(ctx, ext_data.fpriv, id);
+   } else {
+   ret = proto_context_register(ext_data.fpriv, ext_data.pc, );
+   if (ret < 0)
+   goto err_pc;
+   }
 
args->ctx_id = id;
drm_dbg(>drm, "HW context %d created\n", args->ctx_id);
@@ -2181,15 +2200,17 @@ int i915_gem_context_setparam_ioctl(struct drm_device 
*dev, void *data,
mutex_lock(_priv->proto_context_lock);
ctx = __context_lookup(file_priv, args->ctx_id);
if (!ctx) {
-   /* FIXME: We should consider disallowing SET_CONTEXT_PARAM
-* for most things on future platforms.  Clients should be
-* using CONTEXT_CREATE_EXT_PARAM instead.
-*/
pc = xa_load(_priv->proto_context_xa, args->ctx_id);
-   if (pc)
+   if (pc) {
+   /* Contexts should be finalized inside
+* GEM_CONTEXT_CREATE starting with graphics
+* version 13.
+*/
+   WARN_ON(GRAPHICS_VER(file_priv->dev_priv) > 12);
ret = set_proto_ctx_param(file_priv, pc, args);
-   else
+   } else {
ret = -ENOENT;
+   }
}
mutex_unlock(_priv->proto_context_lock);
 
-- 
2.31.1

[PATCH 29/31] drm/i915/gem: Roll all of context creation together

2021-06-09 Thread Jason Ekstrand

Now that we have the whole engine set and VM at context creation time,
we can just assign those fields instead of creating first and handling
the VM and engines later.  This lets us avoid creating useless VMs and
engine sets and lets us get rid of the complex VM setting code.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 176 ++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  33 ++--
 2 files changed, 73 insertions(+), 136 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5f5375b15c530..c67e305f5bc74 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1279,56 +1279,6 @@ static int __context_set_persistence(struct 
i915_gem_context *ctx, bool state)
return 0;
 }
 
-static struct i915_gem_context *
-__create_context(struct drm_i915_private *i915,
-const struct i915_gem_proto_context *pc)
-{
-   struct i915_gem_context *ctx;
-   struct i915_gem_engines *e;
-   int err;
-   int i;
-
-   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
-   if (!ctx)
-   return ERR_PTR(-ENOMEM);
-
-   kref_init(>ref);
-   ctx->i915 = i915;
-   ctx->sched = pc->sched;
-   mutex_init(>mutex);
-   INIT_LIST_HEAD(>link);
-
-   spin_lock_init(>stale.lock);
-   INIT_LIST_HEAD(>stale.engines);
-
-   mutex_init(>engines_mutex);
-   e = default_engines(ctx, pc->legacy_rcs_sseu);
-   if (IS_ERR(e)) {
-   err = PTR_ERR(e);
-   goto err_free;
-   }
-   RCU_INIT_POINTER(ctx->engines, e);
-
-   INIT_RADIX_TREE(>handles_vma, GFP_KERNEL);
-   mutex_init(>lut_mutex);
-
-   /* NB: Mark all slices as needing a remap so that when the context first
-* loads it will restore whatever remap state already exists. If there
-* is no remap info, it will be a NOP. */
-   ctx->remap_slice = ALL_L3_SLICES(i915);
-
-   ctx->user_flags = pc->user_flags;
-
-   for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
-   ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
-
-   return ctx;
-
-err_free:
-   kfree(ctx);
-   return ERR_PTR(err);
-}
-
 static inline struct i915_gem_engines *
 __context_engines_await(const struct i915_gem_context *ctx,
bool *user_engines)
@@ -1372,54 +1322,31 @@ context_apply_all(struct i915_gem_context *ctx,
i915_sw_fence_complete(>fence);
 }
 
-static void __apply_ppgtt(struct intel_context *ce, void *vm)
-{
-   i915_vm_put(ce->vm);
-   ce->vm = i915_vm_get(vm);
-}
-
-static struct i915_address_space *
-__set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm)
-{
-   struct i915_address_space *old;
-
-   old = rcu_replace_pointer(ctx->vm,
- i915_vm_open(vm),
- lockdep_is_held(>mutex));
-   GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old));
-
-   context_apply_all(ctx, __apply_ppgtt, vm);
-
-   return old;
-}
-
-static void __assign_ppgtt(struct i915_gem_context *ctx,
-  struct i915_address_space *vm)
-{
-   if (vm == rcu_access_pointer(ctx->vm))
-   return;
-
-   vm = __set_ppgtt(ctx, vm);
-   if (vm)
-   i915_vm_close(vm);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915,
const struct i915_gem_proto_context *pc)
 {
struct i915_gem_context *ctx;
-   int ret;
+   struct i915_address_space *vm = NULL;
+   struct i915_gem_engines *e;
+   int err;
+   int i;
 
-   ctx = __create_context(i915, pc);
-   if (IS_ERR(ctx))
-   return ctx;
+   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+   if (!ctx)
+   return ERR_PTR(-ENOMEM);
+
+   kref_init(>ref);
+   ctx->i915 = i915;
+   ctx->sched = pc->sched;
+   mutex_init(>mutex);
+   INIT_LIST_HEAD(>link);
+
+   spin_lock_init(>stale.lock);
+   INIT_LIST_HEAD(>stale.engines);
 
if (pc->vm) {
-   /* __assign_ppgtt() requires this mutex to be held */
-   mutex_lock(>mutex);
-   __assign_ppgtt(ctx, pc->vm);
-   mutex_unlock(>mutex);
+   vm = i915_vm_get(pc->vm);
} else if (HAS_FULL_PPGTT(i915)) {
struct i915_ppgtt *ppgtt;
 
@@ -1427,50 +1354,65 @@ i915_gem_create_context(struct drm_i915_private *i915,
if (IS_ERR(ppgtt)) {
drm_dbg(>drm, "PPGTT setup failed (%ld)\n",
PTR_ERR(ppgtt));
-   context_close(ctx);
-   return ERR_CAST(ppgtt);
+   err = PTR_ERR(ppgtt);
+   goto err_ctx;

[PATCH 28/31] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec

2021-06-09 Thread Jason Ekstrand

We want to delete __assign_ppgtt and, generally, stop setting the VM
after context creation.  This is the one place I could find in the
selftests where we set a VM after the fact.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 3e59746afdc82..8eb5050f8cb3e 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg)
struct i915_gem_context *ctx;
struct intel_context *ce;
 
-   ctx = kernel_context(i915, NULL);
+   ctx = kernel_context(i915, ctx_vm(parent));
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_test;
}
 
-   mutex_lock(>mutex);
-   __assign_ppgtt(ctx, ctx_vm(parent));
-   mutex_unlock(>mutex);
-
ce = i915_gem_context_get_engine(ctx, 
engine->legacy_idx);
GEM_BUG_ON(IS_ERR(ce));
 
-- 
2.31.1

[PATCH 22/31] drm/i915/gem: Return an error ptr from context_lookup

2021-06-09 Thread Jason Ekstrand

We're about to start doing lazy context creation which means contexts
get created in i915_gem_context_lookup and we may start having more
errors than -ENOENT.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c| 12 ++--
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h|  2 +-
 drivers/gpu/drm/i915/i915_perf.c   |  4 ++--
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 4972b8c91d942..7045e3afa7113 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2636,8 +2636,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
int ret = 0;
 
ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-   if (!ctx)
-   return -ENOENT;
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
 
switch (args->param) {
case I915_CONTEXT_PARAM_GTT_SIZE:
@@ -2705,8 +2705,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device 
*dev, void *data,
int ret;
 
ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-   if (!ctx)
-   return -ENOENT;
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
 
ret = ctx_setparam(file_priv, ctx, args);
 
@@ -2725,8 +2725,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device 
*dev,
return -EINVAL;
 
ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
-   if (!ctx)
-   return -ENOENT;
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
 
/*
 * We opt for unserialised reads here. This may result in tearing
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 720487ad6a5a4..4b4d3de61a157 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -739,8 +739,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
struct i915_gem_context *ctx;
 
ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
-   if (unlikely(!ctx))
-   return -ENOENT;
+   if (unlikely(IS_ERR(ctx)))
+   return PTR_ERR(ctx);
 
eb->gem_context = ctx;
if (rcu_access_pointer(ctx->vm))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b191946229746..6aa91b795784c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1861,7 +1861,7 @@ i915_gem_context_lookup(struct drm_i915_file_private 
*file_priv, u32 id)
ctx = NULL;
rcu_read_unlock();
 
-   return ctx;
+   return ctx ? ctx : ERR_PTR(-ENOENT);
 }
 
 static inline struct i915_address_space *
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 9f94914958c39..b4ec114a4698b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
struct drm_i915_file_private *file_priv = file->driver_priv;
 
specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
-   if (!specific_ctx) {
+   if (IS_ERR(specific_ctx)) {
DRM_DEBUG("Failed to look up context with ID %u for 
opening perf stream\n",
  ctx_handle);
-   ret = -ENOENT;
+   ret = PTR_ERR(specific_ctx);
goto err;
}
}
-- 
2.31.1

[PATCH 27/31] drm/i915/selftests: Take a VM in kernel_context()

2021-06-09 Thread Jason Ekstrand

This better models where we want to go with contexts in general where
things like the VM and engine set are create parameters instead of being
set after the fact.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 .../drm/i915/gem/selftests/i915_gem_context.c |  4 ++--
 .../gpu/drm/i915/gem/selftests/mock_context.c |  9 -
 .../gpu/drm/i915/gem/selftests/mock_context.h |  4 +++-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +--
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
 5 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 92544a174cc9a..3e59746afdc82 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -680,7 +680,7 @@ static int igt_ctx_exec(void *arg)
struct i915_gem_context *ctx;
struct intel_context *ce;
 
-   ctx = kernel_context(i915);
+   ctx = kernel_context(i915, NULL);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_file;
@@ -813,7 +813,7 @@ static int igt_shared_ctx_exec(void *arg)
struct i915_gem_context *ctx;
struct intel_context *ce;
 
-   ctx = kernel_context(i915);
+   ctx = kernel_context(i915, NULL);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_test;
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c 
b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index 61aaac4a334cf..500ef27ba4771 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -150,7 +150,8 @@ live_context_for_engine(struct intel_engine_cs *engine, 
struct file *file)
 }
 
 struct i915_gem_context *
-kernel_context(struct drm_i915_private *i915)
+kernel_context(struct drm_i915_private *i915,
+  struct i915_address_space *vm)
 {
struct i915_gem_context *ctx;
struct i915_gem_proto_context *pc;
@@ -159,6 +160,12 @@ kernel_context(struct drm_i915_private *i915)
if (IS_ERR(pc))
return ERR_CAST(pc);
 
+   if (vm) {
+   if (pc->vm)
+   i915_vm_put(pc->vm);
+   pc->vm = i915_vm_get(vm);
+   }
+
ctx = i915_gem_create_context(i915, pc);
proto_context_close(pc);
if (IS_ERR(ctx))
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.h 
b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
index 2a6121d33352d..7a02fd9b5866a 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.h
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
@@ -10,6 +10,7 @@
 struct file;
 struct drm_i915_private;
 struct intel_engine_cs;
+struct i915_address_space;
 
 void mock_init_contexts(struct drm_i915_private *i915);
 
@@ -25,7 +26,8 @@ live_context(struct drm_i915_private *i915, struct file 
*file);
 struct i915_gem_context *
 live_context_for_engine(struct intel_engine_cs *engine, struct file *file);
 
-struct i915_gem_context *kernel_context(struct drm_i915_private *i915);
+struct i915_gem_context *kernel_context(struct drm_i915_private *i915,
+   struct i915_address_space *vm);
 void kernel_context_close(struct i915_gem_context *ctx);
 
 #endif /* !__MOCK_CONTEXT_H */
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c 
b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 780939005554f..5eedb9b2e08f3 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -1522,12 +1522,12 @@ static int live_busywait_preempt(void *arg)
 * preempt the busywaits used to synchronise between rings.
 */
 
-   ctx_hi = kernel_context(gt->i915);
+   ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
return -ENOMEM;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-   ctx_lo = kernel_context(gt->i915);
+   ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1724,12 +1724,12 @@ static int live_preempt(void *arg)
if (igt_spinner_init(_lo, gt))
goto err_spin_hi;
 
-   ctx_hi = kernel_context(gt->i915);
+   ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-   ctx_lo = kernel_context(gt->i915);
+   ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority =

[PATCH 20/31] drm/i915/gem: Make an alignment check more sensible

2021-06-09 Thread Jason Ekstrand

What we really want to check is that size of the engines array, i.e.
args->size - sizeof(*user) is divisible by the element size, i.e.
sizeof(*user->engines) because that's what's required for computing the
array length right below the check.  However, we're currently not doing
this and instead doing a compile-time check that sizeof(*user) is
divisible by sizeof(*user->engines) and avoiding the subtraction.  As
far as I can tell, the only reason for the more confusing pair of checks
is to avoid a single subtraction of a constant.

The other thing the BUILD_BUG_ON might be trying to implicitly check is
that offsetof(user->engines) == sizeof(*user) and we don't have any
weird padding throwing us off.  However, that's not the check it's doing
and it's not even a reliable way to do that check.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 8e7c0e3f070ed..c9bae1a1726e1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1764,9 +1764,8 @@ set_engines(struct i915_gem_context *ctx,
goto replace;
}
 
-   BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
if (args->size < sizeof(*user) ||
-   !IS_ALIGNED(args->size, sizeof(*user->engines))) {
+   !IS_ALIGNED(args->size -  sizeof(*user), sizeof(*user->engines))) {
drm_dbg(>drm, "Invalid size for engine array: %d\n",
args->size);
return -EINVAL;
-- 
2.31.1

[PATCH 25/31] drm/i915/gem: Don't allow changing the VM on running contexts (v4)

2021-06-09 Thread Jason Ekstrand

When the APIs were added to manage VMs more directly from userspace, the
questionable choice was made to allow changing out the VM on a context
at any time.  This is horribly racy and there's absolutely no reason why
any userspace would want to do this outside of testing that exact race.
By removing support for CONTEXT_PARAM_VM from ctx_setparam, we make it
impossible to change out the VM after the context has been fully
created.  This lets us delete a bunch of deferred task code as well as a
duplicated (and slightly different) copy of the code which programs the
PPGTT registers.

v2 (Jason Ekstrand):
 - Expand the commit message

v3 (Daniel Vetter):
 - Don't drop the __rcu on the vm pointer

v4 (Jason Ekstrand):
 - Make it more obvious that I915_CONTEXT_PARAM_VM returns -EINVAL

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 263 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 119 
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
 3 files changed, 1 insertion(+), 382 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c4f89e4b1665f..40acecfbbe5b5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1633,120 +1633,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, 
void *data,
return 0;
 }
 
-struct context_barrier_task {
-   struct i915_active base;
-   void (*task)(void *data);
-   void *data;
-};
-
-static void cb_retire(struct i915_active *base)
-{
-   struct context_barrier_task *cb = container_of(base, typeof(*cb), base);
-
-   if (cb->task)
-   cb->task(cb->data);
-
-   i915_active_fini(>base);
-   kfree(cb);
-}
-
-I915_SELFTEST_DECLARE(static intel_engine_mask_t context_barrier_inject_fault);
-static int context_barrier_task(struct i915_gem_context *ctx,
-   intel_engine_mask_t engines,
-   bool (*skip)(struct intel_context *ce, void 
*data),
-   int (*pin)(struct intel_context *ce, struct 
i915_gem_ww_ctx *ww, void *data),
-   int (*emit)(struct i915_request *rq, void 
*data),
-   void (*task)(void *data),
-   void *data)
-{
-   struct context_barrier_task *cb;
-   struct i915_gem_engines_iter it;
-   struct i915_gem_engines *e;
-   struct i915_gem_ww_ctx ww;
-   struct intel_context *ce;
-   int err = 0;
-
-   GEM_BUG_ON(!task);
-
-   cb = kmalloc(sizeof(*cb), GFP_KERNEL);
-   if (!cb)
-   return -ENOMEM;
-
-   i915_active_init(>base, NULL, cb_retire, 0);
-   err = i915_active_acquire(>base);
-   if (err) {
-   kfree(cb);
-   return err;
-   }
-
-   e = __context_engines_await(ctx, NULL);
-   if (!e) {
-   i915_active_release(>base);
-   return -ENOENT;
-   }
-
-   for_each_gem_engine(ce, e, it) {
-   struct i915_request *rq;
-
-   if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-  ce->engine->mask)) {
-   err = -ENXIO;
-   break;
-   }
-
-   if (!(ce->engine->mask & engines))
-   continue;
-
-   if (skip && skip(ce, data))
-   continue;
-
-   i915_gem_ww_ctx_init(, true);
-retry:
-   err = intel_context_pin_ww(ce, );
-   if (err)
-   goto err;
-
-   if (pin)
-   err = pin(ce, , data);
-   if (err)
-   goto err_unpin;
-
-   rq = i915_request_create(ce);
-   if (IS_ERR(rq)) {
-   err = PTR_ERR(rq);
-   goto err_unpin;
-   }
-
-   err = 0;
-   if (emit)
-   err = emit(rq, data);
-   if (err == 0)
-   err = i915_active_add_request(>base, rq);
-
-   i915_request_add(rq);
-err_unpin:
-   intel_context_unpin(ce);
-err:
-   if (err == -EDEADLK) {
-   err = i915_gem_ww_ctx_backoff();
-   if (!err)
-   goto retry;
-   }
-   i915_gem_ww_ctx_fini();
-
-   if (err)
-   break;
-   }
-   i915_sw_fence_complete(>fence);
-
-   cb->task = err ? NULL : task; /* caller needs to unwind instead */
-   cb->data = data;
-
-   i915_active_release(>base);
-
-   return err;
-}
-
 static int get_ppgtt(struct drm_i915_file_private *file_priv,
 struct i915_gem_context *ctx,
 struct

[PATCH 26/31] drm/i915/gem: Don't allow changing the engine set on running contexts (v3)

2021-06-09 Thread Jason Ekstrand

When the APIs were added to manage the engine set on a GEM context
directly from userspace, the questionable choice was made to allow
changing the engine set on a context at any time.  This is horribly racy
and there's absolutely no reason why any userspace would want to do this
outside of trying to exercise interesting race conditions.  By removing
support for CONTEXT_PARAM_ENGINES from ctx_setparam, we make it
impossible to change the engine set after the context has been fully
created.

This doesn't yet let us delete all the deferred engine clean-up code as
that's still used for handling the case where the client dies or calls
GEM_CONTEXT_DESTROY while work is in flight.  However, moving to an API
where the engine set is effectively immutable gives us more options to
potentially clean that code up a bit going forward.  It also removes a
whole class of ways in which a client can hurt itself or try to get
around kernel context banning.

v2 (Jason Ekstrand):
 - Expand the commit mesage

v3 (Jason Ekstrand):
 - Make it more obvious that I915_CONTEXT_PARAM_ENGINES returns -EINVAL

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 304 +---
 1 file changed, 1 insertion(+), 303 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 40acecfbbe5b5..5f5375b15c530 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1819,305 +1819,6 @@ static int set_sseu(struct i915_gem_context *ctx,
return ret;
 }
 
-struct set_engines {
-   struct i915_gem_context *ctx;
-   struct i915_gem_engines *engines;
-};
-
-static int
-set_engines__load_balance(struct i915_user_extension __user *base, void *data)
-{
-   struct i915_context_engines_load_balance __user *ext =
-   container_of_user(base, typeof(*ext), base);
-   const struct set_engines *set = data;
-   struct drm_i915_private *i915 = set->ctx->i915;
-   struct intel_engine_cs *stack[16];
-   struct intel_engine_cs **siblings;
-   struct intel_context *ce;
-   struct intel_sseu null_sseu = {};
-   u16 num_siblings, idx;
-   unsigned int n;
-   int err;
-
-   if (!HAS_EXECLISTS(i915))
-   return -ENODEV;
-
-   if (intel_uc_uses_guc_submission(>gt.uc))
-   return -ENODEV; /* not implement yet */
-
-   if (get_user(idx, >engine_index))
-   return -EFAULT;
-
-   if (idx >= set->engines->num_engines) {
-   drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
-   idx, set->engines->num_engines);
-   return -EINVAL;
-   }
-
-   idx = array_index_nospec(idx, set->engines->num_engines);
-   if (set->engines->engines[idx]) {
-   drm_dbg(>drm,
-   "Invalid placement[%d], already occupied\n", idx);
-   return -EEXIST;
-   }
-
-   if (get_user(num_siblings, >num_siblings))
-   return -EFAULT;
-
-   err = check_user_mbz(>flags);
-   if (err)
-   return err;
-
-   err = check_user_mbz(>mbz64);
-   if (err)
-   return err;
-
-   siblings = stack;
-   if (num_siblings > ARRAY_SIZE(stack)) {
-   siblings = kmalloc_array(num_siblings,
-sizeof(*siblings),
-GFP_KERNEL);
-   if (!siblings)
-   return -ENOMEM;
-   }
-
-   for (n = 0; n < num_siblings; n++) {
-   struct i915_engine_class_instance ci;
-
-   if (copy_from_user(, >engines[n], sizeof(ci))) {
-   err = -EFAULT;
-   goto out_siblings;
-   }
-
-   siblings[n] = intel_engine_lookup_user(i915,
-  ci.engine_class,
-  ci.engine_instance);
-   if (!siblings[n]) {
-   drm_dbg(>drm,
-   "Invalid sibling[%d]: { class:%d, inst:%d }\n",
-   n, ci.engine_class, ci.engine_instance);
-   err = -EINVAL;
-   goto out_siblings;
-   }
-   }
-
-   ce = intel_execlists_create_virtual(siblings, n);
-   if (IS_ERR(ce)) {
-   err = PTR_ERR(ce);
-   goto out_siblings;
-   }
-
-   intel_context_set_gem(ce, set->ctx, null_sseu);
-
-   if (cmpxchg(>engines->engines[idx], NULL, ce)) {
-   intel_context_put(ce);
-   err = -EEXIST;
-   goto out_siblings;
-   }
-
-out_siblings:
-   if (siblings != stack)
-   kfree(siblings);
-
-   return err;
-}
-
-static int
-set_engines__bond(struct i915_user_extension __user *base, void

[PATCH 24/31] drm/i915/gem: Delay context creation (v3)

2021-06-09 Thread Jason Ekstrand

The current context uAPI allows for two methods of setting context
parameters: SET_CONTEXT_PARAM and CONTEXT_CREATE_EXT_SETPARAM.  The
former is allowed to be called at any time while the later happens as
part of GEM_CONTEXT_CREATE.  Currently, everything settable via one is
settable via the other.  While some params are fairly simple and setting
them on a live context is harmless such as the context priority, others
are far trickier such as the VM or the set of engines.  In order to swap
out the VM, for instance, we have to delay until all current in-flight
work is complete, swap in the new VM, and then continue.  This leads to
a plethora of potential race conditions we'd really rather avoid.

In previous patches, we added a i915_gem_proto_context struct which is
capable of storing and tracking all such create parameters.  This commit
delays the creation of the actual context until after the client is done
configuring it with SET_CONTEXT_PARAM.  From the perspective of the
client, it has the same u32 context ID the whole time.  From the
perspective of i915, however, it's an i915_gem_proto_context right up
until the point where we attempt to do something which the proto-context
can't handle.  Then the real context gets created.

This is accomplished via a little xarray dance.  When GEM_CONTEXT_CREATE
is called, we create a proto-context, reserve a slot in context_xa but
leave it NULL, the proto-context in the corresponding slot in
proto_context_xa.  Then, whenever we go to look up a context, we first
check context_xa.  If it's there, we return the i915_gem_context and
we're done.  If it's not, we look in proto_context_xa and, if we find it
there, we create the actual context and kill the proto-context.

In order for this dance to work properly, everything which ever touches
a proto-context is guarded by drm_i915_file_private::proto_context_lock,
including context creation.  Yes, this means context creation now takes
a giant global lock but it can't really be helped and that should never
be on any driver's fast-path anyway.

v2 (Daniel Vetter):
 - Commit message grammatical fixes.
 - Use WARN_ON instead of GEM_BUG_ON
 - Rename lazy_create_context_locked to finalize_create_context_locked
 - Rework the control-flow logic in the setparam ioctl
 - Better documentation all around

v3 (kernel test robot):
 - Make finalize_create_context_locked static

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 203 ++
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  54 +
 .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
 drivers/gpu/drm/i915/i915_drv.h   |  76 +--
 5 files changed, 283 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5a1402544d48d..c4f89e4b1665f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -278,6 +278,42 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
return err;
 }
 
+static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
+struct i915_gem_proto_context *pc,
+u32 *id)
+{
+   int ret;
+   void *old;
+
+   lockdep_assert_held(>proto_context_lock);
+
+   ret = xa_alloc(>context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
+   if (ret)
+   return ret;
+
+   old = xa_store(>proto_context_xa, *id, pc, GFP_KERNEL);
+   if (xa_is_err(old)) {
+   xa_erase(>context_xa, *id);
+   return xa_err(old);
+   }
+   WARN_ON(old);
+
+   return 0;
+}
+
+static int proto_context_register(struct drm_i915_file_private *fpriv,
+ struct i915_gem_proto_context *pc,
+ u32 *id)
+{
+   int ret;
+
+   mutex_lock(>proto_context_lock);
+   ret = proto_context_register_locked(fpriv, pc, id);
+   mutex_unlock(>proto_context_lock);
+
+   return ret;
+}
+
 static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
struct i915_gem_proto_context *pc,
const struct drm_i915_gem_context_param *args)
@@ -1448,12 +1484,12 @@ void i915_gem_init__contexts(struct drm_i915_private 
*i915)
init_contexts(>gem.contexts);
 }
 
-static int gem_context_register(struct i915_gem_context *ctx,
-   struct drm_i915_file_private *fpriv,
-   u32 *id)
+static void gem_context_register(struct i915_gem_context *ctx,
+struct drm_i915_file_private *fpriv,
+u32 id)
 {
struct drm_i915_private *i915 = ctx->i915;
-   int ret;
+   void *old;
 
ctx->file_priv =

[PATCH 23/31] drm/i915/gt: Drop i915_address_space::file (v2)

2021-06-09 Thread Jason Ekstrand

There's a big comment saying how useful it is but no one is using this
for anything anymore.

It was added in 2bfa996e031b ("drm/i915: Store owning file on the
i915_address_space") and used for debugfs at the time as well as telling
the difference between the global GTT and a PPGTT.  In f6e8aa387171
("drm/i915: Report the number of closed vma held by each context in
debugfs") we removed one use of it by switching to a context walk and
comparing with the VM in the context.  Finally, VM stats for debugfs
were entirely nuked in db80a1294c23 ("drm/i915/gem: Remove per-client
stats from debugfs/i915_gem_objects")

v2 (Daniel Vetter):
 - Delete a struct drm_i915_file_private pre-declaration
 - Add a comment to the commit message about history

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 -
 drivers/gpu/drm/i915/gt/intel_gtt.h | 11 ---
 drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
 3 files changed, 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7045e3afa7113..5a1402544d48d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1453,17 +1453,10 @@ static int gem_context_register(struct i915_gem_context 
*ctx,
u32 *id)
 {
struct drm_i915_private *i915 = ctx->i915;
-   struct i915_address_space *vm;
int ret;
 
ctx->file_priv = fpriv;
 
-   mutex_lock(>mutex);
-   vm = i915_gem_context_vm(ctx);
-   if (vm)
-   WRITE_ONCE(vm->file, fpriv); /* XXX */
-   mutex_unlock(>mutex);
-
ctx->pid = get_task_pid(current, PIDTYPE_PID);
snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
 current->comm, pid_nr(ctx->pid));
@@ -1562,8 +1555,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
 
-   ppgtt->vm.file = file_priv;
-
if (args->extensions) {
err = i915_user_extensions(u64_to_user_ptr(args->extensions),
   NULL, 0,
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index edea95b97c36e..474eae483ab0e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -140,7 +140,6 @@ typedef u64 gen8_pte_t;
 
 enum i915_cache_level;
 
-struct drm_i915_file_private;
 struct drm_i915_gem_object;
 struct i915_fence_reg;
 struct i915_vma;
@@ -220,16 +219,6 @@ struct i915_address_space {
struct intel_gt *gt;
struct drm_i915_private *i915;
struct device *dma;
-   /*
-* Every address space belongs to a struct file - except for the global
-* GTT that is owned by the driver (and so @file is set to NULL). In
-* principle, no information should leak from one context to another
-* (or between files/processes etc) unless explicitly shared by the
-* owner. Tracking the owner is important in order to free up per-file
-* objects along with the file, to aide resource tracking, and to
-* assign blame.
-*/
-   struct drm_i915_file_private *file;
u64 total;  /* size addr space maps (ex. 2GB for ggtt) */
u64 reserved;   /* size addr space reserved */
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c 
b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 5c7ae40bba634..cc047ec594f93 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, 
const char *name)
ppgtt->vm.gt = >gt;
ppgtt->vm.i915 = i915;
ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
-   ppgtt->vm.file = ERR_PTR(-ENODEV);
ppgtt->vm.dma = i915->drm.dev;
 
i915_address_space_init(>vm, VM_CLASS_PPGTT);
-- 
2.31.1

[PATCH 17/31] drm/i915/gem: Rework error handling in default_engines

2021-06-09 Thread Jason Ekstrand

Since free_engines works for partially constructed engine sets, we can
use the usual goto pattern.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f44faad296249..93579fa0324d1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -407,7 +407,7 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
 {
const struct intel_gt *gt = >i915->gt;
struct intel_engine_cs *engine;
-   struct i915_gem_engines *e;
+   struct i915_gem_engines *e, *err;
enum intel_engine_id id;
 
e = alloc_engines(I915_NUM_ENGINES);
@@ -425,18 +425,21 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
 
ce = intel_context_create(engine);
if (IS_ERR(ce)) {
-   __free_engines(e, e->num_engines + 1);
-   return ERR_CAST(ce);
+   err = ERR_CAST(ce);
+   goto free_engines;
}
 
intel_context_set_gem(ce, ctx);
 
e->engines[engine->legacy_idx] = ce;
-   e->num_engines = max(e->num_engines, engine->legacy_idx);
+   e->num_engines = max(e->num_engines, engine->legacy_idx + 1);
}
-   e->num_engines++;
 
return e;
+
+free_engines:
+   free_engines(e);
+   return err;
 }
 
 void i915_gem_context_release(struct kref *ref)
-- 
2.31.1

[PATCH 21/31] drm/i915/gem: Use the proto-context to handle create parameters (v4)

2021-06-09 Thread Jason Ekstrand

This means that the proto-context needs to grow support for engine
configuration information as well as setparam logic.  Fortunately, we'll
be deleting a lot of setparam logic on the primary context shortly so it
will hopefully balance out.

There's an extra bit of fun here when it comes to setting SSEU and the
way it interacts with PARAM_ENGINES.  Unfortunately, thanks to
SET_CONTEXT_PARAM and not being allowed to pick the order in which we
handle certain parameters, we have think about those interactions.

v2 (Daniel Vetter):
 - Add a proto_context_free_user_engines helper
 - Comment on SSEU in the commit message
 - Use proto_context_set_persistence in set_proto_ctx_param

v3 (Daniel Vetter):
 - Fix a doc comment
 - Do an explicit HAS_FULL_PPGTT check in set_proto_ctx_vm instead of
   relying on pc->vm != NULL.
 - Handle errors for CONTEXT_PARAM_PERSISTENCE
 - Don't allow more resetting user engines
 - Rework initialization of UCONTEXT_PERSISTENCE

v4 (Jason Ekstrand):
 - Move hand-rolled initialization of UCONTEXT_PERSISTENCE to an
   earlier patch

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 536 +-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  58 ++
 2 files changed, 577 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c9bae1a1726e1..4972b8c91d942 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -193,8 +193,15 @@ static int validate_priority(struct drm_i915_private *i915,
 
 static void proto_context_close(struct i915_gem_proto_context *pc)
 {
+   int i;
+
if (pc->vm)
i915_vm_put(pc->vm);
+   if (pc->user_engines) {
+   for (i = 0; i < pc->num_user_engines; i++)
+   kfree(pc->user_engines[i].siblings);
+   kfree(pc->user_engines);
+   }
kfree(pc);
 }
 
@@ -248,6 +255,8 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
if (!pc)
return ERR_PTR(-ENOMEM);
 
+   pc->num_user_engines = -1;
+   pc->user_engines = NULL;
pc->user_flags = BIT(UCONTEXT_BANNABLE) |
 BIT(UCONTEXT_RECOVERABLE);
if (i915->params.enable_hangcheck)
@@ -269,6 +278,430 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
return err;
 }
 
+static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
+   struct i915_gem_proto_context *pc,
+   const struct drm_i915_gem_context_param *args)
+{
+   struct drm_i915_private *i915 = fpriv->dev_priv;
+   struct i915_address_space *vm;
+
+   if (args->size)
+   return -EINVAL;
+
+   if (!HAS_FULL_PPGTT(i915))
+   return -ENODEV;
+
+   if (upper_32_bits(args->value))
+   return -ENOENT;
+
+   vm = i915_gem_vm_lookup(fpriv, args->value);
+   if (!vm)
+   return -ENOENT;
+
+   if (pc->vm)
+   i915_vm_put(pc->vm);
+   pc->vm = vm;
+
+   return 0;
+}
+
+struct set_proto_ctx_engines {
+   struct drm_i915_private *i915;
+   unsigned num_engines;
+   struct i915_gem_proto_engine *engines;
+};
+
+static int
+set_proto_ctx_engines_balance(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct i915_context_engines_load_balance __user *ext =
+   container_of_user(base, typeof(*ext), base);
+   const struct set_proto_ctx_engines *set = data;
+   struct drm_i915_private *i915 = set->i915;
+   struct intel_engine_cs **siblings;
+   u16 num_siblings, idx;
+   unsigned int n;
+   int err;
+
+   if (!HAS_EXECLISTS(i915))
+   return -ENODEV;
+
+   if (intel_uc_uses_guc_submission(>gt.uc))
+   return -ENODEV; /* not implement yet */
+
+   if (get_user(idx, >engine_index))
+   return -EFAULT;
+
+   if (idx >= set->num_engines) {
+   drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
+   idx, set->num_engines);
+   return -EINVAL;
+   }
+
+   idx = array_index_nospec(idx, set->num_engines);
+   if (set->engines[idx].type != I915_GEM_ENGINE_TYPE_INVALID) {
+   drm_dbg(>drm,
+   "Invalid placement[%d], already occupied\n", idx);
+   return -EEXIST;
+   }
+
+   if (get_user(num_siblings, >num_siblings))
+   return -EFAULT;
+
+   err = check_user_mbz(>flags);
+   if (err)
+   return err;
+
+   err = check_user_mbz(>mbz64);
+   if (err)
+   return err;
+
+   if (num_siblings == 0)
+   return 0;
+
+   siblings = kmalloc_array(num_siblings, sizeof(*siblings), GFP_KERNEL);
+   if (!siblings)
+

[PATCH 18/31] drm/i915/gem: Optionally set SSEU in intel_context_set_gem

2021-06-09 Thread Jason Ekstrand

For now this is a no-op because everyone passes in a null SSEU but it
lets us get some of the error handling and selftest refactoring plumbed
through.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 41 +++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  6 ++-
 2 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 93579fa0324d1..e62482477c771 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -307,9 +307,12 @@ context_get_vm_rcu(struct i915_gem_context *ctx)
} while (1);
 }
 
-static void intel_context_set_gem(struct intel_context *ce,
- struct i915_gem_context *ctx)
+static int intel_context_set_gem(struct intel_context *ce,
+struct i915_gem_context *ctx,
+struct intel_sseu sseu)
 {
+   int ret = 0;
+
GEM_BUG_ON(rcu_access_pointer(ce->gem_context));
RCU_INIT_POINTER(ce->gem_context, ctx);
 
@@ -336,6 +339,12 @@ static void intel_context_set_gem(struct intel_context *ce,
 
intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
}
+
+   /* A valid SSEU has no zero fields */
+   if (sseu.slice_mask && !WARN_ON(ce->engine->class != RENDER_CLASS))
+   ret = intel_context_reconfigure_sseu(ce, sseu);
+
+   return ret;
 }
 
 static void __free_engines(struct i915_gem_engines *e, unsigned int count)
@@ -403,7 +412,8 @@ static struct i915_gem_engines *alloc_engines(unsigned int 
count)
return e;
 }
 
-static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
+static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx,
+   struct intel_sseu rcs_sseu)
 {
const struct intel_gt *gt = >i915->gt;
struct intel_engine_cs *engine;
@@ -416,6 +426,8 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
 
for_each_engine(engine, gt, id) {
struct intel_context *ce;
+   struct intel_sseu sseu = {};
+   int ret;
 
if (engine->legacy_idx == INVALID_ENGINE)
continue;
@@ -429,10 +441,18 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
goto free_engines;
}
 
-   intel_context_set_gem(ce, ctx);
-
e->engines[engine->legacy_idx] = ce;
e->num_engines = max(e->num_engines, engine->legacy_idx + 1);
+
+   if (engine->class == RENDER_CLASS)
+   sseu = rcs_sseu;
+
+   ret = intel_context_set_gem(ce, ctx, sseu);
+   if (ret) {
+   err = ERR_PTR(ret);
+   goto free_engines;
+   }
+
}
 
return e;
@@ -746,6 +766,7 @@ __create_context(struct drm_i915_private *i915,
 {
struct i915_gem_context *ctx;
struct i915_gem_engines *e;
+   struct intel_sseu null_sseu = {};
int err;
int i;
 
@@ -763,7 +784,7 @@ __create_context(struct drm_i915_private *i915,
INIT_LIST_HEAD(>stale.engines);
 
mutex_init(>engines_mutex);
-   e = default_engines(ctx);
+   e = default_engines(ctx, null_sseu);
if (IS_ERR(e)) {
err = PTR_ERR(e);
goto err_free;
@@ -1549,6 +1570,7 @@ set_engines__load_balance(struct i915_user_extension 
__user *base, void *data)
struct intel_engine_cs *stack[16];
struct intel_engine_cs **siblings;
struct intel_context *ce;
+   struct intel_sseu null_sseu = {};
u16 num_siblings, idx;
unsigned int n;
int err;
@@ -1621,7 +1643,7 @@ set_engines__load_balance(struct i915_user_extension 
__user *base, void *data)
goto out_siblings;
}
 
-   intel_context_set_gem(ce, set->ctx);
+   intel_context_set_gem(ce, set->ctx, null_sseu);
 
if (cmpxchg(>engines->engines[idx], NULL, ce)) {
intel_context_put(ce);
@@ -1729,6 +1751,7 @@ set_engines(struct i915_gem_context *ctx,
struct drm_i915_private *i915 = ctx->i915;
struct i915_context_param_engines __user *user =
u64_to_user_ptr(args->value);
+   struct intel_sseu null_sseu = {};
struct set_engines set = { .ctx = ctx };
unsigned int num_engines, n;
u64 extensions;
@@ -1738,7 +1761,7 @@ set_engines(struct i915_gem_context *ctx,
if (!i915_gem_context_user_engines(ctx))
return 0;
 
-   set.engines = default_engines(ctx);
+   set.engines = default_engines(ctx, null_sseu);
if (IS_ERR(set.engines))

[PATCH 19/31] drm/i915: Add an i915_gem_vm_lookup helper

2021-06-09 Thread Jason Ekstrand

This is the VM equivalent of i915_gem_context_lookup.  It's only used
once in this patch but future patches will need to duplicate this lookup
code so it's better to have it in a helper.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  6 +-
 drivers/gpu/drm/i915/i915_drv.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e62482477c771..8e7c0e3f070ed 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1352,11 +1352,7 @@ static int set_ppgtt(struct drm_i915_file_private 
*file_priv,
if (upper_32_bits(args->value))
return -ENOENT;
 
-   rcu_read_lock();
-   vm = xa_load(_priv->vm_xa, args->value);
-   if (vm && !kref_get_unless_zero(>ref))
-   vm = NULL;
-   rcu_read_unlock();
+   vm = i915_gem_vm_lookup(file_priv, args->value);
if (!vm)
return -ENOENT;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fed14ffc52437..b191946229746 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1864,6 +1864,20 @@ i915_gem_context_lookup(struct drm_i915_file_private 
*file_priv, u32 id)
return ctx;
 }
 
+static inline struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
+{
+   struct i915_address_space *vm;
+
+   rcu_read_lock();
+   vm = xa_load(_priv->vm_xa, id);
+   if (vm && !kref_get_unless_zero(>ref))
+   vm = NULL;
+   rcu_read_unlock();
+
+   return vm;
+}
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
  u64 min_size, u64 alignment,
-- 
2.31.1

[PATCH 16/31] drm/i915/gem: Add an intermediate proto_context struct (v5)

2021-06-09 Thread Jason Ekstrand

The current context uAPI allows for two methods of setting context
parameters: SET_CONTEXT_PARAM and CONTEXT_CREATE_EXT_SETPARAM.  The
former is allowed to be called at any time while the later happens as
part of GEM_CONTEXT_CREATE.  Currently, everything settable via one is
settable via the other.  While some params are fairly simple and setting
them on a live context is harmless such the context priority, others are
far trickier such as the VM or the set of engines.  In order to swap out
the VM, for instance, we have to delay until all current in-flight work
is complete, swap in the new VM, and then continue.  This leads to a
plethora of potential race conditions we'd really rather avoid.

Unfortunately, both methods of setting the VM and the engine set are in
active use today so we can't simply disallow setting the VM or engine
set vial SET_CONTEXT_PARAM.  In order to work around this wart, this
commit adds a proto-context struct which contains all the context create
parameters.

v2 (Daniel Vetter):
 - Better commit message
 - Use __set/clear_bit instead of set/clear_bit because there's no race
   and we don't need the atomics

v3 (Daniel Vetter):
 - Use manual bitops and BIT() instead of __set_bit

v4 (Daniel Vetter):
 - Add a changelog to the commit message
 - Better hyperlinking in docs
 - Create the default PPGTT in i915_gem_create_context

v5 (Daniel Vetter):
 - Hand-roll the initialization of UCONTEXT_PERSISTENCE

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 125 +++---
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  22 +++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  16 ++-
 3 files changed, 146 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f9a6eac78c0ae..f44faad296249 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -191,6 +191,84 @@ static int validate_priority(struct drm_i915_private *i915,
return 0;
 }
 
+static void proto_context_close(struct i915_gem_proto_context *pc)
+{
+   if (pc->vm)
+   i915_vm_put(pc->vm);
+   kfree(pc);
+}
+
+static int proto_context_set_persistence(struct drm_i915_private *i915,
+struct i915_gem_proto_context *pc,
+bool persist)
+{
+   if (persist) {
+   /*
+* Only contexts that are short-lived [that will expire or be
+* reset] are allowed to survive past termination. We require
+* hangcheck to ensure that the persistent requests are healthy.
+*/
+   if (!i915->params.enable_hangcheck)
+   return -EINVAL;
+
+   pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
+   } else {
+   /* To cancel a context we use "preempt-to-idle" */
+   if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+   return -ENODEV;
+
+   /*
+* If the cancel fails, we then need to reset, cleanly!
+*
+* If the per-engine reset fails, all hope is lost! We resort
+* to a full GPU reset in that unlikely case, but realistically
+* if the engine could not reset, the full reset does not fare
+* much better. The damage has been done.
+*
+* However, if we cannot reset an engine by itself, we cannot
+* cleanup a hanging persistent context without causing
+* colateral damage, and we should not pretend we can by
+* exposing the interface.
+*/
+   if (!intel_has_reset_engine(>gt))
+   return -ENODEV;
+
+   pc->user_flags &= ~BIT(UCONTEXT_PERSISTENCE);
+   }
+
+   return 0;
+}
+
+static struct i915_gem_proto_context *
+proto_context_create(struct drm_i915_private *i915, unsigned int flags)
+{
+   struct i915_gem_proto_context *pc, *err;
+
+   pc = kzalloc(sizeof(*pc), GFP_KERNEL);
+   if (!pc)
+   return ERR_PTR(-ENOMEM);
+
+   pc->user_flags = BIT(UCONTEXT_BANNABLE) |
+BIT(UCONTEXT_RECOVERABLE);
+   if (i915->params.enable_hangcheck)
+   pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
+   pc->sched.priority = I915_PRIORITY_NORMAL;
+
+   if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
+   if (!HAS_EXECLISTS(i915)) {
+   err = ERR_PTR(-EINVAL);
+   goto proto_close;
+   }
+   pc->single_timeline = true;
+   }
+
+   return pc;
+
+proto_close:
+   proto_context_close(pc);
+   return err;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context

[PATCH 14/31] drm/i915/gem: Add a separate validate_priority helper

2021-06-09 Thread Jason Ekstrand

With the proto-context stuff added later in this series, we end up
having to duplicate set_priority.  This lets us avoid duplicating the
validation logic.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 42 +
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 61fe6d18d4068..f9a6eac78c0ae 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -169,6 +169,28 @@ lookup_user_engine(struct i915_gem_context *ctx,
return i915_gem_context_get_engine(ctx, idx);
 }
 
+static int validate_priority(struct drm_i915_private *i915,
+const struct drm_i915_gem_context_param *args)
+{
+   s64 priority = args->value;
+
+   if (args->size)
+   return -EINVAL;
+
+   if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
+   return -ENODEV;
+
+   if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
+   priority < I915_CONTEXT_MIN_USER_PRIORITY)
+   return -EINVAL;
+
+   if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
+   !capable(CAP_SYS_NICE))
+   return -EPERM;
+
+   return 0;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -1744,23 +1766,13 @@ static void __apply_priority(struct intel_context *ce, 
void *arg)
 static int set_priority(struct i915_gem_context *ctx,
const struct drm_i915_gem_context_param *args)
 {
-   s64 priority = args->value;
-
-   if (args->size)
-   return -EINVAL;
-
-   if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
-   return -ENODEV;
-
-   if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
-   priority < I915_CONTEXT_MIN_USER_PRIORITY)
-   return -EINVAL;
+   int err;
 
-   if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
-   !capable(CAP_SYS_NICE))
-   return -EPERM;
+   err = validate_priority(ctx->i915, args);
+   if (err)
+   return err;
 
-   ctx->sched.priority = priority;
+   ctx->sched.priority = args->value;
context_apply_all(ctx, __apply_priority, ctx);
 
return 0;
-- 
2.31.1

[PATCH 15/31] drm/i915: Add gem/i915_gem_context.h to the docs

2021-06-09 Thread Jason Ekstrand

In order to prevent kernel doc warnings, also fill out docs for any
missing fields and fix those that forgot the "@".

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 Documentation/gpu/i915.rst|  2 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 43 ---
 2 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 42ce0196930a1..b452f84c9ef2b 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -422,6 +422,8 @@ Batchbuffer Parsing
 User Batchbuffer Execution
 --
 
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
:doc: User command execution
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index df76767f0c41b..5f0673a2129f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -30,19 +30,39 @@ struct i915_address_space;
 struct intel_timeline;
 struct intel_ring;
 
+/**
+ * struct i915_gem_engines - A set of engines
+ */
 struct i915_gem_engines {
union {
+   /** @link: Link in i915_gem_context::stale::engines */
struct list_head link;
+
+   /** @rcu: RCU to use when freeing */
struct rcu_head rcu;
};
+
+   /** @fence: Fence used for delayed destruction of engines */
struct i915_sw_fence fence;
+
+   /** @ctx: i915_gem_context backpointer */
struct i915_gem_context *ctx;
+
+   /** @num_engines: Number of engines in this set */
unsigned int num_engines;
+
+   /** @engines: Array of engines */
struct intel_context *engines[];
 };
 
+/**
+ * struct i915_gem_engines_iter - Iterator for an i915_gem_engines set
+ */
 struct i915_gem_engines_iter {
+   /** @idx: Index into i915_gem_engines::engines */
unsigned int idx;
+
+   /** @engines: Engine set being iterated */
const struct i915_gem_engines *engines;
 };
 
@@ -53,10 +73,10 @@ struct i915_gem_engines_iter {
  * logical hardware state for a particular client.
  */
 struct i915_gem_context {
-   /** i915: i915 device backpointer */
+   /** @i915: i915 device backpointer */
struct drm_i915_private *i915;
 
-   /** file_priv: owning file descriptor */
+   /** @file_priv: owning file descriptor */
struct drm_i915_file_private *file_priv;
 
/**
@@ -81,7 +101,9 @@ struct i915_gem_context {
 * CONTEXT_USER_ENGINES flag is set).
 */
struct i915_gem_engines __rcu *engines;
-   struct mutex engines_mutex; /* guards writes to engines */
+
+   /** @engines_mutex: guards writes to engines */
+   struct mutex engines_mutex;
 
/**
 * @syncobj: Shared timeline syncobj
@@ -118,7 +140,7 @@ struct i915_gem_context {
 */
struct pid *pid;
 
-   /** link: place with _i915_private.context_list */
+   /** @link: place with _i915_private.context_list */
struct list_head link;
 
/**
@@ -153,11 +175,13 @@ struct i915_gem_context {
 #define CONTEXT_CLOSED 0
 #define CONTEXT_USER_ENGINES   1
 
+   /** @mutex: guards everything that isn't engines or handles_vma */
struct mutex mutex;
 
+   /** @sched: scheduler parameters */
struct i915_sched_attr sched;
 
-   /** guilty_count: How many times this context has caused a GPU hang. */
+   /** @guilty_count: How many times this context has caused a GPU hang. */
atomic_t guilty_count;
/**
 * @active_count: How many times this context was active during a GPU
@@ -171,15 +195,17 @@ struct i915_gem_context {
unsigned long hang_timestamp[2];
 #define CONTEXT_FAST_HANG_JIFFIES (120 * HZ) /* 3 hangs within 120s? Banned! */
 
-   /** remap_slice: Bitmask of cache lines that need remapping */
+   /** @remap_slice: Bitmask of cache lines that need remapping */
u8 remap_slice;
 
/**
-* handles_vma: rbtree to look up our context specific obj/vma for
+* @handles_vma: rbtree to look up our context specific obj/vma for
 * the user handle. (user handles are per fd, but the binding is
 * per vm, which may be one per context or shared with the global GTT)
 */
struct radix_tree_root handles_vma;
+
+   /** @lut_mutex: Locks handles_vma */
struct mutex lut_mutex;
 
/**
@@ -191,8 +217,11 @@ struct i915_gem_context {
 */
char name[TASK_COMM_LEN + 8];
 
+   /** @stale: tracks stale engines to be destroyed */
struct {
+   /** @lock: guards engines */
spinlock_t lock;
+   /** @engines: list of stale engines */
struct list_head engines;
} stale;
 };

[PATCH 13/31] drm/i915: Stop manually RCU banging in reset_stats_ioctl (v2)

2021-06-09 Thread Jason Ekstrand

As far as I can tell, the only real reason for this is to avoid taking a
reference to the i915_gem_context.  The cost of those two atomics
probably pales in comparison to the cost of the ioctl itself so we're
really not buying ourselves anything here.  We're about to make context
lookup a tiny bit more complicated, so let's get rid of the one hand-
rolled case.

Some usermode drivers such as our Vulkan driver call GET_RESET_STATS on
every execbuf so the perf here could theoretically be an issue.  If this
ever does become a performance issue for any such userspace drivers,
they can use set CONTEXT_PARAM_RECOVERABLE to false and look for -EIO
coming from execbuf to check for hangs instead.

v2 (Daniel Vetter):
 - Add a comment in the commit message about recoverable contexts

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 -
 drivers/gpu/drm/i915/i915_drv.h |  8 +---
 2 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 0ba8506fb966f..61fe6d18d4068 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2090,16 +2090,13 @@ int i915_gem_context_reset_stats_ioctl(struct 
drm_device *dev,
struct drm_i915_private *i915 = to_i915(dev);
struct drm_i915_reset_stats *args = data;
struct i915_gem_context *ctx;
-   int ret;
 
if (args->flags || args->pad)
return -EINVAL;
 
-   ret = -ENOENT;
-   rcu_read_lock();
-   ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
+   ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
if (!ctx)
-   goto out;
+   return -ENOENT;
 
/*
 * We opt for unserialised reads here. This may result in tearing
@@ -2116,10 +2113,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device 
*dev,
args->batch_active = atomic_read(>guilty_count);
args->batch_pending = atomic_read(>active_count);
 
-   ret = 0;
-out:
-   rcu_read_unlock();
-   return ret;
+   i915_gem_context_put(ctx);
+   return 0;
 }
 
 /* GEM context-engines iterator: for_each_gem_engine() */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 38ff2fb897443..fed14ffc52437 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1850,19 +1850,13 @@ struct drm_gem_object *i915_gem_prime_import(struct 
drm_device *dev,
 
 struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int 
flags);
 
-static inline struct i915_gem_context *
-__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
-{
-   return xa_load(_priv->context_xa, id);
-}
-
 static inline struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
struct i915_gem_context *ctx;
 
rcu_read_lock();
-   ctx = __i915_gem_context_lookup_rcu(file_priv, id);
+   ctx = xa_load(_priv->context_xa, id);
if (ctx && !kref_get_unless_zero(>ref))
ctx = NULL;
rcu_read_unlock();
-- 
2.31.1

[PATCH 12/31] drm/i915/gem: Disallow creating contexts with too many engines

2021-06-09 Thread Jason Ekstrand

There's no sense in allowing userspace to create more engines than it
can possibly access via execbuf.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5eca91ded3423..0ba8506fb966f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1639,11 +1639,11 @@ set_engines(struct i915_gem_context *ctx,
return -EINVAL;
}
 
-   /*
-* Note that I915_EXEC_RING_MASK limits execbuf to only using the
-* first 64 engines defined here.
-*/
num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
+   /* RING_MASK has no shift so we can use it directly here */
+   if (num_engines > I915_EXEC_RING_MASK + 1)
+   return -EINVAL;
+
set.engines = alloc_engines(num_engines);
if (!set.engines)
return -ENOMEM;
-- 
2.31.1

[PATCH 10/31] drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2)

2021-06-09 Thread Jason Ekstrand

Even though FENCE_SUBMIT is only documented to wait until the request in
the in-fence starts instead of waiting until it completes, it has a bit
more magic than that.  If FENCE_SUBMIT is used to submit something to a
balanced engine, we would wait to assign engines until the primary
request was ready to start and then attempt to assign it to a different
engine than the primary.  There is an IGT test (the bonded-slice subtest
of gem_exec_balancer) which exercises this by submitting a primary batch
to a specific VCS and then using FENCE_SUBMIT to submit a secondary
which can run on any VCS and have i915 figure out which VCS to run it on
such that they can run in parallel.

However, this functionality has never been used in the real world.  The
media driver (the only user of FENCE_SUBMIT) always picks exactly two
physical engines to bond and never asks us to pick which to use.

v2 (Daniel Vetter):
 - Mention the exact IGT test this breaks

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h|  7 ---
 .../drm/i915/gt/intel_execlists_submission.c| 17 -
 3 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index a6a3b67aa0019..88e7cbf8fc5f8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3474,7 +3474,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (args->flags & I915_EXEC_FENCE_SUBMIT)
err = i915_request_await_execution(eb.request,
   in_fence,
-  
eb.engine->bond_execute);
+  NULL);
else
err = i915_request_await_dma_fence(eb.request,
   in_fence);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e113f93b32745..eeedb2f457ae5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -447,13 +447,6 @@ struct intel_engine_cs {
 */
void(*submit_request)(struct i915_request *rq);
 
-   /*
-* Called on signaling of a SUBMIT_FENCE, passing along the signaling
-* request down to the bonded pairs.
-*/
-   void(*bond_execute)(struct i915_request *rq,
-   struct dma_fence *signal);
-
/*
 * Call when the priority on a request has changed and it and its
 * dependencies may need rescheduling. Note the request itself may
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 38fe91205c77d..01e77ba397372 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3587,22 +3587,6 @@ static void virtual_submit_request(struct i915_request 
*rq)
spin_unlock_irqrestore(>base.active.lock, flags);
 }
 
-static void
-virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
-{
-   intel_engine_mask_t allowed, exec;
-
-   allowed = ~to_request(signal)->engine->mask;
-
-   /* Restrict the bonded request to run on only the available engines */
-   exec = READ_ONCE(rq->execution_mask);
-   while (!try_cmpxchg(>execution_mask, , exec & allowed))
-   ;
-
-   /* Prevent the master from being re-run on the bonded engines */
-   to_request(signal)->execution_mask &= ~allowed;
-}
-
 struct intel_context *
 intel_execlists_create_virtual(struct intel_engine_cs **siblings,
   unsigned int count)
@@ -3656,7 +3640,6 @@ intel_execlists_create_virtual(struct intel_engine_cs 
**siblings,
 
ve->base.schedule = i915_schedule;
ve->base.submit_request = virtual_submit_request;
-   ve->base.bond_execute = virtual_bond_execute;
 
INIT_LIST_HEAD(virtual_queue(ve));
ve->base.execlists.queue_priority_hint = INT_MIN;
-- 
2.31.1

[PATCH 11/31] drm/i915/request: Remove the hook from await_execution

2021-06-09 Thread Jason Ekstrand

This was only ever used for FENCE_SUBMIT automatic engine selection
which was removed in the previous commit.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  3 +-
 drivers/gpu/drm/i915/i915_request.c   | 42 ---
 drivers/gpu/drm/i915/i915_request.h   |  4 +-
 3 files changed, 9 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 88e7cbf8fc5f8..720487ad6a5a4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3473,8 +3473,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (in_fence) {
if (args->flags & I915_EXEC_FENCE_SUBMIT)
err = i915_request_await_execution(eb.request,
-  in_fence,
-  NULL);
+  in_fence);
else
err = i915_request_await_dma_fence(eb.request,
   in_fence);
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 1014c71cf7f52..bb142f944f550 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -49,7 +49,6 @@
 struct execute_cb {
struct irq_work work;
struct i915_sw_fence *fence;
-   void (*hook)(struct i915_request *rq, struct dma_fence *signal);
struct i915_request *signal;
 };
 
@@ -180,17 +179,6 @@ static void irq_execute_cb(struct irq_work *wrk)
kmem_cache_free(global.slab_execute_cbs, cb);
 }
 
-static void irq_execute_cb_hook(struct irq_work *wrk)
-{
-   struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
-
-   cb->hook(container_of(cb->fence, struct i915_request, submit),
->signal->fence);
-   i915_request_put(cb->signal);
-
-   irq_execute_cb(wrk);
-}
-
 static __always_inline void
 __notify_execute_cb(struct i915_request *rq, bool (*fn)(struct irq_work *wrk))
 {
@@ -517,17 +505,12 @@ static bool __request_in_flight(const struct i915_request 
*signal)
 static int
 __await_execution(struct i915_request *rq,
  struct i915_request *signal,
- void (*hook)(struct i915_request *rq,
-  struct dma_fence *signal),
  gfp_t gfp)
 {
struct execute_cb *cb;
 
-   if (i915_request_is_active(signal)) {
-   if (hook)
-   hook(rq, >fence);
+   if (i915_request_is_active(signal))
return 0;
-   }
 
cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
if (!cb)
@@ -537,12 +520,6 @@ __await_execution(struct i915_request *rq,
i915_sw_fence_await(cb->fence);
init_irq_work(>work, irq_execute_cb);
 
-   if (hook) {
-   cb->hook = hook;
-   cb->signal = i915_request_get(signal);
-   cb->work.func = irq_execute_cb_hook;
-   }
-
/*
 * Register the callback first, then see if the signaler is already
 * active. This ensures that if we race with the
@@ -1253,7 +1230,7 @@ emit_semaphore_wait(struct i915_request *to,
goto await_fence;
 
/* Only submit our spinner after the signaler is running! */
-   if (__await_execution(to, from, NULL, gfp))
+   if (__await_execution(to, from, gfp))
goto await_fence;
 
if (__emit_semaphore_wait(to, from, from->fence.seqno))
@@ -1284,16 +1261,14 @@ static int intel_timeline_sync_set_start(struct 
intel_timeline *tl,
 
 static int
 __i915_request_await_execution(struct i915_request *to,
-  struct i915_request *from,
-  void (*hook)(struct i915_request *rq,
-   struct dma_fence *signal))
+  struct i915_request *from)
 {
int err;
 
GEM_BUG_ON(intel_context_is_barrier(from->context));
 
/* Submit both requests at the same time */
-   err = __await_execution(to, from, hook, I915_FENCE_GFP);
+   err = __await_execution(to, from, I915_FENCE_GFP);
if (err)
return err;
 
@@ -1406,9 +1381,7 @@ i915_request_await_external(struct i915_request *rq, 
struct dma_fence *fence)
 
 int
 i915_request_await_execution(struct i915_request *rq,
-struct dma_fence *fence,
-void (*hook)(struct i915_request *rq,
- struct dma_fence *signal))
+struct dma_fence *fence)
 {
struct dma_fence **child = 
unsigned int nchild = 1;
@@ -1441,8 +1414,7 @@ i915_request_await_execution(struct

[PATCH 08/31] drm/i915: Drop getparam support for I915_CONTEXT_PARAM_ENGINES

2021-06-09 Thread Jason Ekstrand

This has never been used by any userspace except IGT and provides no
real functionality beyond parroting back parameters userspace passed in
as part of context creation or via setparam.  If the context is in
legacy mode (where you use I915_EXEC_RENDER and friends), it returns
success with zero data so it's not useful for discovering what engines
are in the context.  It's also not a replacement for the recently
removed I915_CONTEXT_CLONE_ENGINES because it doesn't return any of the
balancing or bonding information.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 77 +
 1 file changed, 1 insertion(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 249bd36f14019..e36e3b1ae14e4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1724,78 +1724,6 @@ set_engines(struct i915_gem_context *ctx,
return 0;
 }
 
-static int
-get_engines(struct i915_gem_context *ctx,
-   struct drm_i915_gem_context_param *args)
-{
-   struct i915_context_param_engines __user *user;
-   struct i915_gem_engines *e;
-   size_t n, count, size;
-   bool user_engines;
-   int err = 0;
-
-   e = __context_engines_await(ctx, _engines);
-   if (!e)
-   return -ENOENT;
-
-   if (!user_engines) {
-   i915_sw_fence_complete(>fence);
-   args->size = 0;
-   return 0;
-   }
-
-   count = e->num_engines;
-
-   /* Be paranoid in case we have an impedance mismatch */
-   if (!check_struct_size(user, engines, count, )) {
-   err = -EINVAL;
-   goto err_free;
-   }
-   if (overflows_type(size, args->size)) {
-   err = -EINVAL;
-   goto err_free;
-   }
-
-   if (!args->size) {
-   args->size = size;
-   goto err_free;
-   }
-
-   if (args->size < size) {
-   err = -EINVAL;
-   goto err_free;
-   }
-
-   user = u64_to_user_ptr(args->value);
-   if (put_user(0, >extensions)) {
-   err = -EFAULT;
-   goto err_free;
-   }
-
-   for (n = 0; n < count; n++) {
-   struct i915_engine_class_instance ci = {
-   .engine_class = I915_ENGINE_CLASS_INVALID,
-   .engine_instance = I915_ENGINE_CLASS_INVALID_NONE,
-   };
-
-   if (e->engines[n]) {
-   ci.engine_class = e->engines[n]->engine->uabi_class;
-   ci.engine_instance = 
e->engines[n]->engine->uabi_instance;
-   }
-
-   if (copy_to_user(>engines[n], , sizeof(ci))) {
-   err = -EFAULT;
-   goto err_free;
-   }
-   }
-
-   args->size = size;
-
-err_free:
-   i915_sw_fence_complete(>fence);
-   return err;
-}
-
 static int
 set_persistence(struct i915_gem_context *ctx,
const struct drm_i915_gem_context_param *args)
@@ -2126,10 +2054,6 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
ret = get_ppgtt(file_priv, ctx, args);
break;
 
-   case I915_CONTEXT_PARAM_ENGINES:
-   ret = get_engines(ctx, args);
-   break;
-
case I915_CONTEXT_PARAM_PERSISTENCE:
args->size = 0;
args->value = i915_gem_context_is_persistent(ctx);
@@ -2137,6 +2061,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
 
case I915_CONTEXT_PARAM_NO_ZEROMAP:
case I915_CONTEXT_PARAM_BAN_PERIOD:
+   case I915_CONTEXT_PARAM_ENGINES:
case I915_CONTEXT_PARAM_RINGSIZE:
default:
ret = -EINVAL;
-- 
2.31.1

[PATCH 09/31] drm/i915/gem: Disallow bonding of virtual engines (v3)

2021-06-09 Thread Jason Ekstrand

This adds a bunch of complexity which the media driver has never
actually used.  The media driver does technically bond a balanced engine
to another engine but the balanced engine only has one engine in the
sibling set.  This doesn't actually result in a virtual engine.

This functionality was originally added to handle cases where we may
have more than two video engines and media might want to load-balance
their bonded submits by, for instance, submitting to a balanced vcs0-1
as the primary and then vcs2-3 as the secondary.  However, no such
hardware has shipped thus far and, if we ever want to enable such
use-cases in the future, we'll use the up-and-coming parallel submit API
which targets GuC submission.

This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We leave the
validation code in place in case we ever decide we want to do something
interesting with the bonding information.

v2 (Jason Ekstrand):
 - Don't delete quite as much code.

v3 (Tvrtko Ursulin):
 - Add some history to the commit message

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
 .../drm/i915/gt/intel_execlists_submission.c  |  69 --
 .../drm/i915/gt/intel_execlists_submission.h  |   5 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 --
 4 files changed, 8 insertions(+), 313 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e36e3b1ae14e4..5eca91ded3423 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1552,6 +1552,12 @@ set_engines__bond(struct i915_user_extension __user 
*base, void *data)
}
virtual = set->engines->engines[idx]->engine;
 
+   if (intel_engine_is_virtual(virtual)) {
+   drm_dbg(>drm,
+   "Bonding with virtual engines not allowed\n");
+   return -EINVAL;
+   }
+
err = check_user_mbz(>flags);
if (err)
return err;
@@ -1592,18 +1598,6 @@ set_engines__bond(struct i915_user_extension __user 
*base, void *data)
n, ci.engine_class, ci.engine_instance);
return -EINVAL;
}
-
-   /*
-* A non-virtual engine has no siblings to choose between; and
-* a submit fence will always be directed to the one engine.
-*/
-   if (intel_engine_is_virtual(virtual)) {
-   err = intel_virtual_engine_attach_bond(virtual,
-  master,
-  bond);
-   if (err)
-   return err;
-   }
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index f9ffaece12213..38fe91205c77d 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -182,18 +182,6 @@ struct virtual_engine {
int prio;
} nodes[I915_NUM_ENGINES];
 
-   /*
-* Keep track of bonded pairs -- restrictions upon on our selection
-* of physical engines any particular request may be submitted to.
-* If we receive a submit-fence from a master engine, we will only
-* use one of sibling_mask physical engines.
-*/
-   struct ve_bond {
-   const struct intel_engine_cs *master;
-   intel_engine_mask_t sibling_mask;
-   } *bonds;
-   unsigned int num_bonds;
-
/* And finally, which physical engines this virtual engine maps onto. */
unsigned int num_siblings;
struct intel_engine_cs *siblings[];
@@ -3347,7 +3335,6 @@ static void rcu_virtual_context_destroy(struct 
work_struct *wrk)
intel_breadcrumbs_free(ve->base.breadcrumbs);
intel_engine_free_request_pool(>base);
 
-   kfree(ve->bonds);
kfree(ve);
 }
 
@@ -3600,33 +3587,13 @@ static void virtual_submit_request(struct i915_request 
*rq)
spin_unlock_irqrestore(>base.active.lock, flags);
 }
 
-static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
- const struct intel_engine_cs *master)
-{
-   int i;
-
-   for (i = 0; i < ve->num_bonds; i++) {
-   if (ve->bonds[i].master == master)
-   return >bonds[i];
-   }
-
-   return NULL;
-}
-
 static void
 virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 {
-   struct virtual_engine *ve = to_virtual_engine(rq->engine);
intel_engine_mask_t allowed, exec;
-   struct ve_bond *bond;
 
allowed = ~to_request(signal)->engine->mask;
 
-   bond = virtual_find_bond(ve, to_request(signal)->engine);
-   if

1 2 3 >

1 - 100 of 248 matches

Mail list logo