Re: [Intel-gfx] [PATCH] drm/i915/guc: Stop using mutex while sending CTB messages

2020-01-31 Thread Michal Wajdeczko
On Fri, 31 Jan 2020 17:40:12 +0100, Matthew Brost  
 wrote:



On Fri, Jan 31, 2020 at 03:33:55PM +, Chris Wilson wrote:

Quoting Michal Wajdeczko (2020-01-31 14:58:34)

While we are always using CT "send" buffer to send request messages
to GuC, we usually don't ask GuC to use CT "receive" buffer to send
back response messages, since almost all returned data can fit into
reserved bits in status dword inside CT descriptor. However, relying
on data modifications inside CT descriptor requires use of mutex to
allow only single CT request in flight, until we read back that status
dword from the CT descriptor.


Q. do we need the same lock for ct_read() and ct_write()?

Could ct_read() use a lock-free ringbuffer, and then if I've read it
right, you wouldn't have any overlapping spinlock between the interrupt
handler and the rest (thus avoiding the interrupt-off).
-Chris


Agree with Chris, it would nice if ct_read() didn't need a lock. At a  
minimum I

think it needs a different lock than the write path.


two options:
1) reuse gt->irq_lock for ct_read() and use ct->lock for ct_write()
2) define lock as part of ct->ctbs[] for dedicated use by ct_read()/write()

I guess 2 is clearer
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Stop using mutex while sending CTB messages

2020-01-31 Thread Michal Wajdeczko
On Fri, 31 Jan 2020 16:33:55 +0100, Chris Wilson  
 wrote:



Quoting Michal Wajdeczko (2020-01-31 14:58:34)

While we are always using CT "send" buffer to send request messages
to GuC, we usually don't ask GuC to use CT "receive" buffer to send
back response messages, since almost all returned data can fit into
reserved bits in status dword inside CT descriptor. However, relying
on data modifications inside CT descriptor requires use of mutex to
allow only single CT request in flight, until we read back that status
dword from the CT descriptor.


Q. do we need the same lock for ct_read() and ct_write()?


No. And additionally I missed that ct_read() was already implicitly locked
with gt->irq_lock by gen11_gt_irq_handler and with i915->irq_lock by
guc_enable_communication() - need to fix that too ;(



Could ct_read() use a lock-free ringbuffer, and then if I've read it
right, you wouldn't have any overlapping spinlock between the interrupt
handler and the rest (thus avoiding the interrupt-off).


Not sure if can go lock-free with two callers, will look for gt->irq_lock
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Stop using mutex while sending CTB messages

2020-01-31 Thread Matthew Brost

On Fri, Jan 31, 2020 at 03:33:55PM +, Chris Wilson wrote:

Quoting Michal Wajdeczko (2020-01-31 14:58:34)

While we are always using CT "send" buffer to send request messages
to GuC, we usually don't ask GuC to use CT "receive" buffer to send
back response messages, since almost all returned data can fit into
reserved bits in status dword inside CT descriptor. However, relying
on data modifications inside CT descriptor requires use of mutex to
allow only single CT request in flight, until we read back that status
dword from the CT descriptor.


Q. do we need the same lock for ct_read() and ct_write()?

Could ct_read() use a lock-free ringbuffer, and then if I've read it
right, you wouldn't have any overlapping spinlock between the interrupt
handler and the rest (thus avoiding the interrupt-off).
-Chris


Agree with Chris, it would nice if ct_read() didn't need a lock. At a minimum I
think it needs a different lock than the write path.

Matt 
___

Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/guc: Stop using mutex while sending CTB messages

2020-01-31 Thread Chris Wilson
Quoting Michal Wajdeczko (2020-01-31 14:58:34)
> While we are always using CT "send" buffer to send request messages
> to GuC, we usually don't ask GuC to use CT "receive" buffer to send
> back response messages, since almost all returned data can fit into
> reserved bits in status dword inside CT descriptor. However, relying
> on data modifications inside CT descriptor requires use of mutex to
> allow only single CT request in flight, until we read back that status
> dword from the CT descriptor.

Q. do we need the same lock for ct_read() and ct_write()?

Could ct_read() use a lock-free ringbuffer, and then if I've read it
right, you wouldn't have any overlapping spinlock between the interrupt
handler and the rest (thus avoiding the interrupt-off).
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/guc: Stop using mutex while sending CTB messages

2020-01-31 Thread Michal Wajdeczko
While we are always using CT "send" buffer to send request messages
to GuC, we usually don't ask GuC to use CT "receive" buffer to send
back response messages, since almost all returned data can fit into
reserved bits in status dword inside CT descriptor. However, relying
on data modifications inside CT descriptor requires use of mutex to
allow only single CT request in flight, until we read back that status
dword from the CT descriptor.

But some H2G actions (like AUTHENTICATE_HUC, and more to come) are
like one-way requests for which we don't care about immediate status,
since we will use a different way to confirm that given action was
completed (ie. HUC_STATUS reg is used to verify HuC authentication).

If we ask GuC to always send response messages over "receive" buffer
for all requests for which we care about their status, then we can
use CT descriptor option only for our new one-way requests, for which
status can be temporary ignored.

Since we only need to protect CT descriptor during reading/writing
from the command buffer, we can drop mutex and switch to spinlock.

Signed-off-by: Michal Wajdeczko 
Cc: Chris Wilson 
Cc: Daniele Ceraolo Spurio 
Cc: John Harrison 
Cc: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|   2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 147 --
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |   3 +
 3 files changed, 57 insertions(+), 95 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index c4c1523da7a6..d5938c1d44a2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -519,7 +519,7 @@ int intel_guc_sample_forcewake(struct intel_guc *guc)
 int intel_guc_auth_huc(struct intel_guc *guc, u32 rsa_offset)
 {
u32 action[] = {
-   INTEL_GUC_ACTION_AUTHENTICATE_HUC,
+   INTEL_GUC_ACTION_AUTHENTICATE_HUC | GUC_SEND_FLAG_NO_RESPONSE,
rsa_offset
};
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index d84812683364..760e03cc2bad 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -41,6 +41,7 @@ static void ct_incoming_request_worker_func(struct 
work_struct *w);
  */
 void intel_guc_ct_init_early(struct intel_guc_ct *ct)
 {
+   spin_lock_init(>lock);
spin_lock_init(>requests.lock);
INIT_LIST_HEAD(>requests.pending);
INIT_LIST_HEAD(>requests.incoming);
@@ -88,13 +89,6 @@ static void guc_ct_buffer_desc_init(struct 
guc_ct_buffer_desc *desc,
desc->owner = CTB_OWNER_HOST;
 }
 
-static void guc_ct_buffer_desc_reset(struct guc_ct_buffer_desc *desc)
-{
-   desc->head = 0;
-   desc->tail = 0;
-   desc->is_in_error = 0;
-}
-
 static int guc_action_register_ct_buffer(struct intel_guc *guc,
 u32 desc_addr,
 u32 type)
@@ -317,6 +311,7 @@ static int ct_write(struct intel_guc_ct *ct,
 {
struct intel_guc_ct_buffer *ctb = >ctbs[CTB_SEND];
struct guc_ct_buffer_desc *desc = ctb->desc;
+   u32 action_code = action[0] & GUC_CT_MSG_ACTION_MASK;
u32 head = desc->head;
u32 tail = desc->tail;
u32 size = desc->size;
@@ -325,6 +320,8 @@ static int ct_write(struct intel_guc_ct *ct,
u32 *cmds = ctb->cmds;
unsigned int i;
 
+   lockdep_assert_held(>lock);
+
if (unlikely(desc->is_in_error))
return -EPIPE;
 
@@ -359,7 +356,7 @@ static int ct_write(struct intel_guc_ct *ct,
header = (len << GUC_CT_MSG_LEN_SHIFT) |
 (GUC_CT_MSG_WRITE_FENCE_TO_DESC) |
 (want_response ? GUC_CT_MSG_SEND_STATUS : 0) |
-(action[0] << GUC_CT_MSG_ACTION_SHIFT);
+(action_code << GUC_CT_MSG_ACTION_SHIFT);
 
CT_DEBUG(ct, "writing %*ph %*ph %*ph\n",
 4, , 4, , 4 * (len - 1), [1]);
@@ -387,62 +384,11 @@ static int ct_write(struct intel_guc_ct *ct,
return -EPIPE;
 }
 
-/**
- * wait_for_ctb_desc_update - Wait for the CT buffer descriptor update.
- * @desc:  buffer descriptor
- * @fence: response fence
- * @status:placeholder for status
- *
- * Guc will update CT buffer descriptor with new fence and status
- * after processing the command identified by the fence. Wait for
- * specified fence and then read from the descriptor status of the
- * command.
- *
- * Return:
- * *   0 response received (status is valid)
- * *   -ETIMEDOUT no response within hardcoded timeout
- * *   -EPROTO no response, CT buffer is in error
- */
-static int wait_for_ctb_desc_update(struct guc_ct_buffer_desc *desc,
-   u32 fence,
-   u32 *status)
-{
-   int err;
-
-   /*
-* Fast commands should complete in less than 10us, so sample quickly
-* up to that length of