Re: [Intel-gfx] [PATCH 3/4] drm/i915: peel dma-fence-chains wait fences

2020-04-12 Thread Venkata Sandeep Dhanalakota
On 20/04/11 11:50, Lionel Landwerlin wrote:
> On 10/04/2020 19:51, Venkata Sandeep Dhanalakota wrote:
> > From: Lionel Landwerlin 
> > 
> > To allow faster engine to engine synchronization, peel the layer of
> > dma-fence-chain to expose potential i915 fences so that the
> > i915-request code can emit HW semaphore wait/signal operations in the
> > ring which is faster than waking up the host to submit unblocked
> > workloads after interrupt notification.
> > 
> > Signed-off-by: Lionel Landwerlin 
> > ---
> >   .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 +--
> >   1 file changed, 35 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index 8dd651cdca39..e43b76d7e9fd 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -2508,6 +2508,7 @@ await_fence_array(struct i915_execbuffer *eb,
> > for (n = 0; n < nfences; n++) {
> > struct drm_syncobj *syncobj;
> > +   struct dma_fence_chain *chain;
> > unsigned int flags;
> > syncobj = ptr_unpack_bits(fences[n].syncobj, , 2);
> > @@ -2515,10 +2516,40 @@ await_fence_array(struct i915_execbuffer *eb,
> > if (!fences[n].dma_fence)
> > continue;
> > -   err = i915_request_await_dma_fence(eb->request,
> > -  fences[n].dma_fence);
> > -   if (err < 0)
> > -   return err;
> > +   /*
> > +* If we're dealing with a dma-fence-chain, peel the chain by
> > +* adding all of the unsignaled fences
> > +* (dma_fence_chain_for_each does that for us) the chain
> > +* points to.
> > +*
> > +* This enables us to identify waits on i915 fences and allows
> > +* for faster engine-to-engine synchronization using HW
> > +* semaphores.
> > +*/
> > +   chain = to_dma_fence_chain(fences[n].dma_fence);
> > +   if (chain) {
> > +   struct dma_fence *iter;
> > +
> > +   dma_fence_chain_for_each(iter, fences[n].dma_fence) {
> 
> 
> The kbuild bot made me think of an interesting case.
> 
> It is possible to build a chain where the first element isn't a
> dma_fence_chain.
> 
Yes agreed, we could have a valid fence-chain with first element as normal
dma_fence and so iter_chain can be null. Will address this in next
revision of the patch.

> 
> We should handle this here like this :
> 
> 
> if (iter_chain)
> 
>     err = i915_request_await_dma_fence(eb->request, iter_chain->fence);
> 
> else
> 
>     err = i915_request_await_dma_fence(eb->request, iter);
> 
> if (err < 0) {
> 
>     dma_fence_put(iter);
> 
>     return err;
> 
> }
> 
> 
> > +   struct dma_fence_chain *iter_chain =
> > +   to_dma_fence_chain(iter);
> > +
> > +   GEM_BUG_ON(!iter_chain);
> > +
> > +   err = i915_request_await_dma_fence(eb->request,
> > +  
> > iter_chain->fence);
> > +   if (err < 0) {
> > +   dma_fence_put(iter);
> > +   return err;
> > +   }
> > +   }
> > +
> > +   } else {
> > +   err = i915_request_await_dma_fence(eb->request,
> > +  fences[n].dma_fence);
> > +   if (err < 0)
> > +   return err;
> > +   }
> > }
> > return 0;
> 
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/4] drm/i915: peel dma-fence-chains wait fences

2020-04-11 Thread Lionel Landwerlin

On 10/04/2020 19:51, Venkata Sandeep Dhanalakota wrote:

From: Lionel Landwerlin 

To allow faster engine to engine synchronization, peel the layer of
dma-fence-chain to expose potential i915 fences so that the
i915-request code can emit HW semaphore wait/signal operations in the
ring which is faster than waking up the host to submit unblocked
workloads after interrupt notification.

Signed-off-by: Lionel Landwerlin 
---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 +--
  1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 8dd651cdca39..e43b76d7e9fd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2508,6 +2508,7 @@ await_fence_array(struct i915_execbuffer *eb,
  
  	for (n = 0; n < nfences; n++) {

struct drm_syncobj *syncobj;
+   struct dma_fence_chain *chain;
unsigned int flags;
  
  		syncobj = ptr_unpack_bits(fences[n].syncobj, , 2);

@@ -2515,10 +2516,40 @@ await_fence_array(struct i915_execbuffer *eb,
if (!fences[n].dma_fence)
continue;
  
-		err = i915_request_await_dma_fence(eb->request,

-  fences[n].dma_fence);
-   if (err < 0)
-   return err;
+   /*
+* If we're dealing with a dma-fence-chain, peel the chain by
+* adding all of the unsignaled fences
+* (dma_fence_chain_for_each does that for us) the chain
+* points to.
+*
+* This enables us to identify waits on i915 fences and allows
+* for faster engine-to-engine synchronization using HW
+* semaphores.
+*/
+   chain = to_dma_fence_chain(fences[n].dma_fence);
+   if (chain) {
+   struct dma_fence *iter;
+
+   dma_fence_chain_for_each(iter, fences[n].dma_fence) {



The kbuild bot made me think of an interesting case.

It is possible to build a chain where the first element isn't a 
dma_fence_chain.



We should handle this here like this :


if (iter_chain)

    err = i915_request_await_dma_fence(eb->request, iter_chain->fence);

else

    err = i915_request_await_dma_fence(eb->request, iter);

if (err < 0) {

    dma_fence_put(iter);

    return err;

}



+   struct dma_fence_chain *iter_chain =
+   to_dma_fence_chain(iter);
+
+   GEM_BUG_ON(!iter_chain);
+
+   err = i915_request_await_dma_fence(eb->request,
+  
iter_chain->fence);
+   if (err < 0) {
+   dma_fence_put(iter);
+   return err;
+   }
+   }
+
+   } else {
+   err = i915_request_await_dma_fence(eb->request,
+  fences[n].dma_fence);
+   if (err < 0)
+   return err;
+   }
}
  
  	return 0;



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/4] drm/i915: peel dma-fence-chains wait fences

2020-04-10 Thread kbuild test robot
Hi Venkata,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-tip/drm-tip]
[cannot apply to drm-intel/for-linux-next linus/master v5.6 next-20200410]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Venkata-Sandeep-Dhanalakota/drm-i915-introduce-a-mechanism-to-extend-execbuf2/20200411-031057
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 


cppcheck warnings: (new ones prefixed by >>)

>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2550:12: warning: Either the 
>> condition '!iter_chain' is redundant or there is possible null pointer 
>> dereference: iter_chain. [nullPointerRedundantCheck]
  iter_chain->fence);
  ^
   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2547:16: note: Assuming that 
condition '!iter_chain' is not redundant
   GEM_BUG_ON(!iter_chain);
  ^
   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2545:24: note: Assignment 
'iter_chain=to_dma_fence_chain(iter)', assigned value is 0
to_dma_fence_chain(iter);
  ^
   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2550:12: note: Null pointer 
dereference
  iter_chain->fence);
  ^
   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1423:6: warning: The scope of 
the variable 'err' can be reduced. [variableScope]
int err;
^
   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:3158:1: warning: Label 
'end_user' is not used. [unusedLabel]
   end_user:
   ^
   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1346:21: warning: Clarify 
calculation precedence for '&' and '?'. [clarifyCalculation]
  len = offset & 7 ? 8 : 5;
   ^
   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1401:24: warning: 'vaddr' is 
of type 'void *'. When using void pointers in calculations, the behaviour is 
undefined. [arithOperationsOnVoidPointer]
clflush_write32(vaddr + offset_in_page(offset),
  ^

vim +2550 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c

  2510  
  2511  static int
  2512  await_fence_array(struct i915_execbuffer *eb,
  2513struct i915_eb_fences *fences,
  2514int nfences)
  2515  {
  2516  unsigned int n;
  2517  int err;
  2518  
  2519  for (n = 0; n < nfences; n++) {
  2520  struct drm_syncobj *syncobj;
  2521  struct dma_fence_chain *chain;
  2522  unsigned int flags;
  2523  
  2524  syncobj = ptr_unpack_bits(fences[n].syncobj, , 2);
  2525  
  2526  if (!fences[n].dma_fence)
  2527  continue;
  2528  
  2529  /*
  2530   * If we're dealing with a dma-fence-chain, peel the 
chain by
  2531   * adding all of the unsignaled fences
  2532   * (dma_fence_chain_for_each does that for us) the chain
  2533   * points to.
  2534   *
  2535   * This enables us to identify waits on i915 fences and 
allows
  2536   * for faster engine-to-engine synchronization using HW
  2537   * semaphores.
  2538   */
  2539  chain = to_dma_fence_chain(fences[n].dma_fence);
  2540  if (chain) {
  2541  struct dma_fence *iter;
  2542  
  2543  dma_fence_chain_for_each(iter, 
fences[n].dma_fence) {
  2544  struct dma_fence_chain *iter_chain =
  2545  to_dma_fence_chain(iter);
  2546  
  2547  GEM_BUG_ON(!iter_chain);
  2548  
  2549  err = 
i915_request_await_dma_fence(eb->request,
> 2550 
> iter_chain->fence);
  2551  if (err < 0) {
  2552  dma_fence_put(iter);
  2553  return err;
  2554  }
  2555  }
  2556  
  2557  } else {
  2558  err = i915_request_await_dma_fence(eb->request,
  2559 
fences[n].dma_fence);
  2560  if (err < 0)
  2561  return err;
  2562  }
  2563  }
  2564  
  2565  return 0;
  2566  }
  2567  

---
0-DAY CI Kernel Test Service, Intel Corporation

[Intel-gfx] [PATCH 3/4] drm/i915: peel dma-fence-chains wait fences

2020-04-10 Thread Venkata Sandeep Dhanalakota
From: Lionel Landwerlin 

To allow faster engine to engine synchronization, peel the layer of
dma-fence-chain to expose potential i915 fences so that the
i915-request code can emit HW semaphore wait/signal operations in the
ring which is faster than waking up the host to submit unblocked
workloads after interrupt notification.

Signed-off-by: Lionel Landwerlin 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 +--
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 8dd651cdca39..e43b76d7e9fd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2508,6 +2508,7 @@ await_fence_array(struct i915_execbuffer *eb,
 
for (n = 0; n < nfences; n++) {
struct drm_syncobj *syncobj;
+   struct dma_fence_chain *chain;
unsigned int flags;
 
syncobj = ptr_unpack_bits(fences[n].syncobj, , 2);
@@ -2515,10 +2516,40 @@ await_fence_array(struct i915_execbuffer *eb,
if (!fences[n].dma_fence)
continue;
 
-   err = i915_request_await_dma_fence(eb->request,
-  fences[n].dma_fence);
-   if (err < 0)
-   return err;
+   /*
+* If we're dealing with a dma-fence-chain, peel the chain by
+* adding all of the unsignaled fences
+* (dma_fence_chain_for_each does that for us) the chain
+* points to.
+*
+* This enables us to identify waits on i915 fences and allows
+* for faster engine-to-engine synchronization using HW
+* semaphores.
+*/
+   chain = to_dma_fence_chain(fences[n].dma_fence);
+   if (chain) {
+   struct dma_fence *iter;
+
+   dma_fence_chain_for_each(iter, fences[n].dma_fence) {
+   struct dma_fence_chain *iter_chain =
+   to_dma_fence_chain(iter);
+
+   GEM_BUG_ON(!iter_chain);
+
+   err = i915_request_await_dma_fence(eb->request,
+  
iter_chain->fence);
+   if (err < 0) {
+   dma_fence_put(iter);
+   return err;
+   }
+   }
+
+   } else {
+   err = i915_request_await_dma_fence(eb->request,
+  fences[n].dma_fence);
+   if (err < 0)
+   return err;
+   }
}
 
return 0;
-- 
2.21.0.5.gaeb582a983

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx