Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-11 Thread Chris Wilson
Quoting Tvrtko Ursulin (2018-04-11 11:47:00)
> 
> On 11/04/2018 11:36, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-11 11:23:01)
> >>
> >> On 10/04/2018 15:56, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> 
>  On 10/04/2018 15:24, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> >>
> >> On 09/04/2018 12:13, Chris Wilson wrote:
> >>> We can refine our current execlists->queue_priority if we inspect
> >>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we 
> >>> use
> >>> the unsubmitted queue and say that if a subsequent request is more 
> >>> than
> >>> important than the current queue, we will rerun the submission tasklet
> >>
> >> s/more than important/more important/
> >>
> >>> to evaluate the need for preemption. However, we only want to preempt 
> >>> if
> >>> we need to jump ahead of a currently executing request in ELSP. The
> >>> second reason for running the submission tasklet is amalgamate 
> >>> requests
> >>> into the active context on ELSP[0] to avoid a stall when ELSP[0] 
> >>> drains.
> >>> (Though repeatedly amalgamating requests into the active context and
> >>> triggering many lite-restore is off question gain, the goal really is 
> >>> to
> >>> put a context into ELSP[1] to cover the interrupt.) So if instead of
> >>> looking at the head of the queue, we look at the context in ELSP[1] we
> >>> can answer both of the questions more accurately -- we don't need to
> >>> rerun the submission tasklet unless our new request is important 
> >>> enough
> >>> to feed into, at least, ELSP[1].
> >>>
> >>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption 
> >>> between submission ports")
> >>> Signed-off-by: Chris Wilson 
> >>> Cc: Michał Winiarski 
> >>> Cc: Michel Thierry 
> >>> Cc: Mika Kuoppala 
> >>> Cc: Tvrtko Ursulin 
> >>> ---
> >>>  drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> >>> b/drivers/gpu/drm/i915/intel_lrc.c
> >>> index 3592288e4696..b47d53d284ca 100644
> >>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct 
> >>> intel_engine_cs *engine)
> >>>  kmem_cache_free(engine->i915->priorities, p);
> >>>  }
> >>>  done:
> >>> - execlists->queue_priority = rb ? to_priolist(rb)->priority : 
> >>> INT_MIN;
> >>> + execlists->queue_priority =
> >>> + port != execlists->port ? rq_prio(last) : INT_MIN;
> >>
> >> Please bear with my questions since I am not 100% up to date with
> >> preemption.
> >>
> >> Should this be rq_prio("port[1]") for future proofing? Or you really
> >> mean last port? In which case it wouldn't be the highest pending
> >> priority as kerneldoc for execlists->queue_priority says.
> >
> > I mean "secondary" port, so yes using last executing port under the
> > assumption that we grow into a ring of many useless submission ports.
> > The kerneldoc is no more or no less accurate. :)
> 
>  "That we _don't_ grow"?
> >>>
> >>> Hmm, no, when we get "last_port" it slots right into here. I just don't
> >>> have the future facing code to prevent Mika from having to think a
> >>> little. The intent is that when there is a ELSP slot available,
> >>> queue_priority is INT_MIN, when there are none, then rq_prio(last).
> >>>
> >>> My bad for remembering what I want the code to be without remembering
> >>> what the code says.
> >>>
> >> Although I failed to understand what do we do in both cases if a new
> >> request arrives of higher prio than the one in ELSP[1]. Looks like
> >> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this 
> >> so
> >> because we can't safely or I misread something?
> >
> > This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> > request higher than ELSP[1], we start a preemption as
> >
> > if (need_preempt(engine, last, execlists->queue_priority)) {
> >
> > will evaluate to true. It's either the lowest executing priority (new
> > code), or lowest pending priority (old code). In either case, if the new
> > request is more important than the queue_priority, we preempt.
> 
>  How when "last" is request from ELSP[0]? And also
>  execlists->queue_priority has not yet been updated to reflect the new
>  priority?
> >>>
> >>> When we start executing last on ELSP[0] there will have been another
> >>> execlists_dequeue() where we see an empty slot (or fill it) and update
> >>> queue_priority. If we are down to the last reques

Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-11 Thread Tvrtko Ursulin


On 11/04/2018 11:36, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2018-04-11 11:23:01)


On 10/04/2018 15:56, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2018-04-10 15:42:07)


On 10/04/2018 15:24, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2018-04-10 15:05:33)


On 09/04/2018 12:13, Chris Wilson wrote:

We can refine our current execlists->queue_priority if we inspect
ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
the unsubmitted queue and say that if a subsequent request is more than
important than the current queue, we will rerun the submission tasklet


s/more than important/more important/


to evaluate the need for preemption. However, we only want to preempt if
we need to jump ahead of a currently executing request in ELSP. The
second reason for running the submission tasklet is amalgamate requests
into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
(Though repeatedly amalgamating requests into the active context and
triggering many lite-restore is off question gain, the goal really is to
put a context into ELSP[1] to cover the interrupt.) So if instead of
looking at the head of the queue, we look at the context in ELSP[1] we
can answer both of the questions more accurately -- we don't need to
rerun the submission tasklet unless our new request is important enough
to feed into, at least, ELSP[1].

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission 
ports")
Signed-off-by: Chris Wilson 
Cc: Michał Winiarski 
Cc: Michel Thierry 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3592288e4696..b47d53d284ca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
 kmem_cache_free(engine->i915->priorities, p);
 }
 done:
- execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
+ execlists->queue_priority =
+ port != execlists->port ? rq_prio(last) : INT_MIN;


Please bear with my questions since I am not 100% up to date with
preemption.

Should this be rq_prio("port[1]") for future proofing? Or you really
mean last port? In which case it wouldn't be the highest pending
priority as kerneldoc for execlists->queue_priority says.


I mean "secondary" port, so yes using last executing port under the
assumption that we grow into a ring of many useless submission ports.
The kerneldoc is no more or no less accurate. :)


"That we _don't_ grow"?


Hmm, no, when we get "last_port" it slots right into here. I just don't
have the future facing code to prevent Mika from having to think a
little. The intent is that when there is a ELSP slot available,
queue_priority is INT_MIN, when there are none, then rq_prio(last).

My bad for remembering what I want the code to be without remembering
what the code says.
   

Although I failed to understand what do we do in both cases if a new
request arrives of higher prio than the one in ELSP[1]. Looks like
nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
because we can't safely or I misread something?


This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
request higher than ELSP[1], we start a preemption as

if (need_preempt(engine, last, execlists->queue_priority)) {

will evaluate to true. It's either the lowest executing priority (new
code), or lowest pending priority (old code). In either case, if the new
request is more important than the queue_priority, we preempt.


How when "last" is request from ELSP[0]? And also
execlists->queue_priority has not yet been updated to reflect the new
priority?


When we start executing last on ELSP[0] there will have been another
execlists_dequeue() where we see an empty slot (or fill it) and update
queue_priority. If we are down to the last request, it will be set to
INT_MIN. Upon its completion, it will remain INT_MIN.


I don't see it yet, let me walk through it:

0. Initial situation, GPU busy with two requests, no outstanding ones:

ELSP[0] = prio 2
ELSP[1] = prio 0

1. queue_priority = 0
2. New execbuf comes along with prio=1.
3. We choose to schedule the tasklet - good.
4. execlists_dequeue runs

last = prio 2

if (need_preempt(engine, last, execlists->queue_priority))

queue_priority = 0, so will not preempt last which is prio 2 - so no
preemption - good.

queue_priority remains at zero since we "goto unlock" with both ports
busy and no preemption is triggered.

5. ELSP[0] completes, new ELSP[0] with prio 0.

(Before we missed the opportunity to replace ELSP[1] with higher prio 1
waiting request before ELSP[0] completed - perhaps we have no choice?
But ok.. carrying on..)


We don't want to interrupt the higher priority task in ELSP[0] to sort

Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-11 Thread Chris Wilson
Quoting Tvrtko Ursulin (2018-04-11 11:23:01)
> 
> On 10/04/2018 15:56, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> >>
> >> On 10/04/2018 15:24, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> 
>  On 09/04/2018 12:13, Chris Wilson wrote:
> > We can refine our current execlists->queue_priority if we inspect
> > ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> > the unsubmitted queue and say that if a subsequent request is more than
> > important than the current queue, we will rerun the submission tasklet
> 
>  s/more than important/more important/
> 
> > to evaluate the need for preemption. However, we only want to preempt if
> > we need to jump ahead of a currently executing request in ELSP. The
> > second reason for running the submission tasklet is amalgamate requests
> > into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> > (Though repeatedly amalgamating requests into the active context and
> > triggering many lite-restore is off question gain, the goal really is to
> > put a context into ELSP[1] to cover the interrupt.) So if instead of
> > looking at the head of the queue, we look at the context in ELSP[1] we
> > can answer both of the questions more accurately -- we don't need to
> > rerun the submission tasklet unless our new request is important enough
> > to feed into, at least, ELSP[1].
> >
> > References: f6322eddaff7 ("drm/i915/preemption: Allow preemption 
> > between submission ports")
> > Signed-off-by: Chris Wilson 
> > Cc: Michał Winiarski 
> > Cc: Michel Thierry 
> > Cc: Mika Kuoppala 
> > Cc: Tvrtko Ursulin 
> > ---
> > drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index 3592288e4696..b47d53d284ca 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -713,7 +713,8 @@ static void execlists_dequeue(struct 
> > intel_engine_cs *engine)
> > kmem_cache_free(engine->i915->priorities, p);
> > }
> > done:
> > - execlists->queue_priority = rb ? to_priolist(rb)->priority : 
> > INT_MIN;
> > + execlists->queue_priority =
> > + port != execlists->port ? rq_prio(last) : INT_MIN;
> 
>  Please bear with my questions since I am not 100% up to date with
>  preemption.
> 
>  Should this be rq_prio("port[1]") for future proofing? Or you really
>  mean last port? In which case it wouldn't be the highest pending
>  priority as kerneldoc for execlists->queue_priority says.
> >>>
> >>> I mean "secondary" port, so yes using last executing port under the
> >>> assumption that we grow into a ring of many useless submission ports.
> >>> The kerneldoc is no more or no less accurate. :)
> >>
> >> "That we _don't_ grow"?
> > 
> > Hmm, no, when we get "last_port" it slots right into here. I just don't
> > have the future facing code to prevent Mika from having to think a
> > little. The intent is that when there is a ELSP slot available,
> > queue_priority is INT_MIN, when there are none, then rq_prio(last).
> > 
> > My bad for remembering what I want the code to be without remembering
> > what the code says.
> >   
>  Although I failed to understand what do we do in both cases if a new
>  request arrives of higher prio than the one in ELSP[1]. Looks like
>  nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
>  because we can't safely or I misread something?
> >>>
> >>> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> >>> request higher than ELSP[1], we start a preemption as
> >>>
> >>>if (need_preempt(engine, last, execlists->queue_priority)) {
> >>>
> >>> will evaluate to true. It's either the lowest executing priority (new
> >>> code), or lowest pending priority (old code). In either case, if the new
> >>> request is more important than the queue_priority, we preempt.
> >>
> >> How when "last" is request from ELSP[0]? And also
> >> execlists->queue_priority has not yet been updated to reflect the new
> >> priority?
> > 
> > When we start executing last on ELSP[0] there will have been another
> > execlists_dequeue() where we see an empty slot (or fill it) and update
> > queue_priority. If we are down to the last request, it will be set to
> > INT_MIN. Upon its completion, it will remain INT_MIN.
> 
> I don't see it yet, let me walk through it:
> 
> 0. Initial situation, GPU busy with two requests, no outstanding ones:
> 
> ELSP[0] = prio 2
> ELSP[1] = prio 0
> 
> 1. queue_priority = 0
> 2. New execbuf comes along with prio=1.
> 3. We choose to schedule the tasklet - good.
> 4. execlists_d

Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-11 Thread Tvrtko Ursulin


On 10/04/2018 15:56, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2018-04-10 15:42:07)


On 10/04/2018 15:24, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2018-04-10 15:05:33)


On 09/04/2018 12:13, Chris Wilson wrote:

We can refine our current execlists->queue_priority if we inspect
ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
the unsubmitted queue and say that if a subsequent request is more than
important than the current queue, we will rerun the submission tasklet


s/more than important/more important/


to evaluate the need for preemption. However, we only want to preempt if
we need to jump ahead of a currently executing request in ELSP. The
second reason for running the submission tasklet is amalgamate requests
into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
(Though repeatedly amalgamating requests into the active context and
triggering many lite-restore is off question gain, the goal really is to
put a context into ELSP[1] to cover the interrupt.) So if instead of
looking at the head of the queue, we look at the context in ELSP[1] we
can answer both of the questions more accurately -- we don't need to
rerun the submission tasklet unless our new request is important enough
to feed into, at least, ELSP[1].

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission 
ports")
Signed-off-by: Chris Wilson 
Cc: Michał Winiarski 
Cc: Michel Thierry 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
---
drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3592288e4696..b47d53d284ca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
kmem_cache_free(engine->i915->priorities, p);
}
done:
- execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
+ execlists->queue_priority =
+ port != execlists->port ? rq_prio(last) : INT_MIN;


Please bear with my questions since I am not 100% up to date with
preemption.

Should this be rq_prio("port[1]") for future proofing? Or you really
mean last port? In which case it wouldn't be the highest pending
priority as kerneldoc for execlists->queue_priority says.


I mean "secondary" port, so yes using last executing port under the
assumption that we grow into a ring of many useless submission ports.
The kerneldoc is no more or no less accurate. :)


"That we _don't_ grow"?


Hmm, no, when we get "last_port" it slots right into here. I just don't
have the future facing code to prevent Mika from having to think a
little. The intent is that when there is a ELSP slot available,
queue_priority is INT_MIN, when there are none, then rq_prio(last).

My bad for remembering what I want the code to be without remembering
what the code says.
  

Although I failed to understand what do we do in both cases if a new
request arrives of higher prio than the one in ELSP[1]. Looks like
nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
because we can't safely or I misread something?


This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
request higher than ELSP[1], we start a preemption as

   if (need_preempt(engine, last, execlists->queue_priority)) {

will evaluate to true. It's either the lowest executing priority (new
code), or lowest pending priority (old code). In either case, if the new
request is more important than the queue_priority, we preempt.


How when "last" is request from ELSP[0]? And also
execlists->queue_priority has not yet been updated to reflect the new
priority?


When we start executing last on ELSP[0] there will have been another
execlists_dequeue() where we see an empty slot (or fill it) and update
queue_priority. If we are down to the last request, it will be set to
INT_MIN. Upon its completion, it will remain INT_MIN.


I don't see it yet, let me walk through it:

0. Initial situation, GPU busy with two requests, no outstanding ones:

ELSP[0] = prio 2
ELSP[1] = prio 0

1. queue_priority = 0
2. New execbuf comes along with prio=1.
3. We choose to schedule the tasklet - good.
4. execlists_dequeue runs

last = prio 2

if (need_preempt(engine, last, execlists->queue_priority))

queue_priority = 0, so will not preempt last which is prio 2 - so no 
preemption - good.


queue_priority remains at zero since we "goto unlock" with both ports 
busy and no preemption is triggered.


5. ELSP[0] completes, new ELSP[0] with prio 0.

(Before we missed the opportunity to replace ELSP[1] with higher prio 1 
waiting request before ELSP[0] completed - perhaps we have no choice? 
But ok.. carrying on..)


6. execlist_dequeue

lasts = prio 0

if (need_preempt(engine, last, execlists->queue_priority))

queue_priority = 0, so again preemption not triggered.

Perhaps 

Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-10 Thread Chris Wilson
Quoting Chris Wilson (2018-04-10 15:56:03)
> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> > 
> > On 10/04/2018 15:24, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> > >> Although I failed to understand what do we do in both cases if a new
> > >> request arrives of higher prio than the one in ELSP[1]. Looks like
> > >> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> > >> because we can't safely or I misread something?
> > > 
> > > This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> > > request higher than ELSP[1], we start a preemption as
> > > 
> > >   if (need_preempt(engine, last, execlists->queue_priority)) {
> > > 
> > > will evaluate to true. It's either the lowest executing priority (new
> > > code), or lowest pending priority (old code). In either case, if the new
> > > request is more important than the queue_priority, we preempt.
> > 
> > How when "last" is request from ELSP[0]? And also 
> > execlists->queue_priority has not yet been updated to reflect the new 
> > priority?
> 
> When we start executing last on ELSP[0] there will have been another
> execlists_dequeue() where we see an empty slot (or fill it) and update
> queue_priority. If we are down to the last request, it will be set to
> INT_MIN. Upon its completion, it will remain INT_MIN.
> 
> > Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
> > if there were any appends to ELSP[0] we will also fail to act in this 
> > situation?
> 
> If we only write into ELSP[0], then ELSP[1] remains empty and the
> queue_priority still needs to INT_MIN so that we service any new
> i915_request_add immediately.
>  
> > > We won't evaluate preemption if we are still awaiting the HWACK from the
> > > last ELSP write, or if we are still preempting. In both of those cases,
> > > we expect to receive an interrupt promptly, upon which we then redo our
> > > evaluations.
> > > 
> > >> Also, don't you need to manage execlists->queue_priority after CSB
> > >> processing now? So that it correctly reflects the priority of request in
> > >> ELSP[1] after ELSP[0] gets completed? It seems that without it would get
> > >> stuck at the previous value and then submission could decide to skip
> > >> scheduling the tasklet if new priority is lower than what was in ELSP[1]
> > >> before, and so would fail to fill ELSP[1].
> > > 
> > > Yes, that is also done here. Since we are always looking one request
> > > ahead, we either update the priority based on the queue following the
> > > resubmission on interrupt, or it is left as INT_MIN on completion.
> > > Indeed, making sure we reset back to INT_MIN is essential so that we
> > > don't any future submissions from idle.
> > 
> > Okay I see it - because execlists_dequeue is called and runs to the 
> > queue_priority update bit even when there is nothing in the queue.
> 
> Phew, I can get away from having to draw ascii diagrams. I'll leave that
> to Mika as he figures out how to hook up N ports ;)

/*
 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
 *
 * We choose queue_priority such that if we add a request of greater
 * priority than this, we kick the submission tasklet to decide on
 * the right order of submitting the requests to hardware. We must
 * also be prepared to reorder requests as they are in-flight on the
 * HW. We derive the queue_priority then as the first "hole" in
 * the HW submission ports and if there are no available slots,
 * it the priority of the lowest executing request, i.e. the last one.
 */
execlists->queue_priority =
port != execlists->port ? rq_prio(last) : INT_MIN;

Does that help, or do I need ASCII art?
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-10 Thread Chris Wilson
Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> 
> On 10/04/2018 15:24, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> >>
> >> On 09/04/2018 12:13, Chris Wilson wrote:
> >>> We can refine our current execlists->queue_priority if we inspect
> >>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> >>> the unsubmitted queue and say that if a subsequent request is more than
> >>> important than the current queue, we will rerun the submission tasklet
> >>
> >> s/more than important/more important/
> >>
> >>> to evaluate the need for preemption. However, we only want to preempt if
> >>> we need to jump ahead of a currently executing request in ELSP. The
> >>> second reason for running the submission tasklet is amalgamate requests
> >>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> >>> (Though repeatedly amalgamating requests into the active context and
> >>> triggering many lite-restore is off question gain, the goal really is to
> >>> put a context into ELSP[1] to cover the interrupt.) So if instead of
> >>> looking at the head of the queue, we look at the context in ELSP[1] we
> >>> can answer both of the questions more accurately -- we don't need to
> >>> rerun the submission tasklet unless our new request is important enough
> >>> to feed into, at least, ELSP[1].
> >>>
> >>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between 
> >>> submission ports")
> >>> Signed-off-by: Chris Wilson 
> >>> Cc: Michał Winiarski 
> >>> Cc: Michel Thierry 
> >>> Cc: Mika Kuoppala 
> >>> Cc: Tvrtko Ursulin 
> >>> ---
> >>>drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >>>1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> >>> b/drivers/gpu/drm/i915/intel_lrc.c
> >>> index 3592288e4696..b47d53d284ca 100644
> >>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs 
> >>> *engine)
> >>>kmem_cache_free(engine->i915->priorities, p);
> >>>}
> >>>done:
> >>> - execlists->queue_priority = rb ? to_priolist(rb)->priority : 
> >>> INT_MIN;
> >>> + execlists->queue_priority =
> >>> + port != execlists->port ? rq_prio(last) : INT_MIN;
> >>
> >> Please bear with my questions since I am not 100% up to date with
> >> preemption.
> >>
> >> Should this be rq_prio("port[1]") for future proofing? Or you really
> >> mean last port? In which case it wouldn't be the highest pending
> >> priority as kerneldoc for execlists->queue_priority says.
> > 
> > I mean "secondary" port, so yes using last executing port under the
> > assumption that we grow into a ring of many useless submission ports.
> > The kerneldoc is no more or no less accurate. :)
> 
> "That we _don't_ grow"?

Hmm, no, when we get "last_port" it slots right into here. I just don't
have the future facing code to prevent Mika from having to think a
little. The intent is that when there is a ELSP slot available,
queue_priority is INT_MIN, when there are none, then rq_prio(last).

My bad for remembering what I want the code to be without remembering
what the code says.
 
> >> Although I failed to understand what do we do in both cases if a new
> >> request arrives of higher prio than the one in ELSP[1]. Looks like
> >> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> >> because we can't safely or I misread something?
> > 
> > This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> > request higher than ELSP[1], we start a preemption as
> > 
> >   if (need_preempt(engine, last, execlists->queue_priority)) {
> > 
> > will evaluate to true. It's either the lowest executing priority (new
> > code), or lowest pending priority (old code). In either case, if the new
> > request is more important than the queue_priority, we preempt.
> 
> How when "last" is request from ELSP[0]? And also 
> execlists->queue_priority has not yet been updated to reflect the new 
> priority?

When we start executing last on ELSP[0] there will have been another
execlists_dequeue() where we see an empty slot (or fill it) and update
queue_priority. If we are down to the last request, it will be set to
INT_MIN. Upon its completion, it will remain INT_MIN.

> Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
> if there were any appends to ELSP[0] we will also fail to act in this 
> situation?

If we only write into ELSP[0], then ELSP[1] remains empty and the
queue_priority still needs to INT_MIN so that we service any new
i915_request_add immediately.
 
> > We won't evaluate preemption if we are still awaiting the HWACK from the
> > last ELSP write, or if we are still preempting. In both of those cases,
> > we expect to receive an interrupt promptly, upon which we then redo our
> > evaluations.
> > 
> >> Also, don't you need to manage

Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-10 Thread Tvrtko Ursulin


On 10/04/2018 15:24, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2018-04-10 15:05:33)


On 09/04/2018 12:13, Chris Wilson wrote:

We can refine our current execlists->queue_priority if we inspect
ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
the unsubmitted queue and say that if a subsequent request is more than
important than the current queue, we will rerun the submission tasklet


s/more than important/more important/


to evaluate the need for preemption. However, we only want to preempt if
we need to jump ahead of a currently executing request in ELSP. The
second reason for running the submission tasklet is amalgamate requests
into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
(Though repeatedly amalgamating requests into the active context and
triggering many lite-restore is off question gain, the goal really is to
put a context into ELSP[1] to cover the interrupt.) So if instead of
looking at the head of the queue, we look at the context in ELSP[1] we
can answer both of the questions more accurately -- we don't need to
rerun the submission tasklet unless our new request is important enough
to feed into, at least, ELSP[1].

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission 
ports")
Signed-off-by: Chris Wilson 
Cc: Michał Winiarski 
Cc: Michel Thierry 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
---
   drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3592288e4696..b47d53d284ca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
   kmem_cache_free(engine->i915->priorities, p);
   }
   done:
- execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
+ execlists->queue_priority =
+ port != execlists->port ? rq_prio(last) : INT_MIN;


Please bear with my questions since I am not 100% up to date with
preemption.

Should this be rq_prio("port[1]") for future proofing? Or you really
mean last port? In which case it wouldn't be the highest pending
priority as kerneldoc for execlists->queue_priority says.


I mean "secondary" port, so yes using last executing port under the
assumption that we grow into a ring of many useless submission ports.
The kerneldoc is no more or no less accurate. :)


"That we _don't_ grow"?




So this patch changes the meaning of "pending". From pending == "not
submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems
to make sense, although it is not the easiest job to figure out the
consequences.


Yes.
  

It even feels like a bugfix since it prevents tasklet scheduling when
all ports are filled with higher priority requests than the new one.


It won't fix any bugs, since we just reduce the number of kicks. Kicking
and evaluating we have nothing to do is just wasted work. So yes I do
agree that it is a bug fix, just not enough to merit a Fixes.


Yeah that's fine.

  

Although I failed to understand what do we do in both cases if a new
request arrives of higher prio than the one in ELSP[1]. Looks like
nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
because we can't safely or I misread something?


This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
request higher than ELSP[1], we start a preemption as

if (need_preempt(engine, last, execlists->queue_priority)) {

will evaluate to true. It's either the lowest executing priority (new
code), or lowest pending priority (old code). In either case, if the new
request is more important than the queue_priority, we preempt.


How when "last" is request from ELSP[0]? And also 
execlists->queue_priority has not yet been updated to reflect the new 
priority?


Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
if there were any appends to ELSP[0] we will also fail to act in this 
situation?



We won't evaluate preemption if we are still awaiting the HWACK from the
last ELSP write, or if we are still preempting. In both of those cases,
we expect to receive an interrupt promptly, upon which we then redo our
evaluations.


Also, don't you need to manage execlists->queue_priority after CSB
processing now? So that it correctly reflects the priority of request in
ELSP[1] after ELSP[0] gets completed? It seems that without it would get
stuck at the previous value and then submission could decide to skip
scheduling the tasklet if new priority is lower than what was in ELSP[1]
before, and so would fail to fill ELSP[1].


Yes, that is also done here. Since we are always looking one request
ahead, we either update the priority based on the queue following the
resubmission on interrupt, or it is left as INT_MIN on completion.
Indeed, making sure we reset back to INT_MIN is essential s

Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-10 Thread Chris Wilson
Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> 
> On 09/04/2018 12:13, Chris Wilson wrote:
> > We can refine our current execlists->queue_priority if we inspect
> > ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> > the unsubmitted queue and say that if a subsequent request is more than
> > important than the current queue, we will rerun the submission tasklet
> 
> s/more than important/more important/
> 
> > to evaluate the need for preemption. However, we only want to preempt if
> > we need to jump ahead of a currently executing request in ELSP. The
> > second reason for running the submission tasklet is amalgamate requests
> > into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> > (Though repeatedly amalgamating requests into the active context and
> > triggering many lite-restore is off question gain, the goal really is to
> > put a context into ELSP[1] to cover the interrupt.) So if instead of
> > looking at the head of the queue, we look at the context in ELSP[1] we
> > can answer both of the questions more accurately -- we don't need to
> > rerun the submission tasklet unless our new request is important enough
> > to feed into, at least, ELSP[1].
> > 
> > References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between 
> > submission ports")
> > Signed-off-by: Chris Wilson 
> > Cc: Michał Winiarski 
> > Cc: Michel Thierry 
> > Cc: Mika Kuoppala 
> > Cc: Tvrtko Ursulin 
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index 3592288e4696..b47d53d284ca 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs 
> > *engine)
> >   kmem_cache_free(engine->i915->priorities, p);
> >   }
> >   done:
> > - execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> > + execlists->queue_priority =
> > + port != execlists->port ? rq_prio(last) : INT_MIN;
> 
> Please bear with my questions since I am not 100% up to date with 
> preemption.
> 
> Should this be rq_prio("port[1]") for future proofing? Or you really 
> mean last port? In which case it wouldn't be the highest pending 
> priority as kerneldoc for execlists->queue_priority says.

I mean "secondary" port, so yes using last executing port under the
assumption that we grow into a ring of many useless submission ports.
The kerneldoc is no more or no less accurate. :)

> So this patch changes the meaning of "pending". From pending == "not 
> submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems 
> to make sense, although it is not the easiest job to figure out the 
> consequences.

Yes.
 
> It even feels like a bugfix since it prevents tasklet scheduling when 
> all ports are filled with higher priority requests than the new one.

It won't fix any bugs, since we just reduce the number of kicks. Kicking
and evaluating we have nothing to do is just wasted work. So yes I do
agree that it is a bug fix, just not enough to merit a Fixes.
 
> Although I failed to understand what do we do in both cases if a new 
> request arrives of higher prio than the one in ELSP[1]. Looks like 
> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so 
> because we can't safely or I misread something?

This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
request higher than ELSP[1], we start a preemption as

if (need_preempt(engine, last, execlists->queue_priority)) {

will evaluate to true. It's either the lowest executing priority (new
code), or lowest pending priority (old code). In either case, if the new
request is more important than the queue_priority, we preempt.

We won't evaluate preemption if we are still awaiting the HWACK from the
last ELSP write, or if we are still preempting. In both of those cases,
we expect to receive an interrupt promptly, upon which we then redo our
evaluations.

> Also, don't you need to manage execlists->queue_priority after CSB 
> processing now? So that it correctly reflects the priority of request in 
> ELSP[1] after ELSP[0] gets completed? It seems that without it would get 
> stuck at the previous value and then submission could decide to skip 
> scheduling the tasklet if new priority is lower than what was in ELSP[1] 
> before, and so would fail to fill ELSP[1].

Yes, that is also done here. Since we are always looking one request
ahead, we either update the priority based on the queue following the
resubmission on interrupt, or it is left as INT_MIN on completion.
Indeed, making sure we reset back to INT_MIN is essential so that we
don't any future submissions from idle.

We can add GEM_BUG_ON(queue_priority != INT_MIN) in engines_park to
check this condition.
-Chris

Re: [Intel-gfx] [PATCH 01/18] drm/i915/execlists: Set queue priority from secondary port

2018-04-10 Thread Tvrtko Ursulin


On 09/04/2018 12:13, Chris Wilson wrote:

We can refine our current execlists->queue_priority if we inspect
ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
the unsubmitted queue and say that if a subsequent request is more than
important than the current queue, we will rerun the submission tasklet


s/more than important/more important/


to evaluate the need for preemption. However, we only want to preempt if
we need to jump ahead of a currently executing request in ELSP. The
second reason for running the submission tasklet is amalgamate requests
into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
(Though repeatedly amalgamating requests into the active context and
triggering many lite-restore is off question gain, the goal really is to
put a context into ELSP[1] to cover the interrupt.) So if instead of
looking at the head of the queue, we look at the context in ELSP[1] we
can answer both of the questions more accurately -- we don't need to
rerun the submission tasklet unless our new request is important enough
to feed into, at least, ELSP[1].

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission 
ports")
Signed-off-by: Chris Wilson 
Cc: Michał Winiarski 
Cc: Michel Thierry 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3592288e4696..b47d53d284ca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
kmem_cache_free(engine->i915->priorities, p);
}
  done:
-   execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
+   execlists->queue_priority =
+   port != execlists->port ? rq_prio(last) : INT_MIN;


Please bear with my questions since I am not 100% up to date with 
preemption.


Should this be rq_prio("port[1]") for future proofing? Or you really 
mean last port? In which case it wouldn't be the highest pending 
priority as kerneldoc for execlists->queue_priority says.


So this patch changes the meaning of "pending". From pending == "not 
submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems 
to make sense, although it is not the easiest job to figure out the 
consequences.


It even feels like a bugfix since it prevents tasklet scheduling when 
all ports are filled with higher priority requests than the new one.


Although I failed to understand what do we do in both cases if a new 
request arrives of higher prio than the one in ELSP[1]. Looks like 
nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so 
because we can't safely or I misread something?


Also, don't you need to manage execlists->queue_priority after CSB 
processing now? So that it correctly reflects the priority of request in 
ELSP[1] after ELSP[0] gets completed? It seems that without it would get 
stuck at the previous value and then submission could decide to skip 
scheduling the tasklet if new priority is lower than what was in ELSP[1] 
before, and so would fail to fill ELSP[1].



execlists->first = rb;
if (submit)
port_assign(port, last);



Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx