Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-25 Thread Andrey Grodzovsky



On 2022-08-24 22:29, Luben Tuikov wrote:

Inlined:

On 2022-08-24 12:21, Andrey Grodzovsky wrote:

On 2022-08-23 17:37, Luben Tuikov wrote:

On 2022-08-23 14:57, Andrey Grodzovsky wrote:

On 2022-08-23 14:30, Luben Tuikov wrote:


On 2022-08-23 14:13, Andrey Grodzovsky wrote:

On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
 drivers/gpu/drm/scheduler/sched_entity.c |  2 +
 drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
 include/drm/gpu_scheduler.h  |  8 +++
 3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(>job_queue, _job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
 
 	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
 #define CREATE_TRACE_POINTS
 #include "gpu_scheduler_trace.h"
 
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.

Christian is not against it, just against adding 'auto' here - like the
default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben

The only solution i see for this now is keeping a global per rq jobs
list parallel to SPCP queue per entity - we use this list when we switch
to FIFO scheduling, we can even start building  it ONLY when we switch
to FIFO building it gradually as more jobs come. Do you have other solution
in mind ?

The idea is to "sort" on insertion, not on picking the next one to run.

cont'd below:


Andrey


+
+
 #define to_drm_sched_job(sched_job)\
container_of((sched_job), struct drm_sched_job, queue_node)
 
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

 }
 
 /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
  *
  * @rq: scheduler run queue to check.
  *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
  */
 static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
 {
struct drm_sched_entity *entity;
 
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
 }
 
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order 

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-25 Thread Andrey Grodzovsky



On 2022-08-24 22:29, Luben Tuikov wrote:

Inlined:

On 2022-08-24 12:21, Andrey Grodzovsky wrote:

On 2022-08-23 17:37, Luben Tuikov wrote:

On 2022-08-23 14:57, Andrey Grodzovsky wrote:

On 2022-08-23 14:30, Luben Tuikov wrote:


On 2022-08-23 14:13, Andrey Grodzovsky wrote:

On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
 drivers/gpu/drm/scheduler/sched_entity.c |  2 +
 drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
 include/drm/gpu_scheduler.h  |  8 +++
 3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(>job_queue, _job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
 
 	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
 #define CREATE_TRACE_POINTS
 #include "gpu_scheduler_trace.h"
 
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.

Christian is not against it, just against adding 'auto' here - like the
default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben

The only solution i see for this now is keeping a global per rq jobs
list parallel to SPCP queue per entity - we use this list when we switch
to FIFO scheduling, we can even start building  it ONLY when we switch
to FIFO building it gradually as more jobs come. Do you have other solution
in mind ?

The idea is to "sort" on insertion, not on picking the next one to run.

cont'd below:


Andrey


+
+
 #define to_drm_sched_job(sched_job)\
container_of((sched_job), struct drm_sched_job, queue_node)
 
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

 }
 
 /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
  *
  * @rq: scheduler run queue to check.
  *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
  */
 static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
 {
struct drm_sched_entity *entity;
 
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
 }
 
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order 

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-25 Thread Andrey Grodzovsky


On 2022-08-23 17:37, Luben Tuikov wrote:


On 2022-08-23 14:57, Andrey Grodzovsky wrote:

On 2022-08-23 14:30, Luben Tuikov wrote:


On 2022-08-23 14:13, Andrey Grodzovsky wrote:

On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
drivers/gpu/drm/scheduler/sched_entity.c |  2 +
drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
include/drm/gpu_scheduler.h  |  8 +++
3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(>job_queue, _job->queue_node);
+   sched_job->submit_ts = ktime_get();
+

	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
#define CREATE_TRACE_POINTS
#include "gpu_scheduler_trace.h"

+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.

Christian is not against it, just against adding 'auto' here - like the
default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben


The only solution i see for this now is keeping a global per rq jobs
list parallel to SPCP queue per entity - we use this list when we switch
to FIFO scheduling, we can even start building  it ONLY when we switch
to FIFO building it gradually as more jobs come. Do you have other solution
in mind ?

The idea is to "sort" on insertion, not on picking the next one to run.

cont'd below:


Andrey


+
+
#define to_drm_sched_job(sched_job) \
container_of((sched_job), struct drm_sched_job, queue_node)

@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

}

/**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
 *
 * @rq: scheduler run queue to check.
 *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
 */
static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
{
struct drm_sched_entity *entity;

@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
}

+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-25 Thread Luben Tuikov
Inlined:

On 2022-08-24 12:21, Andrey Grodzovsky wrote:
> 
> On 2022-08-23 17:37, Luben Tuikov wrote:
>>
>> On 2022-08-23 14:57, Andrey Grodzovsky wrote:
>>> On 2022-08-23 14:30, Luben Tuikov wrote:
>>>
 On 2022-08-23 14:13, Andrey Grodzovsky wrote:
> On 2022-08-23 12:58, Luben Tuikov wrote:
>> Inlined:
>>
>> On 2022-08-22 16:09, Andrey Grodzovsky wrote:
>>> Poblem: Given many entities competing for same rq on
>> ^Problem
>>
>>> same scheduler an uncceptabliy long wait time for some
>> ^unacceptably
>>
>>> jobs waiting stuck in rq before being picked up are
>>> observed (seen using  GPUVis).
>>> The issue is due to Round Robin policy used by scheduler
>>> to pick up the next entity for execution. Under stress
>>> of many entities and long job queus within entity some
>> ^queues
>>
>>> jobs could be stack for very long time in it's entity's
>>> queue before being popped from the queue and executed
>>> while for other entites with samller job queues a job
>> ^entities; smaller
>>
>>> might execute ealier even though that job arrived later
>> ^earlier
>>
>>> then the job in the long queue.
>>>
>>> Fix:
>>> Add FIFO selection policy to entites in RQ, chose next enitity
>>> on rq in such order that if job on one entity arrived
>>> ealrier then job on another entity the first job will start
>>> executing ealier regardless of the length of the entity's job
>>> queue.
>>>
>>> Signed-off-by: Andrey Grodzovsky 
>>> Tested-by: Li Yunxiang (Teddy) 
>>> ---
>>> drivers/gpu/drm/scheduler/sched_entity.c |  2 +
>>> drivers/gpu/drm/scheduler/sched_main.c   | 65 
>>> ++--
>>> include/drm/gpu_scheduler.h  |  8 +++
>>> 3 files changed, 71 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 6b25b2f4f5a3..3bb7f69306ef 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
>>> *sched_job)
>>> atomic_inc(entity->rq->sched->score);
>>> WRITE_ONCE(entity->last_user, current->group_leader);
>>> first = spsc_queue_push(>job_queue, 
>>> _job->queue_node);
>>> +   sched_job->submit_ts = ktime_get();
>>> +
>>> 
>>> /* first job wakes up scheduler */
>>> if (first) {
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 68317d3a7a27..c123aa120d06 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -59,6 +59,19 @@
>>> #define CREATE_TRACE_POINTS
>>> #include "gpu_scheduler_trace.h"
>>> 
>>> +
>>> +
>>> +int drm_sched_policy = -1;
>>> +
>>> +/**
>>> + * DOC: sched_policy (int)
>>> + * Used to override default entites scheduling policy in a run queue.
>>> + */
>>> +MODULE_PARM_DESC(sched_policy,
>>> +   "specify schedule policy for entites on a runqueue (-1 
>>> = auto(default) value, 0 = Round Robin,1  = use FIFO");
>>> +module_param_named(sched_policy, drm_sched_policy, int, 0444);
>> As per Christian's comments, you can drop the "auto" and perhaps leave 
>> one as the default,
>> say the RR.
>>
>> I do think it is beneficial to have a module parameter control the 
>> scheduling policy, as shown above.
> Christian is not against it, just against adding 'auto' here - like the
> default.
 Exactly what I said.

 Also, I still think an O(1) scheduling (picking next to run) should be
 what we strive for in such a FIFO patch implementation.
 A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
 element.

 Regards,
 Luben
>>>
>>> The only solution i see for this now is keeping a global per rq jobs
>>> list parallel to SPCP queue per entity - we use this list when we switch
>>> to FIFO scheduling, we can even start building  it ONLY when we switch
>>> to FIFO building it gradually as more jobs come. Do you have other solution
>>> in mind ?
>> The idea is to "sort" on insertion, not on picking the next one to run.
>>
>> cont'd below:
>>
>>> Andrey
>>>
>>> +
>>> +
>>> #define to_drm_sched_job(sched_job) \
>>> container_of((sched_job), struct drm_sched_job, 
>>> queue_node)
>>> 
>>> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct 
>>> drm_sched_rq *rq,
>>> }
>>> 
>>> /**
>>> - * drm_sched_rq_select_entity - Select an entity which could provide a 
>>> job to run
>>> + * 

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-24 Thread Andrey Grodzovsky

On 2022-08-24 04:29, Michel Dänzer wrote:


On 2022-08-22 22:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on
same scheduler an uncceptabliy long wait time for some
jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some
jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job
might execute ealier even though that job arrived later
then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Instead of ordering based on when jobs are added, might it be possible to order 
them based on when they become ready to run?

Otherwise it seems possible to e.g. submit a large number of inter-dependent 
jobs at once, and they would all run before any jobs from another queue get a 
chance.



While any of them is not ready (i.e. still having unfulfilled 
dependency) this job will not be chosen to run (see 
drm_sched_entity_is_ready). In this scenario if an earlier job
from entity E1 is not ready to run it will be skipped and a later job 
from entity E2 (which is ready) will be chosen to run  so E1 job is not 
blocking E2 job. The moment E1 job
does become ready it seems to me logical to let it run ASAP as it's by 
now it spent the most time of anyone waiting for execution, and I don't 
think it matters that part of this time

was because it waited for dependency job to complete it's run.

Andrey







Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-24 Thread Michel Dänzer
On 2022-08-22 22:09, Andrey Grodzovsky wrote:
> Poblem: Given many entities competing for same rq on
> same scheduler an uncceptabliy long wait time for some
> jobs waiting stuck in rq before being picked up are
> observed (seen using  GPUVis).
> The issue is due to Round Robin policy used by scheduler
> to pick up the next entity for execution. Under stress
> of many entities and long job queus within entity some
> jobs could be stack for very long time in it's entity's
> queue before being popped from the queue and executed
> while for other entites with samller job queues a job
> might execute ealier even though that job arrived later
> then the job in the long queue.
> 
> Fix:
> Add FIFO selection policy to entites in RQ, chose next enitity
> on rq in such order that if job on one entity arrived
> ealrier then job on another entity the first job will start
> executing ealier regardless of the length of the entity's job
> queue.

Instead of ordering based on when jobs are added, might it be possible to order 
them based on when they become ready to run?

Otherwise it seems possible to e.g. submit a large number of inter-dependent 
jobs at once, and they would all run before any jobs from another queue get a 
chance.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Luben Tuikov



On 2022-08-23 14:57, Andrey Grodzovsky wrote:
> On 2022-08-23 14:30, Luben Tuikov wrote:
> 
>>
>> On 2022-08-23 14:13, Andrey Grodzovsky wrote:
>>> On 2022-08-23 12:58, Luben Tuikov wrote:
 Inlined:

 On 2022-08-22 16:09, Andrey Grodzovsky wrote:
> Poblem: Given many entities competing for same rq on
 ^Problem

> same scheduler an uncceptabliy long wait time for some
 ^unacceptably

> jobs waiting stuck in rq before being picked up are
> observed (seen using  GPUVis).
> The issue is due to Round Robin policy used by scheduler
> to pick up the next entity for execution. Under stress
> of many entities and long job queus within entity some
 ^queues

> jobs could be stack for very long time in it's entity's
> queue before being popped from the queue and executed
> while for other entites with samller job queues a job
 ^entities; smaller

> might execute ealier even though that job arrived later
 ^earlier

> then the job in the long queue.
>
> Fix:
> Add FIFO selection policy to entites in RQ, chose next enitity
> on rq in such order that if job on one entity arrived
> ealrier then job on another entity the first job will start
> executing ealier regardless of the length of the entity's job
> queue.
>
> Signed-off-by: Andrey Grodzovsky 
> Tested-by: Li Yunxiang (Teddy) 
> ---
>drivers/gpu/drm/scheduler/sched_entity.c |  2 +
>drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
>include/drm/gpu_scheduler.h  |  8 +++
>3 files changed, 71 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 6b25b2f4f5a3..3bb7f69306ef 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
> *sched_job)
>   atomic_inc(entity->rq->sched->score);
>   WRITE_ONCE(entity->last_user, current->group_leader);
>   first = spsc_queue_push(>job_queue, 
> _job->queue_node);
> + sched_job->submit_ts = ktime_get();
> +
>
>   /* first job wakes up scheduler */
>   if (first) {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 68317d3a7a27..c123aa120d06 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -59,6 +59,19 @@
>#define CREATE_TRACE_POINTS
>#include "gpu_scheduler_trace.h"
>
> +
> +
> +int drm_sched_policy = -1;
> +
> +/**
> + * DOC: sched_policy (int)
> + * Used to override default entites scheduling policy in a run queue.
> + */
> +MODULE_PARM_DESC(sched_policy,
> + "specify schedule policy for entites on a runqueue (-1 = 
> auto(default) value, 0 = Round Robin,1  = use FIFO");
> +module_param_named(sched_policy, drm_sched_policy, int, 0444);
 As per Christian's comments, you can drop the "auto" and perhaps leave one 
 as the default,
 say the RR.

 I do think it is beneficial to have a module parameter control the 
 scheduling policy, as shown above.
>>>
>>> Christian is not against it, just against adding 'auto' here - like the
>>> default.
>> Exactly what I said.
>>
>> Also, I still think an O(1) scheduling (picking next to run) should be
>> what we strive for in such a FIFO patch implementation.
>> A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
>> element.
>>
>> Regards,
>> Luben
> 
> 
> The only solution i see for this now is keeping a global per rq jobs 
> list parallel to SPCP queue per entity - we use this list when we switch
> to FIFO scheduling, we can even start building  it ONLY when we switch 
> to FIFO building it gradually as more jobs come. Do you have other solution
> in mind ?

The idea is to "sort" on insertion, not on picking the next one to run.

cont'd below:

> 
> Andrey
> 
>>
>>>
> +
> +
>#define to_drm_sched_job(sched_job)\
>   container_of((sched_job), struct drm_sched_job, 
> queue_node)
>
> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq 
> *rq,
>}
>
>/**
> - * drm_sched_rq_select_entity - Select an entity which could provide a 
> job to run
> + * drm_sched_rq_select_entity_rr - Select an entity which could provide 
> a job to run
> *
> * @rq: scheduler run queue to check.
> *
> - * Try to find a ready entity, returns NULL if none found.
> + * Try to find a ready entity, in round robin manner.
> + *
> + * Returns NULL if none found.
> */
>   

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Andrey Grodzovsky

On 2022-08-23 14:30, Luben Tuikov wrote:



On 2022-08-23 14:13, Andrey Grodzovsky wrote:

On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
   drivers/gpu/drm/scheduler/sched_entity.c |  2 +
   drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
   include/drm/gpu_scheduler.h  |  8 +++
   3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(>job_queue, _job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
   
   	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
   #define CREATE_TRACE_POINTS
   #include "gpu_scheduler_trace.h"
   
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.


Christian is not against it, just against adding 'auto' here - like the
default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben



The only solution i see for this now is keeping a global per rq jobs 
list parallel to SPCP queue per entity - we use this list when we switch
to FIFO scheduling, we can even start building  it ONLY when we switch 
to FIFO building it gradually as more jobs come. Do you have other solution

in mind ?

Andrey






+
+
   #define to_drm_sched_job(sched_job)  \
container_of((sched_job), struct drm_sched_job, queue_node)
   
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

   }
   
   /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
*
* @rq: scheduler run queue to check.
*
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
*/
   static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
   {
struct drm_sched_entity *entity;
   
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
   }
   
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+   struct drm_sched_entity *tmp, *entity = NULL;
+   ktime_t oldest_ts = KTIME_MAX;
+   struct drm_sched_job *sched_job;
+
+   spin_lock(>lock);
+
+   

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Luben Tuikov



On 2022-08-23 14:13, Andrey Grodzovsky wrote:
> 
> On 2022-08-23 12:58, Luben Tuikov wrote:
>> Inlined:
>>
>> On 2022-08-22 16:09, Andrey Grodzovsky wrote:
>>> Poblem: Given many entities competing for same rq on
>> ^Problem
>>
>>> same scheduler an uncceptabliy long wait time for some
>> ^unacceptably
>>
>>> jobs waiting stuck in rq before being picked up are
>>> observed (seen using  GPUVis).
>>> The issue is due to Round Robin policy used by scheduler
>>> to pick up the next entity for execution. Under stress
>>> of many entities and long job queus within entity some
>> ^queues
>>
>>> jobs could be stack for very long time in it's entity's
>>> queue before being popped from the queue and executed
>>> while for other entites with samller job queues a job
>> ^entities; smaller
>>
>>> might execute ealier even though that job arrived later
>> ^earlier
>>
>>> then the job in the long queue.
>>>
>>> Fix:
>>> Add FIFO selection policy to entites in RQ, chose next enitity
>>> on rq in such order that if job on one entity arrived
>>> ealrier then job on another entity the first job will start
>>> executing ealier regardless of the length of the entity's job
>>> queue.
>>>
>>> Signed-off-by: Andrey Grodzovsky 
>>> Tested-by: Li Yunxiang (Teddy) 
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_entity.c |  2 +
>>>   drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
>>>   include/drm/gpu_scheduler.h  |  8 +++
>>>   3 files changed, 71 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 6b25b2f4f5a3..3bb7f69306ef 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
>>> *sched_job)
>>> atomic_inc(entity->rq->sched->score);
>>> WRITE_ONCE(entity->last_user, current->group_leader);
>>> first = spsc_queue_push(>job_queue, _job->queue_node);
>>> +   sched_job->submit_ts = ktime_get();
>>> +
>>>   
>>> /* first job wakes up scheduler */
>>> if (first) {
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 68317d3a7a27..c123aa120d06 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -59,6 +59,19 @@
>>>   #define CREATE_TRACE_POINTS
>>>   #include "gpu_scheduler_trace.h"
>>>   
>>> +
>>> +
>>> +int drm_sched_policy = -1;
>>> +
>>> +/**
>>> + * DOC: sched_policy (int)
>>> + * Used to override default entites scheduling policy in a run queue.
>>> + */
>>> +MODULE_PARM_DESC(sched_policy,
>>> +   "specify schedule policy for entites on a runqueue (-1 = 
>>> auto(default) value, 0 = Round Robin,1  = use FIFO");
>>> +module_param_named(sched_policy, drm_sched_policy, int, 0444);
>> As per Christian's comments, you can drop the "auto" and perhaps leave one 
>> as the default,
>> say the RR.
>>
>> I do think it is beneficial to have a module parameter control the 
>> scheduling policy, as shown above.
> 
> 
> Christian is not against it, just against adding 'auto' here - like the 
> default.

Exactly what I said.

Also, I still think an O(1) scheduling (picking next to run) should be
what we strive for in such a FIFO patch implementation.
A FIFO mechanism is by it's nature an O(1) mechanism for picking the next
element.

Regards,
Luben

> 
> 
>>
>>> +
>>> +
>>>   #define to_drm_sched_job(sched_job)   \
>>> container_of((sched_job), struct drm_sched_job, queue_node)
>>>   
>>> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq 
>>> *rq,
>>>   }
>>>   
>>>   /**
>>> - * drm_sched_rq_select_entity - Select an entity which could provide a job 
>>> to run
>>> + * drm_sched_rq_select_entity_rr - Select an entity which could provide a 
>>> job to run
>>>*
>>>* @rq: scheduler run queue to check.
>>>*
>>> - * Try to find a ready entity, returns NULL if none found.
>>> + * Try to find a ready entity, in round robin manner.
>>> + *
>>> + * Returns NULL if none found.
>>>*/
>>>   static struct drm_sched_entity *
>>> -drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>>>   {
>>> struct drm_sched_entity *entity;
>>>   
>>> @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>> return NULL;
>>>   }
>>>   
>>> +/**
>>> + * drm_sched_rq_select_entity_fifo - Select an entity which could provide 
>>> a job to run
>>> + *
>>> + * @rq: scheduler run queue to check.
>>> + *
>>> + * Try to find a ready entity, based on FIFO order of jobs arrivals.
>>> + *
>>> + * Returns NULL if none found.
>>> + */
>>> +static struct drm_sched_entity *
>>> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>>> +{
>>> +   struct drm_sched_entity *tmp, *entity = NULL;
>>> +   ktime_t oldest_ts = KTIME_MAX;

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Andrey Grodzovsky



On 2022-08-23 12:58, Luben Tuikov wrote:

Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:

Poblem: Given many entities competing for same rq on

^Problem


same scheduler an uncceptabliy long wait time for some

^unacceptably


jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some

^queues


jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job

^entities; smaller


might execute ealier even though that job arrived later

^earlier


then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
  include/drm/gpu_scheduler.h  |  8 +++
  3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(>job_queue, _job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
  
  	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
  #define CREATE_TRACE_POINTS
  #include "gpu_scheduler_trace.h"
  
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");
+module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.



Christian is not against it, just against adding 'auto' here - like the 
default.






+
+
  #define to_drm_sched_job(sched_job)   \
container_of((sched_job), struct drm_sched_job, queue_node)
  
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

  }
  
  /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
   *
   * @rq: scheduler run queue to check.
   *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
   */
  static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
  {
struct drm_sched_entity *entity;
  
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
  }
  
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+   struct drm_sched_entity *tmp, *entity = NULL;
+   ktime_t oldest_ts = KTIME_MAX;
+   struct drm_sched_job *sched_job;
+
+   spin_lock(>lock);
+
+   list_for_each_entry(tmp, >entities, list) {
+
+   if (drm_sched_entity_is_ready(tmp)) {
+   sched_job = 
to_drm_sched_job(spsc_queue_peek(>job_queue));
+
+   if (ktime_before(sched_job->submit_ts, oldest_ts)) {
+   oldest_ts = sched_job->submit_ts;
+   entity = tmp;
+   }
+   }
+   }

Here I think we need an O(1) lookup of the next job to pick out to run.
I see a number of optimizations, for instance keeping the current/oldest
timestamp in the rq struct itself,



This was my original design with rb tree based min 

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Luben Tuikov
Inlined:

On 2022-08-22 16:09, Andrey Grodzovsky wrote:
> Poblem: Given many entities competing for same rq on
^Problem

> same scheduler an uncceptabliy long wait time for some
^unacceptably

> jobs waiting stuck in rq before being picked up are
> observed (seen using  GPUVis).
> The issue is due to Round Robin policy used by scheduler
> to pick up the next entity for execution. Under stress
> of many entities and long job queus within entity some
^queues

> jobs could be stack for very long time in it's entity's
> queue before being popped from the queue and executed
> while for other entites with samller job queues a job
^entities; smaller

> might execute ealier even though that job arrived later
^earlier

> then the job in the long queue.
> 
> Fix:
> Add FIFO selection policy to entites in RQ, chose next enitity
> on rq in such order that if job on one entity arrived
> ealrier then job on another entity the first job will start
> executing ealier regardless of the length of the entity's job
> queue.
> 
> Signed-off-by: Andrey Grodzovsky 
> Tested-by: Li Yunxiang (Teddy) 
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
>  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
>  include/drm/gpu_scheduler.h  |  8 +++
>  3 files changed, 71 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 6b25b2f4f5a3..3bb7f69306ef 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
> *sched_job)
>   atomic_inc(entity->rq->sched->score);
>   WRITE_ONCE(entity->last_user, current->group_leader);
>   first = spsc_queue_push(>job_queue, _job->queue_node);
> + sched_job->submit_ts = ktime_get();
> +
>  
>   /* first job wakes up scheduler */
>   if (first) {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 68317d3a7a27..c123aa120d06 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -59,6 +59,19 @@
>  #define CREATE_TRACE_POINTS
>  #include "gpu_scheduler_trace.h"
>  
> +
> +
> +int drm_sched_policy = -1;
> +
> +/**
> + * DOC: sched_policy (int)
> + * Used to override default entites scheduling policy in a run queue.
> + */
> +MODULE_PARM_DESC(sched_policy,
> + "specify schedule policy for entites on a runqueue (-1 = 
> auto(default) value, 0 = Round Robin,1  = use FIFO");
> +module_param_named(sched_policy, drm_sched_policy, int, 0444);

As per Christian's comments, you can drop the "auto" and perhaps leave one as 
the default,
say the RR.

I do think it is beneficial to have a module parameter control the scheduling 
policy, as shown above.

> +
> +
>  #define to_drm_sched_job(sched_job)  \
>   container_of((sched_job), struct drm_sched_job, queue_node)
>  
> @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>  }
>  
>  /**
> - * drm_sched_rq_select_entity - Select an entity which could provide a job 
> to run
> + * drm_sched_rq_select_entity_rr - Select an entity which could provide a 
> job to run
>   *
>   * @rq: scheduler run queue to check.
>   *
> - * Try to find a ready entity, returns NULL if none found.
> + * Try to find a ready entity, in round robin manner.
> + *
> + * Returns NULL if none found.
>   */
>  static struct drm_sched_entity *
> -drm_sched_rq_select_entity(struct drm_sched_rq *rq)
> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>  {
>   struct drm_sched_entity *entity;
>  
> @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>   return NULL;
>  }
>  
> +/**
> + * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
> job to run
> + *
> + * @rq: scheduler run queue to check.
> + *
> + * Try to find a ready entity, based on FIFO order of jobs arrivals.
> + *
> + * Returns NULL if none found.
> + */
> +static struct drm_sched_entity *
> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> +{
> + struct drm_sched_entity *tmp, *entity = NULL;
> + ktime_t oldest_ts = KTIME_MAX;
> + struct drm_sched_job *sched_job;
> +
> + spin_lock(>lock);
> +
> + list_for_each_entry(tmp, >entities, list) {
> +
> + if (drm_sched_entity_is_ready(tmp)) {
> + sched_job = 
> to_drm_sched_job(spsc_queue_peek(>job_queue));
> +
> + if (ktime_before(sched_job->submit_ts, oldest_ts)) {
> + oldest_ts = sched_job->submit_ts;
> + entity = tmp;
> + }
> + }
> + }

Here I think we need an O(1) lookup of the next job to pick out to run.
I see a number of optimizations, for instance keeping the current/oldest
timestamp in the rq struct itself, or better yet 

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Andrey Grodzovsky

On 2022-08-23 08:15, Christian König wrote:




Am 22.08.22 um 22:09 schrieb Andrey Grodzovsky:

Poblem: Given many entities competing for same rq on
same scheduler an uncceptabliy long wait time for some
jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some
jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job
might execute ealier even though that job arrived later
then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
  include/drm/gpu_scheduler.h  |  8 +++
  3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c

index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct 
drm_sched_job *sched_job)

  atomic_inc(entity->rq->sched->score);
  WRITE_ONCE(entity->last_user, current->group_leader);
  first = spsc_queue_push(>job_queue, 
_job->queue_node);

+    sched_job->submit_ts = ktime_get();
+
    /* first job wakes up scheduler */
  if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c

index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
  #define CREATE_TRACE_POINTS
  #include "gpu_scheduler_trace.h"
  +
+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+    "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");


Well we don't really have an autodetect at the moment, so I would drop 
that.



+module_param_named(sched_policy, drm_sched_policy, int, 0444);
+
+
  #define to_drm_sched_job(sched_job)    \
  container_of((sched_job), struct drm_sched_job, queue_node)
  @@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct 
drm_sched_rq *rq,

  }
    /**
- * drm_sched_rq_select_entity - Select an entity which could provide 
a job to run
+ * drm_sched_rq_select_entity_rr - Select an entity which could 
provide a job to run

   *
   * @rq: scheduler run queue to check.
   *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
   */
  static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
  {
  struct drm_sched_entity *entity;
  @@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq 
*rq)

  return NULL;
  }
  +/**
+ * drm_sched_rq_select_entity_fifo - Select an entity which could 
provide a job to run

+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+    struct drm_sched_entity *tmp, *entity = NULL;
+    ktime_t oldest_ts = KTIME_MAX;
+    struct drm_sched_job *sched_job;
+
+    spin_lock(>lock);
+
+    list_for_each_entry(tmp, >entities, list) {
+
+    if (drm_sched_entity_is_ready(tmp)) {
+    sched_job = 
to_drm_sched_job(spsc_queue_peek(>job_queue));

+
+    if (ktime_before(sched_job->submit_ts, oldest_ts)) {
+    oldest_ts = sched_job->submit_ts;
+    entity = tmp;
+    }
+    }
+    }
+
+    if (entity) {
+    rq->current_entity = entity;
+    reinit_completion(>entity_idle);
+    }


That should probably be a separate function or at least outside of 
this here.


Apart from that totally straight forward implementation. Any idea how 
much extra overhead that is?


Regards,
Christian.



Well, memory wise you have the extra long for each job struct for the 
time stamp, and then for each next job extraction you have to iterate 
the entire rq to find the next entity with oldest job so always linear 
in number of entitles. Today the worst case is also O(# entities) in 
case none of them are ready but usually it's not the 

Re: [PATCH] drm/sced: Add FIFO policy for scheduler rq

2022-08-23 Thread Christian König




Am 22.08.22 um 22:09 schrieb Andrey Grodzovsky:

Poblem: Given many entities competing for same rq on
same scheduler an uncceptabliy long wait time for some
jobs waiting stuck in rq before being picked up are
observed (seen using  GPUVis).
The issue is due to Round Robin policy used by scheduler
to pick up the next entity for execution. Under stress
of many entities and long job queus within entity some
jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entites with samller job queues a job
might execute ealier even though that job arrived later
then the job in the long queue.

Fix:
Add FIFO selection policy to entites in RQ, chose next enitity
on rq in such order that if job on one entity arrived
ealrier then job on another entity the first job will start
executing ealier regardless of the length of the entity's job
queue.

Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
  drivers/gpu/drm/scheduler/sched_entity.c |  2 +
  drivers/gpu/drm/scheduler/sched_main.c   | 65 ++--
  include/drm/gpu_scheduler.h  |  8 +++
  3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..3bb7f69306ef 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -507,6 +507,8 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(>job_queue, _job->queue_node);
+   sched_job->submit_ts = ktime_get();
+
  
  	/* first job wakes up scheduler */

if (first) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 68317d3a7a27..c123aa120d06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -59,6 +59,19 @@
  #define CREATE_TRACE_POINTS
  #include "gpu_scheduler_trace.h"
  
+

+
+int drm_sched_policy = -1;
+
+/**
+ * DOC: sched_policy (int)
+ * Used to override default entites scheduling policy in a run queue.
+ */
+MODULE_PARM_DESC(sched_policy,
+   "specify schedule policy for entites on a runqueue (-1 = 
auto(default) value, 0 = Round Robin,1  = use FIFO");


Well we don't really have an autodetect at the moment, so I would drop that.


+module_param_named(sched_policy, drm_sched_policy, int, 0444);
+
+
  #define to_drm_sched_job(sched_job)   \
container_of((sched_job), struct drm_sched_job, queue_node)
  
@@ -120,14 +133,16 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

  }
  
  /**

- * drm_sched_rq_select_entity - Select an entity which could provide a job to 
run
+ * drm_sched_rq_select_entity_rr - Select an entity which could provide a job 
to run
   *
   * @rq: scheduler run queue to check.
   *
- * Try to find a ready entity, returns NULL if none found.
+ * Try to find a ready entity, in round robin manner.
+ *
+ * Returns NULL if none found.
   */
  static struct drm_sched_entity *
-drm_sched_rq_select_entity(struct drm_sched_rq *rq)
+drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
  {
struct drm_sched_entity *entity;
  
@@ -163,6 +178,45 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)

return NULL;
  }
  
+/**

+ * drm_sched_rq_select_entity_fifo - Select an entity which could provide a 
job to run
+ *
+ * @rq: scheduler run queue to check.
+ *
+ * Try to find a ready entity, based on FIFO order of jobs arrivals.
+ *
+ * Returns NULL if none found.
+ */
+static struct drm_sched_entity *
+drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
+{
+   struct drm_sched_entity *tmp, *entity = NULL;
+   ktime_t oldest_ts = KTIME_MAX;
+   struct drm_sched_job *sched_job;
+
+   spin_lock(>lock);
+
+   list_for_each_entry(tmp, >entities, list) {
+
+   if (drm_sched_entity_is_ready(tmp)) {
+   sched_job = 
to_drm_sched_job(spsc_queue_peek(>job_queue));
+
+   if (ktime_before(sched_job->submit_ts, oldest_ts)) {
+   oldest_ts = sched_job->submit_ts;
+   entity = tmp;
+   }
+   }
+   }
+
+   if (entity) {
+   rq->current_entity = entity;
+   reinit_completion(>entity_idle);
+   }


That should probably be a separate function or at least outside of this 
here.


Apart from that totally straight forward implementation. Any idea how 
much extra overhead that is?


Regards,
Christian.


+
+   spin_unlock(>lock);
+   return entity;
+}
+
  /**
   * drm_sched_job_done - complete a job
   * @s_job: pointer to the job which is done
@@ -804,7 +858,10 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
  
  	/* Kernel