On 18/03/2026 11:44, Tvrtko Ursulin wrote:
>
> On 17/03/2026 17:27, Marco Pagani wrote:
>> On 25/02/2026 10:03, Tvrtko Ursulin wrote:
>>>
>>> On 25/02/2026 07:47, Philipp Stanner wrote:
>>>> On Mon, 2026-02-23 at 16:25 +0000, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 19/02/2026 14:07, Marco Pagani wrote:
>>>>>> Add a new test suite to simulate concurrent job submissions to the DRM
>>>>>> scheduler. The suite includes two initial test cases: (i) a primary test
>>>>>> case for parallel job submission and (ii) a secondary test case for
>>>>>> interleaved job submission and completion. In the first test case, worker
>>>>>> threads concurrently submit jobs to the scheduler and then the timeline
>>>>>> is
>>>>>> manually advanced. In the second test case, worker threads periodically
>>>>>> submit a sequence of jobs to the mock scheduler. Periods are chosen as
>>>>>> harmonic, starting from a base period, to allow interleaving between
>>>>>> submission and completion. To avoid impractically large execution times,
>>>>>> periods are cycled. The timeline is advanced automatically when the jobs
>>>>>> completes. This models how job submission from userspace (in process
>>>>>> context) may interleave with job completion (hrtimer callback interrupt
>>>>>> context, in the test case) by a physical device.
>>>>
>>>> I still maintain the opinion expressed the last time: that the commit
>>>> message should make explicit why the patch / test is added. Which this
>>>> doesn't do. It just says: "We add X", but not "Currently, the problem
>>>> is that YZ, thus we need X".
>>>> (also breaking longer commit messages into paragraphs is nice)
>>>>
>>>> Also see my comments about interleaved submits below.
>>>
>>> I'll address the ones addressed to me.
>>>
>>> 8><
>>>
>>>>>> +struct drm_sched_parallel_params {
>>>>>> + const char *description;
>>>>>> + unsigned int num_jobs;
>>>>>> + unsigned int num_workers;
>>>>>> +};
>>>>>> +
>>>>>> +static const struct drm_sched_parallel_params
>>>>>> drm_sched_parallel_cases[] = {
>>>>>> + {
>>>>>> + .description = "Workers submitting multiple jobs
>>>>>> against a single entity",
>>>>>
>>>>> Each worker has own entity so the description is a bit confusing.
>>>>
>>>> Do you have a suggestion for a better one?
>>>
>>> Along the lines of:
>>>
>>> "Multiple workers submitting multiple jobs from their entity"
>>>
>>> 8><
>>>
>>>>>> + }
>>>>>> +
>>>>>> + for (i = 0; i < params->num_workers; i++) {
>>>>>> + INIT_WORK(&workers[i].work, drm_sched_parallel_worker);
>>>>>> + queue_work(ctx->sub_wq, &workers[i].work);
>>>>>> + }
>>>>>> +
>>>>>> + complete_all(&ctx->wait_go);
>>>>>> + flush_workqueue(ctx->sub_wq);
>>>>>> +
>>>>>> + for (i = 0; i < params->num_workers; i++) {
>>>>>> + for (j = 0; j < params->num_jobs; j++) {
>>>>>> + done =
>>>>>> drm_mock_sched_job_wait_scheduled(workers[i].jobs[j],
>>>>>> + HZ);
>>>>
>>>> same
>>>>
>>>>>> + KUNIT_ASSERT_TRUE(test, done);
>>>>>> +
>>>>>> + done =
>>>>>> drm_mock_sched_job_is_finished(workers[i].jobs[j]);
>>>>>> + KUNIT_ASSERT_FALSE(test, done);
>>>>>
>>>>> This second assert does not seem to be bringing much value, but instead
>>>>> makes the reader ask themselves why it is there. Remove it?
>>>>>
>>>>> Hmm in fact this whole loop could be removed. All that it is needed
>>>>> below is to wait for the last job to be completed.
>>>>
>>>> I suppose it's being tested whether all jobs are finished. That sounds
>>>> clean and not harmful to me.
>>>
>>> No, it is assert false. It is testing the jobs have been scheduled but
>>> not completed before the timeline is manually advanced. Both those
>>> behaviours are already covered by the existing basic test cases.
>>>
>>> In my view the general best practice is to focus on the thing being
>>> tested, which in this case is the submission side of things. The rest
>>> can just distract the reader. And in this case that is parallel
>>> submission, which is all done and dusted by the time flush_workqueue
>>> above finishes. Everything after that point is just test cleanup.
>>
>> I see what you mean. By that point, concurrent submission is complete, so it
>> may be considered outside the scope of this test. However, my intent was to
>> check that the basic behaviour of the scheduler also holds after the
>> concurrent
>> submissions of multiple jobs, which is not covered by the basic test case.
>> That said, I'm open to removing those asserts if you and Philipp believe
>> they are redundant.
>
> Some time has passed and I don't exactly remember now with the trimmed
> quote. If this part of the test is using manual timeline advancement
> then the assert is basically testing the mock scheduler (not the DRM
> scheduler).
This is a good point. I'm probably going to remove these asserts.
> For me that is lukewarm, but if you insist feel free to keep it.
Thanks,
Marco