On Thu, Jan 12, 2023 at 04:56:07PM +0000, Tvrtko Ursulin 
<tvrtko.ursu...@linux.intel.com> wrote:
> +static int drmcs_can_attach(struct cgroup_taskset *tset)
> +{
> +     int ret;
> +
> +     /*
> +      * As processes are getting moved between groups we need to ensure
> +      * both that the old group does not see a sudden downward jump in the
> +      * GPU utilisation, and that the new group does not see a sudden jump
> +      * up with all the GPU time clients belonging to the migrated process
> +      * have accumulated.
> +      *
> +      * To achieve that we suspend the scanner until the migration is
> +      * completed where the resume at the end ensures both groups start
> +      * observing GPU utilisation from a reset state.
> +      */
> +
> +     ret = mutex_lock_interruptible(&drmcg_mutex);
> +     if (ret)
> +             return ret;
> +     start_suspend_scanning();
> +     mutex_unlock(&drmcg_mutex);
> +
> +     finish_suspend_scanning();

Here's scanning suspension, communicated via 

        root_drmcs.scanning_suspended = true;
        root_drmcs.suspended_period_us = root_drmcs.period_us;
        root_drmcs.period_us = 0;

but I don't see those used in scan_worker() and the scanning traversal
can apparently run concurrently with a task migration.

> [...]
> +static bool
> +__start_scanning(struct drm_cgroup_state *root, unsigned int period_us)
> [...]
> +     css_for_each_descendant_post(node, &root->css) {
> [...]
> +             active = drmcs_get_active_time_us(drmcs);
> +             if (period_us && active > drmcs->prev_active_us)
> +                     drmcs->active_us += active - drmcs->prev_active_us;
> +             drmcs->prev_active_us = active;

drmcs_get_active_time_us() could count a task's contribution here,
the task would migrate to a different drmcs,
and it'd be counted 2nd time.


Attachment: signature.asc
Description: Digital signature

Reply via email to