On 10/9/23 1:09 PM, Nathan Lynch wrote:
Hi Haren,

Haren Myneni <ha...@linux.ibm.com> writes:
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS       t2: Migration event
     window

lock vas_pseries_mutex
If migration_in_progress set
   unlock vas_pseries_mutex
   return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL             lock vas_pseries_mutex
setup window                    migration_in_progress=true
                                Closes all windows from
                                the list
                                unlock vas_pseries_mutex
lock vas_pseries_mutex          return
if nr_closed_windows == 0
   // No DLPAR CPU or migration
   add to the list
   unlock vas_pseries_mutex
   return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

        mutex_lock(&vas_pseries_mutex);
        if (migration_in_progress)
                rc = -EBUSY;
        else
                rc = allocate_setup_window(txwin, (u64 *)&domain[0],
                                   cop_feat_caps->win_type);
        mutex_unlock(&vas_pseries_mutex);
        if (rc)
                goto out;

        rc = h_modify_vas_window(txwin);
        if (!rc)
                rc = get_vas_user_win_ref(&txwin->vas_win.task_ref);
        if (rc)
                goto out_free;

        txwin->win_type = cop_feat_caps->win_type;
        mutex_lock(&vas_pseries_mutex);
        if (!caps->nr_close_wins) {
                list_add(&txwin->win_list, &caps->list);
                caps->nr_open_windows++;
                mutex_unlock(&vas_pseries_mutex);
                vas_user_win_add_mm_context(&txwin->vas_win.task_ref);
                return &txwin->vas_win;
        }
        mutex_unlock(&vas_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?


Thanks Nathan for your comments.

vas_pseries_mutex protects window ID and IRQ allocation between alloc and free window HCALLs, and window list. Generally try to not using mutex in HCALLs, but we need this mutex with these HCALLs.

We can add h_modify_vas_window() or get_vas_user_win_ref() with in the mutex context, but not needed. Also free HCALL can take longer depends on pending NX requests since the hypervisor waits for all requests to be completed before closing the window.

Applications can issue window open / free calls at the same time which can experience mutex contention especially on the large system where 100's of credits are available. Another example: The migration event can wait longer (or timeout) to get this mutex if many threads issue open/free window calls. Hence added h_modify_vas_window() (modify HCALL) or get_vas_user_win_ref() outside of mutex.

Thanks
Haren










Reply via email to