Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-16 Thread Haren Myneni




On 10/16/23 1:30 PM, Nathan Lynch wrote:

Nathan Lynch  writes:

Haren Myneni  writes:

Haren Myneni  writes:

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
  window

lock vas_pseries_mutex
If migration_in_progress set
unlock vas_pseries_mutex
return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
// No DLPAR CPU or migration
add to the list
unlock vas_pseries_mutex
return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY


Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

mutex_lock(_pseries_mutex);
if (migration_in_progress)
rc = -EBUSY;
else
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;

rc = h_modify_vas_window(txwin);
if (!rc)
rc = get_vas_user_win_ref(>vas_win.task_ref);
if (rc)
goto out_free;

txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins) {
list_add(>win_list, >list);
caps->nr_open_windows++;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?



Thanks Nathan for your comments.

vas_pseries_mutex protects window ID and IRQ allocation between alloc
and free window HCALLs, and window list. Generally try to not using
mutex in HCALLs, but we need this mutex with these HCALLs.

We can add h_modify_vas_window() or get_vas_user_win_ref() with in the
mutex context, but not needed.


Hmm. I contend that it would fix your bug in a simpler way that
eliminates the race instead of coping with it by adding more state and
complicating the locking model. With your change, readers of the
migration_in_progress flag check it under the mutex, but the writer
updates it outside of the mutex, which seems strange and unlikely to be
correct.


Expanding on this, with your change, migration_in_progress becomes a
boolean atomic_t flag accessed only with atomic_set() and
atomic_read(). These are non-RMW operations. Documentation/atomic_t.txt
says:

   Non-RMW ops:

   The non-RMW ops are (typically) regular LOADs and STOREs and are
   canonically implemented using READ_ONCE(), WRITE_ONCE(),
   smp_load_acquire() and smp_store_release() respectively. Therefore, if
   you find yourself only using the Non-RMW operations of atomic_t, you
   do not in fact need atomic_t at all and are doing it wrong.

So making migration_in_progress an atomic_t does not confer any
advantageous properties to it that it lacks as a plain boolean.

Considering also (from the same document):

  - non-RMW operations are unordered;

  - RMW operations that have no return value are unordered;

I am concerned that threads executing these segments of code will not
always observe each others' effects in the intended order:

// vas_allocate_window()

 mutex_lock(_pseries_mutex);
 if (!caps->nr_close_wins && !atomic_read(_in_progress)) {
 list_add(>win_list, >list);
 caps->nr_open_windows++;
 atomic_dec(>nr_open_wins_progress);
 mutex_unlock(_pseries_mutex);
 vas_user_win_add_mm_context(>vas_win.task_ref);
 return >vas_win;
 }
 mutex_unlock(_pseries_mutex);
 ...
 atomic_dec(>nr_open_wins_progress);
 wake_up(_win_progress_wq);

// vas_migration_handler()

 atomic_set(_in_progress, 1);
 ...
 mutex_lock(_pseries_mutex);
 rc = reconfig_close_windows(vcaps, 
vcaps->nr_open_windows,
 true);
 

Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-16 Thread Nathan Lynch
Nathan Lynch  writes:
> Haren Myneni  writes:
>>> Haren Myneni  writes:
 The hypervisor returns migration failure if all VAS windows are not
 closed. During pre-migration stage, vas_migration_handler() sets
 migration_in_progress flag and closes all windows from the list.
 The allocate VAS window routine checks the migration flag, setup
 the window and then add it to the list. So there is possibility of
 the migration handler missing the window that is still in the
 process of setup.

 t1: Allocate and open VAS  t2: Migration event
  window

 lock vas_pseries_mutex
 If migration_in_progress set
unlock vas_pseries_mutex
return
 open window HCALL
 unlock vas_pseries_mutex
 Modify window HCALLlock vas_pseries_mutex
 setup window   migration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
 lock vas_pseries_mutex return
 if nr_closed_windows == 0
// No DLPAR CPU or migration
add to the list
unlock vas_pseries_mutex
return
 unlock vas_pseries_mutex
 Close VAS window
 // due to DLPAR CPU or migration
 return -EBUSY
>>> 
>>> Could the the path t1 takes simply hold the mutex for the duration of
>>> its execution instead of dropping and reacquiring it in the middle?
>>> 
>>> Here's the relevant code from vas_allocate_window():
>>> 
>>> mutex_lock(_pseries_mutex);
>>> if (migration_in_progress)
>>> rc = -EBUSY;
>>> else
>>> rc = allocate_setup_window(txwin, (u64 *)[0],
>>>cop_feat_caps->win_type);
>>> mutex_unlock(_pseries_mutex);
>>> if (rc)
>>> goto out;
>>> 
>>> rc = h_modify_vas_window(txwin);
>>> if (!rc)
>>> rc = get_vas_user_win_ref(>vas_win.task_ref);
>>> if (rc)
>>> goto out_free;
>>> 
>>> txwin->win_type = cop_feat_caps->win_type;
>>> mutex_lock(_pseries_mutex);
>>> if (!caps->nr_close_wins) {
>>> list_add(>win_list, >list);
>>> caps->nr_open_windows++;
>>> mutex_unlock(_pseries_mutex);
>>> vas_user_win_add_mm_context(>vas_win.task_ref);
>>> return >vas_win;
>>> }
>>> mutex_unlock(_pseries_mutex);
>>> 
>>> Is there something about h_modify_vas_window() or get_vas_user_win_ref()
>>> that requires temporarily dropping the lock?
>>> 
>>
>> Thanks Nathan for your comments.
>>
>> vas_pseries_mutex protects window ID and IRQ allocation between alloc 
>> and free window HCALLs, and window list. Generally try to not using 
>> mutex in HCALLs, but we need this mutex with these HCALLs.
>>
>> We can add h_modify_vas_window() or get_vas_user_win_ref() with in the 
>> mutex context, but not needed.
>
> Hmm. I contend that it would fix your bug in a simpler way that
> eliminates the race instead of coping with it by adding more state and
> complicating the locking model. With your change, readers of the
> migration_in_progress flag check it under the mutex, but the writer
> updates it outside of the mutex, which seems strange and unlikely to be
> correct.

Expanding on this, with your change, migration_in_progress becomes a
boolean atomic_t flag accessed only with atomic_set() and
atomic_read(). These are non-RMW operations. Documentation/atomic_t.txt
says:

  Non-RMW ops:

  The non-RMW ops are (typically) regular LOADs and STOREs and are
  canonically implemented using READ_ONCE(), WRITE_ONCE(),
  smp_load_acquire() and smp_store_release() respectively. Therefore, if
  you find yourself only using the Non-RMW operations of atomic_t, you
  do not in fact need atomic_t at all and are doing it wrong.

So making migration_in_progress an atomic_t does not confer any
advantageous properties to it that it lacks as a plain boolean.

Considering also (from the same document):

 - non-RMW operations are unordered;

 - RMW operations that have no return value are unordered;

I am concerned that threads executing these segments of code will not
always observe each others' effects in the intended order:

// vas_allocate_window()

mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins && !atomic_read(_in_progress)) {
list_add(>win_list, >list);
caps->nr_open_windows++;
atomic_dec(>nr_open_wins_progress);
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);
...
atomic_dec(>nr_open_wins_progress);
wake_up(_win_progress_wq);

// vas_migration_handler()

atomic_set(_in_progress, 1);
...
mutex_lock(_pseries_mutex);
rc = 

Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-13 Thread Haren Myneni




On 10/11/23 1:36 PM, Nathan Lynch wrote:

Haren Myneni  writes:

Haren Myneni  writes:

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
  window

lock vas_pseries_mutex
If migration_in_progress set
unlock vas_pseries_mutex
return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
// No DLPAR CPU or migration
add to the list
unlock vas_pseries_mutex
return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY


Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

mutex_lock(_pseries_mutex);
if (migration_in_progress)
rc = -EBUSY;
else
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;

rc = h_modify_vas_window(txwin);
if (!rc)
rc = get_vas_user_win_ref(>vas_win.task_ref);
if (rc)
goto out_free;

txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins) {
list_add(>win_list, >list);
caps->nr_open_windows++;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?



Thanks Nathan for your comments.

vas_pseries_mutex protects window ID and IRQ allocation between alloc
and free window HCALLs, and window list. Generally try to not using
mutex in HCALLs, but we need this mutex with these HCALLs.

We can add h_modify_vas_window() or get_vas_user_win_ref() with in the
mutex context, but not needed.


Hmm. I contend that it would fix your bug in a simpler way that
eliminates the race instead of coping with it by adding more state and
complicating the locking model. With your change, readers of the
migration_in_progress flag check it under the mutex, but the writer
updates it outside of the mutex, which seems strange and unlikely to be
correct.


The migration thread is the only writer which changes 
migration_in_progress flag. The setting this flag in moved outside of 
mutex in this patch. The window open is only reader of this flag but 
within mutex.


Reason for moved the setting outside of mutex:

Suppose many threads are called open window and waiting on mutex and 
later the migration thread started. In this case the migration thread 
has to wait on mutex for all window open threads has to complete open 
window HCALLs in the hypervisor. Then the migration thread has to close 
all these windows immediately. So if the setting is done outside of 
mutex, the later open window threads can exit from this function quickly 
without opening windows.


Reason for keeping the migration_in_progress check inside of mutex 
section with the above change (setting outside of mutex):


If the reader threads waits on mutex after checking this flag (before 
holding mutex), end up opening windows which will be closed anyway by 
the migration thread. Also the later open threads can return with -EBUSY 
quickly.






Also free HCALL can take longer depends
on pending NX requests since the hypervisor waits for all requests to be
completed before closing the window.

Applications can issue window open / free calls at the same time which
can experience mutex contention especially on the large system where
100's of credits are available. Another example: The migration event can
wait longer (or timeout) to get this mutex if many threads issue
open/free window calls. Hence added h_modify_vas_window() (modify HCALL)
or get_vas_user_win_ref() outside of mutex.


OK. I believe you're referring to this code, which can run under the lock:

static long hcall_return_busy_check(long rc)
{
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
msleep(get_longbusy_msecs(rc));
 

Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-11 Thread Nathan Lynch
Haren Myneni  writes:
>> Haren Myneni  writes:
>>> The hypervisor returns migration failure if all VAS windows are not
>>> closed. During pre-migration stage, vas_migration_handler() sets
>>> migration_in_progress flag and closes all windows from the list.
>>> The allocate VAS window routine checks the migration flag, setup
>>> the window and then add it to the list. So there is possibility of
>>> the migration handler missing the window that is still in the
>>> process of setup.
>>>
>>> t1: Allocate and open VAS   t2: Migration event
>>>  window
>>>
>>> lock vas_pseries_mutex
>>> If migration_in_progress set
>>>unlock vas_pseries_mutex
>>>return
>>> open window HCALL
>>> unlock vas_pseries_mutex
>>> Modify window HCALL lock vas_pseries_mutex
>>> setup windowmigration_in_progress=true
>>> Closes all windows from
>>> the list
>>> unlock vas_pseries_mutex
>>> lock vas_pseries_mutex  return
>>> if nr_closed_windows == 0
>>>// No DLPAR CPU or migration
>>>add to the list
>>>unlock vas_pseries_mutex
>>>return
>>> unlock vas_pseries_mutex
>>> Close VAS window
>>> // due to DLPAR CPU or migration
>>> return -EBUSY
>> 
>> Could the the path t1 takes simply hold the mutex for the duration of
>> its execution instead of dropping and reacquiring it in the middle?
>> 
>> Here's the relevant code from vas_allocate_window():
>> 
>>  mutex_lock(_pseries_mutex);
>>  if (migration_in_progress)
>>  rc = -EBUSY;
>>  else
>>  rc = allocate_setup_window(txwin, (u64 *)[0],
>> cop_feat_caps->win_type);
>>  mutex_unlock(_pseries_mutex);
>>  if (rc)
>>  goto out;
>> 
>>  rc = h_modify_vas_window(txwin);
>>  if (!rc)
>>  rc = get_vas_user_win_ref(>vas_win.task_ref);
>>  if (rc)
>>  goto out_free;
>> 
>>  txwin->win_type = cop_feat_caps->win_type;
>>  mutex_lock(_pseries_mutex);
>>  if (!caps->nr_close_wins) {
>>  list_add(>win_list, >list);
>>  caps->nr_open_windows++;
>>  mutex_unlock(_pseries_mutex);
>>  vas_user_win_add_mm_context(>vas_win.task_ref);
>>  return >vas_win;
>>  }
>>  mutex_unlock(_pseries_mutex);
>> 
>> Is there something about h_modify_vas_window() or get_vas_user_win_ref()
>> that requires temporarily dropping the lock?
>> 
>
> Thanks Nathan for your comments.
>
> vas_pseries_mutex protects window ID and IRQ allocation between alloc 
> and free window HCALLs, and window list. Generally try to not using 
> mutex in HCALLs, but we need this mutex with these HCALLs.
>
> We can add h_modify_vas_window() or get_vas_user_win_ref() with in the 
> mutex context, but not needed.

Hmm. I contend that it would fix your bug in a simpler way that
eliminates the race instead of coping with it by adding more state and
complicating the locking model. With your change, readers of the
migration_in_progress flag check it under the mutex, but the writer
updates it outside of the mutex, which seems strange and unlikely to be
correct.

> Also free HCALL can take longer depends 
> on pending NX requests since the hypervisor waits for all requests to be 
> completed before closing the window.
>
> Applications can issue window open / free calls at the same time which 
> can experience mutex contention especially on the large system where 
> 100's of credits are available. Another example: The migration event can 
> wait longer (or timeout) to get this mutex if many threads issue 
> open/free window calls. Hence added h_modify_vas_window() (modify HCALL) 
> or get_vas_user_win_ref() outside of mutex.

OK. I believe you're referring to this code, which can run under the lock:

static long hcall_return_busy_check(long rc)
{
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
msleep(get_longbusy_msecs(rc));
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
}

return rc;
}

...

static int h_deallocate_vas_window(u64 winid)
{
long rc;

do {
rc = plpar_hcall_norets(H_DEALLOCATE_VAS_WINDOW, winid);

rc = hcall_return_busy_check(rc);
} while (rc == H_BUSY);

...

[ with similar loops in the window allocate and modify functions ]

If get_longbusy_msecs() typically returns low values (<20), then you
should prefer usleep_range() over msleep() to avoid over-sleeping. For
example, msleep(1) can schedule away for much longer than 1ms -- often
more like 20ms.

If mutex hold times have been a problem in this code, then it's probably
worth improving the way it handles the busy/retry statuses under the
lock.


Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-10 Thread Haren Myneni




On 10/9/23 1:09 PM, Nathan Lynch wrote:

Hi Haren,

Haren Myneni  writes:

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
 window

lock vas_pseries_mutex
If migration_in_progress set
   unlock vas_pseries_mutex
   return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
   // No DLPAR CPU or migration
   add to the list
   unlock vas_pseries_mutex
   return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY


Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

mutex_lock(_pseries_mutex);
if (migration_in_progress)
rc = -EBUSY;
else
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;

rc = h_modify_vas_window(txwin);
if (!rc)
rc = get_vas_user_win_ref(>vas_win.task_ref);
if (rc)
goto out_free;

txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins) {
list_add(>win_list, >list);
caps->nr_open_windows++;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?



Thanks Nathan for your comments.

vas_pseries_mutex protects window ID and IRQ allocation between alloc 
and free window HCALLs, and window list. Generally try to not using 
mutex in HCALLs, but we need this mutex with these HCALLs.


We can add h_modify_vas_window() or get_vas_user_win_ref() with in the 
mutex context, but not needed. Also free HCALL can take longer depends 
on pending NX requests since the hypervisor waits for all requests to be 
completed before closing the window.


Applications can issue window open / free calls at the same time which 
can experience mutex contention especially on the large system where 
100's of credits are available. Another example: The migration event can 
wait longer (or timeout) to get this mutex if many threads issue 
open/free window calls. Hence added h_modify_vas_window() (modify HCALL) 
or get_vas_user_win_ref() outside of mutex.


Thanks
Haren












Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-09 Thread Nathan Lynch
Hi Haren,

Haren Myneni  writes:
> The hypervisor returns migration failure if all VAS windows are not
> closed. During pre-migration stage, vas_migration_handler() sets
> migration_in_progress flag and closes all windows from the list.
> The allocate VAS window routine checks the migration flag, setup
> the window and then add it to the list. So there is possibility of
> the migration handler missing the window that is still in the
> process of setup.
>
> t1: Allocate and open VAS t2: Migration event
> window
>
> lock vas_pseries_mutex
> If migration_in_progress set
>   unlock vas_pseries_mutex
>   return
> open window HCALL
> unlock vas_pseries_mutex
> Modify window HCALL   lock vas_pseries_mutex
> setup window  migration_in_progress=true
>   Closes all windows from
>   the list
>   unlock vas_pseries_mutex
> lock vas_pseries_mutexreturn
> if nr_closed_windows == 0
>   // No DLPAR CPU or migration
>   add to the list
>   unlock vas_pseries_mutex
>   return
> unlock vas_pseries_mutex
> Close VAS window
> // due to DLPAR CPU or migration
> return -EBUSY

Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

mutex_lock(_pseries_mutex);
if (migration_in_progress)
rc = -EBUSY;
else
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;

rc = h_modify_vas_window(txwin);
if (!rc)
rc = get_vas_user_win_ref(>vas_win.task_ref);
if (rc)
goto out_free;

txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins) {
list_add(>win_list, >list);
caps->nr_open_windows++;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?


[PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-09-26 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add to the list
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Define migration_in_progress as atomic so that the migration
  handler sets this flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 51 ++--
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 15d958e38eca..efdaf12ffe49 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -31,7 +31,8 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
-static bool migration_in_progress;
+static atomic_t migration_in_progress = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(open_win_progress_wq);
 
 static long hcall_return_busy_check(long rc)
 {
@@ -384,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (atomic_read(_in_progress)) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   atomic_inc(>nr_open_wins_progress);
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -403,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -412,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !atomic_read(_in_progress)) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   atomic_dec(>nr_open_wins_progress);
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
@@ -432,6 +448,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 */
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
+   atomic_dec(>nr_open_wins_progress);
+   wake_up(_win_progress_wq);
 out:
atomic_dec(_feat_caps->nr_used_credits);
kfree(txwin);
@@ -936,18 +954,18 @@ int 

[PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-09-26 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add to the list
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Define migration_in_progress as atomic so that the migration
  handler sets this flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 51 ++--
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 15d958e38eca..efdaf12ffe49 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -31,7 +31,8 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
-static bool migration_in_progress;
+static atomic_t migration_in_progress = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(open_win_progress_wq);
 
 static long hcall_return_busy_check(long rc)
 {
@@ -384,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (atomic_read(_in_progress)) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   atomic_inc(>nr_open_wins_progress);
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -403,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -412,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !atomic_read(_in_progress)) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   atomic_dec(>nr_open_wins_progress);
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
@@ -432,6 +448,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 */
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
+   atomic_dec(>nr_open_wins_progress);
+   wake_up(_win_progress_wq);
 out:
atomic_dec(_feat_caps->nr_used_credits);
kfree(txwin);
@@ -936,18 +954,18 @@ int