[PATCH v6] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2024-01-15 Thread Haren Myneni
VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor suggests OS reissue these HCALLs after 1 or 10msecs.

The open and close VAS window functions hold mutex and then issue
these HCALLs. So these operations can take longer than the
necessary when multiple threads issue open or close window APIs
simultaneously, especially might affect the performance in the
case of repeat open/close APIs for each compression request.

Multiple tasks can open / close VAS windows at the same time
which depends on the available VAS credits. For example, 240
cores system provides 4800 VAS credits. It means 4800 tasks can
execute open VAS windows HCALLs with the mutex. Since each
msleep() will often sleep more than 20 msecs, some tasks are
waiting more than 120 secs to acquire mutex. It can cause hung
traces for these tasks in dmesg due to mutex contention around
open/close HCALLs.

Instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again. So since each
task sleep 10 msecs maximum, this patch allow more tasks can
issue open/close VAS calls without any hung traces in the
dmesg.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
  suggested by Nathan
v2 -> v3:
- Sleep 10MSecs even for HCALL delay > 10MSecs and the other
  commit / comemnt changes as suggested by Nathan and Ellerman.
v3 -> v4:
- More description in the commit log with the visible impact for
  the current code as suggested by Aneesh
v4 -> v5:
- Use USEC_PER_MSEC macro in usleep_range as suggested by Aneesh
v5 -> v6:
- Use USEC_PER_MSEC macro to calculate all ranges in usleep_range()
  and more description in the commit log.
---
 arch/powerpc/platforms/pseries/vas.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..8e8934564557 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -38,7 +38,27 @@ static long hcall_return_busy_check(long rc)
 {
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   unsigned int ms;
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the sleep time should always
+* be either 1 or 10msecs, but in case if the HCALL
+* returns the long delay > 10 msecs, clamp the sleep
+* time to 10msecs.
+*/
+   ms = clamp(get_longbusy_msecs(rc), 1, 10);
+
+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor suggests that the OS reissue
+* HCALLs after 1 or 10msecs. Also the delay hint from
+* the HCALL is just a suggestion. So OK to pause for
+* less time than the hinted delay. Use usleep_range()
+* to ensure we don't sleep much longer than actually
+* needed.
+*/
+   usleep_range(ms * (USEC_PER_MSEC / 10), ms * USEC_PER_MSEC);
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
-- 
2.26.3



Re: [PATCH v5] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2024-01-15 Thread Haren Myneni




On 1/11/24 9:27 AM, Nathan Lynch wrote:

Haren Myneni  writes:

VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor suggests OS reissue these HCALLs after 1 or 10msecs.

The open and close VAS window functions hold mutex and then issue
these HCALLs. So these operations can take longer than the
necessary when multiple threads issue open or close window APIs
simultaneously, especially might affect the performance in the
case of repeat open/close APIs for each compression request.
On the large machine configuration which allows more simultaneous
open/close windows (Ex: 240 cores provides 4800 VAS credits), the
user can observe hung task traces in dmesg due to mutex contention
around open/close HCAlls.


Is this because the workload queues enough tasks on the mutex to trigger
the hung task watchdog? With a threshold of 120 seconds, something on
the order of ~6000 tasks each taking 20ms or more to traverse this
critical section would cause the problem I think you're describing.

Presumably this change improves the situation, but the commit message
isn't explicit. Have you measured the "throughput" of window open/close
activity before and after? Anything that quantifies the improvement
would be welcome.


Yes, tested on the large system which allows open/close 4800 windows at 
the same time (means 4800 tasks). Noticed sleep more than 20msecs for 
some tasks and getting hung traces for some tasks since the combined 
waiting timing is more then 120seconds. With this patch, the maximum 
sleep is 10msecs and did not see these traces on this system. I will add 
more description to the commit log.


Thanks
Haren





[PATCH v5] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2024-01-10 Thread Haren Myneni
VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor suggests OS reissue these HCALLs after 1 or 10msecs.

The open and close VAS window functions hold mutex and then issue
these HCALLs. So these operations can take longer than the
necessary when multiple threads issue open or close window APIs
simultaneously, especially might affect the performance in the
case of repeat open/close APIs for each compression request.
On the large machine configuration which allows more simultaneous
open/close windows (Ex: 240 cores provides 4800 VAS credits), the
user can observe hung task traces in dmesg due to mutex contention
around open/close HCAlls.

So instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
  suggested by Nathan
v2 -> v3:
- Sleep 10MSecs even for HCALL delay > 10MSecs and the other
  commit / comemnt changes as suggested by Nathan and Ellerman.
v3 -> v4:
- More description in the commit log with the visible impact for
  the current code as suggested by Aneesh
v4 -> v5:
- Use USEC_PER_MSEC macro in usleep_range as suggested by Aneesh
---
 arch/powerpc/platforms/pseries/vas.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..79ffe8868c04 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -38,7 +38,27 @@ static long hcall_return_busy_check(long rc)
 {
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   unsigned int ms;
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the sleep time should always
+* be either 1 or 10msecs, but in case if the HCALL
+* returns the long delay > 10 msecs, clamp the sleep
+* time to 10msecs.
+*/
+   ms = clamp(get_longbusy_msecs(rc), 1, 10);
+
+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor suggests that the OS reissue
+* HCALLs after 1 or 10msecs. Also the delay hint from
+* the HCALL is just a suggestion. So OK to pause for
+* less time than the hinted delay. Use usleep_range()
+* to ensure we don't sleep much longer than actually
+* needed.
+*/
+   usleep_range(ms * 100, ms * USEC_PER_MSEC);
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
-- 
2.26.3



[PATCH v4] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2023-12-27 Thread Haren Myneni
VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor suggests OS reissue these HCALLs after 1 or 10msecs.

The open and close VAS window functions hold mutex and then issue
these HCALLs. So these operations can take longer than the
necessary when multiple threads issue open or close window APIs
simultaneously, especially might affect the performance in the
case of repeat open/close APIs for each compression request.
On the large machine configuration which allows more simultaneous
open/close windows (Ex: 240 cores provides 4800 VAS credits), the
user can observe hung task traces in dmesg due to mutex contention
around open/close HCAlls.

So instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
  suggested by Nathan
v2 -> v3:
- Sleep 10MSecs even for HCALL delay > 10MSecs and the other
  commit / comemnt changes as suggested by Nathan and Ellerman.
v4 -> v3:
- More description in the commit log with the visible impact for
  the current code as suggested by Aneesh
---
 arch/powerpc/platforms/pseries/vas.c | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..5cf81c564d4b 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -38,7 +38,30 @@ static long hcall_return_busy_check(long rc)
 {
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   unsigned int ms;
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the sleep time should always
+* be either 1 or 10msecs, but in case if the HCALL
+* returns the long delay > 10 msecs, clamp the sleep
+* time to 10msecs.
+*/
+   ms = clamp(get_longbusy_msecs(rc), 1, 10);
+
+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor suggests that the OS reissue
+* HCALLs after 1 or 10msecs. Also the delay hint from
+* the HCALL is just a suggestion. So OK to pause for
+* less time than the hinted delay. Use usleep_range()
+* to ensure we don't sleep much longer than actually
+* needed.
+*
+* See Documentation/timers/timers-howto.rst for
+* explanation of the range used here.
+*/
+   usleep_range(ms * 100, ms * 1000);
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
-- 
2.26.3



Re: [PATCH v3] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2023-12-05 Thread Haren Myneni




On 12/4/23 6:05 AM, Aneesh Kumar K.V (IBM) wrote:

Haren Myneni  writes:


VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor suggests OS reissue these HCALLs after 1 or 10msecs.
The open and close VAS window functions hold mutex and then issue
these HCALLs. So these operations can take longer than the
necessary when multiple threads issue open or close window APIs
simultaneously.

So instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again.



Can you summarize if there an user observable impact for the current
code? We have other code paths using msleep(get_longbusy_msec()). Should
we audit those usages?


As mentioned in the description, the open and close VAS window APIs can 
take longer with simultaneous calls, especially might affect the 
performance in the case of repeat open/close APIs for each compression 
request. On the large machine configuration which allows more 
simultaneous open windows (Ex: 240 cores provides 4800 VAS credits), the 
user can observe mutex contention around open/close HCAlls and hung-up 
traces in dmesg. I will repost the patch with this update in the commit 
message.


I think applicable to use the similar approach for other HCALLs (like in 
rtas_busy_delay()) but I have not seen any impact so far with other 
HCALLs. So we can add this change later.


Thanks
Haren






Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
   suggested by Nathan
v2 -> v3:
- Sleep 10MSecs even for HCALL delay > 10MSecs and the other
   commit / comemnt changes as suggested by Nathan and Ellerman.
---
  arch/powerpc/platforms/pseries/vas.c | 25 -
  1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..5cf81c564d4b 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -38,7 +38,30 @@ static long hcall_return_busy_check(long rc)
  {
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   unsigned int ms;
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the sleep time should always
+* be either 1 or 10msecs, but in case if the HCALL
+* returns the long delay > 10 msecs, clamp the sleep
+* time to 10msecs.
+*/
+   ms = clamp(get_longbusy_msecs(rc), 1, 10);
+
+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor suggests that the OS reissue
+* HCALLs after 1 or 10msecs. Also the delay hint from
+* the HCALL is just a suggestion. So OK to pause for
+* less time than the hinted delay. Use usleep_range()
+* to ensure we don't sleep much longer than actually
+* needed.
+*
+* See Documentation/timers/timers-howto.rst for
+* explanation of the range used here.
+*/
+   usleep_range(ms * 100, ms * 1000);
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
--
2.26.3


[PATCH v3] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2023-12-02 Thread Haren Myneni
VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor suggests OS reissue these HCALLs after 1 or 10msecs.
The open and close VAS window functions hold mutex and then issue
these HCALLs. So these operations can take longer than the
necessary when multiple threads issue open or close window APIs
simultaneously.

So instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
  suggested by Nathan
v2 -> v3:
- Sleep 10MSecs even for HCALL delay > 10MSecs and the other
  commit / comemnt changes as suggested by Nathan and Ellerman.
---
 arch/powerpc/platforms/pseries/vas.c | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..5cf81c564d4b 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -38,7 +38,30 @@ static long hcall_return_busy_check(long rc)
 {
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   unsigned int ms;
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the sleep time should always
+* be either 1 or 10msecs, but in case if the HCALL
+* returns the long delay > 10 msecs, clamp the sleep
+* time to 10msecs.
+*/
+   ms = clamp(get_longbusy_msecs(rc), 1, 10);
+
+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor suggests that the OS reissue
+* HCALLs after 1 or 10msecs. Also the delay hint from
+* the HCALL is just a suggestion. So OK to pause for
+* less time than the hinted delay. Use usleep_range()
+* to ensure we don't sleep much longer than actually
+* needed.
+*
+* See Documentation/timers/timers-howto.rst for
+* explanation of the range used here.
+*/
+   usleep_range(ms * 100, ms * 1000);
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
-- 
2.26.3



Re: [PATCH v2] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2023-11-30 Thread Haren Myneni




On 11/29/23 6:07 PM, Michael Ellerman wrote:

Haren Myneni  writes:

VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor expects to reissue these HCALLs after 1 or 10msecs.
It might cause these HCALLs takes longer when multiple threads
issue open or close VAS windows simultaneously.

So instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
   suggested by Nathan
---
  arch/powerpc/platforms/pseries/vas.c | 24 +++-
  1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..bade4402741f 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -36,9 +36,31 @@ static bool migration_in_progress;
  
  static long hcall_return_busy_check(long rc)

  {
+   unsigned int ms;
+
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   ms = get_longbusy_msecs(rc);
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the delay should always be 1
+* or 10msecs, but sleeps 1msec in case if the long
+* delay is > H_LONG_BUSY_ORDER_10_MSEC.
+*/
+   if (ms > 10)
+   ms = 1;
  
I don't understand this. The hypervisor asked you to sleep for more than

10 milliseconds, so instead you sleep for 1?

I can understand that we don't want to usleep() for the longer durations
that could be returned, but so shouldn't the code be using msleep() for
those values?

Sleeping for a very short duration definitely seems wrong.


Allocate, modify and deallocate HCALLs return only 1MSECS and 10MSECS 
for long delay. we should not expect > 10MSECS for these HCALLs. Hence 
ms = 1 if ms > 10


But it is confusing. So will use ms = 10 for ms >= 10 as Nathan suggested.





+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor expects to reissue these
  
That makes it sound like the hypervisor is reissuing the hcalls.


Better would be "the hypervisor suggests the kernel should reissue the
hcall after ...".


+* HCALLs after 1 or 10msecs. So use usleep_range()
+* to sleep with the expected value.
+*
+* See Documentation/timers/timers-howto.rst on using
+* the value range in usleep_range().
+*/
+   usleep_range(ms * 100, ms * 1000);


If ms == 1, then that's 100 usecs, which is not 1 millisecond?

Please use USEC_PER_MSEC.


Using usleep_range() same way as mentioned in  rtas_busy_delay().


Thanks
Haren




rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();


cheers



Re: [PATCH v2] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2023-11-30 Thread Haren Myneni




On 11/29/23 5:43 PM, Nathan Lynch wrote:

Haren Myneni  writes:

VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor expects to reissue these HCALLs after 1 or 10msecs.


I would word this as "the architecture suggests that the OS reissue
these [...]" instead of framing it as something the platform "expects".


It might cause these HCALLs takes longer when multiple threads
issue open or close VAS windows simultaneously.


This is imprecise. Over-sleeping by the OS doesn't cause individual
hcalls to take longer. It is more accurate to say that the higher-level
operation (allocate, modify, free) may take longer than necessary in
cases where the OS must retry the hcalls involved.


Correct, takes longer with multiple threads opening/closing windows. I 
will make it clear.





So instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
   suggested by Nathan
---
  arch/powerpc/platforms/pseries/vas.c | 24 +++-
  1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..bade4402741f 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -36,9 +36,31 @@ static bool migration_in_progress;
  
  static long hcall_return_busy_check(long rc)

  {
+   unsigned int ms;


This should move down into the H_IS_LONG_BUSY() block if it's not used
outside of it.


+
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   ms = get_longbusy_msecs(rc);
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the delay should always be 1
+* or 10msecs, but sleeps 1msec in case if the long
+* delay is > H_LONG_BUSY_ORDER_10_MSEC.
+*/
+   if (ms > 10)
+   ms = 1;


It's strange to coerce ms to 1 when it's greater than 10. Just clamp it
to 10, e.g.

 ms = clamp(get_longbusy_msecs(rc), 1, 10);


Sure, these HCALLs should not return > H_LONG_BUSY_ORDER_10_MSEC.




+
+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor expects to reissue these
+* HCALLs after 1 or 10msecs. So use usleep_range()
+* to sleep with the expected value.
+*
+* See Documentation/timers/timers-howto.rst on using
+* the value range in usleep_range().
+*/
+   usleep_range(ms * 100, ms * 1000);


If there's going to be commentary here I think it should just explain
why potentially sleeping for less than the suggested time is OK. There
is wording you can crib in rtas_busy_delay().



rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
--
2.26.3


[PATCH v2] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2023-11-28 Thread Haren Myneni
VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor expects to reissue these HCALLs after 1 or 10msecs.
It might cause these HCALLs takes longer when multiple threads
issue open or close VAS windows simultaneously.

So instead of msleep(), use usleep_range() to ensure sleep with
the expected value before issuing HCALL again.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 

---
v1 -> v2:
- Use usleep_range instead of using RTAS sleep routine as
  suggested by Nathan
---
 arch/powerpc/platforms/pseries/vas.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..bade4402741f 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -36,9 +36,31 @@ static bool migration_in_progress;
 
 static long hcall_return_busy_check(long rc)
 {
+   unsigned int ms;
+
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   ms = get_longbusy_msecs(rc);
+   /*
+* Allocate, Modify and Deallocate HCALLs returns
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* for the long delay. So the delay should always be 1
+* or 10msecs, but sleeps 1msec in case if the long
+* delay is > H_LONG_BUSY_ORDER_10_MSEC.
+*/
+   if (ms > 10)
+   ms = 1;
+
+   /*
+* msleep() will often sleep at least 20 msecs even
+* though the hypervisor expects to reissue these
+* HCALLs after 1 or 10msecs. So use usleep_range()
+* to sleep with the expected value.
+*
+* See Documentation/timers/timers-howto.rst on using
+* the value range in usleep_range().
+*/
+   usleep_range(ms * 100, ms * 1000);
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
-- 
2.26.3



[PATCH 2/2] powerpc/pseries/vas: Call rtas_busy_sleep() to support HCALL delay

2023-11-27 Thread Haren Myneni
VAS allocate, modify and deallocate HCALLs returns
H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
delay and expects OS to reissue HCALL after that delay. But using
msleep() will often sleep at least 20 msecs even though the
hypervisor expects to reissue these HCALLs after 1 or 10msecs.
It might cause these HCALLs takes longer when multiple threads
issue open or close VAS windows simultaneously.

So instead of using msleep(), call rtas_busy_sleep() which uses
usleep_range() if the delay is <= 20msecs.

Signed-off-by: Haren Myneni 
Suggested-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/vas.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 71d52a670d95..c0ffdfc51f96 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "vas.h"
 
@@ -38,7 +39,13 @@ static long hcall_return_busy_check(long rc)
 {
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
-   msleep(get_longbusy_msecs(rc));
+   /*
+* Allocate, Modify and Deallocate HCALLs can return
+* H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
+* and expects OS to reissue HCALL after 1msec or
+* 10msecs.
+*/
+   rtas_busy_sleep(rc);
rc = H_BUSY;
} else if (rc == H_BUSY) {
cond_resched();
-- 
2.26.3



[PATCH 1/2] powerpc/rtas: Create rtas_busy_sleep function

2023-11-27 Thread Haren Myneni
Move the RTAS delay sleep code to new rtas_busy_sleep(). It can
be called from HCALL delay code that needs to support both usleep()
or msleep() depends on delay value.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/rtas.h |  1 +
 arch/powerpc/kernel/rtas.c  | 56 ++---
 2 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index c697c3c74694..b389351a0045 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -435,6 +435,7 @@ extern void rtas_get_rtc_time(struct rtc_time *rtc_time);
 extern int rtas_set_rtc_time(struct rtc_time *rtc_time);
 
 extern unsigned int rtas_busy_delay_time(int status);
+extern void rtas_busy_sleep(int value);
 bool rtas_busy_delay(int status);
 
 extern int early_init_dt_scan_rtas(unsigned long node,
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index eddc031c4b95..aa0bd7c4dcf1 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1250,6 +1250,36 @@ static bool __init rtas_busy_delay_early(int status)
return retry;
 }
 
+void rtas_busy_sleep(int value)
+{
+   unsigned int ms;
+
+   ms = rtas_busy_delay_time(value);
+   /*
+* The extended delay hint can be as high as 100 seconds.
+* Surely any function returning such a status is either
+* buggy or isn't going to be significantly slowed by us
+* polling at 1HZ. Clamp the sleep time to one second.
+*/
+   ms = clamp(ms, 1U, 1000U);
+   /*
+* The delay hint is an order-of-magnitude suggestion, not
+* a minimum. It is fine, possibly even advantageous, for
+* us to pause for less time than hinted. For small values,
+* use usleep_range() to ensure we don't sleep much longer
+* than actually needed.
+*
+* See Documentation/timers/timers-howto.rst for
+* explanation of the threshold used here. In effect we use
+* usleep_range() for 9900 and 9901, msleep() for
+* 9902-9905.
+*/
+   if (ms <= 20)
+   usleep_range(ms * 100, ms * 1000);
+   else
+   msleep(ms);
+}
+
 /**
  * rtas_busy_delay() - helper for RTAS busy and extended delay statuses
  *
@@ -1270,7 +1300,6 @@ static bool __init rtas_busy_delay_early(int status)
  */
 bool __ref rtas_busy_delay(int status)
 {
-   unsigned int ms;
bool ret;
 
/*
@@ -1282,30 +1311,7 @@ bool __ref rtas_busy_delay(int status)
switch (status) {
case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX:
ret = true;
-   ms = rtas_busy_delay_time(status);
-   /*
-* The extended delay hint can be as high as 100 seconds.
-* Surely any function returning such a status is either
-* buggy or isn't going to be significantly slowed by us
-* polling at 1HZ. Clamp the sleep time to one second.
-*/
-   ms = clamp(ms, 1U, 1000U);
-   /*
-* The delay hint is an order-of-magnitude suggestion, not
-* a minimum. It is fine, possibly even advantageous, for
-* us to pause for less time than hinted. For small values,
-* use usleep_range() to ensure we don't sleep much longer
-* than actually needed.
-*
-* See Documentation/timers/timers-howto.rst for
-* explanation of the threshold used here. In effect we use
-* usleep_range() for 9900 and 9901, msleep() for
-* 9902-9905.
-*/
-   if (ms <= 20)
-   usleep_range(ms * 100, ms * 1000);
-   else
-   msleep(ms);
+   rtas_busy_sleep(status);
break;
case RTAS_BUSY:
ret = true;
-- 
2.26.3



[PATCH v5] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-11-25 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from the list
// May miss windows that are
// not in the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add window to the list
  // Window will be added to the
  // list after the setup is completed
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Set the migration_in_progress flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

The code flow with the fix is as follows:

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
   unlock vas_pseries_mutex
   return
open window HCALL
nr_open_wins_progress++
// Window opened, but not
// added to the list yet
unlock vas_pseries_mutex
Modify window HCALL migration_in_progress=true
setup windowlock vas_pseries_mutex
Closes all windows from the list
While nr_open_wins_progress {
unlock vas_pseries_mutex
lock vas_pseries_mutex  sleep
if nr_closed_windows == 0   // Wait if any open window in
or migration is not started // progress. The open window
   // No DLPAR CPU or migration // thread closes the window without
   add window to the list   // adding to the list and return if
   nr_open_wins_progress--  // the migration is in progress.
   unlock vas_pseries_mutex
   return
Close VAS window
nr_open_wins_progress--
unlock vas_pseries_mutex
return -EBUSY   lock vas_pseries_mutex
}
unlock vas_pseries_mutex
return

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 

---
v1 -> v2:
- Do not define the migration_in_progress flag as atomic as
  suggested by Nathan

v2 -> v3:
- Use wait_event() instead of wait_event_interruptible() so that
  returns after all windows are closed as suggested by Nathan

v3 -> v4:
- remove atomic for nr_open_wins_progress counter as suggested by
  Nathan and Michael Ellerman
- Use sleep instead of wait_event_interruptible() to check
  nr_open_wins_progress counter under mutex.

v4 -> v5:
- Update the commit message with comments in the code flow and added
  the second code flow with the fix as suggested by Michael Ellerman
---
 arch/powerpc/platforms/pseries/vas.c | 51 
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index b1f25bac280b..71d52a670d95 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -385,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (migration_in_progress) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   caps->nr_open_wins_progress++;
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -404,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+*

[PATCH v4] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-11-12 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add to the list
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Set the migration_in_progress flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 

---
v1 -> v2:
- Do not define the migration_in_progress flag as atomic as
  suggested by Nathan

v2 -> v3:
- Use wait_event() instead of wait_event_interruptible() so that
  returns after all windows are closed as suggested by Nathan

v3 -> v4:
- remove atomic for nr_open_wins_progress counter as suggested by
  Nathan and Michael Ellerman
- Use sleep instead of wait_event_interruptible() to check
  nr_open_wins_progress counter under mutex.
---
 arch/powerpc/platforms/pseries/vas.c | 51 
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index b1f25bac280b..71d52a670d95 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -385,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (migration_in_progress) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   caps->nr_open_wins_progress++;
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -404,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -413,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !migration_in_progress) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   caps->nr_open_wins_progress--;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
@@ -433,6 +448,12 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 */
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
+   /*
+* Hold mutex and reduce nr_open_wins_progress counter.
+*/
+

[PATCH v3] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-19 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add to the list
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Set the migration_in_progress flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 

---
v1 -> v2:
- Do not define the migration_in_progress flag as atomic as
  suggested by Nathan

v2 -> v3:
- Use wait_event() instead of wait_event_interruptible() so that
  returns after all windows are closed as suggested by Nathan
---
 arch/powerpc/platforms/pseries/vas.c | 45 +++-
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 88bee56ff92b..c2c6cc2b22ea 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -32,6 +32,7 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
 static bool migration_in_progress;
+static DECLARE_WAIT_QUEUE_HEAD(open_win_progress_wq);
 
 static long hcall_return_busy_check(long rc)
 {
@@ -384,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (migration_in_progress) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   atomic_inc(>nr_open_wins_progress);
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -403,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -412,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !migration_in_progress) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   atomic_dec(>nr_open_wins_progress);
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
@@ -432,6 +448,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 */
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.win

[PATCH v2] powerpc/vas: Limit open window failure messages in log bufffer

2023-10-19 Thread Haren Myneni
The VAS open window call prints error message and returns -EBUSY
after the migration suspend event initiated and until the resume
event completed on the destination system. It can cause the log
buffer filled with these error messages if the user space issues
continuous open window calls.  Similar case even for DLPAR CPU
remove event when no credits are available until the credits are
freed or with the other DLPAR CPU add event.

So changes in the patch to use pr_err_ratelimited() instead of
pr_err() to display open window failure and not-available credits
error messages.

Use pr_fmt() and make the corresponding changes to have the
consistencein prefix all pr_*() messages (vas-api.c).

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/book3s/vas-api.c | 34 -
 arch/powerpc/platforms/pseries/vas.c|  4 +--
 2 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 77ea9335fd04..52dabbe52da1 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -4,6 +4,8 @@
  * Copyright (C) 2019 Haren Myneni, IBM Corp
  */
 
+#define pr_fmt(fmt)"VAS-API: " fmt
+
 #include 
 #include 
 #include 
@@ -78,7 +80,7 @@ int get_vas_user_win_ref(struct vas_user_win_ref *task_ref)
task_ref->mm = get_task_mm(current);
if (!task_ref->mm) {
put_pid(task_ref->pid);
-   pr_err("VAS: pid(%d): mm_struct is not found\n",
+   pr_err("pid(%d): mm_struct is not found\n",
current->pid);
return -EPERM;
}
@@ -235,8 +237,7 @@ void vas_update_csb(struct coprocessor_request_block *crb,
rc = kill_pid_info(SIGSEGV, , pid);
rcu_read_unlock();
 
-   pr_devel("%s(): pid %d kill_proc_info() rc %d\n", __func__,
-   pid_vnr(pid), rc);
+   pr_devel("pid %d kill_proc_info() rc %d\n", pid_vnr(pid), rc);
 }
 
 void vas_dump_crb(struct coprocessor_request_block *crb)
@@ -294,7 +295,7 @@ static int coproc_ioc_tx_win_open(struct file *fp, unsigned 
long arg)
 
rc = copy_from_user(, uptr, sizeof(uattr));
if (rc) {
-   pr_err("%s(): copy_from_user() returns %d\n", __func__, rc);
+   pr_err("copy_from_user() returns %d\n", rc);
return -EFAULT;
}
 
@@ -311,7 +312,7 @@ static int coproc_ioc_tx_win_open(struct file *fp, unsigned 
long arg)
txwin = cp_inst->coproc->vops->open_win(uattr.vas_id, uattr.flags,
cp_inst->coproc->cop_type);
if (IS_ERR(txwin)) {
-   pr_err("%s() VAS window open failed, %ld\n", __func__,
+   pr_err_ratelimited("VAS window open failed rc=%ld\n",
PTR_ERR(txwin));
return PTR_ERR(txwin);
}
@@ -405,8 +406,7 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
 * window is not opened. Shouldn't expect this error.
 */
if (!cp_inst || !cp_inst->txwin) {
-   pr_err("%s(): Unexpected fault on paste address with TX window 
closed\n",
-   __func__);
+   pr_err("Unexpected fault on paste address with TX window 
closed\n");
return VM_FAULT_SIGBUS;
}
 
@@ -421,8 +421,7 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
 * issue NX request.
 */
if (txwin->task_ref.vma != vmf->vma) {
-   pr_err("%s(): No previous mapping with paste address\n",
-   __func__);
+   pr_err("No previous mapping with paste address\n");
return VM_FAULT_SIGBUS;
}
 
@@ -481,19 +480,19 @@ static int coproc_mmap(struct file *fp, struct 
vm_area_struct *vma)
txwin = cp_inst->txwin;
 
if ((vma->vm_end - vma->vm_start) > PAGE_SIZE) {
-   pr_debug("%s(): size 0x%zx, PAGE_SIZE 0x%zx\n", __func__,
+   pr_debug("size 0x%zx, PAGE_SIZE 0x%zx\n",
(vma->vm_end - vma->vm_start), PAGE_SIZE);
return -EINVAL;
}
 
/* Ensure instance has an open send window */
if (!txwin) {
-   pr_err("%s(): No send window open?\n", __func__);
+   pr_err("No send window open?\n");
return -EINVAL;
}
 
if (!cp_inst->coproc->vops || !cp_inst->coproc->vops->paste_addr) {
-   pr_err("%s(): VAS API is not registered\n", __func__);
+   pr_err("VAS API is not registered\n&q

Re: [PATCH v2] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-19 Thread Haren Myneni




On 10/18/23 2:10 PM, Nathan Lynch wrote:

Haren Myneni  writes:

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
 window

lock vas_pseries_mutex
If migration_in_progress set
   unlock vas_pseries_mutex
   return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
   // No DLPAR CPU or migration
   add to the list
   unlock vas_pseries_mutex
   return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Define migration_in_progress as atomic so that the migration
   handler sets this flag without holding mutex.


This part of the commit message is no longer accurate...


Correct. My mistake




- Introduce nr_open_wins_progress counter in VAS capabilities
   struct
- This counter tracks the number of open windows are still in
   progress
- The allocate setup window thread closes windows if the migration
   is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 

---
Changes from v1:
- Do not define the migration_in_progress flag as atomic as
   suggested by Nathan
---
  arch/powerpc/platforms/pseries/vas.c | 45 +++-
  arch/powerpc/platforms/pseries/vas.h |  2 ++
  2 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 15d958e38eca..b86f0db08e98 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -32,6 +32,7 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
  static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
  static DEFINE_MUTEX(vas_pseries_mutex);
  static bool migration_in_progress;
+static DECLARE_WAIT_QUEUE_HEAD(open_win_progress_wq);
  
  static long hcall_return_busy_check(long rc)

  {
@@ -384,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (migration_in_progress) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   atomic_inc(>nr_open_wins_progress);
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -403,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
  
  	txwin->win_type = cop_feat_caps->win_type;

-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -412,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !migration_in_progress) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   atomic_dec(>nr_open_wins_progress);


Should there not be a test and wakeup here

if (atomic_dec_return(>nr_open_wins_progress) == 0)
wake_up(_win_progress_wq);


We do not need this. This section will be running only when the 
migration_in_progress is not set. So the migration threa

[PATCH] powerpc/vas: Limit open window failure messages in log bufffer

2023-10-17 Thread Haren Myneni
The VAS open window call prints error message and returns -EBUSY
after the migration suspend event initiated and until the resume
event completed on the destination system. It can cause the log
buffer filled with these error messages if the user space issues
continuous open window calls.  Similar case even for DLPAR CPU
remove event when no credits are available until the credits are
freed or with the other DLPAR CPU add event.

So changes in the patch to use pr_err_ratelimited() instead of
pr_err() to display open window failure and not-available credits
error messages.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/book3s/vas-api.c | 4 ++--
 arch/powerpc/platforms/pseries/vas.c| 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 77ea9335fd04..203cfc2fb8ff 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -311,8 +311,8 @@ static int coproc_ioc_tx_win_open(struct file *fp, unsigned 
long arg)
txwin = cp_inst->coproc->vops->open_win(uattr.vas_id, uattr.flags,
cp_inst->coproc->cop_type);
if (IS_ERR(txwin)) {
-   pr_err("%s() VAS window open failed, %ld\n", __func__,
-   PTR_ERR(txwin));
+   pr_err_ratelimited("%s() VAS window open failed, %ld\n",
+   __func__, PTR_ERR(txwin));
return PTR_ERR(txwin);
}
 
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index b86f0db08e98..7259e6676503 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -341,7 +341,7 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 
if (atomic_inc_return(_feat_caps->nr_used_credits) >
atomic_read(_feat_caps->nr_total_credits)) {
-   pr_err("Credits are not available to allocate window\n");
+   pr_err_ratelimited("Credits are not available to allocate 
window\n");
rc = -EINVAL;
goto out;
}
@@ -439,7 +439,7 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 
put_vas_user_win_ref(>vas_win.task_ref);
rc = -EBUSY;
-   pr_err("No credit is available to allocate window\n");
+   pr_err_ratelimited("No credit is available to allocate window\n");
 
 out_free:
/*
-- 
2.26.3



[PATCH v2] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-17 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add to the list
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Define migration_in_progress as atomic so that the migration
  handler sets this flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 

---
Changes from v1:
- Do not define the migration_in_progress flag as atomic as
  suggested by Nathan
---
 arch/powerpc/platforms/pseries/vas.c | 45 +++-
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 15d958e38eca..b86f0db08e98 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -32,6 +32,7 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
 static bool migration_in_progress;
+static DECLARE_WAIT_QUEUE_HEAD(open_win_progress_wq);
 
 static long hcall_return_busy_check(long rc)
 {
@@ -384,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (migration_in_progress) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   atomic_inc(>nr_open_wins_progress);
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -403,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -412,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !migration_in_progress) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   atomic_dec(>nr_open_wins_progress);
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
@@ -432,6 +448,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 */
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
+   atomic_dec(>nr_open_wins_progress);
+   wake_up(_win_progress_wq);
 out:
  

Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-16 Thread Haren Myneni




On 10/16/23 1:30 PM, Nathan Lynch wrote:

Nathan Lynch  writes:

Haren Myneni  writes:

Haren Myneni  writes:

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
  window

lock vas_pseries_mutex
If migration_in_progress set
unlock vas_pseries_mutex
return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
// No DLPAR CPU or migration
add to the list
unlock vas_pseries_mutex
return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY


Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

mutex_lock(_pseries_mutex);
if (migration_in_progress)
rc = -EBUSY;
else
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;

rc = h_modify_vas_window(txwin);
if (!rc)
rc = get_vas_user_win_ref(>vas_win.task_ref);
if (rc)
goto out_free;

txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins) {
list_add(>win_list, >list);
caps->nr_open_windows++;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?



Thanks Nathan for your comments.

vas_pseries_mutex protects window ID and IRQ allocation between alloc
and free window HCALLs, and window list. Generally try to not using
mutex in HCALLs, but we need this mutex with these HCALLs.

We can add h_modify_vas_window() or get_vas_user_win_ref() with in the
mutex context, but not needed.


Hmm. I contend that it would fix your bug in a simpler way that
eliminates the race instead of coping with it by adding more state and
complicating the locking model. With your change, readers of the
migration_in_progress flag check it under the mutex, but the writer
updates it outside of the mutex, which seems strange and unlikely to be
correct.


Expanding on this, with your change, migration_in_progress becomes a
boolean atomic_t flag accessed only with atomic_set() and
atomic_read(). These are non-RMW operations. Documentation/atomic_t.txt
says:

   Non-RMW ops:

   The non-RMW ops are (typically) regular LOADs and STOREs and are
   canonically implemented using READ_ONCE(), WRITE_ONCE(),
   smp_load_acquire() and smp_store_release() respectively. Therefore, if
   you find yourself only using the Non-RMW operations of atomic_t, you
   do not in fact need atomic_t at all and are doing it wrong.

So making migration_in_progress an atomic_t does not confer any
advantageous properties to it that it lacks as a plain boolean.

Considering also (from the same document):

  - non-RMW operations are unordered;

  - RMW operations that have no return value are unordered;

I am concerned that threads executing these segments of code will not
always observe each others' effects in the intended order:

// vas_allocate_window()

 mutex_lock(_pseries_mutex);
 if (!caps->nr_close_wins && !atomic_read(_in_progress)) {
 list_add(>win_list, >list);
 caps->nr_open_windows++;
 atomic_dec(>nr_open_wins_progress);
 mutex_unlock(_pseries_mutex);
 vas_user_win_add_mm_context(>vas_win.task_ref);
 return >vas_win;
 }
 mutex_unlock(_pseries_mutex);
 ...
 atomic_dec(>nr_open_wins_progress);
 wake_up(_win_progress_wq);

// vas_migration_handler()

 atomic_set(_in_progress, 1);
 ...
 mutex_lock(_pseries_mutex);
 rc = reconfig_close_windows(vcaps, 
vcaps->nr_open_windows,

Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-13 Thread Haren Myneni




On 10/11/23 1:36 PM, Nathan Lynch wrote:

Haren Myneni  writes:

Haren Myneni  writes:

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
  window

lock vas_pseries_mutex
If migration_in_progress set
unlock vas_pseries_mutex
return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
// No DLPAR CPU or migration
add to the list
unlock vas_pseries_mutex
return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY


Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

mutex_lock(_pseries_mutex);
if (migration_in_progress)
rc = -EBUSY;
else
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;

rc = h_modify_vas_window(txwin);
if (!rc)
rc = get_vas_user_win_ref(>vas_win.task_ref);
if (rc)
goto out_free;

txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins) {
list_add(>win_list, >list);
caps->nr_open_windows++;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?



Thanks Nathan for your comments.

vas_pseries_mutex protects window ID and IRQ allocation between alloc
and free window HCALLs, and window list. Generally try to not using
mutex in HCALLs, but we need this mutex with these HCALLs.

We can add h_modify_vas_window() or get_vas_user_win_ref() with in the
mutex context, but not needed.


Hmm. I contend that it would fix your bug in a simpler way that
eliminates the race instead of coping with it by adding more state and
complicating the locking model. With your change, readers of the
migration_in_progress flag check it under the mutex, but the writer
updates it outside of the mutex, which seems strange and unlikely to be
correct.


The migration thread is the only writer which changes 
migration_in_progress flag. The setting this flag in moved outside of 
mutex in this patch. The window open is only reader of this flag but 
within mutex.


Reason for moved the setting outside of mutex:

Suppose many threads are called open window and waiting on mutex and 
later the migration thread started. In this case the migration thread 
has to wait on mutex for all window open threads has to complete open 
window HCALLs in the hypervisor. Then the migration thread has to close 
all these windows immediately. So if the setting is done outside of 
mutex, the later open window threads can exit from this function quickly 
without opening windows.


Reason for keeping the migration_in_progress check inside of mutex 
section with the above change (setting outside of mutex):


If the reader threads waits on mutex after checking this flag (before 
holding mutex), end up opening windows which will be closed anyway by 
the migration thread. Also the later open threads can return with -EBUSY 
quickly.






Also free HCALL can take longer depends
on pending NX requests since the hypervisor waits for all requests to be
completed before closing the window.

Applications can issue window open / free calls at the same time which
can experience mutex contention especially on the large system where
100's of credits are available. Another example: The migration event can
wait longer (or timeout) to get this mutex if many threads issue
open/free window calls. Hence added h_modify_vas_window() (modify HCALL)
or get_vas_user_win_ref() outside of mutex.


OK. I believe you're referring to this code, which can run under the lock:

static long hcall_return_busy_check(long rc)
{
/* Check if we are stalled for some time */
if (H_IS_LONG_BUSY(rc)) {
  

Re: [PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-10-10 Thread Haren Myneni




On 10/9/23 1:09 PM, Nathan Lynch wrote:

Hi Haren,

Haren Myneni  writes:

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
 window

lock vas_pseries_mutex
If migration_in_progress set
   unlock vas_pseries_mutex
   return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
   // No DLPAR CPU or migration
   add to the list
   unlock vas_pseries_mutex
   return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY


Could the the path t1 takes simply hold the mutex for the duration of
its execution instead of dropping and reacquiring it in the middle?

Here's the relevant code from vas_allocate_window():

mutex_lock(_pseries_mutex);
if (migration_in_progress)
rc = -EBUSY;
else
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;

rc = h_modify_vas_window(txwin);
if (!rc)
rc = get_vas_user_win_ref(>vas_win.task_ref);
if (rc)
goto out_free;

txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
if (!caps->nr_close_wins) {
list_add(>win_list, >list);
caps->nr_open_windows++;
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
}
mutex_unlock(_pseries_mutex);

Is there something about h_modify_vas_window() or get_vas_user_win_ref()
that requires temporarily dropping the lock?



Thanks Nathan for your comments.

vas_pseries_mutex protects window ID and IRQ allocation between alloc 
and free window HCALLs, and window list. Generally try to not using 
mutex in HCALLs, but we need this mutex with these HCALLs.


We can add h_modify_vas_window() or get_vas_user_win_ref() with in the 
mutex context, but not needed. Also free HCALL can take longer depends 
on pending NX requests since the hypervisor waits for all requests to be 
completed before closing the window.


Applications can issue window open / free calls at the same time which 
can experience mutex contention especially on the large system where 
100's of credits are available. Another example: The migration event can 
wait longer (or timeout) to get this mutex if many threads issue 
open/free window calls. Hence added h_modify_vas_window() (modify HCALL) 
or get_vas_user_win_ref() outside of mutex.


Thanks
Haren












[PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-09-26 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add to the list
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Define migration_in_progress as atomic so that the migration
  handler sets this flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 51 ++--
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 15d958e38eca..efdaf12ffe49 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -31,7 +31,8 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
-static bool migration_in_progress;
+static atomic_t migration_in_progress = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(open_win_progress_wq);
 
 static long hcall_return_busy_check(long rc)
 {
@@ -384,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (atomic_read(_in_progress)) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   atomic_inc(>nr_open_wins_progress);
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -403,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -412,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !atomic_read(_in_progress)) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   atomic_dec(>nr_open_wins_progress);
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
@@ -432,6 +448,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 */
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
+   atomic_dec(>nr_open_wins_progress);
+   wake_up(_win_progress_wq);
 out:
atomic_dec(_feat_caps->nr

[PATCH] powerpc/pseries/vas: Migration suspend waits for no in-progress open windows

2023-09-26 Thread Haren Myneni
The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS   t2: Migration event
window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL lock vas_pseries_mutex
setup windowmigration_in_progress=true
Closes all windows from
the list
unlock vas_pseries_mutex
lock vas_pseries_mutex  return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add to the list
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Define migration_in_progress as atomic so that the migration
  handler sets this flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 51 ++--
 arch/powerpc/platforms/pseries/vas.h |  2 ++
 2 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 15d958e38eca..efdaf12ffe49 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -31,7 +31,8 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
-static bool migration_in_progress;
+static atomic_t migration_in_progress = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(open_win_progress_wq);
 
 static long hcall_return_busy_check(long rc)
 {
@@ -384,11 +385,15 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   if (migration_in_progress)
+   if (atomic_read(_in_progress)) {
rc = -EBUSY;
-   else
+   } else {
rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
+   if (!rc)
+   atomic_inc(>nr_open_wins_progress);
+   }
+
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -403,8 +408,17 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
goto out_free;
 
txwin->win_type = cop_feat_caps->win_type;
-   mutex_lock(_pseries_mutex);
+
/*
+* The migration SUSPEND thread sets migration_in_progress and
+* closes all open windows from the list. But the window is
+* added to the list after open and modify HCALLs. So possible
+* that migration_in_progress is set before modify HCALL which
+* may cause some windows are still open when the hypervisor
+* initiates the migration.
+* So checks the migration_in_progress flag again and close all
+* open windows.
+*
 * Possible to lose the acquired credit with DLPAR core
 * removal after the window is opened. So if there are any
 * closed windows (means with lost credits), do not give new
@@ -412,9 +426,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * after the existing windows are reopened when credits are
 * available.
 */
-   if (!caps->nr_close_wins) {
+   mutex_lock(_pseries_mutex);
+   if (!caps->nr_close_wins && !atomic_read(_in_progress)) {
list_add(>win_list, >list);
caps->nr_open_windows++;
+   atomic_dec(>nr_open_wins_progress);
mutex_unlock(_pseries_mutex);
vas_user_win_add_mm_context(>vas_win.task_ref);
return >vas_win;
@@ -432,6 +448,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 */
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
+   atomic_dec(>nr_open_wins_progress);
+   wake_up(_win_progress_wq);
 out:
atomic_dec(_feat_caps->nr

[PATCH v2] powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close

2023-07-16 Thread Haren Myneni
Commit 8ef7b9e1765a ("powerpc/pseries/vas: Close windows with DLPAR
core removal") unmaps the window paste address and issues HCALL to
close window in the hypervisor for migration or DLPAR core removal
events. So holds mmap_mutex and then mmap lock before unmap the
paste address. But if the user space issue mmap paste address at
the same time with the migration event, coproc_mmap() is called
after holding the mmap lock which can trigger deadlock when trying
to acquire mmap_mutex in coproc_mmap().

t1: mmap() call to mmap  t2: Migration event
window paste address

do_mmap2()   migration_store()
 ksys_mmap_pgoff()pseries_migrate_partition()
  vm_mmap_pgoff()  vas_migration_handler()
Acquire mmap lock   reconfig_close_windows()
do_mmap() lock mmap_mutex
 mmap_region()Acquire mmap lock
  call_mmap() //Wait for mmap lock
   coproc_mmap()unmap vma
 lock mmap_mutexupdate window status
 //wait for mmap_mutexRelease mmap lock
  mmap vmaunlock mmap_mutex
  update window status
 unlock mmap_mutex
...
Release mmap lock

Fix this deadlock issue by holding mmap lock first before mmap_mutex
in reconfig_close_windows().

Fixes: 8ef7b9e1765a ("powerpc/pseries/vas: Close windows with DLPAR core 
removal")
Signed-off-by: Haren Myneni 

---
Changes from v1:
- Update commit log with more description on deadlock traces
---
 arch/powerpc/platforms/pseries/vas.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 513180467562..15d958e38eca 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -744,6 +744,12 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds,
}
 
task_ref = >vas_win.task_ref;
+   /*
+* VAS mmap (coproc_mmap()) and its fault handler
+* (vas_mmap_fault()) are called after holding mmap lock.
+* So hold mmap mutex after mmap_lock to avoid deadlock.
+*/
+   mmap_write_lock(task_ref->mm);
mutex_lock(_ref->mmap_mutex);
vma = task_ref->vma;
/*
@@ -752,7 +758,6 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds,
 */
win->vas_win.status |= flag;
 
-   mmap_write_lock(task_ref->mm);
/*
 * vma is set in the original mapping. But this mapping
 * is done with mmap() after the window is opened with ioctl.
@@ -762,8 +767,8 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds,
if (vma)
zap_vma_pages(vma);
 
-   mmap_write_unlock(task_ref->mm);
mutex_unlock(_ref->mmap_mutex);
+   mmap_write_unlock(task_ref->mm);
/*
 * Close VAS window in the hypervisor, but do not
 * free vas_window struct since it may be reused
-- 
2.26.3



[PATCH] powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close

2023-06-22 Thread Haren Myneni
VAS mmap (coproc_mmap()) and its fault handler (vas_mmap_fault())
are called after holding mmap lock and acquire mmap_mutex to
update VAS window status. The migration / DLPAR window close can
hang while trying to acquire mmap lock if it is issued at the
same time with the user space ioctl mmap or VAS fault handler
execution.

So this patch adds changes to acquire mmap lock before holding
mmap_mutex.

Fixes: 8ef7b9e1765a ("powerpc/pseries/vas: Close windows with DLPAR core 
removal")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 513180467562..15d958e38eca 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -744,6 +744,12 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds,
}
 
task_ref = >vas_win.task_ref;
+   /*
+* VAS mmap (coproc_mmap()) and its fault handler
+* (vas_mmap_fault()) are called after holding mmap lock.
+* So hold mmap mutex after mmap_lock to avoid deadlock.
+*/
+   mmap_write_lock(task_ref->mm);
mutex_lock(_ref->mmap_mutex);
vma = task_ref->vma;
/*
@@ -752,7 +758,6 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds,
 */
win->vas_win.status |= flag;
 
-   mmap_write_lock(task_ref->mm);
/*
 * vma is set in the original mapping. But this mapping
 * is done with mmap() after the window is opened with ioctl.
@@ -762,8 +767,8 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds,
if (vma)
zap_vma_pages(vma);
 
-   mmap_write_unlock(task_ref->mm);
mutex_unlock(_ref->mmap_mutex);
+   mmap_write_unlock(task_ref->mm);
/*
 * Close VAS window in the hypervisor, but do not
 * free vas_window struct since it may be reused
-- 
2.26.3



[PATCH v2] powerpc/pseries/vas: Ignore VAS update for DLPAR if copy/paste is not enabled

2023-03-20 Thread Haren Myneni


The hypervisor supports user-mode NX from Power10. pseries_vas_dlpar_cpu()
is called from lparcfg_write() to update VAS windows for DLPAR event in
shared processor mode and the kernel gets -ENOTSUPP for HCALLs if the
user-mode NX is not supported. The current VAS implementation also
supports only with Radix page tables. Whereas in dedicated processor
mode, pseries_vas_notifier() is registered only if the copy/paste
feature is enabled. So instead of displaying HCALL error messages,
update VAS capabilities if the copy/paste feature is available.

This patch ignores updating VAS capabilities in pseries_vas_dlpar_cpu()
and returns success if the copy/paste feature is not enabled.
Then lparcfg_write() completes the processor DLPAR operations
without any failures.

Fixes: 2147783d6bf0 ("powerpc/pseries: Use lparcfg to reconfig VAS windows for 
DLPAR CPU")
Signed-off-by: Haren Myneni 
---
v2: Use bool for copypaste_feat and expand commit message on handling failures

 arch/powerpc/platforms/pseries/vas.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 559112312810..513180467562 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -856,6 +856,13 @@ int pseries_vas_dlpar_cpu(void)
 {
int new_nr_creds, rc;
 
+   /*
+* NX-GZIP is not enabled. Nothing to do for DLPAR event
+*/
+   if (!copypaste_feat)
+   return 0;
+
+
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
  vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat,
  (u64)virt_to_phys(_cop_caps));
@@ -1012,6 +1019,7 @@ static int __init pseries_vas_init(void)
 * Linux supports user space COPY/PASTE only with Radix
 */
if (!radix_enabled()) {
+   copypaste_feat = false;
pr_err("API is supported only with radix page tables\n");
return -ENOTSUPP;
}
-- 
2.26.3




Re: [PATCH] powerpc/pseries/vas: Ignore VAS update for DLPAR if copy/paste is not enabled

2023-03-20 Thread Haren Myneni
On Tue, 2023-03-07 at 20:55 -0600, Nathan Lynch wrote:
> Haren Myneni  writes:
> > The hypervisor supports user-mode NX from Power10.
> > pseries_vas_dlpar_cpu()
> > is called from lparcfg_write() to update VAS windows for DLPAR CPU
> > event
> > and the kernel gets -ENOTSUPP for HCALLs if the user-mode NX is not
> > supported.
> 
> The commit text would be improved by more explanation about the
> higher
> level failure mode here. Does lparcfg_write() fail when it shouldn't?
> If
> so, does that cause a processor DLPAR operation to spuriously fail?

Thanks for your suggestions, I will add more explanation in the
description.

Even if copy/paste is not enabled, pseries_vas_dlpar_cpu() just returns
0 which allows lparcfg_write() returns success - will not affect DLPAR
operation.

  if (pseries_vas_dlpar_cpu() != 0)
   retval = H_HARDWARE;
> 
> pseries_vas_dlpar_cpu() is also called from pseries_vas_notifier() in
> dedicated processor mode. Does this problem affect that scenario
> also?

It should not affect for dedicated processor
mode. pseries_vas_notifier() is registered if copy/paste is enabled. 

if (!rc && copypaste_feat) {
if (firmware_has_feature(FW_FEATURE_LPAR))
of_reconfig_notifier_register(_vas_nb);

> 
> > This patch ignores updating VAS capabilities and returns success if
> > the
> > copy/paste feature is not enabled.
> > 
> > Fixes: 2147783d6bf0 ("powerpc/pseries: Use lparcfg to reconfig VAS
> > windows for DLPAR CPU")
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/vas.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index 559112312810..dc003849d2c5 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -856,6 +856,13 @@ int pseries_vas_dlpar_cpu(void)
> >  {
> > int new_nr_creds, rc;
> >  
> > +   /*
> > +* NX-GZIP is not enabled. Nothing to do for DLPAR event
> > +*/
> > +   if (!copypaste_feat)
> > +   return 0;
> > +
> > +
> > rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
> >   vascaps[VAS_GZIP_DEF_FEAT_TYPE].f
> > eat,
> >   (u64)virt_to_phys(_cop_caps));
> > @@ -1012,6 +1019,7 @@ static int __init pseries_vas_init(void)
> >  * Linux supports user space COPY/PASTE only with Radix
> >  */
> > if (!radix_enabled()) {
> > +   copypaste_feat = 0;
> 
> copypaste_feat is a bool, so use false, not 0. But otherwise I think
> this looks correct and consistent with the rest of the code in vas.c.

Correct my mistake.

Thanks
Haren



[PATCH] powerpc/pseries/vas: Ignore VAS update for DLPAR if copy/paste is not enabled

2023-03-06 Thread Haren Myneni


The hypervisor supports user-mode NX from Power10. pseries_vas_dlpar_cpu()
is called from lparcfg_write() to update VAS windows for DLPAR CPU event
and the kernel gets -ENOTSUPP for HCALLs if the user-mode NX is not
supported.

This patch ignores updating VAS capabilities and returns success if the
copy/paste feature is not enabled.

Fixes: 2147783d6bf0 ("powerpc/pseries: Use lparcfg to reconfig VAS windows for 
DLPAR CPU")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 559112312810..dc003849d2c5 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -856,6 +856,13 @@ int pseries_vas_dlpar_cpu(void)
 {
int new_nr_creds, rc;
 
+   /*
+* NX-GZIP is not enabled. Nothing to do for DLPAR event
+*/
+   if (!copypaste_feat)
+   return 0;
+
+
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
  vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat,
  (u64)virt_to_phys(_cop_caps));
@@ -1012,6 +1019,7 @@ static int __init pseries_vas_init(void)
 * Linux supports user space COPY/PASTE only with Radix
 */
if (!radix_enabled()) {
+   copypaste_feat = 0;
pr_err("API is supported only with radix page tables\n");
return -ENOTSUPP;
}
-- 
2.26.3




[PATCH v3] powerpc/pseries/vas: Add VAS IRQ primary handler

2022-10-09 Thread Haren Myneni


irq_default_primary_handler() can be used only with IRQF_ONESHOT
flag, but the flag disables IRQ before executing the thread handler
and enables it after the interrupt is handled. But this IRQ disable
sets the VAS IRQ OFF state in the hypervisor. In case if NX faults
during this window, the hypervisor will not deliver the fault
interrupt to the partition and the user space may wait continuously
for the CSB update. So use VAS specific IRQ handler instead of
calling the default primary handler.

Increment pending_faults counter in IRQ handler and the bottom
thread handler will process all faults based on this counter.
In case if the another interrupt is received while the thread is
running, it will be processed using this counter. The synchronization
of top and bottom handlers will be done with IRQTF_RUNTHREAD flag
and will re-enter to bottom half if this flag is set.

Signed-off-by: Haren Myneni 
---
v3: Update pending_faults usage in changelog
v2: Use the pending_faults counter for the second interrupt and
process it with the previous interrupt handling if its thread
handler is executing.

 arch/powerpc/platforms/pseries/vas.c | 40 +++-
 arch/powerpc/platforms/pseries/vas.h |  1 +
 2 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 1a2cbc156e8f..70f26efcc35a 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -200,16 +200,41 @@ static irqreturn_t pseries_vas_fault_thread_fn(int irq, 
void *data)
struct vas_user_win_ref *tsk_ref;
int rc;
 
-   rc = h_get_nx_fault(txwin->vas_win.winid, (u64)virt_to_phys());
-   if (!rc) {
-   tsk_ref = >vas_win.task_ref;
-   vas_dump_crb();
-   vas_update_csb(, tsk_ref);
+   while (atomic_read(>pending_faults)) {
+   rc = h_get_nx_fault(txwin->vas_win.winid, 
(u64)virt_to_phys());
+   if (!rc) {
+   tsk_ref = >vas_win.task_ref;
+   vas_dump_crb();
+   vas_update_csb(, tsk_ref);
+   }
+   atomic_dec(>pending_faults);
}
 
return IRQ_HANDLED;
 }
 
+/*
+ * irq_default_primary_handler() can be used only with IRQF_ONESHOT
+ * which disables IRQ before executing the thread handler and enables
+ * it after. But this disabling interrupt sets the VAS IRQ OFF
+ * state in the hypervisor. If the NX generates fault interrupt
+ * during this window, the hypervisor will not deliver this
+ * interrupt to the LPAR. So use VAS specific IRQ handler instead
+ * of calling the default primary handler.
+ */
+static irqreturn_t pseries_vas_irq_handler(int irq, void *data)
+{
+   struct pseries_vas_window *txwin = data;
+
+   /*
+* The thread hanlder will process this interrupt if it is
+* already running.
+*/
+   atomic_inc(>pending_faults);
+
+   return IRQ_WAKE_THREAD;
+}
+
 /*
  * Allocate window and setup IRQ mapping.
  */
@@ -240,8 +265,9 @@ static int allocate_setup_window(struct pseries_vas_window 
*txwin,
goto out_irq;
}
 
-   rc = request_threaded_irq(txwin->fault_virq, NULL,
- pseries_vas_fault_thread_fn, IRQF_ONESHOT,
+   rc = request_threaded_irq(txwin->fault_virq,
+ pseries_vas_irq_handler,
+ pseries_vas_fault_thread_fn, 0,
  txwin->name, txwin);
if (rc) {
pr_err("VAS-Window[%d]: Request IRQ(%u) failed with %d\n",
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index 333ffa2f9f42..a2cb12a31c17 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -132,6 +132,7 @@ struct pseries_vas_window {
u64 flags;
char *name;
int fault_virq;
+   atomic_t pending_faults; /* Number of pending faults */
 };
 
 int sysfs_add_vas_caps(struct vas_cop_feat_caps *caps);
-- 
2.26.3




[PATCH] powerpc/pseries: Use lparcfg to reconfig VAS windows for DLPAR CPU

2022-10-06 Thread Haren Myneni


The hypervisor assigns VAS (Virtual Accelerator Switchboard)
windows depends on cores configured in LPAR. The kernel uses
OF reconfig notifier to reconfig VAS windows for DLPAR CPU event.
In the case of shared CPU mode partition, the hypervisor assigns
VAS windows depends on CPU entitled capacity, not based on vcpus.
When the user changes CPU entitled capacity for the partition,
drmgr uses /proc/ppc64/lparcfg interface to notify the kernel.

This patch adds the following changes to update VAS resources
for shared mode:
- Call vas reconfig windows from lparcfg_write()
- Ignore reconfig changes in the VAS notifier

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/lparcfg.c | 15 
 arch/powerpc/platforms/pseries/vas.c | 44 
 arch/powerpc/platforms/pseries/vas.h |  5 +++
 3 files changed, 50 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
b/arch/powerpc/platforms/pseries/lparcfg.c
index 507dc0b5987d..23f9c96e9abc 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -35,6 +35,7 @@
 #include 
 
 #include "pseries.h"
+#include "vas.h"   /* pseries_vas_dlpar_cpu() */
 
 /*
  * This isn't a module but we expose that to userspace
@@ -748,6 +749,20 @@ static ssize_t lparcfg_write(struct file *file, const char 
__user * buf,
return -EINVAL;
 
retval = update_ppp(new_entitled_ptr, NULL);
+
+   if (retval == H_SUCCESS || retval == H_CONSTRAINED) {
+   /*
+* The hypervisor assigns VAS resources based
+* on entitled capacity for shared mode.
+* Reconfig VAS windows based on DLPAR CPU events.
+* Returns success, -EIO, -EINVAL, or -ENOMEM
+*/
+   retval = pseries_vas_dlpar_cpu();
+   if (!retval)
+   retval = count;
+
+   return retval;
+   }
} else if (!strcmp(kbuf, "capacity_weight")) {
char *endp;
*new_weight_ptr = (u8) simple_strtoul(tmp, , 10);
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 93f87ac126df..bf15586f2af7 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -857,6 +857,26 @@ int vas_reconfig_capabilties(u8 type, int new_nr_creds)
mutex_unlock(_pseries_mutex);
return rc;
 }
+
+int pseries_vas_dlpar_cpu(void)
+{
+   int new_nr_creds, rc = 0;
+
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
+   vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat,
+   (u64)virt_to_phys(_cop_caps));
+   if (!rc) {
+   new_nr_creds = be16_to_cpu(hv_cop_caps.target_lpar_creds);
+   rc = vas_reconfig_capabilties(VAS_GZIP_DEF_FEAT_TYPE,
+   new_nr_creds);
+   }
+
+   if (rc)
+   pr_err("Failed reconfig VAS capabilities with DLPAR\n");
+
+   return rc;
+}
+
 /*
  * Total number of default credits available (target_credits)
  * in LPAR depends on number of cores configured. It varies based on
@@ -871,7 +891,15 @@ static int pseries_vas_notifier(struct notifier_block *nb,
struct of_reconfig_data *rd = data;
struct device_node *dn = rd->dn;
const __be32 *intserv = NULL;
-   int new_nr_creds, len, rc = 0;
+   int len;
+
+   /*
+* For shared CPU partition, the hypervisor assigns total credits
+* based on entitled core capacity. So updating VAS windows will
+* be called from lparcfg_write().
+*/
+   if (is_shared_processor())
+   return NOTIFY_OK;
 
if ((action == OF_RECONFIG_ATTACH_NODE) ||
(action == OF_RECONFIG_DETACH_NODE))
@@ -883,19 +911,7 @@ static int pseries_vas_notifier(struct notifier_block *nb,
if (!intserv)
return NOTIFY_OK;
 
-   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
-   vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat,
-   (u64)virt_to_phys(_cop_caps));
-   if (!rc) {
-   new_nr_creds = be16_to_cpu(hv_cop_caps.target_lpar_creds);
-   rc = vas_reconfig_capabilties(VAS_GZIP_DEF_FEAT_TYPE,
-   new_nr_creds);
-   }
-
-   if (rc)
-   pr_err("Failed reconfig VAS capabilities with DLPAR\n");
-
-   return rc;
+   return pseries_vas_dlpar_cpu();
 }
 
 static struct notifier_block pseries_vas_nb = {
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index a2cb12a31c17..7115043ec488 10

[PATCH v2] powerpc/pseries/vas: Add VAS IRQ primary handler

2022-10-05 Thread Haren Myneni


irq_default_primary_handler() can be used only with IRQF_ONESHOT
flag, but the flag disables IRQ before executing the thread handler
and enables it after the interrupt is handled. But this IRQ disable
sets the VAS IRQ OFF state in the hypervisor. In case if NX faults
during this window, the hypervisor will not deliver the fault
interrupt to the partition and the user space may wait continuously
for the CSB update. So use VAS specific IRQ handler instead of
calling the default primary handler.

Signed-off-by: Haren Myneni 
---
v2: Use the pending_faults counter for the second interrupt and
process it with the previous interrupt handling if its thread
handler is executing.

 arch/powerpc/platforms/pseries/vas.c | 41 +++-
 arch/powerpc/platforms/pseries/vas.h |  1 +
 2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 1a2cbc156e8f..93f87ac126df 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -200,16 +200,42 @@ static irqreturn_t pseries_vas_fault_thread_fn(int irq, 
void *data)
struct vas_user_win_ref *tsk_ref;
int rc;
 
-   rc = h_get_nx_fault(txwin->vas_win.winid, (u64)virt_to_phys());
-   if (!rc) {
-   tsk_ref = >vas_win.task_ref;
-   vas_dump_crb();
-   vas_update_csb(, tsk_ref);
+   while (atomic_read(>pending_faults)) {
+   rc = h_get_nx_fault(txwin->vas_win.winid,
+   (u64)virt_to_phys());
+   if (!rc) {
+   tsk_ref = >vas_win.task_ref;
+   vas_dump_crb();
+   vas_update_csb(, tsk_ref);
+   }
+   atomic_dec(>pending_faults);
}
 
return IRQ_HANDLED;
 }
 
+/*
+ * irq_default_primary_handler() can be used only with IRQF_ONESHOT
+ * which disables IRQ before executing the thread handler and enables
+ * it after. But this disabling interrupt sets the VAS IRQ OFF
+ * state in the hypervisor. If the NX generates fault interrupt
+ * during this window, the hypervisor will not deliver this
+ * interrupt to the LPAR. So use VAS specific IRQ handler instead
+ * of calling the default primary handler.
+ */
+static irqreturn_t pseries_vas_irq_handler(int irq, void *data)
+{
+   struct pseries_vas_window *txwin = data;
+
+   /*
+* The thread hanlder can process this interrupt if it is
+* already running.
+*/
+   atomic_inc(>pending_faults);
+
+   return IRQ_WAKE_THREAD;
+}
+
 /*
  * Allocate window and setup IRQ mapping.
  */
@@ -240,8 +266,9 @@ static int allocate_setup_window(struct pseries_vas_window 
*txwin,
goto out_irq;
}
 
-   rc = request_threaded_irq(txwin->fault_virq, NULL,
- pseries_vas_fault_thread_fn, IRQF_ONESHOT,
+   rc = request_threaded_irq(txwin->fault_virq,
+ pseries_vas_irq_handler,
+ pseries_vas_fault_thread_fn, 0,
  txwin->name, txwin);
if (rc) {
pr_err("VAS-Window[%d]: Request IRQ(%u) failed with %d\n",
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index 333ffa2f9f42..a2cb12a31c17 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -132,6 +132,7 @@ struct pseries_vas_window {
u64 flags;
char *name;
int fault_virq;
+   atomic_t pending_faults; /* Number of pending faults */
 };
 
 int sysfs_add_vas_caps(struct vas_cop_feat_caps *caps);
-- 
2.26.3




[PATCH] powerpc/pseries/vas: Pass hw_cpu_id to node associativity HCALL

2022-09-28 Thread Haren Myneni


Generally the hypervisor decides to allocate a window on different
VAS instances. But if the user space wishes to allocate on the
current VAS instance where the process is executing, the kernel has
to pass associativity domain IDs to allocate VAS window HCALL. To
determine the associativity domain IDs for the current CPU, passing
smp_processor_id() to node associativity HCALL which may return
H_P2 (-55) error during DLPAR CPU event.

This patch fixes this issue by passing hard_smp_processor_id() with
VPHN_FLAG_VCPU flag (PAPR 14.11.6.1 H_HOME_NODE_ASSOCIATIVITY).

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index fe33bdb620d5..533026fd1f40 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -348,7 +348,7 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * So no unpacking needs to be done.
 */
rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, domain,
- VPHN_FLAG_VCPU, smp_processor_id());
+ VPHN_FLAG_VCPU, hard_smp_processor_id());
if (rc != H_SUCCESS) {
pr_err("H_HOME_NODE_ASSOCIATIVITY error: %d\n", rc);
goto out;
-- 
2.26.3




[PATCH v3] powerpc: Ignore DSI error caused by the copy/paste instruction

2022-09-27 Thread Haren Myneni


The data storage interrupt (DSI) error will be generated when the
paste operation is issued on the suspended Nest Accelerator (NX)
window due to NX state changes. The hypervisor expects the
partition to ignore this error during page fault handling.
To differentiate DSI caused by an actual HW configuration or by
the NX window, a new “ibm,pi-features” type value is defined.
Byte 0, bit 3 of pi-attribute-specifier-type is now defined to
indicate this DSI error. If this error is not ignored, the user
space can get SIGBUS when the NX request is issued.

This patch adds changes to read ibm,pi-features property and ignore
DSI error during page fault handling if MMU_FTR_NX_DSI is defined.

Signed-off-by: Haren Myneni 
---
v2: Code cleanup as suggested by Christophe Leroy
v3: Make NX DSI as MMU feature instead of CPU feature 

 arch/powerpc/include/asm/mmu.h |  6 +-
 arch/powerpc/kernel/prom.c | 36 --
 arch/powerpc/mm/fault.c| 17 +++-
 3 files changed, 47 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 860d0290ca4d..a0c10465b94a 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -120,6 +120,10 @@
  */
 #define MMU_FTR_1T_SEGMENT ASM_CONST(0x4000)
 
+/* NX paste RMA reject in DSI
+ */
+#define MMU_FTR_NX_DSI ASM_CONST(0x8000)
+
 /* MMU feature bit sets for various CPUs */
 #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2  (MMU_FTR_HPTE_TABLE | MMU_FTR_TLBIEL | 
MMU_FTR_16M_PAGE)
 #define MMU_FTRS_POWER MMU_FTRS_DEFAULT_HPTE_ARCH_V2
@@ -181,7 +185,7 @@ enum {
 #endif
 #ifdef CONFIG_PPC_RADIX_MMU
MMU_FTR_TYPE_RADIX |
-   MMU_FTR_GTSE |
+   MMU_FTR_GTSE | MMU_FTR_NX_DSI |
 #endif /* CONFIG_PPC_RADIX_MMU */
 #endif
 #ifdef CONFIG_PPC_KUAP
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index a730b951b64b..2e7a04dab2f7 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -137,7 +137,7 @@ static void __init move_device_tree(void)
 }
 
 /*
- * ibm,pa-features is a per-cpu property that contains a string of
+ * ibm,pa/pi-features is a per-cpu property that contains a string of
  * attribute descriptors, each of which has a 2 byte header plus up
  * to 254 bytes worth of processor attribute bits.  First header
  * byte specifies the number of bytes following the header.
@@ -149,15 +149,17 @@ static void __init move_device_tree(void)
  * is supported/not supported.  Note that the bit numbers are
  * big-endian to match the definition in PAPR.
  */
-static struct ibm_pa_feature {
+struct ibm_feature {
unsigned long   cpu_features;   /* CPU_FTR_xxx bit */
unsigned long   mmu_features;   /* MMU_FTR_xxx bit */
unsigned intcpu_user_ftrs;  /* PPC_FEATURE_xxx bit */
unsigned intcpu_user_ftrs2; /* PPC_FEATURE2_xxx bit */
-   unsigned char   pabyte; /* byte number in ibm,pa-features */
+   unsigned char   pabyte; /* byte number in ibm,pa/pi-features */
unsigned char   pabit;  /* bit number (big-endian) */
unsigned char   invert; /* if 1, pa bit set => clear feature */
-} ibm_pa_features[] __initdata = {
+};
+
+static struct ibm_feature ibm_pa_features[] __initdata = {
{ .pabyte = 0,  .pabit = 0, .cpu_user_ftrs = PPC_FEATURE_HAS_MMU },
{ .pabyte = 0,  .pabit = 1, .cpu_user_ftrs = PPC_FEATURE_HAS_FPU },
{ .pabyte = 0,  .pabit = 3, .cpu_features  = CPU_FTR_CTRL },
@@ -179,9 +181,19 @@ static struct ibm_pa_feature {
{ .pabyte = 64, .pabit = 0, .cpu_features = CPU_FTR_DAWR1 },
 };
 
+/*
+ * ibm,pi-features property provides the support of processor specific
+ * options not described in ibm,pa-features. Right now use byte 0, bit 3
+ * which indicates the occurrence of DSI interrupt when the paste operation
+ * on the suspended NX window.
+ */
+static struct ibm_feature ibm_pi_features[] __initdata = {
+   { .pabyte = 0, .pabit = 3, .mmu_features  = MMU_FTR_NX_DSI },
+};
+
 static void __init scan_features(unsigned long node, const unsigned char *ftrs,
 unsigned long tablelen,
-struct ibm_pa_feature *fp,
+struct ibm_feature *fp,
 unsigned long ft_size)
 {
unsigned long i, len, bit;
@@ -218,17 +230,18 @@ static void __init scan_features(unsigned long node, 
const unsigned char *ftrs,
}
 }
 
-static void __init check_cpu_pa_features(unsigned long node)
+static void __init check_cpu_features(unsigned long node, char *name,
+ struct ibm_feature *fp,
+ unsigned long size)
 {
const unsigned char *pa_ftrs;
int tablelen;
 
-   pa_ftrs = of_get_flat_dt_prop(node, "ibm,pa-features", );
+ 

Re: [PATCH v2] powerpc: Ignore DSI error caused by the copy/paste instruction

2022-09-26 Thread Haren Myneni
On Mon, 2022-09-26 at 05:55 +, Christophe Leroy wrote:
> 
> Le 25/09/2022 à 22:26, Haren Myneni a écrit :
> > DSI error will be generated when the paste operation is issued on
> > the suspended NX window due to NX state changes. The hypervisor
> > expects the partition to ignore this error during page pault
> > handling. To differentiate DSI caused by an actual HW configuration
> > or by the NX window, a new “ibm,pi-features” type value is defined.
> > Byte 0, bit 3 of pi-attribute-specifier-type is now defined to
> > indicate this DSI error. If this error is not ignored, the user
> > space can get SIGBUS when the NX request is issued.
> 
> Would be nice to mention at least one time in the message that NX
> stands 
> to nest accelerator.
> 
> Otherwise, that's confusing with for exemple:
> Commit 2e602847d9c2 ("KVM: PPC: Don't flush PTEs on NX/RO hit")
> Commit c49643319715 ("powerpc/32s: Only leave NX unset on segments
> used 
> for modules")

Thanks. I did not realize since VAS/NX code is added before. I will add
the description as you suggested. 

> 
> 
> > This patch adds changes to read ibm,pi-features property and ignore
> > DSI error in the page fault handling if CPU_FTR_NX_DSI if defined.
> > 
> > Signed-off-by: Haren Myneni 
> > ---
> > v2: Code cleanup as suggested by Christophe Leroy
> > 
> >   arch/powerpc/include/asm/cputable.h |  5 ++--
> >   arch/powerpc/kernel/prom.c  | 36 +---
> > -
> >   arch/powerpc/mm/fault.c | 17 +-
> >   3 files changed, 45 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index 014005428687..cb949f12baa9 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -367,7 +367,22 @@ static void sanity_check_fault(bool is_write,
> > bool is_user,
> >   #elif defined(CONFIG_PPC_8xx)
> >   #define page_fault_is_bad(__err)  ((__err) & DSISR_NOEXEC_OR_G)
> >   #elif defined(CONFIG_PPC64)
> > -#define page_fault_is_bad(__err)   ((__err) & DSISR_BAD_FAULT_64S)
> > +static int page_fault_is_bad(unsigned long err)
> > +{
> > +   unsigned long flag = DSISR_BAD_FAULT_64S;
> > +
> > +   /*
> > +* PAPR 14.15.3.4.1
> > +* If byte 0, bit 3 of pi-attribute-specifier-type in
> > +* ibm,pi-features property is defined, ignore the DSI error
> > +* which is caused by the paste instruction on the
> > +* suspended NX window.
> > +*/
> > +   if (cpu_has_feature(CPU_FTR_NX_DSI))
> > +   flag &= ~DSISR_BAD_COPYPASTE;
> > +
> > +   return (err & flag);
> 
> You don't need parenthesis ( )
> 
> > +}
> >   #else
> >   #define page_fault_is_bad(__err)  ((__err) & DSISR_BAD_FAULT_32S)
> >   #endif



[PATCH v2] powerpc: Ignore DSI error caused by the copy/paste instruction

2022-09-25 Thread Haren Myneni


DSI error will be generated when the paste operation is issued on
the suspended NX window due to NX state changes. The hypervisor
expects the partition to ignore this error during page pault
handling. To differentiate DSI caused by an actual HW configuration
or by the NX window, a new “ibm,pi-features” type value is defined.
Byte 0, bit 3 of pi-attribute-specifier-type is now defined to
indicate this DSI error. If this error is not ignored, the user
space can get SIGBUS when the NX request is issued.

This patch adds changes to read ibm,pi-features property and ignore
DSI error in the page fault handling if CPU_FTR_NX_DSI if defined.

Signed-off-by: Haren Myneni 
---
v2: Code cleanup as suggested by Christophe Leroy 

 arch/powerpc/include/asm/cputable.h |  5 ++--
 arch/powerpc/kernel/prom.c  | 36 +
 arch/powerpc/mm/fault.c | 17 +-
 3 files changed, 45 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index ae8c3e13cfce..8dc9949b6365 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -192,6 +192,7 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTR_P9_RADIX_PREFETCH_BUG  LONG_ASM_CONST(0x0002)
 #define CPU_FTR_ARCH_31
LONG_ASM_CONST(0x0004)
 #define CPU_FTR_DAWR1  LONG_ASM_CONST(0x0008)
+#define CPU_FTR_NX_DSI LONG_ASM_CONST(0x0010)
 
 #ifndef __ASSEMBLY__
 
@@ -429,7 +430,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_P9_TLBIE_STQ_BUG | \
-   CPU_FTR_P9_TLBIE_ERAT_BUG | CPU_FTR_P9_TIDR)
+   CPU_FTR_P9_TLBIE_ERAT_BUG | CPU_FTR_P9_TIDR | CPU_FTR_NX_DSI)
 #define CPU_FTRS_POWER9_DD2_0 (CPU_FTRS_POWER9 | CPU_FTR_P9_RADIX_PREFETCH_BUG)
 #define CPU_FTRS_POWER9_DD2_1 (CPU_FTRS_POWER9 | \
   CPU_FTR_P9_RADIX_PREFETCH_BUG | \
@@ -451,7 +452,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
-   CPU_FTR_DAWR | CPU_FTR_DAWR1)
+   CPU_FTR_DAWR | CPU_FTR_DAWR1 | CPU_FTR_NX_DSI)
 #define CPU_FTRS_CELL  (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index a730b951b64b..19047c582e9f 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -137,7 +137,7 @@ static void __init move_device_tree(void)
 }
 
 /*
- * ibm,pa-features is a per-cpu property that contains a string of
+ * ibm,pa/pi-features is a per-cpu property that contains a string of
  * attribute descriptors, each of which has a 2 byte header plus up
  * to 254 bytes worth of processor attribute bits.  First header
  * byte specifies the number of bytes following the header.
@@ -149,15 +149,17 @@ static void __init move_device_tree(void)
  * is supported/not supported.  Note that the bit numbers are
  * big-endian to match the definition in PAPR.
  */
-static struct ibm_pa_feature {
+struct ibm_feature {
unsigned long   cpu_features;   /* CPU_FTR_xxx bit */
unsigned long   mmu_features;   /* MMU_FTR_xxx bit */
unsigned intcpu_user_ftrs;  /* PPC_FEATURE_xxx bit */
unsigned intcpu_user_ftrs2; /* PPC_FEATURE2_xxx bit */
-   unsigned char   pabyte; /* byte number in ibm,pa-features */
+   unsigned char   pabyte; /* byte number in ibm,pa/pi-features */
unsigned char   pabit;  /* bit number (big-endian) */
unsigned char   invert; /* if 1, pa bit set => clear feature */
-} ibm_pa_features[] __initdata = {
+};
+
+static struct ibm_feature ibm_pa_features[] __initdata = {
{ .pabyte = 0,  .pabit = 0, .cpu_user_ftrs = PPC_FEATURE_HAS_MMU },
{ .pabyte = 0,  .pabit = 1, .cpu_user_ftrs = PPC_FEATURE_HAS_FPU },
{ .pabyte = 0,  .pabit = 3, .cpu_features  = CPU_FTR_CTRL },
@@ -179,9 +181,19 @@ static struct ibm_pa_feature {
{ .pabyte = 64, .pabit = 0, .cpu_features = CPU_FTR_DAWR1 },
 };
 
+/*
+ * ibm,pi-features property provides the support of processor specific
+ * options not described in ibm,pa-features. Right now use byte 0, bit 3
+ * which indicates the occurrence of DSI interrupt when the paste operation
+ * on the suspended NX window.
+ */
+static struct ibm_feature ibm_pi_features[] __initdata = {
+   { .pabyte = 0, .pabit = 3, .cpu_features  = CPU_FTR_NX_DSI },
+};
+
 static void __init scan_features(unsigned long node, co

Re: [PATCH] powerpc/pseries: Move vas_migration_handler early during migration

2022-09-23 Thread Haren Myneni
On Thu, 2022-09-22 at 07:14 -0500, Nathan Lynch wrote:
> Haren Myneni  writes:
> > When the migration is initiated, the hypervisor changes VAS
> > mappings as part of pre-migration event. Then the OS gets the
> > migration event which closes all VAS windows before the migration
> > starts. NX generates continuous faults until windows are closed
> > and the user space can not differentiate these NX faults coming
> > from the actual migration. So to reduce this time window, close
> > VAS windows first in pseries_migrate_partition().
> 
> I'm concerned that this is only narrowing a window of time where
> undesirable faults occur, and that it may not be sufficient for all
> configurations. Migrations can be in progress for minutes or hours,
> while the time that we wait for the VASI state transition is usually
> seconds or minutes. So I worry that this works around a problem in
> limited cases but doesn't cover them all.
> 
> Maybe I don't understand the problem well enough. How does user space
> respond to the NX faults?

The user space resend the request to NX whenever the request is
returned with NX fault. So the process should be same even for faults
caused by the pre-migration.

Whereas the paste will be returned with failure when the window is
closed (unmap the paste address) and it can be considered as NX busy.
Up to the user space whether to send the request again after some delay
or fall back to SW compression and send the request again later.

For the migration, pre-migration event is notified to the hypervisor
and then OS will receive the migration event (SUSPEND) - So this patch
close windows early before VASI so that removing NX fault handling
during the time taken for VASI state transistion. 

Thanks
Haren



Re: [PATCH] powerpc: Ignore DSI error caused by the copy/paste instruction

2022-09-22 Thread Haren Myneni
On Thu, 2022-09-22 at 09:04 +, Christophe Leroy wrote:
> 
> Le 22/09/2022 à 10:29, Haren Myneni a écrit :
> > DSI error will be generated when the paste operation is issued on
> > the suspended NX window due to NX state changes. The hypervisor
> > expects the partition to ignore this error during page pault
> > handling. To differentiate DSI caused by an actual HW configuration
> > or by the NX window, a new “ibm,pi-features” type value is defined.
> > Byte 0, bit 3 of pi-attribute-specifier-type is now defined to
> > indicate this DSI error.
> 
> What is NX ? No eXec ? That's what it is usually. But in that case
> it 
> would be the ISI, not DSI.

NX is nest accelerator supports several functions such as compression,
encryption and etc. It is DSI error mentioned in PAPR ("DSI Caused by
User Mode NX")

> 
> > This patch adds changes to read ibm,pi-features property and ignore
> > DSI error in the page fault handling if CPU_FTR_NX_DSI if defined.
> > 
> > Signed-off-by: Haren Myneni 
> > ---
> >   arch/powerpc/include/asm/cputable.h |  5 ++--
> >   arch/powerpc/kernel/prom.c  | 36 +---
> > -
> >   arch/powerpc/mm/fault.c | 17 +-
> >   3 files changed, 45 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index 014005428687..154cc1e85770 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -367,7 +367,22 @@ static void sanity_check_fault(bool is_write,
> > bool is_user,
> >   #elif defined(CONFIG_PPC_8xx)
> >   #define page_fault_is_bad(__err)  ((__err) & DSISR_NOEXEC_OR_G)
> >   #elif defined(CONFIG_PPC64)
> > -#define page_fault_is_bad(__err)   ((__err) & DSISR_BAD_FAULT_64S)
> > +static inline int page_fault_is_bad(unsigned long __err)
> 
> The name was __err because it was a macro and there was a risk of 
> collision with a 'err' variable in the caller.
> 
> But as it is now a function, you can just call it 'err'.
> 
> And no need of the 'inline' keyword, GCC will inline it anyway.

Thanks for your comments. I will repost the patch with these changes.

> 
> > +{
> > +   unsigned long flag = DSISR_BAD_FAULT_64S;
> > +
> > +   /*
> > +* PAPR 14.15.3.4.1
> > +* If byte 0, bit 3 of pi-attribute-specifier-type in
> > +* ibm,pi-features property is defined, ignore the DSI error
> > +* which is caused by the paste instruction on the
> > +* suspended NX window.
> > +*/
> > +   if (cpu_has_feature(CPU_FTR_NX_DSI))
> > +   flag &= ~DSISR_BAD_COPYPASTE;
> > +
> > +   return ((__err) & flag);
> 
> The () around __err was because it was a macro parameter. It is 
> pointless now. And same for the overall ones. Now it can be :
> 
>   return err & flags;
> 
> > +}
> >   #else
> >   #define page_fault_is_bad(__err)  ((__err) & DSISR_BAD_FAULT_32S)
> >   #endif



[PATCH] powerpc: Ignore DSI error caused by the copy/paste instruction

2022-09-22 Thread Haren Myneni


DSI error will be generated when the paste operation is issued on
the suspended NX window due to NX state changes. The hypervisor
expects the partition to ignore this error during page pault
handling. To differentiate DSI caused by an actual HW configuration
or by the NX window, a new “ibm,pi-features” type value is defined.
Byte 0, bit 3 of pi-attribute-specifier-type is now defined to
indicate this DSI error.

This patch adds changes to read ibm,pi-features property and ignore
DSI error in the page fault handling if CPU_FTR_NX_DSI if defined.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/cputable.h |  5 ++--
 arch/powerpc/kernel/prom.c  | 36 +
 arch/powerpc/mm/fault.c | 17 +-
 3 files changed, 45 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index ae8c3e13cfce..8dc9949b6365 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -192,6 +192,7 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTR_P9_RADIX_PREFETCH_BUG  LONG_ASM_CONST(0x0002)
 #define CPU_FTR_ARCH_31
LONG_ASM_CONST(0x0004)
 #define CPU_FTR_DAWR1  LONG_ASM_CONST(0x0008)
+#define CPU_FTR_NX_DSI LONG_ASM_CONST(0x0010)
 
 #ifndef __ASSEMBLY__
 
@@ -429,7 +430,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_P9_TLBIE_STQ_BUG | \
-   CPU_FTR_P9_TLBIE_ERAT_BUG | CPU_FTR_P9_TIDR)
+   CPU_FTR_P9_TLBIE_ERAT_BUG | CPU_FTR_P9_TIDR | CPU_FTR_NX_DSI)
 #define CPU_FTRS_POWER9_DD2_0 (CPU_FTRS_POWER9 | CPU_FTR_P9_RADIX_PREFETCH_BUG)
 #define CPU_FTRS_POWER9_DD2_1 (CPU_FTRS_POWER9 | \
   CPU_FTR_P9_RADIX_PREFETCH_BUG | \
@@ -451,7 +452,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
-   CPU_FTR_DAWR | CPU_FTR_DAWR1)
+   CPU_FTR_DAWR | CPU_FTR_DAWR1 | CPU_FTR_NX_DSI)
 #define CPU_FTRS_CELL  (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index a730b951b64b..19047c582e9f 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -137,7 +137,7 @@ static void __init move_device_tree(void)
 }
 
 /*
- * ibm,pa-features is a per-cpu property that contains a string of
+ * ibm,pa/pi-features is a per-cpu property that contains a string of
  * attribute descriptors, each of which has a 2 byte header plus up
  * to 254 bytes worth of processor attribute bits.  First header
  * byte specifies the number of bytes following the header.
@@ -149,15 +149,17 @@ static void __init move_device_tree(void)
  * is supported/not supported.  Note that the bit numbers are
  * big-endian to match the definition in PAPR.
  */
-static struct ibm_pa_feature {
+struct ibm_feature {
unsigned long   cpu_features;   /* CPU_FTR_xxx bit */
unsigned long   mmu_features;   /* MMU_FTR_xxx bit */
unsigned intcpu_user_ftrs;  /* PPC_FEATURE_xxx bit */
unsigned intcpu_user_ftrs2; /* PPC_FEATURE2_xxx bit */
-   unsigned char   pabyte; /* byte number in ibm,pa-features */
+   unsigned char   pabyte; /* byte number in ibm,pa/pi-features */
unsigned char   pabit;  /* bit number (big-endian) */
unsigned char   invert; /* if 1, pa bit set => clear feature */
-} ibm_pa_features[] __initdata = {
+};
+
+static struct ibm_feature ibm_pa_features[] __initdata = {
{ .pabyte = 0,  .pabit = 0, .cpu_user_ftrs = PPC_FEATURE_HAS_MMU },
{ .pabyte = 0,  .pabit = 1, .cpu_user_ftrs = PPC_FEATURE_HAS_FPU },
{ .pabyte = 0,  .pabit = 3, .cpu_features  = CPU_FTR_CTRL },
@@ -179,9 +181,19 @@ static struct ibm_pa_feature {
{ .pabyte = 64, .pabit = 0, .cpu_features = CPU_FTR_DAWR1 },
 };
 
+/*
+ * ibm,pi-features property provides the support of processor specific
+ * options not described in ibm,pa-features. Right now use byte 0, bit 3
+ * which indicates the occurrence of DSI interrupt when the paste operation
+ * on the suspended NX window.
+ */
+static struct ibm_feature ibm_pi_features[] __initdata = {
+   { .pabyte = 0, .pabit = 3, .cpu_features  = CPU_FTR_NX_DSI },
+};
+
 static void __init scan_features(unsigned long node, const unsigned char *ftrs,
 unsigned long tablelen,
-struct ibm_pa_feature 

[PATCH] powerpc/pseries/vas: Add VAS IRQ primary handler

2022-09-22 Thread Haren Myneni


irq_default_primary_handler() can be used only with IRQF_ONESHOT
flag, but the flag disables IRQ before executing the thread handler
and enables it after the interrupt is handled. But this IRQ disable
sets the VAS IRQ OFF state in the hypervisor. In case if NX faults
during this window, the hypervisor will not deliver the fault
interrupt to the partition and the user space may wait continuously
for the CSB update. So use VAS specific IRQ handler instead of
calling the default primary handler.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 7e6e6dd2e33e..fe33bdb620d5 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -210,6 +210,20 @@ static irqreturn_t pseries_vas_fault_thread_fn(int irq, 
void *data)
return IRQ_HANDLED;
 }
 
+/*
+ * irq_default_primary_handler() can be used only with IRQF_ONESHOT
+ * which disables IRQ before executing the thread handler and enables
+ * it after. But this disabling interrupt sets the VAS IRQ OFF
+ * state in the hypervisor. If the NX generates fault interrupt
+ * during this window, the hypervisor will not deliver this
+ * interrupt to the LPAR. So use VAS specific IRQ handler instead
+ * of calling the default primary handler.
+ */
+static irqreturn_t pseries_vas_irq_handler(int irq, void *data)
+{
+   return IRQ_WAKE_THREAD;
+}
+
 /*
  * Allocate window and setup IRQ mapping.
  */
@@ -240,8 +254,9 @@ static int allocate_setup_window(struct pseries_vas_window 
*txwin,
goto out_irq;
}
 
-   rc = request_threaded_irq(txwin->fault_virq, NULL,
- pseries_vas_fault_thread_fn, IRQF_ONESHOT,
+   rc = request_threaded_irq(txwin->fault_virq,
+ pseries_vas_irq_handler,
+ pseries_vas_fault_thread_fn, 0,
  txwin->name, txwin);
if (rc) {
pr_err("VAS-Window[%d]: Request IRQ(%u) failed with %d\n",
-- 
2.26.3




[PATCH] powerpc/pseries: Move vas_migration_handler early during migration

2022-09-22 Thread Haren Myneni


When the migration is initiated, the hypervisor changes VAS
mappings as part of pre-migration event. Then the OS gets the
migration event which closes all VAS windows before the migration
starts. NX generates continuous faults until windows are closed
and the user space can not differentiate these NX faults coming
from the actual migration. So to reduce this time window, close
VAS windows first in pseries_migrate_partition().

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/mobility.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 3d36a8955eaf..884595b7c51f 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -740,11 +740,19 @@ static int pseries_migrate_partition(u64 handle)
 #ifdef CONFIG_PPC_WATCHDOG
factor = nmi_wd_lpm_factor;
 #endif
+   /*
+* When the migration is initiated, the hypervisor changes VAS
+* mappings to prepare before OS gets the notification and
+* closes all VAS windows. NX generates continuous faults during
+* this time and the user space can not differentiate these
+* faults from the migration event. So reduce this time window
+* by closing VAS windows at the beginning of this function.
+*/
+   vas_migration_handler(VAS_SUSPEND);
+
ret = wait_for_vasi_session_suspending(handle);
if (ret)
-   return ret;
-
-   vas_migration_handler(VAS_SUSPEND);
+   goto out;
 
if (factor)
watchdog_nmi_set_timeout_pct(factor);
@@ -765,6 +773,7 @@ static int pseries_migrate_partition(u64 handle)
if (factor)
watchdog_nmi_set_timeout_pct(0);
 
+out:
vas_migration_handler(VAS_RESUME);
 
return ret;
-- 
2.26.3




[PATCH] powerpc/pseries/vas: sysfs comments with the correct entries

2022-04-09 Thread Haren Myneni


VAS entry is created as a misc device and the sysfs comments
should list the proper entries

Reported-by: Matheus Castanho 
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas-sysfs.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
index f3c58c309cff..e05d2bac8824 100644
--- a/arch/powerpc/platforms/pseries/vas-sysfs.c
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -74,26 +74,26 @@ struct vas_sysfs_entry {
 
 /*
  * Create sysfs interface:
- * /sys/devices/vas/vas0/gzip/default_capabilities
+ * /sys/devices/virtual/misc/vas/vas0/gzip/default_capabilities
  * This directory contains the following VAS GZIP capabilities
  * for the defaule credit type.
- * /sys/devices/vas/vas0/gzip/default_capabilities/nr_total_credits
+ * 
/sys/devices/virtual/misc/vas/vas0/gzip/default_capabilities/nr_total_credits
  * Total number of default credits assigned to the LPAR which
  * can be changed with DLPAR operation.
- * /sys/devices/vas/vas0/gzip/default_capabilities/nr_used_credits
+ * /sys/devices/virtual/misc/vas/vas0/gzip/default_capabilities/nr_used_credits
  * Number of credits used by the user space. One credit will
  * be assigned for each window open.
  *
- * /sys/devices/vas/vas0/gzip/qos_capabilities
+ * /sys/devices/virtual/misc/vas/vas0/gzip/qos_capabilities
  * This directory contains the following VAS GZIP capabilities
  * for the Quality of Service (QoS) credit type.
- * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_total_credits
+ * /sys/devices/virtual/misc/vas/vas0/gzip/qos_capabilities/nr_total_credits
  * Total number of QoS credits assigned to the LPAR. The user
  * has to define this value using HMC interface. It can be
  * changed dynamically by the user.
- * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_used_credits
+ * /sys/devices/virtual/misc/vas/vas0/gzip/qos_capabilities/nr_used_credits
  * Number of credits used by the user space.
- * /sys/devices/vas/vas0/gzip/qos_capabilities/update_total_credits
+ * 
/sys/devices/virtual/misc/vas/vas0/gzip/qos_capabilities/update_total_credits
  * Update total QoS credits dynamically
  */
 
-- 
2.27.0




[PATCH] powerpc/powernv/vas: Assign real address to rx_fifo in vas_rx_win_attr

2022-04-09 Thread Haren Myneni
In init_winctx_regs(), __pa() is called on winctx->rx_fifo and this
function is called to initialize registers for receive and fault
windows. But the real address is passed in winctx->rx_fifo for
receive windows and the virtual address for fault windows which
causes errors with DEBUG_VIRTUAL enabled. Fixes this issue by
assigning only real address to rx_fifo in vas_rx_win_attr struct
for both receive and fault windows.

Reported-by: Michael Ellerman 
Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h  | 2 +-
 arch/powerpc/platforms/powernv/vas-fault.c  | 2 +-
 arch/powerpc/platforms/powernv/vas-window.c | 4 ++--
 arch/powerpc/platforms/powernv/vas.h| 2 +-
 drivers/crypto/nx/nx-common-powernv.c   | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 83afcb6c194b..c36f71e01c0f 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -126,7 +126,7 @@ static inline void vas_user_win_add_mm_context(struct 
vas_user_win_ref *ref)
  * Receive window attributes specified by the (in-kernel) owner of window.
  */
 struct vas_rx_win_attr {
-   void *rx_fifo;
+   u64 rx_fifo;
int rx_fifo_size;
int wcreds_max;
 
diff --git a/arch/powerpc/platforms/powernv/vas-fault.c 
b/arch/powerpc/platforms/powernv/vas-fault.c
index a7aabc18039e..c1bfad56447d 100644
--- a/arch/powerpc/platforms/powernv/vas-fault.c
+++ b/arch/powerpc/platforms/powernv/vas-fault.c
@@ -216,7 +216,7 @@ int vas_setup_fault_window(struct vas_instance *vinst)
vas_init_rx_win_attr(, VAS_COP_TYPE_FAULT);
 
attr.rx_fifo_size = vinst->fault_fifo_size;
-   attr.rx_fifo = vinst->fault_fifo;
+   attr.rx_fifo = __pa(vinst->fault_fifo);
 
/*
 * Max creds is based on number of CRBs can fit in the FIFO.
diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index 0f8d39fbf2b2..0072682531d8 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -404,7 +404,7 @@ static void init_winctx_regs(struct pnv_vas_window *window,
 *
 * See also: Design note in function header.
 */
-   val = __pa(winctx->rx_fifo);
+   val = winctx->rx_fifo;
val = SET_FIELD(VAS_PAGE_MIGRATION_SELECT, val, 0);
write_hvwc_reg(window, VREG(LFIFO_BAR), val);
 
@@ -739,7 +739,7 @@ static void init_winctx_for_rxwin(struct pnv_vas_window 
*rxwin,
 */
winctx->fifo_disable = true;
winctx->intr_disable = true;
-   winctx->rx_fifo = NULL;
+   winctx->rx_fifo = 0;
}
 
winctx->lnotify_lpid = rxattr->lnotify_lpid;
diff --git a/arch/powerpc/platforms/powernv/vas.h 
b/arch/powerpc/platforms/powernv/vas.h
index 8bb08e395de0..08d9d3d5a22b 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -376,7 +376,7 @@ struct pnv_vas_window {
  * is a container for the register fields in the window context.
  */
 struct vas_winctx {
-   void *rx_fifo;
+   u64 rx_fifo;
int rx_fifo_size;
int wcreds_max;
int rsvd_txbuf_count;
diff --git a/drivers/crypto/nx/nx-common-powernv.c 
b/drivers/crypto/nx/nx-common-powernv.c
index 32a036ada5d0..f418817c0f43 100644
--- a/drivers/crypto/nx/nx-common-powernv.c
+++ b/drivers/crypto/nx/nx-common-powernv.c
@@ -827,7 +827,7 @@ static int __init vas_cfg_coproc_info(struct device_node 
*dn, int chip_id,
goto err_out;
 
vas_init_rx_win_attr(, coproc->ct);
-   rxattr.rx_fifo = (void *)rx_fifo;
+   rxattr.rx_fifo = rx_fifo;
rxattr.rx_fifo_size = fifo_size;
rxattr.lnotify_lpid = lpid;
rxattr.lnotify_pid = pid;
-- 
2.27.0




[PATCH] powerpc/pseries/vas: Use QoS credits from the userspace

2022-03-19 Thread Haren Myneni


The user can change the QoS credits dynamically with the
management console interface which notifies OS with sysfs. After
returning from the OS interface successfully, the management
console updates the hypervisor. Since the VAS capabilities in
the hypervisor is not updated when the OS gets the update,
the kernel is using the old total credits value from the
hypervisor. Fix this issue by using the new QoS credits
from the userspace instead of depending on VAS capabilities
from the hypervisor.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas-sysfs.c | 19 +-
 arch/powerpc/platforms/pseries/vas.c   | 23 +++---
 arch/powerpc/platforms/pseries/vas.h   |  2 +-
 3 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
index 4a7fcde5afc0..f3c58c309cff 100644
--- a/arch/powerpc/platforms/pseries/vas-sysfs.c
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -27,22 +27,31 @@ struct vas_caps_entry {
 
 /*
  * This function is used to get the notification from the drmgr when
- * QoS credits are changed. Though receiving the target total QoS
- * credits here, get the official QoS capabilities from the hypervisor.
+ * QoS credits are changed.
  */
-static ssize_t update_total_credits_trigger(struct vas_cop_feat_caps *caps,
+static ssize_t update_total_credits_store(struct vas_cop_feat_caps *caps,
const char *buf, size_t count)
 {
int err;
u16 creds;
 
err = kstrtou16(buf, 0, );
+   /*
+* The user space interface from the management console
+* notifies OS with the new QoS credits and then the
+* hypervisor. So OS has to use this new credits value
+* and reconfigure VAS windows (close or reopen depends
+* on the credits available) instead of depending on VAS
+* QoS capabilities from the hypervisor.
+*/
if (!err)
-   err = vas_reconfig_capabilties(caps->win_type);
+   err = vas_reconfig_capabilties(caps->win_type, creds);
 
if (err)
return -EINVAL;
 
+   pr_info("Set QoS total credits %u\n", creds);
+
return count;
 }
 
@@ -92,7 +101,7 @@ VAS_ATTR_RO(nr_total_credits);
 VAS_ATTR_RO(nr_used_credits);
 
 static struct vas_sysfs_entry update_total_credits_attribute =
-   __ATTR(update_total_credits, 0200, NULL, update_total_credits_trigger);
+   __ATTR(update_total_credits, 0200, NULL, update_total_credits_store);
 
 static struct attribute *vas_def_capab_attrs[] = {
_total_credits_attribute.attr,
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 1f59d78c77a1..ec643bbdb67f 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -779,10 +779,10 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds,
  * changes. Reconfig window configurations based on the credits
  * availability from this new capabilities.
  */
-int vas_reconfig_capabilties(u8 type)
+int vas_reconfig_capabilties(u8 type, int new_nr_creds)
 {
struct vas_cop_feat_caps *caps;
-   int old_nr_creds, new_nr_creds;
+   int old_nr_creds;
struct vas_caps *vcaps;
int rc = 0, nr_active_wins;
 
@@ -795,12 +795,6 @@ int vas_reconfig_capabilties(u8 type)
caps = >caps;
 
mutex_lock(_pseries_mutex);
-   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, vcaps->feat,
- (u64)virt_to_phys(_cop_caps));
-   if (rc)
-   goto out;
-
-   new_nr_creds = be16_to_cpu(hv_cop_caps.target_lpar_creds);
 
old_nr_creds = atomic_read(>nr_total_credits);
 
@@ -832,7 +826,6 @@ int vas_reconfig_capabilties(u8 type)
false);
}
 
-out:
mutex_unlock(_pseries_mutex);
return rc;
 }
@@ -850,7 +843,7 @@ static int pseries_vas_notifier(struct notifier_block *nb,
struct of_reconfig_data *rd = data;
struct device_node *dn = rd->dn;
const __be32 *intserv = NULL;
-   int len, rc = 0;
+   int new_nr_creds, len, rc = 0;
 
if ((action == OF_RECONFIG_ATTACH_NODE) ||
(action == OF_RECONFIG_DETACH_NODE))
@@ -862,7 +855,15 @@ static int pseries_vas_notifier(struct notifier_block *nb,
if (!intserv)
return NOTIFY_OK;
 
-   rc = vas_reconfig_capabilties(VAS_GZIP_DEF_FEAT_TYPE);
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
+   vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat,
+   (u64)virt_to_phys(_cop_caps));
+   if (!rc) {
+   new_nr_creds = be16_to_cpu(hv_cop_caps.target_lpar_creds);
+   rc = vas_reconfig_capabi

[PATCH v4 3/3] powerpc/pseries/vas: Add VAS migration handler

2022-03-02 Thread Haren Myneni
[Update: Included the build fix reported by kernel test robot ]

Since the VAS windows belong to the VAS hardware resource, the
hypervisor expects the partition to close them on source partition
and reopen them after the partition migrated on the destination
machine.

This handler is called before pseries_suspend() to close these
windows and again invoked after migration. All active windows
for both default and QoS types will be closed and mark them
inactive and reopened after migration with this handler.
During the migration, the user space receives paste instruction
failure if it issues copy/paste on these inactive windows.

The current migration implementation does not freeze the user
space and applications can continue to open VAS windows while
migration is in progress. So when the migration_in_progress flag
is set, VAS open window API returns -EBUSY.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/mobility.c |  5 ++
 arch/powerpc/platforms/pseries/vas.c  | 98 ++-
 arch/powerpc/platforms/pseries/vas.h  | 14 
 3 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 85033f392c78..70004243e25e 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include "pseries.h"
+#include "vas.h"   /* vas_migration_handler() */
 #include "../../kernel/cacheinfo.h"
 
 static struct kobject *mobility_kobj;
@@ -669,12 +670,16 @@ static int pseries_migrate_partition(u64 handle)
if (ret)
return ret;
 
+   vas_migration_handler(VAS_SUSPEND);
+
ret = pseries_suspend(handle);
if (ret == 0)
post_mobility_fixup();
else
pseries_cancel_migration(handle, ret);
 
+   vas_migration_handler(VAS_RESUME);
+
return ret;
 }
 
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index fbcf311da0ec..1f59d78c77a1 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -30,6 +30,7 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
+static bool migration_in_progress;
 
 static long hcall_return_busy_check(long rc)
 {
@@ -356,7 +357,10 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   rc = allocate_setup_window(txwin, (u64 *)[0],
+   if (migration_in_progress)
+   rc = -EBUSY;
+   else
+   rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
@@ -869,6 +873,98 @@ static struct notifier_block pseries_vas_nb = {
.notifier_call = pseries_vas_notifier,
 };
 
+/*
+ * For LPM, all windows have to be closed on the source partition
+ * before migration and reopen them on the destination partition
+ * after migration. So closing windows during suspend and
+ * reopen them during resume.
+ */
+int vas_migration_handler(int action)
+{
+   struct vas_cop_feat_caps *caps;
+   int old_nr_creds, new_nr_creds = 0;
+   struct vas_caps *vcaps;
+   int i, rc = 0;
+
+   /*
+* NX-GZIP is not enabled. Nothing to do for migration.
+*/
+   if (!copypaste_feat)
+   return rc;
+
+   mutex_lock(_pseries_mutex);
+
+   if (action == VAS_SUSPEND)
+   migration_in_progress = true;
+   else
+   migration_in_progress = false;
+
+   for (i = 0; i < VAS_MAX_FEAT_TYPE; i++) {
+   vcaps = [i];
+   caps = >caps;
+   old_nr_creds = atomic_read(>nr_total_credits);
+
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
+ vcaps->feat,
+ (u64)virt_to_phys(_cop_caps));
+   if (!rc) {
+   new_nr_creds = 
be16_to_cpu(hv_cop_caps.target_lpar_creds);
+   /*
+* Should not happen. But incase print messages, close
+* all windows in the list during suspend and reopen
+* windows based on new lpar_creds on the destination
+* system.
+*/
+   if (old_nr_creds != new_nr_creds) {
+   pr_err("Target credits mismatch with the 
hypervisor\n");
+   pr_err("state(%d): lpar creds: %d HV lpar 
creds: %d\n",
+   action, old_nr_creds, new_nr_creds);
+  

[PATCH v6 9/9] powerpc/pseries/vas: Add 'update_total_credits' entry for QoS capabilities

2022-02-28 Thread Haren Myneni


pseries supports two types of credits - Default (uses normal priority
FIFO) and Qality of service (QoS uses high priority FIFO). The user
decides the number of QoS credits and sets this value with HMC
interface. The total credits for QoS capabilities can be changed
dynamically with HMC interface which invokes drmgr to communicate
to the kernel.

This patch creats 'update_total_credits' entry for QoS capabilities
so that drmgr command can write the new target QoS credits in sysfs.
Instead of using this value, the kernel gets the new QoS capabilities
from the hypervisor whenever update_total_credits is updated to make
sure sync with the QoS target credits in the hypervisor.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas-sysfs.c | 54 +++---
 arch/powerpc/platforms/pseries/vas.c   |  2 +-
 arch/powerpc/platforms/pseries/vas.h   |  1 +
 3 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
index e24d3edb3021..4a7fcde5afc0 100644
--- a/arch/powerpc/platforms/pseries/vas-sysfs.c
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -25,6 +25,27 @@ struct vas_caps_entry {
 
 #define to_caps_entry(entry) container_of(entry, struct vas_caps_entry, kobj)
 
+/*
+ * This function is used to get the notification from the drmgr when
+ * QoS credits are changed. Though receiving the target total QoS
+ * credits here, get the official QoS capabilities from the hypervisor.
+ */
+static ssize_t update_total_credits_trigger(struct vas_cop_feat_caps *caps,
+   const char *buf, size_t count)
+{
+   int err;
+   u16 creds;
+
+   err = kstrtou16(buf, 0, );
+   if (!err)
+   err = vas_reconfig_capabilties(caps->win_type);
+
+   if (err)
+   return -EINVAL;
+
+   return count;
+}
+
 #define sysfs_caps_entry_read(_name)   \
 static ssize_t _name##_show(struct vas_cop_feat_caps *caps, char *buf) 
\
 {  \
@@ -63,17 +84,29 @@ struct vas_sysfs_entry {
  * changed dynamically by the user.
  * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_used_credits
  * Number of credits used by the user space.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/update_total_credits
+ * Update total QoS credits dynamically
  */
 
 VAS_ATTR_RO(nr_total_credits);
 VAS_ATTR_RO(nr_used_credits);
 
-static struct attribute *vas_capab_attrs[] = {
+static struct vas_sysfs_entry update_total_credits_attribute =
+   __ATTR(update_total_credits, 0200, NULL, update_total_credits_trigger);
+
+static struct attribute *vas_def_capab_attrs[] = {
_total_credits_attribute.attr,
_used_credits_attribute.attr,
NULL,
 };
 
+static struct attribute *vas_qos_capab_attrs[] = {
+   _total_credits_attribute.attr,
+   _used_credits_attribute.attr,
+   _total_credits_attribute.attr,
+   NULL,
+};
+
 static ssize_t vas_type_show(struct kobject *kobj, struct attribute *attr,
 char *buf)
 {
@@ -118,19 +151,29 @@ static const struct sysfs_ops vas_sysfs_ops = {
.store  =   vas_type_store,
 };
 
-static struct kobj_type vas_attr_type = {
+static struct kobj_type vas_def_attr_type = {
.release=   vas_type_release,
.sysfs_ops  =   _sysfs_ops,
-   .default_attrs  =   vas_capab_attrs,
+   .default_attrs  =   vas_def_capab_attrs,
 };
 
-static char *vas_caps_kobj_name(struct vas_cop_feat_caps *caps,
+static struct kobj_type vas_qos_attr_type = {
+   .release=   vas_type_release,
+   .sysfs_ops  =   _sysfs_ops,
+   .default_attrs  =   vas_qos_capab_attrs,
+};
+
+static char *vas_caps_kobj_name(struct vas_caps_entry *centry,
struct kobject **kobj)
 {
+   struct vas_cop_feat_caps *caps = centry->caps;
+
if (caps->descriptor == VAS_GZIP_QOS_CAPABILITIES) {
+   kobject_init(>kobj, _qos_attr_type);
*kobj = gzip_caps_kobj;
return "qos_capabilities";
} else if (caps->descriptor == VAS_GZIP_DEFAULT_CAPABILITIES) {
+   kobject_init(>kobj, _def_attr_type);
*kobj = gzip_caps_kobj;
return "default_capabilities";
} else
@@ -152,9 +195,8 @@ int sysfs_add_vas_caps(struct vas_cop_feat_caps *caps)
if (!centry)
return -ENOMEM;
 
-   kobject_init(>kobj, _attr_type);
centry->caps = caps;
-   name  = vas_caps_kobj_name(caps, );
+   name  = vas_caps_kobj_name(centry, );
 
if (kobj) {
ret = kobject_add(>kobj, kobj, "%s", name);
diff --git a/arch/powerpc/platforms/ps

[PATCH v6 8/9] powerpc/pseries/vas: sysfs interface to export capabilities

2022-02-28 Thread Haren Myneni


The hypervisor provides the available VAS GZIP capabilities such
as default or QoS window type and the target available credits in
each type. This patch creates sysfs entries and exports the target,
used and the available credits for each feature.

This interface can be used by the user space to determine the credits
usage or to set the target credits in the case of QoS type (for DLPAR).

/sys/devices/vas/vas0/gzip/default_capabilities (default GZIP capabilities)
nr_total_credits /* Total credits available. Can be
 /* changed with DLPAR operation */
nr_used_credits  /* Used credits */

/sys/devices/vas/vas0/gzip/qos_capabilities (QoS GZIP capabilities)
nr_total_credits
nr_used_credits

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/Makefile|   2 +-
 arch/powerpc/platforms/pseries/vas-sysfs.c | 226 +
 arch/powerpc/platforms/pseries/vas.c   |   6 +
 arch/powerpc/platforms/pseries/vas.h   |   6 +
 4 files changed, 239 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/pseries/vas-sysfs.c

diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index ee60b59024b4..29b522d2c755 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -29,6 +29,6 @@ obj-$(CONFIG_PPC_SVM) += svm.o
 obj-$(CONFIG_FA_DUMP)  += rtas-fadump.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
-obj-$(CONFIG_PPC_VAS)  += vas.o
+obj-$(CONFIG_PPC_VAS)  += vas.o vas-sysfs.o
 
 obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
new file mode 100644
index ..e24d3edb3021
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -0,0 +1,226 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright 2022-23 IBM Corp.
+ */
+
+#define pr_fmt(fmt) "vas: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vas.h"
+
+#ifdef CONFIG_SYSFS
+static struct kobject *pseries_vas_kobj;
+static struct kobject *gzip_caps_kobj;
+
+struct vas_caps_entry {
+   struct kobject kobj;
+   struct vas_cop_feat_caps *caps;
+};
+
+#define to_caps_entry(entry) container_of(entry, struct vas_caps_entry, kobj)
+
+#define sysfs_caps_entry_read(_name)   \
+static ssize_t _name##_show(struct vas_cop_feat_caps *caps, char *buf) 
\
+{  \
+   return sprintf(buf, "%d\n", atomic_read(>_name)); \
+}
+
+struct vas_sysfs_entry {
+   struct attribute attr;
+   ssize_t (*show)(struct vas_cop_feat_caps *, char *);
+   ssize_t (*store)(struct vas_cop_feat_caps *, const char *, size_t);
+};
+
+#define VAS_ATTR_RO(_name) \
+   sysfs_caps_entry_read(_name);   \
+   static struct vas_sysfs_entry _name##_attribute = __ATTR(_name, \
+   0444, _name##_show, NULL);
+
+/*
+ * Create sysfs interface:
+ * /sys/devices/vas/vas0/gzip/default_capabilities
+ * This directory contains the following VAS GZIP capabilities
+ * for the defaule credit type.
+ * /sys/devices/vas/vas0/gzip/default_capabilities/nr_total_credits
+ * Total number of default credits assigned to the LPAR which
+ * can be changed with DLPAR operation.
+ * /sys/devices/vas/vas0/gzip/default_capabilities/nr_used_credits
+ * Number of credits used by the user space. One credit will
+ * be assigned for each window open.
+ *
+ * /sys/devices/vas/vas0/gzip/qos_capabilities
+ * This directory contains the following VAS GZIP capabilities
+ * for the Quality of Service (QoS) credit type.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_total_credits
+ * Total number of QoS credits assigned to the LPAR. The user
+ * has to define this value using HMC interface. It can be
+ * changed dynamically by the user.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_used_credits
+ * Number of credits used by the user space.
+ */
+
+VAS_ATTR_RO(nr_total_credits);
+VAS_ATTR_RO(nr_used_credits);
+
+static struct attribute *vas_capab_attrs[] = {
+   _total_credits_attribute.attr,
+   _used_credits_attribute.attr,
+   NULL,
+};
+
+static ssize_t vas_type_show(struct kobject *kobj, struct attribute *attr,
+char *buf)
+{
+   struct vas_caps_entry *centry;
+   struct vas_cop_feat_caps *caps;
+   struct vas_sysfs_entry *entry;
+
+   centry = to_caps_entry(kobj);
+   caps = centry->caps;
+   entry = container_of(attr, struct vas_sysfs_entry, attr);
+
+   if (!entry->show)
+   return -EIO;
+
+   return entry->show(caps, buf);
+}
+
+static ssize_t vas_type_store(st

[PATCH v6 7/9] powerpc/pseries/vas: Reopen windows with DLPAR core add

2022-02-28 Thread Haren Myneni


VAS windows can be closed in the hypervisor due to lost credits
when the core is removed and the kernel gets fault for NX
requests on these inactive windows. If the NX requests are
issued on these inactive windows, OS gets page faults and the
paste failure will be returned to the user space. If the lost
credits are available later with core add, reopen these windows
and set them active. Later when the OS sees page faults on these
active windows, it creates mapping on the new paste address.
Then the user space can continue to use these windows and send
HW compression requests to NX successfully.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 91 +++-
 1 file changed, 90 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index a297720bcdae..96178dd58adf 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -565,6 +565,88 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
return 0;
 }
 
+/*
+ * VAS windows can be closed due to lost credits when the core is
+ * removed. So reopen them if credits are available due to DLPAR
+ * core add and set the window active status. When NX sees the page
+ * fault on the unmapped paste address, the kernel handles the fault
+ * by setting the remapping to new paste address if the window is
+ * active.
+ */
+static int reconfig_open_windows(struct vas_caps *vcaps, int creds)
+{
+   long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
+   struct vas_cop_feat_caps *caps = >caps;
+   struct pseries_vas_window *win = NULL, *tmp;
+   int rc, mv_ents = 0;
+
+   /*
+* Nothing to do if there are no closed windows.
+*/
+   if (!vcaps->nr_close_wins)
+   return 0;
+
+   /*
+* For the core removal, the hypervisor reduces the credits
+* assigned to the LPAR and the kernel closes VAS windows
+* in the hypervisor depends on reduced credits. The kernel
+* uses LIFO (the last windows that are opened will be closed
+* first) and expects to open in the same order when credits
+* are available.
+* For example, 40 windows are closed when the LPAR lost 2 cores
+* (dedicated). If 1 core is added, this LPAR can have 20 more
+* credits. It means the kernel can reopen 20 windows. So move
+* 20 entries in the VAS windows lost and reopen next 20 windows.
+*/
+   if (vcaps->nr_close_wins > creds)
+   mv_ents = vcaps->nr_close_wins - creds;
+
+   list_for_each_entry_safe(win, tmp, >list, win_list) {
+   if (!mv_ents)
+   break;
+
+   mv_ents--;
+   }
+
+   list_for_each_entry_safe_from(win, tmp, >list, win_list) {
+   /*
+* Nothing to do on this window if it is not closed
+* with VAS_WIN_NO_CRED_CLOSE
+*/
+   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE))
+   continue;
+
+   rc = allocate_setup_window(win, (u64 *)[0],
+  caps->win_type);
+   if (rc)
+   return rc;
+
+   rc = h_modify_vas_window(win);
+   if (rc)
+   goto out;
+
+   mutex_lock(>vas_win.task_ref.mmap_mutex);
+   /*
+* Set window status to active
+*/
+   win->vas_win.status &= ~VAS_WIN_NO_CRED_CLOSE;
+   mutex_unlock(>vas_win.task_ref.mmap_mutex);
+   win->win_type = caps->win_type;
+   if (!--vcaps->nr_close_wins)
+   break;
+   }
+
+   return 0;
+out:
+   /*
+* Window modify HCALL failed. So close the window to the
+* hypervisor and return.
+*/
+   free_irq_setup(win);
+   h_deallocate_vas_window(win->vas_win.winid);
+   return rc;
+}
+
 /*
  * The hypervisor reduces the available credits if the LPAR lost core. It
  * means the excessive windows should not be active and the user space
@@ -673,7 +755,14 @@ static int vas_reconfig_capabilties(u8 type)
 * closed / reopened. Hold the vas_pseries_mutex so that the
 * the user space can not open new windows.
 */
-   if (old_nr_creds >  new_nr_creds) {
+   if (old_nr_creds <  new_nr_creds) {
+   /*
+* If the existing target credits is less than the new
+* target, reopen windows if they are closed due to
+* the previous DLPAR (core removal).
+*/
+   rc = reconfig_open_windows(vcaps, new_nr_creds - old_nr_creds);
+   } else {
/*
 * # active windows is more than new LPAR availa

[PATCH v6 6/9] powerpc/pseries/vas: Close windows with DLPAR core removal

2022-02-28 Thread Haren Myneni


The hypervisor assigns vas credits (windows) for each LPAR based
on the number of cores configured in that system. The OS is
expected to release credits when cores are removed, and may
allocate more when cores are added. So there is a possibility of
using excessive credits (windows) in the LPAR and the hypervisor
expects the system to close the excessive windows so that NX load
can be equally distributed across all LPARs in the system.

When the OS closes the excessive windows in the hypervisor,
it sets the window status inactive and invalidates window
virtual address mapping. The user space receives paste instruction
failure if any NX requests are issued on the inactive window.
Then the user space can use with the available open windows or
retry NX requests until this window active again.

This patch also adds the notifier for core removal/add to close
windows in the hypervisor if the system lost credits (core
removal) and reopen windows in the hypervisor when the previously
lost credits are available.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h   |   2 +
 arch/powerpc/platforms/pseries/vas.c | 207 +--
 arch/powerpc/platforms/pseries/vas.h |   3 +
 3 files changed, 204 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 27251af18c65..6baf7b9ffed4 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -34,6 +34,8 @@
  */
 #define VAS_WIN_ACTIVE 0x0 /* Used in platform independent */
/* vas mmap() */
+/* Window is closed in the hypervisor due to lost credit */
+#define VAS_WIN_NO_CRED_CLOSE  0x0001
 
 /*
  * Get/Set bit fields
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 1035446f985b..a297720bcdae 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -370,13 +370,28 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
if (rc)
goto out_free;
 
-   vas_user_win_add_mm_context(>vas_win.task_ref);
txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
-   list_add(>win_list, >list);
+   /*
+* Possible to lose the acquired credit with DLPAR core
+* removal after the window is opened. So if there are any
+* closed windows (means with lost credits), do not give new
+* window to user space. New windows will be opened only
+* after the existing windows are reopened when credits are
+* available.
+*/
+   if (!caps->nr_close_wins) {
+   list_add(>win_list, >list);
+   caps->nr_open_windows++;
+   mutex_unlock(_pseries_mutex);
+   vas_user_win_add_mm_context(>vas_win.task_ref);
+   return >vas_win;
+   }
mutex_unlock(_pseries_mutex);
 
-   return >vas_win;
+   put_vas_user_win_ref(>vas_win.task_ref);
+   rc = -EBUSY;
+   pr_err("No credit is available to allocate window\n");
 
 out_free:
/*
@@ -439,14 +454,24 @@ static int vas_deallocate_window(struct vas_window *vwin)
 
caps = [win->win_type].caps;
mutex_lock(_pseries_mutex);
-   rc = deallocate_free_window(win);
-   if (rc) {
-   mutex_unlock(_pseries_mutex);
-   return rc;
-   }
+   /*
+* VAS window is already closed in the hypervisor when
+* lost the credit. So just remove the entry from
+* the list, remove task references and free vas_window
+* struct.
+*/
+   if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) {
+   rc = deallocate_free_window(win);
+   if (rc) {
+   mutex_unlock(_pseries_mutex);
+   return rc;
+   }
+   } else
+   vascaps[win->win_type].nr_close_wins--;
 
list_del(>win_list);
atomic_dec(>nr_used_credits);
+   vascaps[win->win_type].nr_open_windows--;
mutex_unlock(_pseries_mutex);
 
put_vas_user_win_ref(>task_ref);
@@ -501,6 +526,7 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
memset(vcaps, 0, sizeof(*vcaps));
INIT_LIST_HEAD(>list);
 
+   vcaps->feat = feat;
caps = >caps;
 
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, feat,
@@ -539,6 +565,168 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
return 0;
 }
 
+/*
+ * The hypervisor reduces the available credits if the LPAR lost core. It
+ * means the excessive windows should not be active and the user space
+ * should not be using these windows to send compression requests to NX.
+ * So the kernel closes the excessive windows and unmap the pas

[PATCH v6 5/9] powerpc/vas: Map paste address only if window is active

2022-02-28 Thread Haren Myneni


The paste address mapping is done with mmap() after the window is
opened with ioctl. The partition has to close VAS windows in the
hypervisor if it lost credits due to DLPAR core removal. But the
kernel marks these windows inactive until the previously lost
credits are available later. If the window is inactive due to
DLPAR after this mmap(), the paste instruction returns failure
until the the OS reopens this window again.

Before the user space issuing mmap(), there is a possibility of
happening DLPAR core removal event which causes the corresponding
window inactive. So if the window is not active, return mmap()
failure with -EACCES and expects the user space reissue mmap()
when the window is active or open a new window when the credit
is available.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/book3s/vas-api.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 82f32781c5d2..f9a1615b74da 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -497,10 +497,29 @@ static int coproc_mmap(struct file *fp, struct 
vm_area_struct *vma)
return -EACCES;
}
 
+   /*
+* The initial mmap is done after the window is opened
+* with ioctl. But before mmap(), this window can be closed in
+* the hypervisor due to lost credit (core removal on pseries).
+* So if the window is not active, return mmap() failure with
+* -EACCES and expects the user space reissue mmap() when it
+* is active again or open new window when the credit is available.
+* mmap_mutex protects the paste address mmap() with DLPAR
+* close/open event and allows mmap() only when the window is
+* active.
+*/
+   mutex_lock(>task_ref.mmap_mutex);
+   if (txwin->status != VAS_WIN_ACTIVE) {
+   pr_err("%s(): Window is not active\n", __func__);
+   rc = -EACCES;
+   goto out;
+   }
+
paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
if (!paste_addr) {
pr_err("%s(): Window paste address failed\n", __func__);
-   return -EINVAL;
+   rc = -EINVAL;
+   goto out;
}
 
pfn = paste_addr >> PAGE_SHIFT;
@@ -520,6 +539,8 @@ static int coproc_mmap(struct file *fp, struct 
vm_area_struct *vma)
txwin->task_ref.vma = vma;
vma->vm_ops = _vm_ops;
 
+out:
+   mutex_unlock(>task_ref.mmap_mutex);
return rc;
 }
 
-- 
2.27.0




[PATCH v6 4/9] powerpc/vas: Return paste instruction failure if no active window

2022-02-28 Thread Haren Myneni


The VAS window may not be active if the system looses credits and
the NX generates page fault when it receives request on unmap
paste address.

The kernel handles the fault by remap new paste address if the
window is active again, Otherwise return the paste instruction
failure if the executed instruction that caused the fault was
a paste.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/ppc-opcode.h   |  2 +
 arch/powerpc/platforms/book3s/vas-api.c | 54 +
 2 files changed, 56 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 9675303b724e..82f1f0041c6f 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -262,6 +262,8 @@
 #define PPC_INST_MFSPR_PVR 0x7c1f42a6
 #define PPC_INST_MFSPR_PVR_MASK0xfc1e
 #define PPC_INST_MTMSRD0x7c000164
+#define PPC_INST_PASTE 0x7c20070d
+#define PPC_INST_PASTE_MASK0xfc2007ff
 #define PPC_INST_POPCNTB   0x7cf4
 #define PPC_INST_POPCNTB_MASK  0xfc0007fe
 #define PPC_INST_RFEBB 0x4c000124
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 217b4a624d09..82f32781c5d2 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -351,6 +351,41 @@ static int coproc_release(struct inode *inode, struct file 
*fp)
return 0;
 }
 
+/*
+ * If the executed instruction that caused the fault was a paste, then
+ * clear regs CR0[EQ], advance NIP, and return 0. Else return error code.
+ */
+static int do_fail_paste(void)
+{
+   struct pt_regs *regs = current->thread.regs;
+   u32 instword;
+
+   if (WARN_ON_ONCE(!regs))
+   return -EINVAL;
+
+   if (WARN_ON_ONCE(!user_mode(regs)))
+   return -EINVAL;
+
+   /*
+* If we couldn't translate the instruction, the driver should
+* return success without handling the fault, it will be retried
+* or the instruction fetch will fault.
+*/
+   if (get_user(instword, (u32 __user *)(regs->nip)))
+   return -EAGAIN;
+
+   /*
+* Not a paste instruction, driver may fail the fault.
+*/
+   if ((instword & PPC_INST_PASTE_MASK) != PPC_INST_PASTE)
+   return -ENOENT;
+
+   regs->ccr &= ~0xe000;   /* Clear CR0[0-2] to fail paste */
+   regs_add_return_ip(regs, 4);/* Emulate the paste */
+
+   return 0;
+}
+
 /*
  * This fault handler is invoked when the core generates page fault on
  * the paste address. Happens if the kernel closes window in hypervisor
@@ -364,6 +399,7 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
struct vas_window *txwin;
vm_fault_t fault;
u64 paste_addr;
+   int ret;
 
/*
 * window is not opened. Shouldn't expect this error.
@@ -408,6 +444,24 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
}
mutex_unlock(>task_ref.mmap_mutex);
 
+   /*
+* Received this fault due to closing the actual window.
+* It can happen during migration or lost credits.
+* Since no mapping, return the paste instruction failure
+* to the user space.
+*/
+   ret = do_fail_paste();
+   /*
+* The user space can retry several times until success (needed
+* for migration) or should fallback to SW compression or
+* manage with the existing open windows if available.
+* Looking at sysfs interface, it can determine whether these
+* failures are coming during migration or core removal:
+* nr_used_credits > nr_total_credits when lost credits
+*/
+   if (!ret || (ret == -EAGAIN))
+   return VM_FAULT_NOPAGE;
+
return VM_FAULT_SIGBUS;
 }
 
-- 
2.27.0




[PATCH v6 3/9] powerpc/vas: Add paste address mmap fault handler

2022-02-28 Thread Haren Myneni


The user space opens VAS windows and issues NX requests by pasting
CRB on the corresponding paste address mmap. When the system lost
credits due to core removal, the kernel has to close the window in
the hypervisor and make the window inactive by unmapping this paste
address. Also the OS has to handle NX request page faults if the user
space issue NX requests.

This handler maps the new paste address with the same VMA when the
window is active again (due to core add with DLPAR). Otherwise
returns paste failure.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h  | 10 
 arch/powerpc/platforms/book3s/vas-api.c | 68 +
 2 files changed, 78 insertions(+)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 57573d9c1e09..27251af18c65 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -29,6 +29,12 @@
 #define VAS_THRESH_FIFO_GT_QTR_FULL2
 #define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
 
+/*
+ * VAS window Linux status bits
+ */
+#define VAS_WIN_ACTIVE 0x0 /* Used in platform independent */
+   /* vas mmap() */
+
 /*
  * Get/Set bit fields
  */
@@ -59,6 +65,9 @@ struct vas_user_win_ref {
struct pid *pid;/* PID of owner */
struct pid *tgid;   /* Thread group ID of owner */
struct mm_struct *mm;   /* Linux process mm_struct */
+   struct mutex mmap_mutex;/* protects paste address mmap() */
+   /* with DLPAR close/open windows */
+   struct vm_area_struct *vma; /* Save VMA and used in DLPAR ops */
 };
 
 /*
@@ -67,6 +76,7 @@ struct vas_user_win_ref {
 struct vas_window {
u32 winid;
u32 wcreds_max; /* Window credits */
+   u32 status; /* Window status used in OS */
enum vas_cop_type cop;
struct vas_user_win_ref task_ref;
char *dbgname;
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 4d82c92ddd52..217b4a624d09 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -316,6 +316,7 @@ static int coproc_ioc_tx_win_open(struct file *fp, unsigned 
long arg)
return PTR_ERR(txwin);
}
 
+   mutex_init(>task_ref.mmap_mutex);
cp_inst->txwin = txwin;
 
return 0;
@@ -350,6 +351,70 @@ static int coproc_release(struct inode *inode, struct file 
*fp)
return 0;
 }
 
+/*
+ * This fault handler is invoked when the core generates page fault on
+ * the paste address. Happens if the kernel closes window in hypervisor
+ * (on pseries) due to lost credit or the paste address is not mapped.
+ */
+static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
+{
+   struct vm_area_struct *vma = vmf->vma;
+   struct file *fp = vma->vm_file;
+   struct coproc_instance *cp_inst = fp->private_data;
+   struct vas_window *txwin;
+   vm_fault_t fault;
+   u64 paste_addr;
+
+   /*
+* window is not opened. Shouldn't expect this error.
+*/
+   if (!cp_inst || !cp_inst->txwin) {
+   pr_err("%s(): Unexpected fault on paste address with TX window 
closed\n",
+   __func__);
+   return VM_FAULT_SIGBUS;
+   }
+
+   txwin = cp_inst->txwin;
+   /*
+* When the LPAR lost credits due to core removal or during
+* migration, invalidate the existing mapping for the current
+* paste addresses and set windows in-active (zap_page_range in
+* reconfig_close_windows()).
+* New mapping will be done later after migration or new credits
+* available. So continue to receive faults if the user space
+* issue NX request.
+*/
+   if (txwin->task_ref.vma != vmf->vma) {
+   pr_err("%s(): No previous mapping with paste address\n",
+   __func__);
+   return VM_FAULT_SIGBUS;
+   }
+
+   mutex_lock(>task_ref.mmap_mutex);
+   /*
+* The window may be inactive due to lost credit (Ex: core
+* removal with DLPAR). If the window is active again when
+* the credit is available, map the new paste address at the
+* the window virtual address.
+*/
+   if (txwin->status == VAS_WIN_ACTIVE) {
+   paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
+   if (paste_addr) {
+   fault = vmf_insert_pfn(vma, vma->vm_start,
+   (paste_addr >> PAGE_SHIFT));
+   mutex_unlock(>task_ref.mmap_mutex);
+   return fault;
+   }
+   }
+   mutex_unlock(>task_ref.mmap_mutex);
+
+   return VM_FAULT_SIGBUS;
+}
+
+static const struct vm_operations_struct 

[PATCH v6 2/9] powerpc/pseries/vas: Save PID in pseries_vas_window struct

2022-02-28 Thread Haren Myneni


The kernel sets the VAS window with PID when it is opened in
the hypervisor. During DLPAR operation, windows can be closed and
reopened in the hypervisor when the credit is available. So saves
this PID in pseries_vas_window struct when the window is opened
initially and reuse it later during DLPAR operation.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/vas.c | 9 +
 arch/powerpc/platforms/pseries/vas.h | 1 +
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 18aae037ffe9..1035446f985b 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -107,7 +107,6 @@ static int h_deallocate_vas_window(u64 winid)
 static int h_modify_vas_window(struct pseries_vas_window *win)
 {
long rc;
-   u32 lpid = mfspr(SPRN_PID);
 
/*
 * AMR value is not supported in Linux VAS implementation.
@@ -115,7 +114,7 @@ static int h_modify_vas_window(struct pseries_vas_window 
*win)
 */
do {
rc = plpar_hcall_norets(H_MODIFY_VAS_WINDOW,
-   win->vas_win.winid, lpid, 0,
+   win->vas_win.winid, win->pid, 0,
VAS_MOD_WIN_FLAGS, 0);
 
rc = hcall_return_busy_check(rc);
@@ -124,8 +123,8 @@ static int h_modify_vas_window(struct pseries_vas_window 
*win)
if (rc == H_SUCCESS)
return 0;
 
-   pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u lpid %u\n",
-   rc, win->vas_win.winid, lpid);
+   pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u pid %u\n",
+   rc, win->vas_win.winid, win->pid);
return -EIO;
 }
 
@@ -338,6 +337,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
}
}
 
+   txwin->pid = mfspr(SPRN_PID);
+
/*
 * Allocate / Deallocate window hcalls and setup / free IRQs
 * have to be protected with mutex.
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index d6ea8ab8b07a..2872532ed72a 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -114,6 +114,7 @@ struct pseries_vas_window {
u64 domain[6];  /* Associativity domain Ids */
/* this window is allocated */
u64 util;
+   u32 pid;/* PID associated with this window */
 
/* List of windows opened which is used for LPM */
struct list_head win_list;
-- 
2.27.0




[PATCH v6 1/9] powerpc/pseries/vas: Use common names in VAS capability structure

2022-02-28 Thread Haren Myneni


nr_total/nr_used_credits provides credits usage to user space
via sysfs and the same interface can be used on PowerNV in
future. Changed with proper naming so that applicable on both
pseries and PowerNV.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/vas.c | 10 +-
 arch/powerpc/platforms/pseries/vas.h |  5 ++---
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index d243ddc58827..18aae037ffe9 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -310,8 +310,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 
cop_feat_caps = >caps;
 
-   if (atomic_inc_return(_feat_caps->used_lpar_creds) >
-   atomic_read(_feat_caps->target_lpar_creds)) {
+   if (atomic_inc_return(_feat_caps->nr_used_credits) >
+   atomic_read(_feat_caps->nr_total_credits)) {
pr_err("Credits are not available to allocate window\n");
rc = -EINVAL;
goto out;
@@ -385,7 +385,7 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
 out:
-   atomic_dec(_feat_caps->used_lpar_creds);
+   atomic_dec(_feat_caps->nr_used_credits);
kfree(txwin);
return ERR_PTR(rc);
 }
@@ -445,7 +445,7 @@ static int vas_deallocate_window(struct vas_window *vwin)
}
 
list_del(>win_list);
-   atomic_dec(>used_lpar_creds);
+   atomic_dec(>nr_used_credits);
mutex_unlock(_pseries_mutex);
 
put_vas_user_win_ref(>task_ref);
@@ -521,7 +521,7 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
}
caps->max_lpar_creds = be16_to_cpu(hv_caps->max_lpar_creds);
caps->max_win_creds = be16_to_cpu(hv_caps->max_win_creds);
-   atomic_set(>target_lpar_creds,
+   atomic_set(>nr_total_credits,
   be16_to_cpu(hv_caps->target_lpar_creds));
if (feat == VAS_GZIP_DEF_FEAT) {
caps->def_lpar_creds = be16_to_cpu(hv_caps->def_lpar_creds);
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index 4ecb3fcabd10..d6ea8ab8b07a 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -72,9 +72,8 @@ struct vas_cop_feat_caps {
};
/* Total LPAR available credits. Can be different from max LPAR */
/* credits due to DLPAR operation */
-   atomic_ttarget_lpar_creds;
-   atomic_tused_lpar_creds; /* Used credits so far */
-   u16 avail_lpar_creds; /* Remaining available credits */
+   atomic_tnr_total_credits;   /* Total credits assigned to 
LPAR */
+   atomic_tnr_used_credits;/* Used credits so far */
 };
 
 /*
-- 
2.27.0




[PATCH v6 0/9] powerpc/pseries/vas: NXGZIP support with DLPAR

2022-02-28 Thread Haren Myneni


PowerPC provides HW compression with NX coprocessor. This feature
is available on both PowerNV and PowerVM and included in Linux.
Since each powerpc chip has one NX coprocessor, the VAS introduces
the concept of windows / credits to manage access to this hardware
resource. On powerVM, these limited resources should be available
across all LPARs. So the hypervisor assigns the specific credits
to each LPAR based on processor entitlement so that one LPAR does
not overload NX. The hypervisor can reject the window open request
to a partition if exceeds its credit limit (1 credit per window).

So the total number of target credits in a partition can be changed
if the core configuration is modified. The hypervisor expects the
partition to modify its window usage depends on new target
credits. For example, if the partition uses more credits than the
new target credits, it should close the excessive windows so that
the NX resource will be available to other partitions.

This patch series enables OS to support this dynamic credit
management with DLPAR core removal/add.

Core removal operation:
- Get new VAS capabilities from the hypervisor when the DLPAR
  notifier is received. This capabilities provides the new target
  credits based on new processor entitlement. In the case of QoS
  credit changes, the notification will be issued by updating
  the target_creds via sysfs.
- If the partition is already used more than the new target credits,
  the kernel selects windows, unmap the current paste address and
  close them in the hypervisor, It uses FIFO to identify these
  windows - last windows that are opened are the first ones to be
  closed.
- When the user space issue requests on these windows, NX generates
  page fault on the unmap paste address. The kernel handles the
  fault by returning the paste instruction failure if the window is
  not active (means unmap paste). Then up to the library / user
  space to fall back to SW compression or manage with the current
  windows.

Core add operation:
- The kernel can see increased target credits from the new VAS
  capabilities.
- Scans the window list for the closed windows in the hypervisor
  due to lost credit before and selects windows based on same FIFO.
- Make these corresponding windows active and create remap with
  the same VMA on the new paste address in the fault handler.
- Then the user space should expect paste successful later.

Patch 1: Define common names for sysfs target/used/avail_creds so
 that same sysfs entries can be used even on PowerNV later.
Patch 2: Save PID in the vas window struct  during initial window
 open and use it when reopen later.
Patch 3: Add new mmap fault handler which handles the page fault
 from NX on paste address.
Patch 4: Return the paste instruction failure if the window is not
 active.
Patch 5: If the window is closed in the hypervisor before the user
 space issue the initial mmap(), return -EACCES failure.
Patch 6: Close windows in the hypervisor when the partition exceeds
 its usage than the new target credits.
Patch 7: When credits are available, reopen windows that are closed
 before with core removal.
Patch 8 & 9: The user space determines the credit usage with sysfs
 nr_total/nr_used_credits interfaces. drmgr uses
 update_total_credits to notify OS for QoS credit changes.

Thanks to Nicholas Piggin and Aneesh Kumar for the valuable suggestions
on the NXGZIP design to support DLPAR operations.

Changes in v2:
- Rebase on 5.16-rc5
- Use list safe functions to iterate windows list
- Changes to show the actual value in sysfs used_credits even though
  some windows are inactive with core removal. Reflects -ve value in
  sysfs avail_creds to let userspace know that it opened more windows
  than the current maximum LPAR credits.

Changes in v3:
- Rebase on 5.16
- Reconfigure VAS windows only for CPU hotplug events.

Changes in v4:
- Rebase on 5.17-rc4
- Changes based on comments from Nicholas Piggin
- Included VAS DLPAR notifer code in 'Close windows with DLPAR'
  patch instead of as a separate patch
- Patches reordering and other changes

Changes in v5:
- Rebase on 5.17-rc5
- Add update_total_credits sysfs entry to update QoS target credits
  and other commit descriptions as suggested by Nicholas Piggin

Changed in v6:
- Build fix in "Add paste address mmap fault handler" patch
  as reported by kernel test robot 

Haren Myneni (9):
  powerpc/pseries/vas: Use common names in VAS capability structure
  powerpc/pseries/vas: Save PID in pseries_vas_window struct
  powerpc/vas: Add paste address mmap fault handler
  powerpc/vas: Return paste instruction failure if no active window
  powerpc/vas: Map paste address only if window is active
  powerpc/pseries/vas: Close windows with DLPAR core removal
  powerpc/pseries/vas: Reopen windows with DLPAR core add
  powerpc/pseries/vas: sysfs interface to export capabilities
  powerpc/ps

[PATCH v5 7/9] powerpc/pseries/vas: Reopen windows with DLPAR core add

2022-02-27 Thread Haren Myneni


VAS windows can be closed in the hypervisor due to lost credits
when the core is removed and the kernel gets fault for NX
requests on these inactive windows. If the NX requests are
issued on these inactive windows, OS gets page faults and the
paste failure will be returned to the user space. If the lost
credits are available later with core add, reopen these windows
and set them active. Later when the OS sees page faults on these
active windows, it creates mapping on the new paste address.
Then the user space can continue to use these windows and send
HW compression requests to NX successfully.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 91 +++-
 1 file changed, 90 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index a297720bcdae..96178dd58adf 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -565,6 +565,88 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
return 0;
 }
 
+/*
+ * VAS windows can be closed due to lost credits when the core is
+ * removed. So reopen them if credits are available due to DLPAR
+ * core add and set the window active status. When NX sees the page
+ * fault on the unmapped paste address, the kernel handles the fault
+ * by setting the remapping to new paste address if the window is
+ * active.
+ */
+static int reconfig_open_windows(struct vas_caps *vcaps, int creds)
+{
+   long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
+   struct vas_cop_feat_caps *caps = >caps;
+   struct pseries_vas_window *win = NULL, *tmp;
+   int rc, mv_ents = 0;
+
+   /*
+* Nothing to do if there are no closed windows.
+*/
+   if (!vcaps->nr_close_wins)
+   return 0;
+
+   /*
+* For the core removal, the hypervisor reduces the credits
+* assigned to the LPAR and the kernel closes VAS windows
+* in the hypervisor depends on reduced credits. The kernel
+* uses LIFO (the last windows that are opened will be closed
+* first) and expects to open in the same order when credits
+* are available.
+* For example, 40 windows are closed when the LPAR lost 2 cores
+* (dedicated). If 1 core is added, this LPAR can have 20 more
+* credits. It means the kernel can reopen 20 windows. So move
+* 20 entries in the VAS windows lost and reopen next 20 windows.
+*/
+   if (vcaps->nr_close_wins > creds)
+   mv_ents = vcaps->nr_close_wins - creds;
+
+   list_for_each_entry_safe(win, tmp, >list, win_list) {
+   if (!mv_ents)
+   break;
+
+   mv_ents--;
+   }
+
+   list_for_each_entry_safe_from(win, tmp, >list, win_list) {
+   /*
+* Nothing to do on this window if it is not closed
+* with VAS_WIN_NO_CRED_CLOSE
+*/
+   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE))
+   continue;
+
+   rc = allocate_setup_window(win, (u64 *)[0],
+  caps->win_type);
+   if (rc)
+   return rc;
+
+   rc = h_modify_vas_window(win);
+   if (rc)
+   goto out;
+
+   mutex_lock(>vas_win.task_ref.mmap_mutex);
+   /*
+* Set window status to active
+*/
+   win->vas_win.status &= ~VAS_WIN_NO_CRED_CLOSE;
+   mutex_unlock(>vas_win.task_ref.mmap_mutex);
+   win->win_type = caps->win_type;
+   if (!--vcaps->nr_close_wins)
+   break;
+   }
+
+   return 0;
+out:
+   /*
+* Window modify HCALL failed. So close the window to the
+* hypervisor and return.
+*/
+   free_irq_setup(win);
+   h_deallocate_vas_window(win->vas_win.winid);
+   return rc;
+}
+
 /*
  * The hypervisor reduces the available credits if the LPAR lost core. It
  * means the excessive windows should not be active and the user space
@@ -673,7 +755,14 @@ static int vas_reconfig_capabilties(u8 type)
 * closed / reopened. Hold the vas_pseries_mutex so that the
 * the user space can not open new windows.
 */
-   if (old_nr_creds >  new_nr_creds) {
+   if (old_nr_creds <  new_nr_creds) {
+   /*
+* If the existing target credits is less than the new
+* target, reopen windows if they are closed due to
+* the previous DLPAR (core removal).
+*/
+   rc = reconfig_open_windows(vcaps, new_nr_creds - old_nr_creds);
+   } else {
/*
 * # active windows is more than new LPAR availa

[PATCH v5 4/9] powerpc/vas: Return paste instruction failure if no active window

2022-02-27 Thread Haren Myneni


The VAS window may not be active if the system looses credits and
the NX generates page fault when it receives request on unmap
paste address.

The kernel handles the fault by remap new paste address if the
window is active again, Otherwise return the paste instruction
failure if the executed instruction that caused the fault was
a paste.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/ppc-opcode.h   |  2 +
 arch/powerpc/platforms/book3s/vas-api.c | 55 -
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 9675303b724e..82f1f0041c6f 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -262,6 +262,8 @@
 #define PPC_INST_MFSPR_PVR 0x7c1f42a6
 #define PPC_INST_MFSPR_PVR_MASK0xfc1e
 #define PPC_INST_MTMSRD0x7c000164
+#define PPC_INST_PASTE 0x7c20070d
+#define PPC_INST_PASTE_MASK0xfc2007ff
 #define PPC_INST_POPCNTB   0x7cf4
 #define PPC_INST_POPCNTB_MASK  0xfc0007fe
 #define PPC_INST_RFEBB 0x4c000124
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index f359e7b2bf90..f3e421511ea6 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -351,6 +351,41 @@ static int coproc_release(struct inode *inode, struct file 
*fp)
return 0;
 }
 
+/*
+ * If the executed instruction that caused the fault was a paste, then
+ * clear regs CR0[EQ], advance NIP, and return 0. Else return error code.
+ */
+static int do_fail_paste(void)
+{
+   struct pt_regs *regs = current->thread.regs;
+   u32 instword;
+
+   if (WARN_ON_ONCE(!regs))
+   return -EINVAL;
+
+   if (WARN_ON_ONCE(!user_mode(regs)))
+   return -EINVAL;
+
+   /*
+* If we couldn't translate the instruction, the driver should
+* return success without handling the fault, it will be retried
+* or the instruction fetch will fault.
+*/
+   if (get_user(instword, (u32 __user *)(regs->nip)))
+   return -EAGAIN;
+
+   /*
+* Not a paste instruction, driver may fail the fault.
+*/
+   if ((instword & PPC_INST_PASTE_MASK) != PPC_INST_PASTE)
+   return -ENOENT;
+
+   regs->ccr &= ~0xe000;   /* Clear CR0[0-2] to fail paste */
+   regs_add_return_ip(regs, 4);/* Emulate the paste */
+
+   return 0;
+}
+
 /*
  * This fault handler is invoked when the core generates page fault on
  * the paste address. Happens if the kernel closes window in hypervisor
@@ -408,9 +443,27 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
}
mutex_unlock(>task_ref.mmap_mutex);
 
-   return VM_FAULT_SIGBUS;
+   /*
+* Received this fault due to closing the actual window.
+* It can happen during migration or lost credits.
+* Since no mapping, return the paste instruction failure
+* to the user space.
+*/
+   ret = do_fail_paste();
+   /*
+* The user space can retry several times until success (needed
+* for migration) or should fallback to SW compression or
+* manage with the existing open windows if available.
+* Looking at sysfs interface, it can determine whether these
+* failures are coming during migration or core removal:
+* nr_used_credits > nr_total_credits when lost credits
+*/
+   if (!ret || (ret == -EAGAIN))
+   return VM_FAULT_NOPAGE;
 
+   return VM_FAULT_SIGBUS;
 }
+
 static const struct vm_operations_struct vas_vm_ops = {
.fault = vas_mmap_fault,
 };
-- 
2.27.0




[PATCH v4 3/3] powerpc/pseries/vas: Add VAS migration handler

2022-02-27 Thread Haren Myneni


Since the VAS windows belong to the VAS hardware resource, the
hypervisor expects the partition to close them on source partition
and reopen them after the partition migrated on the destination
machine.

This handler is called before pseries_suspend() to close these
windows and again invoked after migration. All active windows
for both default and QoS types will be closed and mark them
inactive and reopened after migration with this handler.
During the migration, the user space receives paste instruction
failure if it issues copy/paste on these inactive windows.

The current migration implementation does not freeze the user
space and applications can continue to open VAS windows while
migration is in progress. So when the migration_in_progress flag
is set, VAS open window API returns -EBUSY.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/mobility.c |  5 ++
 arch/powerpc/platforms/pseries/vas.c  | 98 ++-
 arch/powerpc/platforms/pseries/vas.h  |  6 ++
 3 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 85033f392c78..70004243e25e 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include "pseries.h"
+#include "vas.h"   /* vas_migration_handler() */
 #include "../../kernel/cacheinfo.h"
 
 static struct kobject *mobility_kobj;
@@ -669,12 +670,16 @@ static int pseries_migrate_partition(u64 handle)
if (ret)
return ret;
 
+   vas_migration_handler(VAS_SUSPEND);
+
ret = pseries_suspend(handle);
if (ret == 0)
post_mobility_fixup();
else
pseries_cancel_migration(handle, ret);
 
+   vas_migration_handler(VAS_RESUME);
+
return ret;
 }
 
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index fbcf311da0ec..1f59d78c77a1 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -30,6 +30,7 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
+static bool migration_in_progress;
 
 static long hcall_return_busy_check(long rc)
 {
@@ -356,7 +357,10 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   rc = allocate_setup_window(txwin, (u64 *)[0],
+   if (migration_in_progress)
+   rc = -EBUSY;
+   else
+   rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
@@ -869,6 +873,98 @@ static struct notifier_block pseries_vas_nb = {
.notifier_call = pseries_vas_notifier,
 };
 
+/*
+ * For LPM, all windows have to be closed on the source partition
+ * before migration and reopen them on the destination partition
+ * after migration. So closing windows during suspend and
+ * reopen them during resume.
+ */
+int vas_migration_handler(int action)
+{
+   struct vas_cop_feat_caps *caps;
+   int old_nr_creds, new_nr_creds = 0;
+   struct vas_caps *vcaps;
+   int i, rc = 0;
+
+   /*
+* NX-GZIP is not enabled. Nothing to do for migration.
+*/
+   if (!copypaste_feat)
+   return rc;
+
+   mutex_lock(_pseries_mutex);
+
+   if (action == VAS_SUSPEND)
+   migration_in_progress = true;
+   else
+   migration_in_progress = false;
+
+   for (i = 0; i < VAS_MAX_FEAT_TYPE; i++) {
+   vcaps = [i];
+   caps = >caps;
+   old_nr_creds = atomic_read(>nr_total_credits);
+
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
+ vcaps->feat,
+ (u64)virt_to_phys(_cop_caps));
+   if (!rc) {
+   new_nr_creds = 
be16_to_cpu(hv_cop_caps.target_lpar_creds);
+   /*
+* Should not happen. But incase print messages, close
+* all windows in the list during suspend and reopen
+* windows based on new lpar_creds on the destination
+* system.
+*/
+   if (old_nr_creds != new_nr_creds) {
+   pr_err("Target credits mismatch with the 
hypervisor\n");
+   pr_err("state(%d): lpar creds: %d HV lpar 
creds: %d\n",
+   action, old_nr_creds, new_nr_creds);
+   pr_err("Used creds: %d, Ac

[PATCH v4 2/3] powerpc/pseries/vas: Modify reconfig open/close functions for migration

2022-02-27 Thread Haren Myneni


VAS is a hardware engine stays on the chip. So when the partition
migrates, all VAS windows on the source system have to be closed
and reopen them on the destination after migration.

The kernel has to consider both DLPAR CPU and migration events to
take action on VAS windows. So using VAS_WIN_NO_CRED_CLOSE and
VAS_WIN_MIGRATE_CLOSE status bits and windows will be reopened
after migration only after both status bits are cleared.

This patch make changes to the current reconfig_open/close_windows
functions to support migration:
- Set VAS_WIN_MIGRATE_CLOSE to the window status when closes and
  reopen windows with the same status during resume.
- Continue to close all windows even if deallocate HCALL failed
  (should not happen) since no way to stop migration with the
  current LPM implementation.
- If the DLPAR CPU event happens while migration is in progress,
  set VAS_WIN_NO_CRED_CLOSE to the window status. Close window
  happens with the first event (migration or DLPAR) and Reopen
  window happens only with the last event (migration or DLPAR).

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h   |  2 +
 arch/powerpc/platforms/pseries/vas.c | 88 ++--
 2 files changed, 73 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 6baf7b9ffed4..83afcb6c194b 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -36,6 +36,8 @@
/* vas mmap() */
 /* Window is closed in the hypervisor due to lost credit */
 #define VAS_WIN_NO_CRED_CLOSE  0x0001
+/* Window is closed due to migration */
+#define VAS_WIN_MIGRATE_CLOSE  0x0002
 
 /*
  * Get/Set bit fields
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 3bb219f54806..fbcf311da0ec 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -457,11 +457,12 @@ static int vas_deallocate_window(struct vas_window *vwin)
mutex_lock(_pseries_mutex);
/*
 * VAS window is already closed in the hypervisor when
-* lost the credit. So just remove the entry from
-* the list, remove task references and free vas_window
+* lost the credit or with migration. So just remove the entry
+* from the list, remove task references and free vas_window
 * struct.
 */
-   if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) {
+   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) &&
+   !(win->vas_win.status & VAS_WIN_MIGRATE_CLOSE)) {
rc = deallocate_free_window(win);
if (rc) {
mutex_unlock(_pseries_mutex);
@@ -578,12 +579,14 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
  * by setting the remapping to new paste address if the window is
  * active.
  */
-static int reconfig_open_windows(struct vas_caps *vcaps, int creds)
+static int reconfig_open_windows(struct vas_caps *vcaps, int creds,
+bool migrate)
 {
long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
struct vas_cop_feat_caps *caps = >caps;
struct pseries_vas_window *win = NULL, *tmp;
int rc, mv_ents = 0;
+   int flag;
 
/*
 * Nothing to do if there are no closed windows.
@@ -602,8 +605,10 @@ static int reconfig_open_windows(struct vas_caps *vcaps, 
int creds)
 * (dedicated). If 1 core is added, this LPAR can have 20 more
 * credits. It means the kernel can reopen 20 windows. So move
 * 20 entries in the VAS windows lost and reopen next 20 windows.
+* For partition migration, reopen all windows that are closed
+* during resume.
 */
-   if (vcaps->nr_close_wins > creds)
+   if ((vcaps->nr_close_wins > creds) && !migrate)
mv_ents = vcaps->nr_close_wins - creds;
 
list_for_each_entry_safe(win, tmp, >list, win_list) {
@@ -613,12 +618,35 @@ static int reconfig_open_windows(struct vas_caps *vcaps, 
int creds)
mv_ents--;
}
 
+   /*
+* Open windows if they are closed only with migration or
+* DLPAR (lost credit) before.
+*/
+   if (migrate)
+   flag = VAS_WIN_MIGRATE_CLOSE;
+   else
+   flag = VAS_WIN_NO_CRED_CLOSE;
+
list_for_each_entry_safe_from(win, tmp, >list, win_list) {
+   /*
+* This window is closed with DLPAR and migration events.
+* So reopen the window with the last event.
+* The user space is not suspended with the current
+* migration notifier. So the user space can issue DLPAR
+* CPU hotplug while migration in progress. In this case
+* this window will be opened with

[PATCH v4 1/3] powerpc/pseries/vas: Define global hv_cop_caps struct

2022-02-27 Thread Haren Myneni


The coprocessor capabilities struct is used to get default and
QoS capabilities from the hypervisor during init, DLPAR event and
migration. So instead of allocating this struct for each event,
define global struct and reuse it which allows the migration code
to avoid adding an error path.

Also disable copy/paste feature flag if any capabilities HCALL
is failed.

Signed-off-by: Haren Myneni 
Acked-by: Nathan Lynch 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/vas.c | 47 
 1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 591c7597db5a..3bb219f54806 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -26,6 +26,7 @@
 
 static struct vas_all_caps caps_all;
 static bool copypaste_feat;
+static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
@@ -724,7 +725,6 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds)
  */
 int vas_reconfig_capabilties(u8 type)
 {
-   struct hv_vas_cop_feat_caps *hv_caps;
struct vas_cop_feat_caps *caps;
int old_nr_creds, new_nr_creds;
struct vas_caps *vcaps;
@@ -738,17 +738,13 @@ int vas_reconfig_capabilties(u8 type)
vcaps = [type];
caps = >caps;
 
-   hv_caps = kmalloc(sizeof(*hv_caps), GFP_KERNEL);
-   if (!hv_caps)
-   return -ENOMEM;
-
mutex_lock(_pseries_mutex);
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, vcaps->feat,
- (u64)virt_to_phys(hv_caps));
+ (u64)virt_to_phys(_cop_caps));
if (rc)
goto out;
 
-   new_nr_creds = be16_to_cpu(hv_caps->target_lpar_creds);
+   new_nr_creds = be16_to_cpu(hv_cop_caps.target_lpar_creds);
 
old_nr_creds = atomic_read(>nr_total_credits);
 
@@ -780,7 +776,6 @@ int vas_reconfig_capabilties(u8 type)
 
 out:
mutex_unlock(_pseries_mutex);
-   kfree(hv_caps);
return rc;
 }
 /*
@@ -822,9 +817,8 @@ static struct notifier_block pseries_vas_nb = {
 
 static int __init pseries_vas_init(void)
 {
-   struct hv_vas_cop_feat_caps *hv_cop_caps;
struct hv_vas_all_caps *hv_caps;
-   int rc;
+   int rc = 0;
 
/*
 * Linux supports user space COPY/PASTE only with Radix
@@ -850,38 +844,37 @@ static int __init pseries_vas_init(void)
 
sysfs_pseries_vas_init(_all);
 
-   hv_cop_caps = kmalloc(sizeof(*hv_cop_caps), GFP_KERNEL);
-   if (!hv_cop_caps) {
-   rc = -ENOMEM;
-   goto out;
-   }
/*
 * QOS capabilities available
 */
if (caps_all.feat_type & VAS_GZIP_QOS_FEAT_BIT) {
rc = get_vas_capabilities(VAS_GZIP_QOS_FEAT,
- VAS_GZIP_QOS_FEAT_TYPE, hv_cop_caps);
+ VAS_GZIP_QOS_FEAT_TYPE, _cop_caps);
 
if (rc)
-   goto out_cop;
+   goto out;
}
/*
 * Default capabilities available
 */
-   if (caps_all.feat_type & VAS_GZIP_DEF_FEAT_BIT) {
+   if (caps_all.feat_type & VAS_GZIP_DEF_FEAT_BIT)
rc = get_vas_capabilities(VAS_GZIP_DEF_FEAT,
- VAS_GZIP_DEF_FEAT_TYPE, hv_cop_caps);
-   if (rc)
-   goto out_cop;
-   }
+ VAS_GZIP_DEF_FEAT_TYPE, _cop_caps);
 
-   if (copypaste_feat && firmware_has_feature(FW_FEATURE_LPAR))
-   of_reconfig_notifier_register(_vas_nb);
+   if (!rc && copypaste_feat) {
+   if (firmware_has_feature(FW_FEATURE_LPAR))
+   of_reconfig_notifier_register(_vas_nb);
 
-   pr_info("GZIP feature is available\n");
+   pr_info("GZIP feature is available\n");
+   } else {
+   /*
+* Should not happen, but only when get default
+* capabilities HCALL failed. So disable copy paste
+* feature.
+*/
+   copypaste_feat = false;
+   }
 
-out_cop:
-   kfree(hv_cop_caps);
 out:
kfree(hv_caps);
return rc;
-- 
2.27.0




[PATCH v5 7/9] powerpc/pseries/vas: Reopen windows with DLPAR core add

2022-02-27 Thread Haren Myneni


VAS windows can be closed in the hypervisor due to lost credits
when the core is removed and the kernel gets fault for NX
requests on these inactive windows. If the NX requests are
issued on these inactive windows, OS gets page faults and the
paste failure will be returned to the user space. If the lost
credits are available later with core add, reopen these windows
and set them active. Later when the OS sees page faults on these
active windows, it creates mapping on the new paste address.
Then the user space can continue to use these windows and send
HW compression requests to NX successfully.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 91 +++-
 1 file changed, 90 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index a297720bcdae..96178dd58adf 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -565,6 +565,88 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
return 0;
 }
 
+/*
+ * VAS windows can be closed due to lost credits when the core is
+ * removed. So reopen them if credits are available due to DLPAR
+ * core add and set the window active status. When NX sees the page
+ * fault on the unmapped paste address, the kernel handles the fault
+ * by setting the remapping to new paste address if the window is
+ * active.
+ */
+static int reconfig_open_windows(struct vas_caps *vcaps, int creds)
+{
+   long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
+   struct vas_cop_feat_caps *caps = >caps;
+   struct pseries_vas_window *win = NULL, *tmp;
+   int rc, mv_ents = 0;
+
+   /*
+* Nothing to do if there are no closed windows.
+*/
+   if (!vcaps->nr_close_wins)
+   return 0;
+
+   /*
+* For the core removal, the hypervisor reduces the credits
+* assigned to the LPAR and the kernel closes VAS windows
+* in the hypervisor depends on reduced credits. The kernel
+* uses LIFO (the last windows that are opened will be closed
+* first) and expects to open in the same order when credits
+* are available.
+* For example, 40 windows are closed when the LPAR lost 2 cores
+* (dedicated). If 1 core is added, this LPAR can have 20 more
+* credits. It means the kernel can reopen 20 windows. So move
+* 20 entries in the VAS windows lost and reopen next 20 windows.
+*/
+   if (vcaps->nr_close_wins > creds)
+   mv_ents = vcaps->nr_close_wins - creds;
+
+   list_for_each_entry_safe(win, tmp, >list, win_list) {
+   if (!mv_ents)
+   break;
+
+   mv_ents--;
+   }
+
+   list_for_each_entry_safe_from(win, tmp, >list, win_list) {
+   /*
+* Nothing to do on this window if it is not closed
+* with VAS_WIN_NO_CRED_CLOSE
+*/
+   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE))
+   continue;
+
+   rc = allocate_setup_window(win, (u64 *)[0],
+  caps->win_type);
+   if (rc)
+   return rc;
+
+   rc = h_modify_vas_window(win);
+   if (rc)
+   goto out;
+
+   mutex_lock(>vas_win.task_ref.mmap_mutex);
+   /*
+* Set window status to active
+*/
+   win->vas_win.status &= ~VAS_WIN_NO_CRED_CLOSE;
+   mutex_unlock(>vas_win.task_ref.mmap_mutex);
+   win->win_type = caps->win_type;
+   if (!--vcaps->nr_close_wins)
+   break;
+   }
+
+   return 0;
+out:
+   /*
+* Window modify HCALL failed. So close the window to the
+* hypervisor and return.
+*/
+   free_irq_setup(win);
+   h_deallocate_vas_window(win->vas_win.winid);
+   return rc;
+}
+
 /*
  * The hypervisor reduces the available credits if the LPAR lost core. It
  * means the excessive windows should not be active and the user space
@@ -673,7 +755,14 @@ static int vas_reconfig_capabilties(u8 type)
 * closed / reopened. Hold the vas_pseries_mutex so that the
 * the user space can not open new windows.
 */
-   if (old_nr_creds >  new_nr_creds) {
+   if (old_nr_creds <  new_nr_creds) {
+   /*
+* If the existing target credits is less than the new
+* target, reopen windows if they are closed due to
+* the previous DLPAR (core removal).
+*/
+   rc = reconfig_open_windows(vcaps, new_nr_creds - old_nr_creds);
+   } else {
/*
 * # active windows is more than new LPAR availa

[PATCH v4 0/3] powerpc/pseries/vas: VAS/NXGZIP support with LPM

2022-02-27 Thread Haren Myneni


Virtual Accelerator Switchboard (VAS) is an engine stays on the
chip. So all windows opened on a specific engine belongs to VAS
the chip. The hypervisor expects the partition to close all
active windows on the sources system and reopen them after
migration on the destination machine.

This patch series adds VAS support with the partition migration.
When the migration initiates, the VAS migration handler will be
invoked before pseries_suspend() to close all active windows and
mark them in-active with VAS_WIN_MIGRATE_CLOSE status. Whereas
this migration handler is called after migration to reopen all
windows which has VAS_WIN_MIGRATE_CLOSE status and make them
active again. The user space gets paste instruction failure
when it sends requests on these in-active windows.

These patches depend on VAS/DLPAR support patch series

Changes in v2:
- Added new patch "Define global hv_cop_caps struct" to eliminate
  memory allocation failure during migration (suggestion by
  Nathan Lynch)

Changes in v3:
- Rebase on 5.17-rc4
- Naming changes for VAS capability struct elemets based on the V4 DLPAR
  support patch series.

Changes in v4:
- Rebase on 5.17-rc5
- Include migration_in_progress enable patch in "VAS migration handler"
  and other changes as suggested by Nicholas Piggin 

Haren Myneni (3):
  powerpc/pseries/vas: Define global hv_cop_caps struct
  powerpc/pseries/vas: Modify reconfig open/close functions for
migration
  powerpc/pseries/vas: Add VAS migration handler

 arch/powerpc/include/asm/vas.h|   2 +
 arch/powerpc/platforms/pseries/mobility.c |   5 +
 arch/powerpc/platforms/pseries/vas.c  | 233 +-
 arch/powerpc/platforms/pseries/vas.h  |   6 +
 4 files changed, 201 insertions(+), 45 deletions(-)

-- 
2.27.0




[PATCH v5 9/9] powerpc/pseries/vas: Add 'update_total_credits' entry for QoS capabilities

2022-02-27 Thread Haren Myneni


pseries supports two types of credits - Default (uses normal priority
FIFO) and Qality of service (QoS uses high priority FIFO). The user
decides the number of QoS credits and sets this value with HMC
interface. The total credits for QoS capabilities can be changed
dynamically with HMC interface which invokes drmgr to communicate
to the kernel.

This patch creats 'update_total_credits' entry for QoS capabilities
so that drmgr command can write the new target QoS credits in sysfs.
Instead of using this value, the kernel gets the new QoS capabilities
from the hypervisor whenever update_total_credits is updated to make
sure sync with the QoS target credits in the hypervisor.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas-sysfs.c | 54 +++---
 arch/powerpc/platforms/pseries/vas.c   |  2 +-
 arch/powerpc/platforms/pseries/vas.h   |  1 +
 3 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
index e24d3edb3021..4a7fcde5afc0 100644
--- a/arch/powerpc/platforms/pseries/vas-sysfs.c
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -25,6 +25,27 @@ struct vas_caps_entry {
 
 #define to_caps_entry(entry) container_of(entry, struct vas_caps_entry, kobj)
 
+/*
+ * This function is used to get the notification from the drmgr when
+ * QoS credits are changed. Though receiving the target total QoS
+ * credits here, get the official QoS capabilities from the hypervisor.
+ */
+static ssize_t update_total_credits_trigger(struct vas_cop_feat_caps *caps,
+   const char *buf, size_t count)
+{
+   int err;
+   u16 creds;
+
+   err = kstrtou16(buf, 0, );
+   if (!err)
+   err = vas_reconfig_capabilties(caps->win_type);
+
+   if (err)
+   return -EINVAL;
+
+   return count;
+}
+
 #define sysfs_caps_entry_read(_name)   \
 static ssize_t _name##_show(struct vas_cop_feat_caps *caps, char *buf) 
\
 {  \
@@ -63,17 +84,29 @@ struct vas_sysfs_entry {
  * changed dynamically by the user.
  * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_used_credits
  * Number of credits used by the user space.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/update_total_credits
+ * Update total QoS credits dynamically
  */
 
 VAS_ATTR_RO(nr_total_credits);
 VAS_ATTR_RO(nr_used_credits);
 
-static struct attribute *vas_capab_attrs[] = {
+static struct vas_sysfs_entry update_total_credits_attribute =
+   __ATTR(update_total_credits, 0200, NULL, update_total_credits_trigger);
+
+static struct attribute *vas_def_capab_attrs[] = {
_total_credits_attribute.attr,
_used_credits_attribute.attr,
NULL,
 };
 
+static struct attribute *vas_qos_capab_attrs[] = {
+   _total_credits_attribute.attr,
+   _used_credits_attribute.attr,
+   _total_credits_attribute.attr,
+   NULL,
+};
+
 static ssize_t vas_type_show(struct kobject *kobj, struct attribute *attr,
 char *buf)
 {
@@ -118,19 +151,29 @@ static const struct sysfs_ops vas_sysfs_ops = {
.store  =   vas_type_store,
 };
 
-static struct kobj_type vas_attr_type = {
+static struct kobj_type vas_def_attr_type = {
.release=   vas_type_release,
.sysfs_ops  =   _sysfs_ops,
-   .default_attrs  =   vas_capab_attrs,
+   .default_attrs  =   vas_def_capab_attrs,
 };
 
-static char *vas_caps_kobj_name(struct vas_cop_feat_caps *caps,
+static struct kobj_type vas_qos_attr_type = {
+   .release=   vas_type_release,
+   .sysfs_ops  =   _sysfs_ops,
+   .default_attrs  =   vas_qos_capab_attrs,
+};
+
+static char *vas_caps_kobj_name(struct vas_caps_entry *centry,
struct kobject **kobj)
 {
+   struct vas_cop_feat_caps *caps = centry->caps;
+
if (caps->descriptor == VAS_GZIP_QOS_CAPABILITIES) {
+   kobject_init(>kobj, _qos_attr_type);
*kobj = gzip_caps_kobj;
return "qos_capabilities";
} else if (caps->descriptor == VAS_GZIP_DEFAULT_CAPABILITIES) {
+   kobject_init(>kobj, _def_attr_type);
*kobj = gzip_caps_kobj;
return "default_capabilities";
} else
@@ -152,9 +195,8 @@ int sysfs_add_vas_caps(struct vas_cop_feat_caps *caps)
if (!centry)
return -ENOMEM;
 
-   kobject_init(>kobj, _attr_type);
centry->caps = caps;
-   name  = vas_caps_kobj_name(caps, );
+   name  = vas_caps_kobj_name(centry, );
 
if (kobj) {
ret = kobject_add(>kobj, kobj, "%s", name);
diff --git a/arch/powerpc/platforms/ps

[PATCH v5 8/9] powerpc/pseries/vas: sysfs interface to export capabilities

2022-02-27 Thread Haren Myneni


The hypervisor provides the available VAS GZIP capabilities such
as default or QoS window type and the target available credits in
each type. This patch creates sysfs entries and exports the target,
used and the available credits for each feature.

This interface can be used by the user space to determine the credits
usage or to set the target credits in the case of QoS type (for DLPAR).

/sys/devices/vas/vas0/gzip/default_capabilities (default GZIP capabilities)
nr_total_credits /* Total credits available. Can be
 /* changed with DLPAR operation */
nr_used_credits  /* Used credits */

/sys/devices/vas/vas0/gzip/qos_capabilities (QoS GZIP capabilities)
nr_total_credits
nr_used_credits

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/Makefile|   2 +-
 arch/powerpc/platforms/pseries/vas-sysfs.c | 226 +
 arch/powerpc/platforms/pseries/vas.c   |   6 +
 arch/powerpc/platforms/pseries/vas.h   |   6 +
 4 files changed, 239 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/pseries/vas-sysfs.c

diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index ee60b59024b4..29b522d2c755 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -29,6 +29,6 @@ obj-$(CONFIG_PPC_SVM) += svm.o
 obj-$(CONFIG_FA_DUMP)  += rtas-fadump.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
-obj-$(CONFIG_PPC_VAS)  += vas.o
+obj-$(CONFIG_PPC_VAS)  += vas.o vas-sysfs.o
 
 obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
new file mode 100644
index ..e24d3edb3021
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -0,0 +1,226 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright 2022-23 IBM Corp.
+ */
+
+#define pr_fmt(fmt) "vas: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vas.h"
+
+#ifdef CONFIG_SYSFS
+static struct kobject *pseries_vas_kobj;
+static struct kobject *gzip_caps_kobj;
+
+struct vas_caps_entry {
+   struct kobject kobj;
+   struct vas_cop_feat_caps *caps;
+};
+
+#define to_caps_entry(entry) container_of(entry, struct vas_caps_entry, kobj)
+
+#define sysfs_caps_entry_read(_name)   \
+static ssize_t _name##_show(struct vas_cop_feat_caps *caps, char *buf) 
\
+{  \
+   return sprintf(buf, "%d\n", atomic_read(>_name)); \
+}
+
+struct vas_sysfs_entry {
+   struct attribute attr;
+   ssize_t (*show)(struct vas_cop_feat_caps *, char *);
+   ssize_t (*store)(struct vas_cop_feat_caps *, const char *, size_t);
+};
+
+#define VAS_ATTR_RO(_name) \
+   sysfs_caps_entry_read(_name);   \
+   static struct vas_sysfs_entry _name##_attribute = __ATTR(_name, \
+   0444, _name##_show, NULL);
+
+/*
+ * Create sysfs interface:
+ * /sys/devices/vas/vas0/gzip/default_capabilities
+ * This directory contains the following VAS GZIP capabilities
+ * for the defaule credit type.
+ * /sys/devices/vas/vas0/gzip/default_capabilities/nr_total_credits
+ * Total number of default credits assigned to the LPAR which
+ * can be changed with DLPAR operation.
+ * /sys/devices/vas/vas0/gzip/default_capabilities/nr_used_credits
+ * Number of credits used by the user space. One credit will
+ * be assigned for each window open.
+ *
+ * /sys/devices/vas/vas0/gzip/qos_capabilities
+ * This directory contains the following VAS GZIP capabilities
+ * for the Quality of Service (QoS) credit type.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_total_credits
+ * Total number of QoS credits assigned to the LPAR. The user
+ * has to define this value using HMC interface. It can be
+ * changed dynamically by the user.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_used_credits
+ * Number of credits used by the user space.
+ */
+
+VAS_ATTR_RO(nr_total_credits);
+VAS_ATTR_RO(nr_used_credits);
+
+static struct attribute *vas_capab_attrs[] = {
+   _total_credits_attribute.attr,
+   _used_credits_attribute.attr,
+   NULL,
+};
+
+static ssize_t vas_type_show(struct kobject *kobj, struct attribute *attr,
+char *buf)
+{
+   struct vas_caps_entry *centry;
+   struct vas_cop_feat_caps *caps;
+   struct vas_sysfs_entry *entry;
+
+   centry = to_caps_entry(kobj);
+   caps = centry->caps;
+   entry = container_of(attr, struct vas_sysfs_entry, attr);
+
+   if (!entry->show)
+   return -EIO;
+
+   return entry->show(caps, buf);
+}
+
+static ssize_t vas_type_store(st

[PATCH v5 6/9] powerpc/pseries/vas: Close windows with DLPAR core removal

2022-02-27 Thread Haren Myneni


The hypervisor assigns vas credits (windows) for each LPAR based
on the number of cores configured in that system. The OS is
expected to release credits when cores are removed, and may
allocate more when cores are added. So there is a possibility of
using excessive credits (windows) in the LPAR and the hypervisor
expects the system to close the excessive windows so that NX load
can be equally distributed across all LPARs in the system.

When the OS closes the excessive windows in the hypervisor,
it sets the window status inactive and invalidates window
virtual address mapping. The user space receives paste instruction
failure if any NX requests are issued on the inactive window.
Then the user space can use with the available open windows or
retry NX requests until this window active again.

This patch also adds the notifier for core removal/add to close
windows in the hypervisor if the system lost credits (core
removal) and reopen windows in the hypervisor when the previously
lost credits are available.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h   |   2 +
 arch/powerpc/platforms/pseries/vas.c | 207 +--
 arch/powerpc/platforms/pseries/vas.h |   3 +
 3 files changed, 204 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 27251af18c65..6baf7b9ffed4 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -34,6 +34,8 @@
  */
 #define VAS_WIN_ACTIVE 0x0 /* Used in platform independent */
/* vas mmap() */
+/* Window is closed in the hypervisor due to lost credit */
+#define VAS_WIN_NO_CRED_CLOSE  0x0001
 
 /*
  * Get/Set bit fields
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 1035446f985b..a297720bcdae 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -370,13 +370,28 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
if (rc)
goto out_free;
 
-   vas_user_win_add_mm_context(>vas_win.task_ref);
txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
-   list_add(>win_list, >list);
+   /*
+* Possible to lose the acquired credit with DLPAR core
+* removal after the window is opened. So if there are any
+* closed windows (means with lost credits), do not give new
+* window to user space. New windows will be opened only
+* after the existing windows are reopened when credits are
+* available.
+*/
+   if (!caps->nr_close_wins) {
+   list_add(>win_list, >list);
+   caps->nr_open_windows++;
+   mutex_unlock(_pseries_mutex);
+   vas_user_win_add_mm_context(>vas_win.task_ref);
+   return >vas_win;
+   }
mutex_unlock(_pseries_mutex);
 
-   return >vas_win;
+   put_vas_user_win_ref(>vas_win.task_ref);
+   rc = -EBUSY;
+   pr_err("No credit is available to allocate window\n");
 
 out_free:
/*
@@ -439,14 +454,24 @@ static int vas_deallocate_window(struct vas_window *vwin)
 
caps = [win->win_type].caps;
mutex_lock(_pseries_mutex);
-   rc = deallocate_free_window(win);
-   if (rc) {
-   mutex_unlock(_pseries_mutex);
-   return rc;
-   }
+   /*
+* VAS window is already closed in the hypervisor when
+* lost the credit. So just remove the entry from
+* the list, remove task references and free vas_window
+* struct.
+*/
+   if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) {
+   rc = deallocate_free_window(win);
+   if (rc) {
+   mutex_unlock(_pseries_mutex);
+   return rc;
+   }
+   } else
+   vascaps[win->win_type].nr_close_wins--;
 
list_del(>win_list);
atomic_dec(>nr_used_credits);
+   vascaps[win->win_type].nr_open_windows--;
mutex_unlock(_pseries_mutex);
 
put_vas_user_win_ref(>task_ref);
@@ -501,6 +526,7 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
memset(vcaps, 0, sizeof(*vcaps));
INIT_LIST_HEAD(>list);
 
+   vcaps->feat = feat;
caps = >caps;
 
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, feat,
@@ -539,6 +565,168 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
return 0;
 }
 
+/*
+ * The hypervisor reduces the available credits if the LPAR lost core. It
+ * means the excessive windows should not be active and the user space
+ * should not be using these windows to send compression requests to NX.
+ * So the kernel closes the excessive windows and unmap the pas

[PATCH v5 5/9] powerpc/vas: Map paste address only if window is active

2022-02-27 Thread Haren Myneni


The paste address mapping is done with mmap() after the window is
opened with ioctl. The partition has to close VAS windows in the
hypervisor if it lost credits due to DLPAR core removal. But the
kernel marks these windows inactive until the previously lost
credits are available later. If the window is inactive due to
DLPAR after this mmap(), the paste instruction returns failure
until the the OS reopens this window again.

Before the user space issuing mmap(), there is a possibility of
happening DLPAR core removal event which causes the corresponding
window inactive. So if the window is not active, return mmap()
failure with -EACCES and expects the user space reissue mmap()
when the window is active or open a new window when the credit
is available.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/book3s/vas-api.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index f3e421511ea6..5372dbc2e37f 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -496,10 +496,29 @@ static int coproc_mmap(struct file *fp, struct 
vm_area_struct *vma)
return -EACCES;
}
 
+   /*
+* The initial mmap is done after the window is opened
+* with ioctl. But before mmap(), this window can be closed in
+* the hypervisor due to lost credit (core removal on pseries).
+* So if the window is not active, return mmap() failure with
+* -EACCES and expects the user space reissue mmap() when it
+* is active again or open new window when the credit is available.
+* mmap_mutex protects the paste address mmap() with DLPAR
+* close/open event and allows mmap() only when the window is
+* active.
+*/
+   mutex_lock(>task_ref.mmap_mutex);
+   if (txwin->status != VAS_WIN_ACTIVE) {
+   pr_err("%s(): Window is not active\n", __func__);
+   rc = -EACCES;
+   goto out;
+   }
+
paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
if (!paste_addr) {
pr_err("%s(): Window paste address failed\n", __func__);
-   return -EINVAL;
+   rc = -EINVAL;
+   goto out;
}
 
pfn = paste_addr >> PAGE_SHIFT;
@@ -519,6 +538,8 @@ static int coproc_mmap(struct file *fp, struct 
vm_area_struct *vma)
txwin->task_ref.vma = vma;
vma->vm_ops = _vm_ops;
 
+out:
+   mutex_unlock(>task_ref.mmap_mutex);
return rc;
 }
 
-- 
2.27.0




[PATCH v5 4/9] powerpc/vas: Return paste instruction failure if no active window

2022-02-27 Thread Haren Myneni


The VAS window may not be active if the system looses credits and
the NX generates page fault when it receives request on unmap
paste address.

The kernel handles the fault by remap new paste address if the
window is active again, Otherwise return the paste instruction
failure if the executed instruction that caused the fault was
a paste.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/ppc-opcode.h   |  2 +
 arch/powerpc/platforms/book3s/vas-api.c | 55 -
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 9675303b724e..82f1f0041c6f 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -262,6 +262,8 @@
 #define PPC_INST_MFSPR_PVR 0x7c1f42a6
 #define PPC_INST_MFSPR_PVR_MASK0xfc1e
 #define PPC_INST_MTMSRD0x7c000164
+#define PPC_INST_PASTE 0x7c20070d
+#define PPC_INST_PASTE_MASK0xfc2007ff
 #define PPC_INST_POPCNTB   0x7cf4
 #define PPC_INST_POPCNTB_MASK  0xfc0007fe
 #define PPC_INST_RFEBB 0x4c000124
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index f359e7b2bf90..f3e421511ea6 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -351,6 +351,41 @@ static int coproc_release(struct inode *inode, struct file 
*fp)
return 0;
 }
 
+/*
+ * If the executed instruction that caused the fault was a paste, then
+ * clear regs CR0[EQ], advance NIP, and return 0. Else return error code.
+ */
+static int do_fail_paste(void)
+{
+   struct pt_regs *regs = current->thread.regs;
+   u32 instword;
+
+   if (WARN_ON_ONCE(!regs))
+   return -EINVAL;
+
+   if (WARN_ON_ONCE(!user_mode(regs)))
+   return -EINVAL;
+
+   /*
+* If we couldn't translate the instruction, the driver should
+* return success without handling the fault, it will be retried
+* or the instruction fetch will fault.
+*/
+   if (get_user(instword, (u32 __user *)(regs->nip)))
+   return -EAGAIN;
+
+   /*
+* Not a paste instruction, driver may fail the fault.
+*/
+   if ((instword & PPC_INST_PASTE_MASK) != PPC_INST_PASTE)
+   return -ENOENT;
+
+   regs->ccr &= ~0xe000;   /* Clear CR0[0-2] to fail paste */
+   regs_add_return_ip(regs, 4);/* Emulate the paste */
+
+   return 0;
+}
+
 /*
  * This fault handler is invoked when the core generates page fault on
  * the paste address. Happens if the kernel closes window in hypervisor
@@ -408,9 +443,27 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
}
mutex_unlock(>task_ref.mmap_mutex);
 
-   return VM_FAULT_SIGBUS;
+   /*
+* Received this fault due to closing the actual window.
+* It can happen during migration or lost credits.
+* Since no mapping, return the paste instruction failure
+* to the user space.
+*/
+   ret = do_fail_paste();
+   /*
+* The user space can retry several times until success (needed
+* for migration) or should fallback to SW compression or
+* manage with the existing open windows if available.
+* Looking at sysfs interface, it can determine whether these
+* failures are coming during migration or core removal:
+* nr_used_credits > nr_total_credits when lost credits
+*/
+   if (!ret || (ret == -EAGAIN))
+   return VM_FAULT_NOPAGE;
 
+   return VM_FAULT_SIGBUS;
 }
+
 static const struct vm_operations_struct vas_vm_ops = {
.fault = vas_mmap_fault,
 };
-- 
2.27.0




[PATCH v5 3/9] powerpc/vas: Add paste address mmap fault handler

2022-02-27 Thread Haren Myneni


The user space opens VAS windows and issues NX requests by pasting
CRB on the corresponding paste address mmap. When the system lost
credits due to core removal, the kernel has to close the window in
the hypervisor and make the window inactive by unmapping this paste
address. Also the OS has to handle NX request page faults if the user
space issue NX requests.

This handler maps the new paste address with the same VMA when the
window is active again (due to core add with DLPAR). Otherwise
returns paste failure.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h  | 10 
 arch/powerpc/platforms/book3s/vas-api.c | 68 +
 2 files changed, 78 insertions(+)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 57573d9c1e09..27251af18c65 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -29,6 +29,12 @@
 #define VAS_THRESH_FIFO_GT_QTR_FULL2
 #define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
 
+/*
+ * VAS window Linux status bits
+ */
+#define VAS_WIN_ACTIVE 0x0 /* Used in platform independent */
+   /* vas mmap() */
+
 /*
  * Get/Set bit fields
  */
@@ -59,6 +65,9 @@ struct vas_user_win_ref {
struct pid *pid;/* PID of owner */
struct pid *tgid;   /* Thread group ID of owner */
struct mm_struct *mm;   /* Linux process mm_struct */
+   struct mutex mmap_mutex;/* protects paste address mmap() */
+   /* with DLPAR close/open windows */
+   struct vm_area_struct *vma; /* Save VMA and used in DLPAR ops */
 };
 
 /*
@@ -67,6 +76,7 @@ struct vas_user_win_ref {
 struct vas_window {
u32 winid;
u32 wcreds_max; /* Window credits */
+   u32 status; /* Window status used in OS */
enum vas_cop_type cop;
struct vas_user_win_ref task_ref;
char *dbgname;
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 4d82c92ddd52..f359e7b2bf90 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -316,6 +316,7 @@ static int coproc_ioc_tx_win_open(struct file *fp, unsigned 
long arg)
return PTR_ERR(txwin);
}
 
+   mutex_init(>task_ref.mmap_mutex);
cp_inst->txwin = txwin;
 
return 0;
@@ -350,6 +351,70 @@ static int coproc_release(struct inode *inode, struct file 
*fp)
return 0;
 }
 
+/*
+ * This fault handler is invoked when the core generates page fault on
+ * the paste address. Happens if the kernel closes window in hypervisor
+ * (on pseries) due to lost credit or the paste address is not mapped.
+ */
+static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
+{
+   struct vm_area_struct *vma = vmf->vma;
+   struct file *fp = vma->vm_file;
+   struct coproc_instance *cp_inst = fp->private_data;
+   struct vas_window *txwin;
+   u64 paste_addr;
+   int ret;
+
+   /*
+* window is not opened. Shouldn't expect this error.
+*/
+   if (!cp_inst || !cp_inst->txwin) {
+   pr_err("%s(): Unexpected fault on paste address with TX window 
closed\n",
+   __func__);
+   return VM_FAULT_SIGBUS;
+   }
+
+   txwin = cp_inst->txwin;
+   /*
+* When the LPAR lost credits due to core removal or during
+* migration, invalidate the existing mapping for the current
+* paste addresses and set windows in-active (zap_page_range in
+* reconfig_close_windows()).
+* New mapping will be done later after migration or new credits
+* available. So continue to receive faults if the user space
+* issue NX request.
+*/
+   if (txwin->task_ref.vma != vmf->vma) {
+   pr_err("%s(): No previous mapping with paste address\n",
+   __func__);
+   return VM_FAULT_SIGBUS;
+   }
+
+   mutex_lock(>task_ref.mmap_mutex);
+   /*
+* The window may be inactive due to lost credit (Ex: core
+* removal with DLPAR). If the window is active again when
+* the credit is available, map the new paste address at the
+* the window virtual address.
+*/
+   if (txwin->status == VAS_WIN_ACTIVE) {
+   paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
+   if (paste_addr) {
+   ret = vmf_insert_pfn(vma, vma->vm_start,
+   (paste_addr >> PAGE_SHIFT));
+   mutex_unlock(>task_ref.mmap_mutex);
+   return ret;
+   }
+   }
+   mutex_unlock(>task_ref.mmap_mutex);
+
+   return VM_FAULT_SIGBUS;
+
+}
+static const struct vm_operations_struct vas_vm_ops = {
+

[PATCH v5 2/9] powerpc/pseries/vas: Save PID in pseries_vas_window struct

2022-02-27 Thread Haren Myneni


The kernel sets the VAS window with PID when it is opened in
the hypervisor. During DLPAR operation, windows can be closed and
reopened in the hypervisor when the credit is available. So saves
this PID in pseries_vas_window struct when the window is opened
initially and reuse it later during DLPAR operation.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/vas.c | 9 +
 arch/powerpc/platforms/pseries/vas.h | 1 +
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 18aae037ffe9..1035446f985b 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -107,7 +107,6 @@ static int h_deallocate_vas_window(u64 winid)
 static int h_modify_vas_window(struct pseries_vas_window *win)
 {
long rc;
-   u32 lpid = mfspr(SPRN_PID);
 
/*
 * AMR value is not supported in Linux VAS implementation.
@@ -115,7 +114,7 @@ static int h_modify_vas_window(struct pseries_vas_window 
*win)
 */
do {
rc = plpar_hcall_norets(H_MODIFY_VAS_WINDOW,
-   win->vas_win.winid, lpid, 0,
+   win->vas_win.winid, win->pid, 0,
VAS_MOD_WIN_FLAGS, 0);
 
rc = hcall_return_busy_check(rc);
@@ -124,8 +123,8 @@ static int h_modify_vas_window(struct pseries_vas_window 
*win)
if (rc == H_SUCCESS)
return 0;
 
-   pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u lpid %u\n",
-   rc, win->vas_win.winid, lpid);
+   pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u pid %u\n",
+   rc, win->vas_win.winid, win->pid);
return -EIO;
 }
 
@@ -338,6 +337,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
}
}
 
+   txwin->pid = mfspr(SPRN_PID);
+
/*
 * Allocate / Deallocate window hcalls and setup / free IRQs
 * have to be protected with mutex.
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index d6ea8ab8b07a..2872532ed72a 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -114,6 +114,7 @@ struct pseries_vas_window {
u64 domain[6];  /* Associativity domain Ids */
/* this window is allocated */
u64 util;
+   u32 pid;/* PID associated with this window */
 
/* List of windows opened which is used for LPM */
struct list_head win_list;
-- 
2.27.0




[PATCH v5 1/9] powerpc/pseries/vas: Use common names in VAS capability structure

2022-02-27 Thread Haren Myneni


nr_total/nr_used_credits provides credits usage to user space
via sysfs and the same interface can be used on PowerNV in
future. Changed with proper naming so that applicable on both
pseries and PowerNV.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/vas.c | 10 +-
 arch/powerpc/platforms/pseries/vas.h |  5 ++---
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index d243ddc58827..18aae037ffe9 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -310,8 +310,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 
cop_feat_caps = >caps;
 
-   if (atomic_inc_return(_feat_caps->used_lpar_creds) >
-   atomic_read(_feat_caps->target_lpar_creds)) {
+   if (atomic_inc_return(_feat_caps->nr_used_credits) >
+   atomic_read(_feat_caps->nr_total_credits)) {
pr_err("Credits are not available to allocate window\n");
rc = -EINVAL;
goto out;
@@ -385,7 +385,7 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
 out:
-   atomic_dec(_feat_caps->used_lpar_creds);
+   atomic_dec(_feat_caps->nr_used_credits);
kfree(txwin);
return ERR_PTR(rc);
 }
@@ -445,7 +445,7 @@ static int vas_deallocate_window(struct vas_window *vwin)
}
 
list_del(>win_list);
-   atomic_dec(>used_lpar_creds);
+   atomic_dec(>nr_used_credits);
mutex_unlock(_pseries_mutex);
 
put_vas_user_win_ref(>task_ref);
@@ -521,7 +521,7 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
}
caps->max_lpar_creds = be16_to_cpu(hv_caps->max_lpar_creds);
caps->max_win_creds = be16_to_cpu(hv_caps->max_win_creds);
-   atomic_set(>target_lpar_creds,
+   atomic_set(>nr_total_credits,
   be16_to_cpu(hv_caps->target_lpar_creds));
if (feat == VAS_GZIP_DEF_FEAT) {
caps->def_lpar_creds = be16_to_cpu(hv_caps->def_lpar_creds);
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index 4ecb3fcabd10..d6ea8ab8b07a 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -72,9 +72,8 @@ struct vas_cop_feat_caps {
};
/* Total LPAR available credits. Can be different from max LPAR */
/* credits due to DLPAR operation */
-   atomic_ttarget_lpar_creds;
-   atomic_tused_lpar_creds; /* Used credits so far */
-   u16 avail_lpar_creds; /* Remaining available credits */
+   atomic_tnr_total_credits;   /* Total credits assigned to 
LPAR */
+   atomic_tnr_used_credits;/* Used credits so far */
 };
 
 /*
-- 
2.27.0




[PATCH v5 0/9] powerpc/pseries/vas: NXGZIP support with DLPAR

2022-02-27 Thread Haren Myneni


PowerPC provides HW compression with NX coprocessor. This feature
is available on both PowerNV and PowerVM and included in Linux.
Since each powerpc chip has one NX coprocessor, the VAS introduces
the concept of windows / credits to manage access to this hardware
resource. On powerVM, these limited resources should be available
across all LPARs. So the hypervisor assigns the specific credits
to each LPAR based on processor entitlement so that one LPAR does
not overload NX. The hypervisor can reject the window open request
to a partition if exceeds its credit limit (1 credit per window).

So the total number of target credits in a partition can be changed
if the core configuration is modified. The hypervisor expects the
partition to modify its window usage depends on new target
credits. For example, if the partition uses more credits than the
new target credits, it should close the excessive windows so that
the NX resource will be available to other partitions.

This patch series enables OS to support this dynamic credit
management with DLPAR core removal/add.

Core removal operation:
- Get new VAS capabilities from the hypervisor when the DLPAR
  notifier is received. This capabilities provides the new target
  credits based on new processor entitlement. In the case of QoS
  credit changes, the notification will be issued by updating
  the target_creds via sysfs.
- If the partition is already used more than the new target credits,
  the kernel selects windows, unmap the current paste address and
  close them in the hypervisor, It uses FIFO to identify these
  windows - last windows that are opened are the first ones to be
  closed.
- When the user space issue requests on these windows, NX generates
  page fault on the unmap paste address. The kernel handles the
  fault by returning the paste instruction failure if the window is
  not active (means unmap paste). Then up to the library / user
  space to fall back to SW compression or manage with the current
  windows.

Core add operation:
- The kernel can see increased target credits from the new VAS
  capabilities.
- Scans the window list for the closed windows in the hypervisor
  due to lost credit before and selects windows based on same FIFO.
- Make these corresponding windows active and create remap with
  the same VMA on the new paste address in the fault handler.
- Then the user space should expect paste successful later.

Patch 1: Define common names for sysfs target/used/avail_creds so
 that same sysfs entries can be used even on PowerNV later.
Patch 2: Save PID in the vas window struct  during initial window
 open and use it when reopen later.
Patch 3: Add new mmap fault handler which handles the page fault
 from NX on paste address.
Patch 4: Return the paste instruction failure if the window is not
 active.
Patch 5: If the window is closed in the hypervisor before the user
 space issue the initial mmap(), return -EACCES failure.
Patch 6: Close windows in the hypervisor when the partition exceeds
 its usage than the new target credits.
Patch 7: When credits are available, reopen windows that are closed
 before with core removal.
Patch 8 & 9: The user space determines the credit usage with sysfs
 target/avail/used_creds interfaces. drmgr uses target_creds
 to notify OS for QoS credit changes.

Thanks to Nicholas Piggin and Aneesh Kumar for the valuable suggestions
on the NXGZIP design to support DLPAR operations.

Changes in v2:
- Rebase on 5.16-rc5
- Use list safe functions to iterate windows list
- Changes to show the actual value in sysfs used_credits even though
  some windows are inactive with core removal. Reflects -ve value in
  sysfs avail_creds to let userspace know that it opened more windows
  than the current maximum LPAR credits.

Changes in v3:
- Rebase on 5.16
- Reconfigure VAS windows only for CPU hotplug events.

Changes in v4:
- Rebase on 5.17-rc4
- Changes based on comments from Nicholas Piggin
- Included VAS DLPAR notifer code in 'Close windows with DLPAR'
  patch instead of as a separate patch
- Patches reordering and other changes

Changes in v5:
- Rebase on 5.17-rc5
- Add update_total_credits sysfs entry to update QoS target credits
  and other commit descriptions as suggested by Nicholas Piggin

Haren Myneni (9):
  powerpc/pseries/vas: Use common names in VAS capability structure
  powerpc/pseries/vas: Save PID in pseries_vas_window struct
  powerpc/vas: Add paste address mmap fault handler
  powerpc/vas: Return paste instruction failure if no active window
  powerpc/vas: Map paste address only if window is active
  powerpc/pseries/vas: Close windows with DLPAR core removal
  powerpc/pseries/vas: Reopen windows with DLPAR core add
  powerpc/pseries/vas: sysfs interface to export capabilities
  powerpc/pseries/vas: Add 'update_total_credits' entry for QoS
capabilities

 arch/powerpc/include/asm/ppc-opcode.h  |   2 +
 arch/pow

Re: [PATCH v3 3/4] powerpc/pseries/vas: Add VAS migration handler

2022-02-24 Thread Haren Myneni
On Wed, 2022-02-23 at 20:03 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 6:06 am:
> > Since the VAS windows belong to the VAS hardware resource, the
> > hypervisor expects the partition to close them on source partition
> > and reopen them after the partition migrated on the destination
> > machine.
> > 
> > This handler is called before pseries_suspend() to close these
> > windows and again invoked after migration. All active windows
> > for both default and QoS types will be closed and mark them
> > in-active and reopened after migration with this handler.
> > During the migration, the user space receives paste instruction
> > failure if it issues copy/paste on these in-active windows.
> > 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/mobility.c |  5 ++
> >  arch/powerpc/platforms/pseries/vas.c  | 86
> > +++
> >  arch/powerpc/platforms/pseries/vas.h  |  6 ++
> >  3 files changed, 97 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/mobility.c
> > b/arch/powerpc/platforms/pseries/mobility.c
> > index 85033f392c78..70004243e25e 100644
> > --- a/arch/powerpc/platforms/pseries/mobility.c
> > +++ b/arch/powerpc/platforms/pseries/mobility.c
> > @@ -26,6 +26,7 @@
> >  #include 
> >  #include 
> >  #include "pseries.h"
> > +#include "vas.h"   /* vas_migration_handler() */
> >  #include "../../kernel/cacheinfo.h"
> >  
> >  static struct kobject *mobility_kobj;
> > @@ -669,12 +670,16 @@ static int pseries_migrate_partition(u64
> > handle)
> > if (ret)
> > return ret;
> >  
> > +   vas_migration_handler(VAS_SUSPEND);
> 
> Not sure if there is much point having a "handler" like this that
> only
> takes two operations. vas_migration_begin()/vas_migration_end() is
> better isn't it?

The actual suspend / resume framework will be added later. So using the
VAS_SUSPEND/VAS_RESUME right now, but will be removed later after
having the permanent fix. 

> 
> Other question is why can't the suspend handler return error and
> handle
> it here?

We can, but has to call pseries_cancel_migration() if VAS suspend
handler returns failure. We should expect this failure only from
H_DEALLOCATE_VAS_WINDOW and H_QUERY_VAS_CAPABILITIES HCALLs wich should
not happen generally.

> 
> > +
> > ret = pseries_suspend(handle);
> > if (ret == 0)
> > post_mobility_fixup();
> > else
> > pseries_cancel_migration(handle, ret);
> >  
> > +   vas_migration_handler(VAS_RESUME);
> > +
> > return ret;
> >  }
> >  
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index fbcf311da0ec..df22827969db 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -869,6 +869,92 @@ static struct notifier_block pseries_vas_nb =
> > {
> > .notifier_call = pseries_vas_notifier,
> >  };
> >  
> > +/*
> > + * For LPM, all windows have to be closed on the source partition
> > + * before migration and reopen them on the destination partition
> > + * after migration. So closing windows during suspend and
> > + * reopen them during resume.
> > + */
> > +int vas_migration_handler(int action)
> > +{
> > +   struct vas_cop_feat_caps *caps;
> > +   int old_nr_creds, new_nr_creds = 0;
> > +   struct vas_caps *vcaps;
> > +   int i, rc = 0;
> > +
> > +   /*
> > +* NX-GZIP is not enabled. Nothing to do for migration.
> > +*/
> > +   if (!copypaste_feat)
> > +   return rc;
> > +
> > +   mutex_lock(_pseries_mutex);
> > +
> > +   for (i = 0; i < VAS_MAX_FEAT_TYPE; i++) {
> > +   vcaps = [i];
> > +   caps = >caps;
> > +   old_nr_creds = atomic_read(>nr_total_credits);
> > +
> > +   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
> > + vcaps->feat,
> > + (u64)virt_to_phys(_cop
> > _caps));
> > +   if (!rc) {
> > +   new_nr_creds =
> > be16_to_cpu(hv_cop_caps.target_lpar_creds);
> > +   /*
> > +* Should not happen. But incase print
> > messages, close
> > +* all windows in the list during suspend and
> > reopen
> > + 

Re: [PATCH v3 2/4] powerpc/pseries/vas: Modify reconfig open/close functions for migration

2022-02-24 Thread Haren Myneni
On Wed, 2022-02-23 at 19:54 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 6:05 am:
> > VAS is a hardware engine stays on the chip. So when the partition
> > migrates, all VAS windows on the source system have to be closed
> > and reopen them on the destination after migration.
> > 
> > This patch make changes to the current reconfig_open/close_windows
> > functions to support migration:
> > - Set VAS_WIN_MIGRATE_CLOSE to the window status when closes and
> >   reopen windows with the same status during resume.
> > - Continue to close all windows even if deallocate HCALL failed
> >   (should not happen) since no way to stop migration with the
> >   current LPM implementation.
> 
> Hmm.  pseries_migrate_partition *can* fail?

Yes, it can fail. If pseries_suspend() fails, all VAS windows will be
reopened again without migration. vas_migration_handler(VAS_RESUME) is
called whether pseries_suspend() returns 0 or not.

> 
> > - If the DLPAR CPU event happens while migration is in progress,
> >   set VAS_WIN_NO_CRED_CLOSE to the window status. Close window
> >   happens with the first event (migration or DLPAR) and Reopen
> >   window happens only with the last event (migration or DLPAR).
> 
> Can DLPAR happen while migration is in progress? Couldn't
> this cause your source and destination credits to go out of
> whack?

Should not be, If the DLPAR event happens while migration is in
progress, windows will be closed in the hypervisor (and mark inactive
with migration status bit in OS) for migration. For DLPAR event, mark
the DLPAR_CLOSED status bits for the necessary windows. Then after the
migration, we open windows in the hypervisor and set them active in OS
that have only migration status. Open the other remaining windows only
after the other DLPAR core add event. 

Regarding the traget credits on the destination, we get the new
capabilities after migration and use the new value for reopen. 

Ex: Used the following test case -
- Configuted 2 dedicated cores (40 credits) and exeuted the test case
which opened 35 credits / windows
- Removed 1 core, means available 20 credits. So closed 15 windows and
set them with DLPAR closed status
- Migration start: Closed the remaining 20 windows and set all windows
(means 35) migration status
- After migration, opened windows that have only migration status - 20
windows, and also clear migration stats for the remaining 15 widnows
- Add core which gives the system 20 more credits, So opened the
remaining 15 windows and these have only DLPAR closed status. 

> 
> Why do you need two close window types, what if you finish
> LPM and just open as many as possible regardless how they
> are closed?

Adding 2 different status bits to support DLPAR and LPM closed staus.
As I mentioned above, windows will be active only after both bits are
cleared.

Thanks
Haren

> 
> Thanks,
> Nick
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/include/asm/vas.h   |  2 +
> >  arch/powerpc/platforms/pseries/vas.c | 88 ++
> > --
> >  2 files changed, 73 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/vas.h
> > b/arch/powerpc/include/asm/vas.h
> > index 6baf7b9ffed4..83afcb6c194b 100644
> > --- a/arch/powerpc/include/asm/vas.h
> > +++ b/arch/powerpc/include/asm/vas.h
> > @@ -36,6 +36,8 @@
> > /* vas mmap() */
> >  /* Window is closed in the hypervisor due to lost credit */
> >  #define VAS_WIN_NO_CRED_CLOSE  0x0001
> > +/* Window is closed due to migration */
> > +#define VAS_WIN_MIGRATE_CLOSE  0x0002
> >  
> >  /*
> >   * Get/Set bit fields
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index 3bb219f54806..fbcf311da0ec 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -457,11 +457,12 @@ static int vas_deallocate_window(struct
> > vas_window *vwin)
> > mutex_lock(_pseries_mutex);
> > /*
> >  * VAS window is already closed in the hypervisor when
> > -* lost the credit. So just remove the entry from
> > -* the list, remove task references and free vas_window
> > +* lost the credit or with migration. So just remove the entry
> > +* from the list, remove task references and free vas_window
> >  * struct.
> >  */
> > -   if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) {
> > +   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) &&
> > +   !(win->vas_win.status & VAS_WIN_MIGRATE_C

Re: [PATCH v3 0/4] powerpc/pseries/vas: VAS/NXGZIP support with LPM

2022-02-23 Thread Haren Myneni
On Wed, 2022-02-23 at 19:38 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 6:04 am:
> > Virtual Accelerator Switchboard (VAS) is an engine stays on the
> > chip. So all windows opened on a specific engine belongs to VAS
> > the chip.
> 
> The problem is more that PAPR does not virtualise the VAS windows,
> right? That's a whole other gripe but nothing you can do about it
> here.

Yes, There is no virtualization with VAS windows and they are specific
to the chip. 

> 
> Thanks,
> Nick
> 
> > The hypervisor expects the partition to close all
> > active windows on the sources system and reopen them after
> > migration on the destination machine.
> > 
> > This patch series adds VAS support with the partition migration.
> > When the migration initiates, the VAS migration handler will be
> > invoked before pseries_suspend() to close all active windows and
> > mark them in-active with VAS_WIN_MIGRATE_CLOSE status. Whereas
> > this migration handler is called after migration to reopen all
> > windows which has VAS_WIN_MIGRATE_CLOSE status and make them
> > active again. The user space gets paste instruction failure
> > when it sends requests on these in-active windows.
> > 
> > These patches depend on VAS/DLPAR support patch series
> > 
> > Changes in v2:
> > - Added new patch "Define global hv_cop_caps struct" to eliminate
> >   memory allocation failure during migration (suggestion by
> >   Nathan Lynch)
> > 
> > Changes in v3:
> > - Rebase on 5.17-rc4
> > - Naming changes for VAS capability struct elemets based on the V4
> > DLPAR
> >   support patch series.
> > 
> > Haren Myneni (4):
> >   powerpc/pseries/vas: Define global hv_cop_caps struct
> >   powerpc/pseries/vas: Modify reconfig open/close functions for
> > migration
> >   powerpc/pseries/vas: Add VAS migration handler
> >   powerpc/pseries/vas: Disable window open during migration
> > 
> >  arch/powerpc/include/asm/vas.h|   2 +
> >  arch/powerpc/platforms/pseries/mobility.c |   5 +
> >  arch/powerpc/platforms/pseries/vas.c  | 234 +-
> > 
> >  arch/powerpc/platforms/pseries/vas.h  |   6 +
> >  4 files changed, 201 insertions(+), 46 deletions(-)
> > 
> > -- 
> > 2.27.0
> > 
> > 
> > 



Re: [PATCH v4 9/9] powerpc/pseries/vas: Write 'nr_total_credits' for QoS credits change

2022-02-23 Thread Haren Myneni
On Wed, 2022-02-23 at 17:33 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 6:03 am:
> > pseries supports two types of credits - Default (uses normal
> > priority
> > FIFO) and Qality of service (QoS uses high priority FIFO). The user
> > decides the number of QoS credits and sets this value with HMC
> > interface. With the core add/removal, this value can be changed in
> > HMC
> > which invokes drmgr to communicate to the kernel.
> > 
> > This patch adds an interface so that drmgr command can write the
> > new
> > target QoS credits in sysfs. But the kernel gets the new QoS
> > capabilities from the hypervisor whenever nr_total_credits is
> > updated
> > to make sure sync with the values in the hypervisor.
> > 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/vas-sysfs.c | 33
> > +-
> >  arch/powerpc/platforms/pseries/vas.c   |  2 +-
> >  arch/powerpc/platforms/pseries/vas.h   |  1 +
> >  3 files changed, 34 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c
> > b/arch/powerpc/platforms/pseries/vas-sysfs.c
> > index e24d3edb3021..20745cd75f27 100644
> > --- a/arch/powerpc/platforms/pseries/vas-sysfs.c
> > +++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
> > @@ -25,6 +25,33 @@ struct vas_caps_entry {
> >  
> >  #define to_caps_entry(entry) container_of(entry, struct
> > vas_caps_entry, kobj)
> >  
> > +/*
> > + * This function is used to get the notification from the drmgr
> > when
> > + * QoS credits are changed. Though receiving the target total QoS
> > + * credits here, get the official QoS capabilities from the
> > hypervisor.
> > + */
> > +static ssize_t nr_total_credits_store(struct vas_cop_feat_caps
> > *caps,
> > +  const char *buf, size_t count)
> > +{
> > +   int err;
> > +   u16 creds;
> > +
> > +   /*
> > +* Nothing to do for default credit type.
> > +*/
> > +   if (caps->win_type == VAS_GZIP_DEF_FEAT_TYPE)
> > +   return -EOPNOTSUPP;
> > +
> > +   err = kstrtou16(buf, 0, );
> > +   if (!err)
> > +   err = vas_reconfig_capabilties(caps->win_type);
> 
> So what's happening here? The creds value is ignored? Can it just
> be a write-only file which is named appropriately to indicate it
> can be written-to to trigger an update?

Yes, new credits value is ignored. When the user changes QoS credits
with the HMC interface, it should reflect in QoS capability in the
hypervisor. So ignoring the credit value here and get the capability
value from the hypervisor.

This file should be read/write - the user space should be able to read
the current configured value for both credit types - default and QoS

Can I say nr_total_credits_update?

Thanks
Haren

> 
> Thanks,
> Nick



Re: [PATCH v4 7/9] powerpc/pseries/vas: Reopen windows with DLPAR core add

2022-02-23 Thread Haren Myneni
On Wed, 2022-02-23 at 17:28 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 6:01 am:
> > VAS windows can be closed in the hypervisor due to lost credits
> > when the core is removed and the kernel gets fault for NX
> > requests on these in-active windows. If these credits are
> > available later for core add, reopen these windows and set them
> > active. When the OS sees page faults on these active windows,
> > it creates mapping on the new paste address. Then the user space
> > can continue to use these windows and send HW compression
> > requests to NX successfully.
> 
> Just for my own ignorance, what happens if userspace does not get
> another page fault on that window? Presumably when it gets a page
> fault it changes to an available window and doesn't just keep
> re-trying. So in what situation does it attempt to re-access a
> faulting window?

We should except faults only when the user space issue NX request on
these inactive windows. 
Example: 
- window is closed in the hypervisor to the DLPAR core removal. Means
inactive in the OS.
- window is opened later when the previous credit is available - means
active window in the OS.
- If the user space issue copy/paste on this window and fault on paste
address:
  - gets paste failure if the window is inactive
  - map to the new paste address if the window is active and returns 

So in case if the user space does not retry, new mapping will not be
done. From the OS point of view, this window is not used by the user
space. 

Thanks
Haren

> 
> Thanks,
> Nick
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/vas.c | 91
> > +++-
> >  1 file changed, 90 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index a297720bcdae..96178dd58adf 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -565,6 +565,88 @@ static int __init get_vas_capabilities(u8
> > feat, enum vas_cop_feat_type type,
> > return 0;
> >  }
> >  
> > +/*
> > + * VAS windows can be closed due to lost credits when the core is
> > + * removed. So reopen them if credits are available due to DLPAR
> > + * core add and set the window active status. When NX sees the
> > page
> > + * fault on the unmapped paste address, the kernel handles the
> > fault
> > + * by setting the remapping to new paste address if the window is
> > + * active.
> > + */
> > +static int reconfig_open_windows(struct vas_caps *vcaps, int
> > creds)
> > +{
> > +   long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
> > +   struct vas_cop_feat_caps *caps = >caps;
> > +   struct pseries_vas_window *win = NULL, *tmp;
> > +   int rc, mv_ents = 0;
> > +
> > +   /*
> > +* Nothing to do if there are no closed windows.
> > +*/
> > +   if (!vcaps->nr_close_wins)
> > +   return 0;
> > +
> > +   /*
> > +* For the core removal, the hypervisor reduces the credits
> > +* assigned to the LPAR and the kernel closes VAS windows
> > +* in the hypervisor depends on reduced credits. The kernel
> > +* uses LIFO (the last windows that are opened will be closed
> > +* first) and expects to open in the same order when credits
> > +* are available.
> > +* For example, 40 windows are closed when the LPAR lost 2
> > cores
> > +* (dedicated). If 1 core is added, this LPAR can have 20 more
> > +* credits. It means the kernel can reopen 20 windows. So move
> > +* 20 entries in the VAS windows lost and reopen next 20
> > windows.
> > +*/
> > +   if (vcaps->nr_close_wins > creds)
> > +   mv_ents = vcaps->nr_close_wins - creds;
> > +
> > +   list_for_each_entry_safe(win, tmp, >list, win_list) {
> > +   if (!mv_ents)
> > +   break;
> > +
> > +   mv_ents--;
> > +   }
> > +
> > +   list_for_each_entry_safe_from(win, tmp, >list, win_list)
> > {
> > +   /*
> > +* Nothing to do on this window if it is not closed
> > +* with VAS_WIN_NO_CRED_CLOSE
> > +*/
> > +   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE))
> > +   continue;
> > +
> > +   rc = allocate_setup_window(win, (u64 *)[0],
> > +  caps->win_type);
> > +   if (rc)
> > +  

Re: [PATCH v4 6/9] powerpc/pseries/vas: Close windows with DLPAR core removal

2022-02-23 Thread Haren Myneni
On Wed, 2022-02-23 at 17:23 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 6:00 am:
> > The hypervisor assigns vas credits (windows) for each LPAR based
> > on the number of cores configured in that system. The OS is
> > expected to release credits when cores are removed, and may
> > allocate more when cores are added. So there is a possibility of
> > using excessive credits (windows) in the LPAR and the hypervisor
> > expects the system to close the excessive windows so that NX load
> > can be equally distributed across all LPARs in the system.
> > 
> > When the OS closes the excessive windows in the hypervisor,
> > it sets the window status in-active and invalidates window
> > virtual address mapping. The user space receives paste instruction
> > failure if any NX requests are issued on the in-active window.
> 
> Thanks for adding this paragraph. Then presumably userspace can
> update their windows and be able to re-try with an available open
> window?
 
yes, the user space should be able to manage with the available open
windows or fall back to SW compression if can. Added this comment in
the fault hanlder patch.
> 
> in-active can be one word, not hyphenated.
> 
> 
> > This patch also adds the notifier for core removal/add to close
> > windows in the hypervisor if the system lost credits (core
> > removal) and reopen windows in the hypervisor when the previously
> > lost credits are available.
> > 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/include/asm/vas.h   |   2 +
> >  arch/powerpc/platforms/pseries/vas.c | 207
> > +--
> >  arch/powerpc/platforms/pseries/vas.h |   3 +
> >  3 files changed, 204 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/vas.h
> > b/arch/powerpc/include/asm/vas.h
> > index 27251af18c65..6baf7b9ffed4 100644
> > --- a/arch/powerpc/include/asm/vas.h
> > +++ b/arch/powerpc/include/asm/vas.h
> > @@ -34,6 +34,8 @@
> >   */
> >  #define VAS_WIN_ACTIVE 0x0 /* Used in platform
> > independent */
> > /* vas mmap() */
> > +/* Window is closed in the hypervisor due to lost credit */
> > +#define VAS_WIN_NO_CRED_CLOSE  0x0001
> 
> I thought we were getting a different status for software
> status vs status rturned by hypervisor?

vas_window->status is only for Linux status bits which are used for
active, DLPAR close or migration. We do not need status returned from
hypervisor right now. In case if needed in future, hv_status will be
added in pseries_vas_window. 

> 
> > diff --git a/arch/powerpc/platforms/pseries/vas.h
> > b/arch/powerpc/platforms/pseries/vas.h
> > index 2872532ed72a..701363cfd7c1 100644
> > --- a/arch/powerpc/platforms/pseries/vas.h
> > +++ b/arch/powerpc/platforms/pseries/vas.h
> > @@ -83,6 +83,9 @@ struct vas_cop_feat_caps {
> >  struct vas_caps {
> > struct vas_cop_feat_caps caps;
> > struct list_head list;  /* List of open windows */
> > +   int nr_close_wins;  /* closed windows in the hypervisor for
> > DLPAR */
> > +   int nr_open_windows;/* Number of successful open
> > windows */
> > +   u8 feat;/* Feature type */
> >  };
> 
> Still not entirely sold on the idea that nr_open_windows is a feature
> or capability, but if the code works out easier this way, sometimes
> these little hacks are reasonable.

nr_close_wins / nr_open_windows - not a capability or feature, but
these are used to track active windows and needed for DLPAR / migration
- Means total number of open windows and the actual number of windows
closed in the hypervisor. hence I did not add these elements in
 vas_cop_feat_caps struct.

Thanks
Haren

> 
> Thanks,
> Nick



Re: [PATCH v4 5/9] powerpc/vas: Map paste address only if window is active

2022-02-23 Thread Haren Myneni
On Wed, 2022-02-23 at 17:11 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 5:59 am:
> > The paste address mapping is done with mmap() after the window is
> > opened with ioctl. If the window is closed by OS in the hypervisor
> > due to DLPAR after this mmap(), the paste instruction returns
> 
> I don't think the changelog was improved here.
> 
> The window is closed by the OS in response to a DLPAR operation
> by the hypervisor? The OS can't be in the hypervisor.

>From the user space point of view, this window is inactive. But from
the hypervisor point of view, this window is closed. So my point is the
window is closed by the OS in the hypervisor for the DLPAR event.

> 
> 
> > failure until the OS reopens this window again. But before mmap(),
> > DLPAR core removal can happen which causes the corresponding
> > window in-active. So if the window is not active, return mmap()
> > failure with -EACCES and expects the user space reissue mmap()
> > when the window is active or open a new window when the credit
> > is available.
> > 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/book3s/vas-api.c | 20 +++-
> >  1 file changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/platforms/book3s/vas-api.c
> > b/arch/powerpc/platforms/book3s/vas-api.c
> > index f3e421511ea6..eb4489b2b46b 100644
> > --- a/arch/powerpc/platforms/book3s/vas-api.c
> > +++ b/arch/powerpc/platforms/book3s/vas-api.c
> > @@ -496,10 +496,26 @@ static int coproc_mmap(struct file *fp,
> > struct vm_area_struct *vma)
> > return -EACCES;
> > }
> >  
> > +   /*
> > +* The initial mmap is done after the window is opened
> > +* with ioctl. But before mmap(), this window can be closed in
> > +* the hypervisor due to lost credit (core removal on pseries).
> > +* So if the window is not active, return mmap() failure with
> > +* -EACCES and expects the user space reissue mmap() when it
> > +* is active again or open new window when the credit is
> > available.
> > +*/
> > +   mutex_lock(>task_ref.mmap_mutex);
> > +   if (txwin->status != VAS_WIN_ACTIVE) {
> > +   pr_err("%s(): Window is not active\n", __func__);
> > +   rc = -EACCES;
> > +   goto out;
> > +   }
> > +
> > paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
> > if (!paste_addr) {
> > pr_err("%s(): Window paste address failed\n",
> > __func__);
> > -   return -EINVAL;
> > +   rc = -EINVAL;
> > +   goto out;
> > }
> >  
> > pfn = paste_addr >> PAGE_SHIFT;
> > @@ -519,6 +535,8 @@ static int coproc_mmap(struct file *fp, struct
> > vm_area_struct *vma)
> > txwin->task_ref.vma = vma;
> > vma->vm_ops = _vm_ops;
> >  
> > +out:
> > +   mutex_unlock(>task_ref.mmap_mutex);
> 
> Did we have an explanation or what mmap_mutex is protecting? Sorry
> if 
> you explained it and I forgot -- would be good to have a small
> comment
> (what is it protecting against).

The comment should be in the struct definition. I will add some comment
in this patch.

Thanks
Haren
> 
> Thanks,
> Nick



Re: [PATCH v4 4/9] powerpc/vas: Return paste instruction failure if no active window

2022-02-22 Thread Haren Myneni
On Wed, 2022-02-23 at 17:05 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of February 20, 2022 5:58 am:
> > The VAS window may not be active if the system looses credits and
> > the NX generates page fault when it receives request on unmap
> > paste address.
> > 
> > The kernel handles the fault by remap new paste address if the
> > window is active again, Otherwise return the paste instruction
> > failure if the executed instruction that caused the fault was
> > a paste.
> 
> Looks good, thanks for fixin the SIGBUS thing, was that my
> fault? I vaguely remember writing some of this patch :P

Thanks for your reviews on all patches. 

No, it was my fault not handling the -EGAIN error. 

> 
> Thanks,
> Nick
> 
> > Signed-off-by: Nicholas Piggin 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/include/asm/ppc-opcode.h   |  2 +
> >  arch/powerpc/platforms/book3s/vas-api.c | 55
> > -
> >  2 files changed, 56 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/include/asm/ppc-opcode.h
> > b/arch/powerpc/include/asm/ppc-opcode.h
> > index 9675303b724e..82f1f0041c6f 100644
> > --- a/arch/powerpc/include/asm/ppc-opcode.h
> > +++ b/arch/powerpc/include/asm/ppc-opcode.h
> > @@ -262,6 +262,8 @@
> >  #define PPC_INST_MFSPR_PVR 0x7c1f42a6
> >  #define PPC_INST_MFSPR_PVR_MASK0xfc1e
> >  #define PPC_INST_MTMSRD0x7c000164
> > +#define PPC_INST_PASTE 0x7c20070d
> > +#define PPC_INST_PASTE_MASK0xfc2007ff
> >  #define PPC_INST_POPCNTB   0x7cf4
> >  #define PPC_INST_POPCNTB_MASK  0xfc0007fe
> >  #define PPC_INST_RFEBB 0x4c000124
> > diff --git a/arch/powerpc/platforms/book3s/vas-api.c
> > b/arch/powerpc/platforms/book3s/vas-api.c
> > index f359e7b2bf90..f3e421511ea6 100644
> > --- a/arch/powerpc/platforms/book3s/vas-api.c
> > +++ b/arch/powerpc/platforms/book3s/vas-api.c
> > @@ -351,6 +351,41 @@ static int coproc_release(struct inode *inode,
> > struct file *fp)
> > return 0;
> >  }
> >  
> > +/*
> > + * If the executed instruction that caused the fault was a paste,
> > then
> > + * clear regs CR0[EQ], advance NIP, and return 0. Else return
> > error code.
> > + */
> > +static int do_fail_paste(void)
> > +{
> > +   struct pt_regs *regs = current->thread.regs;
> > +   u32 instword;
> > +
> > +   if (WARN_ON_ONCE(!regs))
> > +   return -EINVAL;
> > +
> > +   if (WARN_ON_ONCE(!user_mode(regs)))
> > +   return -EINVAL;
> > +
> > +   /*
> > +* If we couldn't translate the instruction, the driver should
> > +* return success without handling the fault, it will be
> > retried
> > +* or the instruction fetch will fault.
> > +*/
> > +   if (get_user(instword, (u32 __user *)(regs->nip)))
> > +   return -EAGAIN;
> > +
> > +   /*
> > +* Not a paste instruction, driver may fail the fault.
> > +*/
> > +   if ((instword & PPC_INST_PASTE_MASK) != PPC_INST_PASTE)
> > +   return -ENOENT;
> > +
> > +   regs->ccr &= ~0xe000;   /* Clear CR0[0-2] to fail paste */
> > +   regs_add_return_ip(regs, 4);/* Emulate the paste */
> > +
> > +   return 0;
> > +}
> > +
> >  /*
> >   * This fault handler is invoked when the core generates page
> > fault on
> >   * the paste address. Happens if the kernel closes window in
> > hypervisor
> > @@ -408,9 +443,27 @@ static vm_fault_t vas_mmap_fault(struct
> > vm_fault *vmf)
> > }
> > mutex_unlock(>task_ref.mmap_mutex);
> >  
> > -   return VM_FAULT_SIGBUS;
> > +   /*
> > +* Received this fault due to closing the actual window.
> > +* It can happen during migration or lost credits.
> > +* Since no mapping, return the paste instruction failure
> > +* to the user space.
> > +*/
> > +   ret = do_fail_paste();
> > +   /*
> > +* The user space can retry several times until success (needed
> > +* for migration) or should fallback to SW compression or
> > +* manage with the existing open windows if available.
> > +* Looking at sysfs interface, it can determine whether these
> > +* failures are coming during migration or core removal:
> > +* nr_used_credits > nr_total_credits when lost credits
> > +*/
> > +   if (!ret || (ret == -EAGAIN))
> > +   return VM_FAULT_NOPAGE;
> >  
> > +   return VM_FAULT_SIGBUS;
> >  }
> > +
> >  static const struct vm_operations_struct vas_vm_ops = {
> > .fault = vas_mmap_fault,
> >  };
> > -- 
> > 2.27.0
> > 
> > 
> > 



[PATCH v3 4/4] powerpc/pseries/vas: Disable window open during migration

2022-02-19 Thread Haren Myneni


The current partition migration implementation does not freeze the
user space and the user space can continue open VAS windows. So
when migration_in_progress flag is enabled, VAS open window
API returns -EBUSY.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index df22827969db..4be80112b05e 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -30,6 +30,7 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
+static bool migration_in_progress;
 
 static long hcall_return_busy_check(long rc)
 {
@@ -356,8 +357,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   rc = allocate_setup_window(txwin, (u64 *)[0],
-  cop_feat_caps->win_type);
+   if (migration_in_progress)
+   rc = -EBUSY;
+   else
+   rc = allocate_setup_window(txwin, (u64 *)[0],
+  cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
goto out;
@@ -890,6 +894,11 @@ int vas_migration_handler(int action)
 
mutex_lock(_pseries_mutex);
 
+   if (action == VAS_SUSPEND)
+   migration_in_progress = true;
+   else
+   migration_in_progress = false;
+
for (i = 0; i < VAS_MAX_FEAT_TYPE; i++) {
vcaps = [i];
caps = >caps;
-- 
2.27.0




[PATCH v3 3/4] powerpc/pseries/vas: Add VAS migration handler

2022-02-19 Thread Haren Myneni


Since the VAS windows belong to the VAS hardware resource, the
hypervisor expects the partition to close them on source partition
and reopen them after the partition migrated on the destination
machine.

This handler is called before pseries_suspend() to close these
windows and again invoked after migration. All active windows
for both default and QoS types will be closed and mark them
in-active and reopened after migration with this handler.
During the migration, the user space receives paste instruction
failure if it issues copy/paste on these in-active windows.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/mobility.c |  5 ++
 arch/powerpc/platforms/pseries/vas.c  | 86 +++
 arch/powerpc/platforms/pseries/vas.h  |  6 ++
 3 files changed, 97 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 85033f392c78..70004243e25e 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include "pseries.h"
+#include "vas.h"   /* vas_migration_handler() */
 #include "../../kernel/cacheinfo.h"
 
 static struct kobject *mobility_kobj;
@@ -669,12 +670,16 @@ static int pseries_migrate_partition(u64 handle)
if (ret)
return ret;
 
+   vas_migration_handler(VAS_SUSPEND);
+
ret = pseries_suspend(handle);
if (ret == 0)
post_mobility_fixup();
else
pseries_cancel_migration(handle, ret);
 
+   vas_migration_handler(VAS_RESUME);
+
return ret;
 }
 
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index fbcf311da0ec..df22827969db 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -869,6 +869,92 @@ static struct notifier_block pseries_vas_nb = {
.notifier_call = pseries_vas_notifier,
 };
 
+/*
+ * For LPM, all windows have to be closed on the source partition
+ * before migration and reopen them on the destination partition
+ * after migration. So closing windows during suspend and
+ * reopen them during resume.
+ */
+int vas_migration_handler(int action)
+{
+   struct vas_cop_feat_caps *caps;
+   int old_nr_creds, new_nr_creds = 0;
+   struct vas_caps *vcaps;
+   int i, rc = 0;
+
+   /*
+* NX-GZIP is not enabled. Nothing to do for migration.
+*/
+   if (!copypaste_feat)
+   return rc;
+
+   mutex_lock(_pseries_mutex);
+
+   for (i = 0; i < VAS_MAX_FEAT_TYPE; i++) {
+   vcaps = [i];
+   caps = >caps;
+   old_nr_creds = atomic_read(>nr_total_credits);
+
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
+ vcaps->feat,
+ (u64)virt_to_phys(_cop_caps));
+   if (!rc) {
+   new_nr_creds = 
be16_to_cpu(hv_cop_caps.target_lpar_creds);
+   /*
+* Should not happen. But incase print messages, close
+* all windows in the list during suspend and reopen
+* windows based on new lpar_creds on the destination
+* system.
+*/
+   if (old_nr_creds != new_nr_creds) {
+   pr_err("state(%d): lpar creds: %d HV lpar 
creds: %d\n",
+   action, old_nr_creds, new_nr_creds);
+   pr_err("Used creds: %d, Active creds: %d\n",
+   atomic_read(>nr_used_credits),
+   vcaps->nr_open_windows - 
vcaps->nr_close_wins);
+   }
+   } else {
+   pr_err("state(%d): Get VAS capabilities failed with 
%d\n",
+   action, rc);
+   /*
+* We can not stop migration with the current lpm
+* implementation. So continue closing all windows in
+* the list (during suspend) and return without
+* opening windows (during resume) if VAS capabilities
+* HCALL failed.
+*/
+   if (action == VAS_RESUME)
+   goto out;
+   }
+
+   switch (action) {
+   case VAS_SUSPEND:
+   rc = reconfig_close_windows(vcaps, 
vcaps->nr_open_windows,
+   true);
+   break;
+   case VAS_RESUME:
+   atomic_set(>nr_total_credits, new_

[PATCH v3 2/4] powerpc/pseries/vas: Modify reconfig open/close functions for migration

2022-02-19 Thread Haren Myneni


VAS is a hardware engine stays on the chip. So when the partition
migrates, all VAS windows on the source system have to be closed
and reopen them on the destination after migration.

This patch make changes to the current reconfig_open/close_windows
functions to support migration:
- Set VAS_WIN_MIGRATE_CLOSE to the window status when closes and
  reopen windows with the same status during resume.
- Continue to close all windows even if deallocate HCALL failed
  (should not happen) since no way to stop migration with the
  current LPM implementation.
- If the DLPAR CPU event happens while migration is in progress,
  set VAS_WIN_NO_CRED_CLOSE to the window status. Close window
  happens with the first event (migration or DLPAR) and Reopen
  window happens only with the last event (migration or DLPAR).

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h   |  2 +
 arch/powerpc/platforms/pseries/vas.c | 88 ++--
 2 files changed, 73 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 6baf7b9ffed4..83afcb6c194b 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -36,6 +36,8 @@
/* vas mmap() */
 /* Window is closed in the hypervisor due to lost credit */
 #define VAS_WIN_NO_CRED_CLOSE  0x0001
+/* Window is closed due to migration */
+#define VAS_WIN_MIGRATE_CLOSE  0x0002
 
 /*
  * Get/Set bit fields
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 3bb219f54806..fbcf311da0ec 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -457,11 +457,12 @@ static int vas_deallocate_window(struct vas_window *vwin)
mutex_lock(_pseries_mutex);
/*
 * VAS window is already closed in the hypervisor when
-* lost the credit. So just remove the entry from
-* the list, remove task references and free vas_window
+* lost the credit or with migration. So just remove the entry
+* from the list, remove task references and free vas_window
 * struct.
 */
-   if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) {
+   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) &&
+   !(win->vas_win.status & VAS_WIN_MIGRATE_CLOSE)) {
rc = deallocate_free_window(win);
if (rc) {
mutex_unlock(_pseries_mutex);
@@ -578,12 +579,14 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
  * by setting the remapping to new paste address if the window is
  * active.
  */
-static int reconfig_open_windows(struct vas_caps *vcaps, int creds)
+static int reconfig_open_windows(struct vas_caps *vcaps, int creds,
+bool migrate)
 {
long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
struct vas_cop_feat_caps *caps = >caps;
struct pseries_vas_window *win = NULL, *tmp;
int rc, mv_ents = 0;
+   int flag;
 
/*
 * Nothing to do if there are no closed windows.
@@ -602,8 +605,10 @@ static int reconfig_open_windows(struct vas_caps *vcaps, 
int creds)
 * (dedicated). If 1 core is added, this LPAR can have 20 more
 * credits. It means the kernel can reopen 20 windows. So move
 * 20 entries in the VAS windows lost and reopen next 20 windows.
+* For partition migration, reopen all windows that are closed
+* during resume.
 */
-   if (vcaps->nr_close_wins > creds)
+   if ((vcaps->nr_close_wins > creds) && !migrate)
mv_ents = vcaps->nr_close_wins - creds;
 
list_for_each_entry_safe(win, tmp, >list, win_list) {
@@ -613,12 +618,35 @@ static int reconfig_open_windows(struct vas_caps *vcaps, 
int creds)
mv_ents--;
}
 
+   /*
+* Open windows if they are closed only with migration or
+* DLPAR (lost credit) before.
+*/
+   if (migrate)
+   flag = VAS_WIN_MIGRATE_CLOSE;
+   else
+   flag = VAS_WIN_NO_CRED_CLOSE;
+
list_for_each_entry_safe_from(win, tmp, >list, win_list) {
+   /*
+* This window is closed with DLPAR and migration events.
+* So reopen the window with the last event.
+* The user space is not suspended with the current
+* migration notifier. So the user space can issue DLPAR
+* CPU hotplug while migration in progress. In this case
+* this window will be opened with the last event.
+*/
+   if ((win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) &&
+   (win->vas_win.status & VAS_WIN_MIGRATE_CLOSE)) {
+  

[PATCH v3 1/4] powerpc/pseries/vas: Define global hv_cop_caps struct

2022-02-19 Thread Haren Myneni


The coprocessor capabilities struct is used to get default and
QoS capabilities from the hypervisor during init, DLPAR event and
migration. So instead of allocating this struct for each event,
define global struct and reuse it which allows the migration code
to avoid adding an error path.

Also disable copy/paste feature flag if any capabilities HCALL
is failed.

Signed-off-by: Haren Myneni 
Acked-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/vas.c | 47 
 1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 591c7597db5a..3bb219f54806 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -26,6 +26,7 @@
 
 static struct vas_all_caps caps_all;
 static bool copypaste_feat;
+static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
@@ -724,7 +725,6 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds)
  */
 int vas_reconfig_capabilties(u8 type)
 {
-   struct hv_vas_cop_feat_caps *hv_caps;
struct vas_cop_feat_caps *caps;
int old_nr_creds, new_nr_creds;
struct vas_caps *vcaps;
@@ -738,17 +738,13 @@ int vas_reconfig_capabilties(u8 type)
vcaps = [type];
caps = >caps;
 
-   hv_caps = kmalloc(sizeof(*hv_caps), GFP_KERNEL);
-   if (!hv_caps)
-   return -ENOMEM;
-
mutex_lock(_pseries_mutex);
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, vcaps->feat,
- (u64)virt_to_phys(hv_caps));
+ (u64)virt_to_phys(_cop_caps));
if (rc)
goto out;
 
-   new_nr_creds = be16_to_cpu(hv_caps->target_lpar_creds);
+   new_nr_creds = be16_to_cpu(hv_cop_caps.target_lpar_creds);
 
old_nr_creds = atomic_read(>nr_total_credits);
 
@@ -780,7 +776,6 @@ int vas_reconfig_capabilties(u8 type)
 
 out:
mutex_unlock(_pseries_mutex);
-   kfree(hv_caps);
return rc;
 }
 /*
@@ -822,9 +817,8 @@ static struct notifier_block pseries_vas_nb = {
 
 static int __init pseries_vas_init(void)
 {
-   struct hv_vas_cop_feat_caps *hv_cop_caps;
struct hv_vas_all_caps *hv_caps;
-   int rc;
+   int rc = 0;
 
/*
 * Linux supports user space COPY/PASTE only with Radix
@@ -850,38 +844,37 @@ static int __init pseries_vas_init(void)
 
sysfs_pseries_vas_init(_all);
 
-   hv_cop_caps = kmalloc(sizeof(*hv_cop_caps), GFP_KERNEL);
-   if (!hv_cop_caps) {
-   rc = -ENOMEM;
-   goto out;
-   }
/*
 * QOS capabilities available
 */
if (caps_all.feat_type & VAS_GZIP_QOS_FEAT_BIT) {
rc = get_vas_capabilities(VAS_GZIP_QOS_FEAT,
- VAS_GZIP_QOS_FEAT_TYPE, hv_cop_caps);
+ VAS_GZIP_QOS_FEAT_TYPE, _cop_caps);
 
if (rc)
-   goto out_cop;
+   goto out;
}
/*
 * Default capabilities available
 */
-   if (caps_all.feat_type & VAS_GZIP_DEF_FEAT_BIT) {
+   if (caps_all.feat_type & VAS_GZIP_DEF_FEAT_BIT)
rc = get_vas_capabilities(VAS_GZIP_DEF_FEAT,
- VAS_GZIP_DEF_FEAT_TYPE, hv_cop_caps);
-   if (rc)
-   goto out_cop;
-   }
+ VAS_GZIP_DEF_FEAT_TYPE, _cop_caps);
 
-   if (copypaste_feat && firmware_has_feature(FW_FEATURE_LPAR))
-   of_reconfig_notifier_register(_vas_nb);
+   if (!rc && copypaste_feat) {
+   if (firmware_has_feature(FW_FEATURE_LPAR))
+   of_reconfig_notifier_register(_vas_nb);
 
-   pr_info("GZIP feature is available\n");
+   pr_info("GZIP feature is available\n");
+   } else {
+   /*
+* Should not happen, but only when get default
+* capabilities HCALL failed. So disable copy paste
+* feature.
+*/
+   copypaste_feat = false;
+   }
 
-out_cop:
-   kfree(hv_cop_caps);
 out:
kfree(hv_caps);
return rc;
-- 
2.27.0




[PATCH v3 0/4] powerpc/pseries/vas: VAS/NXGZIP support with LPM

2022-02-19 Thread Haren Myneni


Virtual Accelerator Switchboard (VAS) is an engine stays on the
chip. So all windows opened on a specific engine belongs to VAS
the chip. The hypervisor expects the partition to close all
active windows on the sources system and reopen them after
migration on the destination machine.

This patch series adds VAS support with the partition migration.
When the migration initiates, the VAS migration handler will be
invoked before pseries_suspend() to close all active windows and
mark them in-active with VAS_WIN_MIGRATE_CLOSE status. Whereas
this migration handler is called after migration to reopen all
windows which has VAS_WIN_MIGRATE_CLOSE status and make them
active again. The user space gets paste instruction failure
when it sends requests on these in-active windows.

These patches depend on VAS/DLPAR support patch series

Changes in v2:
- Added new patch "Define global hv_cop_caps struct" to eliminate
  memory allocation failure during migration (suggestion by
  Nathan Lynch)

Changes in v3:
- Rebase on 5.17-rc4
- Naming changes for VAS capability struct elemets based on the V4 DLPAR
  support patch series.

Haren Myneni (4):
  powerpc/pseries/vas: Define global hv_cop_caps struct
  powerpc/pseries/vas: Modify reconfig open/close functions for
migration
  powerpc/pseries/vas: Add VAS migration handler
  powerpc/pseries/vas: Disable window open during migration

 arch/powerpc/include/asm/vas.h|   2 +
 arch/powerpc/platforms/pseries/mobility.c |   5 +
 arch/powerpc/platforms/pseries/vas.c  | 234 +-
 arch/powerpc/platforms/pseries/vas.h  |   6 +
 4 files changed, 201 insertions(+), 46 deletions(-)

-- 
2.27.0




[PATCH v4 9/9] powerpc/pseries/vas: Write 'nr_total_credits' for QoS credits change

2022-02-19 Thread Haren Myneni


pseries supports two types of credits - Default (uses normal priority
FIFO) and Qality of service (QoS uses high priority FIFO). The user
decides the number of QoS credits and sets this value with HMC
interface. With the core add/removal, this value can be changed in HMC
which invokes drmgr to communicate to the kernel.

This patch adds an interface so that drmgr command can write the new
target QoS credits in sysfs. But the kernel gets the new QoS
capabilities from the hypervisor whenever nr_total_credits is updated
to make sure sync with the values in the hypervisor.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas-sysfs.c | 33 +-
 arch/powerpc/platforms/pseries/vas.c   |  2 +-
 arch/powerpc/platforms/pseries/vas.h   |  1 +
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
index e24d3edb3021..20745cd75f27 100644
--- a/arch/powerpc/platforms/pseries/vas-sysfs.c
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -25,6 +25,33 @@ struct vas_caps_entry {
 
 #define to_caps_entry(entry) container_of(entry, struct vas_caps_entry, kobj)
 
+/*
+ * This function is used to get the notification from the drmgr when
+ * QoS credits are changed. Though receiving the target total QoS
+ * credits here, get the official QoS capabilities from the hypervisor.
+ */
+static ssize_t nr_total_credits_store(struct vas_cop_feat_caps *caps,
+  const char *buf, size_t count)
+{
+   int err;
+   u16 creds;
+
+   /*
+* Nothing to do for default credit type.
+*/
+   if (caps->win_type == VAS_GZIP_DEF_FEAT_TYPE)
+   return -EOPNOTSUPP;
+
+   err = kstrtou16(buf, 0, );
+   if (!err)
+   err = vas_reconfig_capabilties(caps->win_type);
+
+   if (err)
+   return -EINVAL;
+
+   return count;
+}
+
 #define sysfs_caps_entry_read(_name)   \
 static ssize_t _name##_show(struct vas_cop_feat_caps *caps, char *buf) 
\
 {  \
@@ -41,6 +68,10 @@ struct vas_sysfs_entry {
sysfs_caps_entry_read(_name);   \
static struct vas_sysfs_entry _name##_attribute = __ATTR(_name, \
0444, _name##_show, NULL);
+#define VAS_ATTR(_name)
\
+   sysfs_caps_entry_read(_name);   \
+   static struct vas_sysfs_entry _name##_attribute = __ATTR(_name, \
+   0644, _name##_show, _name##_store)
 
 /*
  * Create sysfs interface:
@@ -65,7 +96,7 @@ struct vas_sysfs_entry {
  * Number of credits used by the user space.
  */
 
-VAS_ATTR_RO(nr_total_credits);
+VAS_ATTR(nr_total_credits);
 VAS_ATTR_RO(nr_used_credits);
 
 static struct attribute *vas_capab_attrs[] = {
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index ca0ad191229d..591c7597db5a 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -722,7 +722,7 @@ static int reconfig_close_windows(struct vas_caps *vcap, 
int excess_creds)
  * changes. Reconfig window configurations based on the credits
  * availability from this new capabilities.
  */
-static int vas_reconfig_capabilties(u8 type)
+int vas_reconfig_capabilties(u8 type)
 {
struct hv_vas_cop_feat_caps *hv_caps;
struct vas_cop_feat_caps *caps;
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index f1bdb776021e..4ddb1001a0aa 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -130,5 +130,6 @@ struct pseries_vas_window {
 };
 
 int sysfs_add_vas_caps(struct vas_cop_feat_caps *caps);
+int vas_reconfig_capabilties(u8 type);
 int __init sysfs_pseries_vas_init(struct vas_all_caps *vas_caps);
 #endif /* _VAS_H */
-- 
2.27.0




[PATCH v4 8/9] powerpc/pseries/vas: sysfs interface to export capabilities

2022-02-19 Thread Haren Myneni


The hypervisor provides the available VAS GZIP capabilities such
as default or QoS window type and the target available credits in
each type. This patch creates sysfs entries and exports the target,
used and the available credits for each feature.

This interface can be used by the user space to determine the credits
usage or to set the target credits in the case of QoS type (for DLPAR).

/sys/devices/vas/vas0/gzip/default_capabilities (default GZIP capabilities)
nr_total_credits /* Total credits available. Can be
 /* changed with DLPAR operation */
nr_used_credits  /* Used credits */

/sys/devices/vas/vas0/gzip/qos_capabilities (QoS GZIP capabilities)
nr_total_credits
nr_used_credits

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/Makefile|   2 +-
 arch/powerpc/platforms/pseries/vas-sysfs.c | 226 +
 arch/powerpc/platforms/pseries/vas.c   |   6 +
 arch/powerpc/platforms/pseries/vas.h   |   6 +
 4 files changed, 239 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/pseries/vas-sysfs.c

diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index ee60b59024b4..29b522d2c755 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -29,6 +29,6 @@ obj-$(CONFIG_PPC_SVM) += svm.o
 obj-$(CONFIG_FA_DUMP)  += rtas-fadump.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
-obj-$(CONFIG_PPC_VAS)  += vas.o
+obj-$(CONFIG_PPC_VAS)  += vas.o vas-sysfs.o
 
 obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c 
b/arch/powerpc/platforms/pseries/vas-sysfs.c
new file mode 100644
index ..e24d3edb3021
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -0,0 +1,226 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright 2022-23 IBM Corp.
+ */
+
+#define pr_fmt(fmt) "vas: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vas.h"
+
+#ifdef CONFIG_SYSFS
+static struct kobject *pseries_vas_kobj;
+static struct kobject *gzip_caps_kobj;
+
+struct vas_caps_entry {
+   struct kobject kobj;
+   struct vas_cop_feat_caps *caps;
+};
+
+#define to_caps_entry(entry) container_of(entry, struct vas_caps_entry, kobj)
+
+#define sysfs_caps_entry_read(_name)   \
+static ssize_t _name##_show(struct vas_cop_feat_caps *caps, char *buf) 
\
+{  \
+   return sprintf(buf, "%d\n", atomic_read(>_name)); \
+}
+
+struct vas_sysfs_entry {
+   struct attribute attr;
+   ssize_t (*show)(struct vas_cop_feat_caps *, char *);
+   ssize_t (*store)(struct vas_cop_feat_caps *, const char *, size_t);
+};
+
+#define VAS_ATTR_RO(_name) \
+   sysfs_caps_entry_read(_name);   \
+   static struct vas_sysfs_entry _name##_attribute = __ATTR(_name, \
+   0444, _name##_show, NULL);
+
+/*
+ * Create sysfs interface:
+ * /sys/devices/vas/vas0/gzip/default_capabilities
+ * This directory contains the following VAS GZIP capabilities
+ * for the defaule credit type.
+ * /sys/devices/vas/vas0/gzip/default_capabilities/nr_total_credits
+ * Total number of default credits assigned to the LPAR which
+ * can be changed with DLPAR operation.
+ * /sys/devices/vas/vas0/gzip/default_capabilities/nr_used_credits
+ * Number of credits used by the user space. One credit will
+ * be assigned for each window open.
+ *
+ * /sys/devices/vas/vas0/gzip/qos_capabilities
+ * This directory contains the following VAS GZIP capabilities
+ * for the Quality of Service (QoS) credit type.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_total_credits
+ * Total number of QoS credits assigned to the LPAR. The user
+ * has to define this value using HMC interface. It can be
+ * changed dynamically by the user.
+ * /sys/devices/vas/vas0/gzip/qos_capabilities/nr_used_credits
+ * Number of credits used by the user space.
+ */
+
+VAS_ATTR_RO(nr_total_credits);
+VAS_ATTR_RO(nr_used_credits);
+
+static struct attribute *vas_capab_attrs[] = {
+   _total_credits_attribute.attr,
+   _used_credits_attribute.attr,
+   NULL,
+};
+
+static ssize_t vas_type_show(struct kobject *kobj, struct attribute *attr,
+char *buf)
+{
+   struct vas_caps_entry *centry;
+   struct vas_cop_feat_caps *caps;
+   struct vas_sysfs_entry *entry;
+
+   centry = to_caps_entry(kobj);
+   caps = centry->caps;
+   entry = container_of(attr, struct vas_sysfs_entry, attr);
+
+   if (!entry->show)
+   return -EIO;
+
+   return entry->show(caps, buf);
+}
+
+static ssize_t vas_type_store(struct kobject *kobj, struct attribute *attr

[PATCH v4 7/9] powerpc/pseries/vas: Reopen windows with DLPAR core add

2022-02-19 Thread Haren Myneni


VAS windows can be closed in the hypervisor due to lost credits
when the core is removed and the kernel gets fault for NX
requests on these in-active windows. If these credits are
available later for core add, reopen these windows and set them
active. When the OS sees page faults on these active windows,
it creates mapping on the new paste address. Then the user space
can continue to use these windows and send HW compression
requests to NX successfully.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 91 +++-
 1 file changed, 90 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index a297720bcdae..96178dd58adf 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -565,6 +565,88 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
return 0;
 }
 
+/*
+ * VAS windows can be closed due to lost credits when the core is
+ * removed. So reopen them if credits are available due to DLPAR
+ * core add and set the window active status. When NX sees the page
+ * fault on the unmapped paste address, the kernel handles the fault
+ * by setting the remapping to new paste address if the window is
+ * active.
+ */
+static int reconfig_open_windows(struct vas_caps *vcaps, int creds)
+{
+   long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
+   struct vas_cop_feat_caps *caps = >caps;
+   struct pseries_vas_window *win = NULL, *tmp;
+   int rc, mv_ents = 0;
+
+   /*
+* Nothing to do if there are no closed windows.
+*/
+   if (!vcaps->nr_close_wins)
+   return 0;
+
+   /*
+* For the core removal, the hypervisor reduces the credits
+* assigned to the LPAR and the kernel closes VAS windows
+* in the hypervisor depends on reduced credits. The kernel
+* uses LIFO (the last windows that are opened will be closed
+* first) and expects to open in the same order when credits
+* are available.
+* For example, 40 windows are closed when the LPAR lost 2 cores
+* (dedicated). If 1 core is added, this LPAR can have 20 more
+* credits. It means the kernel can reopen 20 windows. So move
+* 20 entries in the VAS windows lost and reopen next 20 windows.
+*/
+   if (vcaps->nr_close_wins > creds)
+   mv_ents = vcaps->nr_close_wins - creds;
+
+   list_for_each_entry_safe(win, tmp, >list, win_list) {
+   if (!mv_ents)
+   break;
+
+   mv_ents--;
+   }
+
+   list_for_each_entry_safe_from(win, tmp, >list, win_list) {
+   /*
+* Nothing to do on this window if it is not closed
+* with VAS_WIN_NO_CRED_CLOSE
+*/
+   if (!(win->vas_win.status & VAS_WIN_NO_CRED_CLOSE))
+   continue;
+
+   rc = allocate_setup_window(win, (u64 *)[0],
+  caps->win_type);
+   if (rc)
+   return rc;
+
+   rc = h_modify_vas_window(win);
+   if (rc)
+   goto out;
+
+   mutex_lock(>vas_win.task_ref.mmap_mutex);
+   /*
+* Set window status to active
+*/
+   win->vas_win.status &= ~VAS_WIN_NO_CRED_CLOSE;
+   mutex_unlock(>vas_win.task_ref.mmap_mutex);
+   win->win_type = caps->win_type;
+   if (!--vcaps->nr_close_wins)
+   break;
+   }
+
+   return 0;
+out:
+   /*
+* Window modify HCALL failed. So close the window to the
+* hypervisor and return.
+*/
+   free_irq_setup(win);
+   h_deallocate_vas_window(win->vas_win.winid);
+   return rc;
+}
+
 /*
  * The hypervisor reduces the available credits if the LPAR lost core. It
  * means the excessive windows should not be active and the user space
@@ -673,7 +755,14 @@ static int vas_reconfig_capabilties(u8 type)
 * closed / reopened. Hold the vas_pseries_mutex so that the
 * the user space can not open new windows.
 */
-   if (old_nr_creds >  new_nr_creds) {
+   if (old_nr_creds <  new_nr_creds) {
+   /*
+* If the existing target credits is less than the new
+* target, reopen windows if they are closed due to
+* the previous DLPAR (core removal).
+*/
+   rc = reconfig_open_windows(vcaps, new_nr_creds - old_nr_creds);
+   } else {
/*
 * # active windows is more than new LPAR available
 * credits. So close the excessive windows.
-- 
2.27.0




[PATCH v4 6/9] powerpc/pseries/vas: Close windows with DLPAR core removal

2022-02-19 Thread Haren Myneni


The hypervisor assigns vas credits (windows) for each LPAR based
on the number of cores configured in that system. The OS is
expected to release credits when cores are removed, and may
allocate more when cores are added. So there is a possibility of
using excessive credits (windows) in the LPAR and the hypervisor
expects the system to close the excessive windows so that NX load
can be equally distributed across all LPARs in the system.

When the OS closes the excessive windows in the hypervisor,
it sets the window status in-active and invalidates window
virtual address mapping. The user space receives paste instruction
failure if any NX requests are issued on the in-active window.

This patch also adds the notifier for core removal/add to close
windows in the hypervisor if the system lost credits (core
removal) and reopen windows in the hypervisor when the previously
lost credits are available.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h   |   2 +
 arch/powerpc/platforms/pseries/vas.c | 207 +--
 arch/powerpc/platforms/pseries/vas.h |   3 +
 3 files changed, 204 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 27251af18c65..6baf7b9ffed4 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -34,6 +34,8 @@
  */
 #define VAS_WIN_ACTIVE 0x0 /* Used in platform independent */
/* vas mmap() */
+/* Window is closed in the hypervisor due to lost credit */
+#define VAS_WIN_NO_CRED_CLOSE  0x0001
 
 /*
  * Get/Set bit fields
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 1035446f985b..a297720bcdae 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -370,13 +370,28 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
if (rc)
goto out_free;
 
-   vas_user_win_add_mm_context(>vas_win.task_ref);
txwin->win_type = cop_feat_caps->win_type;
mutex_lock(_pseries_mutex);
-   list_add(>win_list, >list);
+   /*
+* Possible to lose the acquired credit with DLPAR core
+* removal after the window is opened. So if there are any
+* closed windows (means with lost credits), do not give new
+* window to user space. New windows will be opened only
+* after the existing windows are reopened when credits are
+* available.
+*/
+   if (!caps->nr_close_wins) {
+   list_add(>win_list, >list);
+   caps->nr_open_windows++;
+   mutex_unlock(_pseries_mutex);
+   vas_user_win_add_mm_context(>vas_win.task_ref);
+   return >vas_win;
+   }
mutex_unlock(_pseries_mutex);
 
-   return >vas_win;
+   put_vas_user_win_ref(>vas_win.task_ref);
+   rc = -EBUSY;
+   pr_err("No credit is available to allocate window\n");
 
 out_free:
/*
@@ -439,14 +454,24 @@ static int vas_deallocate_window(struct vas_window *vwin)
 
caps = [win->win_type].caps;
mutex_lock(_pseries_mutex);
-   rc = deallocate_free_window(win);
-   if (rc) {
-   mutex_unlock(_pseries_mutex);
-   return rc;
-   }
+   /*
+* VAS window is already closed in the hypervisor when
+* lost the credit. So just remove the entry from
+* the list, remove task references and free vas_window
+* struct.
+*/
+   if (win->vas_win.status & VAS_WIN_NO_CRED_CLOSE) {
+   rc = deallocate_free_window(win);
+   if (rc) {
+   mutex_unlock(_pseries_mutex);
+   return rc;
+   }
+   } else
+   vascaps[win->win_type].nr_close_wins--;
 
list_del(>win_list);
atomic_dec(>nr_used_credits);
+   vascaps[win->win_type].nr_open_windows--;
mutex_unlock(_pseries_mutex);
 
put_vas_user_win_ref(>task_ref);
@@ -501,6 +526,7 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
memset(vcaps, 0, sizeof(*vcaps));
INIT_LIST_HEAD(>list);
 
+   vcaps->feat = feat;
caps = >caps;
 
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, feat,
@@ -539,6 +565,168 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
return 0;
 }
 
+/*
+ * The hypervisor reduces the available credits if the LPAR lost core. It
+ * means the excessive windows should not be active and the user space
+ * should not be using these windows to send compression requests to NX.
+ * So the kernel closes the excessive windows and unmap the paste address
+ * such that the user space receives paste instruction failure. Then up to
+ * the user space to fal

[PATCH v4 5/9] powerpc/vas: Map paste address only if window is active

2022-02-19 Thread Haren Myneni


The paste address mapping is done with mmap() after the window is
opened with ioctl. If the window is closed by OS in the hypervisor
due to DLPAR after this mmap(), the paste instruction returns
failure until the OS reopens this window again. But before mmap(),
DLPAR core removal can happen which causes the corresponding
window in-active. So if the window is not active, return mmap()
failure with -EACCES and expects the user space reissue mmap()
when the window is active or open a new window when the credit
is available.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/book3s/vas-api.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index f3e421511ea6..eb4489b2b46b 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -496,10 +496,26 @@ static int coproc_mmap(struct file *fp, struct 
vm_area_struct *vma)
return -EACCES;
}
 
+   /*
+* The initial mmap is done after the window is opened
+* with ioctl. But before mmap(), this window can be closed in
+* the hypervisor due to lost credit (core removal on pseries).
+* So if the window is not active, return mmap() failure with
+* -EACCES and expects the user space reissue mmap() when it
+* is active again or open new window when the credit is available.
+*/
+   mutex_lock(>task_ref.mmap_mutex);
+   if (txwin->status != VAS_WIN_ACTIVE) {
+   pr_err("%s(): Window is not active\n", __func__);
+   rc = -EACCES;
+   goto out;
+   }
+
paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
if (!paste_addr) {
pr_err("%s(): Window paste address failed\n", __func__);
-   return -EINVAL;
+   rc = -EINVAL;
+   goto out;
}
 
pfn = paste_addr >> PAGE_SHIFT;
@@ -519,6 +535,8 @@ static int coproc_mmap(struct file *fp, struct 
vm_area_struct *vma)
txwin->task_ref.vma = vma;
vma->vm_ops = _vm_ops;
 
+out:
+   mutex_unlock(>task_ref.mmap_mutex);
return rc;
 }
 
-- 
2.27.0




[PATCH v4 4/9] powerpc/vas: Return paste instruction failure if no active window

2022-02-19 Thread Haren Myneni


The VAS window may not be active if the system looses credits and
the NX generates page fault when it receives request on unmap
paste address.

The kernel handles the fault by remap new paste address if the
window is active again, Otherwise return the paste instruction
failure if the executed instruction that caused the fault was
a paste.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/ppc-opcode.h   |  2 +
 arch/powerpc/platforms/book3s/vas-api.c | 55 -
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 9675303b724e..82f1f0041c6f 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -262,6 +262,8 @@
 #define PPC_INST_MFSPR_PVR 0x7c1f42a6
 #define PPC_INST_MFSPR_PVR_MASK0xfc1e
 #define PPC_INST_MTMSRD0x7c000164
+#define PPC_INST_PASTE 0x7c20070d
+#define PPC_INST_PASTE_MASK0xfc2007ff
 #define PPC_INST_POPCNTB   0x7cf4
 #define PPC_INST_POPCNTB_MASK  0xfc0007fe
 #define PPC_INST_RFEBB 0x4c000124
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index f359e7b2bf90..f3e421511ea6 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -351,6 +351,41 @@ static int coproc_release(struct inode *inode, struct file 
*fp)
return 0;
 }
 
+/*
+ * If the executed instruction that caused the fault was a paste, then
+ * clear regs CR0[EQ], advance NIP, and return 0. Else return error code.
+ */
+static int do_fail_paste(void)
+{
+   struct pt_regs *regs = current->thread.regs;
+   u32 instword;
+
+   if (WARN_ON_ONCE(!regs))
+   return -EINVAL;
+
+   if (WARN_ON_ONCE(!user_mode(regs)))
+   return -EINVAL;
+
+   /*
+* If we couldn't translate the instruction, the driver should
+* return success without handling the fault, it will be retried
+* or the instruction fetch will fault.
+*/
+   if (get_user(instword, (u32 __user *)(regs->nip)))
+   return -EAGAIN;
+
+   /*
+* Not a paste instruction, driver may fail the fault.
+*/
+   if ((instword & PPC_INST_PASTE_MASK) != PPC_INST_PASTE)
+   return -ENOENT;
+
+   regs->ccr &= ~0xe000;   /* Clear CR0[0-2] to fail paste */
+   regs_add_return_ip(regs, 4);/* Emulate the paste */
+
+   return 0;
+}
+
 /*
  * This fault handler is invoked when the core generates page fault on
  * the paste address. Happens if the kernel closes window in hypervisor
@@ -408,9 +443,27 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
}
mutex_unlock(>task_ref.mmap_mutex);
 
-   return VM_FAULT_SIGBUS;
+   /*
+* Received this fault due to closing the actual window.
+* It can happen during migration or lost credits.
+* Since no mapping, return the paste instruction failure
+* to the user space.
+*/
+   ret = do_fail_paste();
+   /*
+* The user space can retry several times until success (needed
+* for migration) or should fallback to SW compression or
+* manage with the existing open windows if available.
+* Looking at sysfs interface, it can determine whether these
+* failures are coming during migration or core removal:
+* nr_used_credits > nr_total_credits when lost credits
+*/
+   if (!ret || (ret == -EAGAIN))
+   return VM_FAULT_NOPAGE;
 
+   return VM_FAULT_SIGBUS;
 }
+
 static const struct vm_operations_struct vas_vm_ops = {
.fault = vas_mmap_fault,
 };
-- 
2.27.0




[PATCH v4 3/9] powerpc/vas: Add paste address mmap fault handler

2022-02-19 Thread Haren Myneni


The user space opens VAS windows and issues NX requests by pasting
CRB on the corresponding paste address mmap. When the system lost
credits due to core removal, the kernel has to close the window in
the hypervisor and make the window inactive by unmapping this paste
address. Also the OS has to handle NX request page faults if the user
space issue NX requests.

This handler maps the new paste address with the same VMA when the
window is active again (due to core add with DLPAR). Otherwise
returns paste failure.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h  | 10 
 arch/powerpc/platforms/book3s/vas-api.c | 68 +
 2 files changed, 78 insertions(+)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 57573d9c1e09..27251af18c65 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -29,6 +29,12 @@
 #define VAS_THRESH_FIFO_GT_QTR_FULL2
 #define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
 
+/*
+ * VAS window Linux status bits
+ */
+#define VAS_WIN_ACTIVE 0x0 /* Used in platform independent */
+   /* vas mmap() */
+
 /*
  * Get/Set bit fields
  */
@@ -59,6 +65,9 @@ struct vas_user_win_ref {
struct pid *pid;/* PID of owner */
struct pid *tgid;   /* Thread group ID of owner */
struct mm_struct *mm;   /* Linux process mm_struct */
+   struct mutex mmap_mutex;/* protects paste address mmap() */
+   /* with DLPAR close/open windows */
+   struct vm_area_struct *vma; /* Save VMA and used in DLPAR ops */
 };
 
 /*
@@ -67,6 +76,7 @@ struct vas_user_win_ref {
 struct vas_window {
u32 winid;
u32 wcreds_max; /* Window credits */
+   u32 status; /* Window status used in OS */
enum vas_cop_type cop;
struct vas_user_win_ref task_ref;
char *dbgname;
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 4d82c92ddd52..f359e7b2bf90 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -316,6 +316,7 @@ static int coproc_ioc_tx_win_open(struct file *fp, unsigned 
long arg)
return PTR_ERR(txwin);
}
 
+   mutex_init(>task_ref.mmap_mutex);
cp_inst->txwin = txwin;
 
return 0;
@@ -350,6 +351,70 @@ static int coproc_release(struct inode *inode, struct file 
*fp)
return 0;
 }
 
+/*
+ * This fault handler is invoked when the core generates page fault on
+ * the paste address. Happens if the kernel closes window in hypervisor
+ * (on pseries) due to lost credit or the paste address is not mapped.
+ */
+static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
+{
+   struct vm_area_struct *vma = vmf->vma;
+   struct file *fp = vma->vm_file;
+   struct coproc_instance *cp_inst = fp->private_data;
+   struct vas_window *txwin;
+   u64 paste_addr;
+   int ret;
+
+   /*
+* window is not opened. Shouldn't expect this error.
+*/
+   if (!cp_inst || !cp_inst->txwin) {
+   pr_err("%s(): Unexpected fault on paste address with TX window 
closed\n",
+   __func__);
+   return VM_FAULT_SIGBUS;
+   }
+
+   txwin = cp_inst->txwin;
+   /*
+* When the LPAR lost credits due to core removal or during
+* migration, invalidate the existing mapping for the current
+* paste addresses and set windows in-active (zap_page_range in
+* reconfig_close_windows()).
+* New mapping will be done later after migration or new credits
+* available. So continue to receive faults if the user space
+* issue NX request.
+*/
+   if (txwin->task_ref.vma != vmf->vma) {
+   pr_err("%s(): No previous mapping with paste address\n",
+   __func__);
+   return VM_FAULT_SIGBUS;
+   }
+
+   mutex_lock(>task_ref.mmap_mutex);
+   /*
+* The window may be inactive due to lost credit (Ex: core
+* removal with DLPAR). If the window is active again when
+* the credit is available, map the new paste address at the
+* the window virtual address.
+*/
+   if (txwin->status == VAS_WIN_ACTIVE) {
+   paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
+   if (paste_addr) {
+   ret = vmf_insert_pfn(vma, vma->vm_start,
+   (paste_addr >> PAGE_SHIFT));
+   mutex_unlock(>task_ref.mmap_mutex);
+   return ret;
+   }
+   }
+   mutex_unlock(>task_ref.mmap_mutex);
+
+   return VM_FAULT_SIGBUS;
+
+}
+static const struct vm_operations_struct vas_vm_ops = {
+   .fault = vas_mmap_fau

[PATCH v4 2/9] powerpc/pseries/vas: Save PID in pseries_vas_window struct

2022-02-19 Thread Haren Myneni


The kernel sets the VAS window with PID when it is opened in
the hypervisor. During DLPAR operation, windows can be closed and
reopened in the hypervisor when the credit is available. So saves
this PID in pseries_vas_window struct when the window is opened
initially and reuse it later during DLPAR operation.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 9 +
 arch/powerpc/platforms/pseries/vas.h | 1 +
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 18aae037ffe9..1035446f985b 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -107,7 +107,6 @@ static int h_deallocate_vas_window(u64 winid)
 static int h_modify_vas_window(struct pseries_vas_window *win)
 {
long rc;
-   u32 lpid = mfspr(SPRN_PID);
 
/*
 * AMR value is not supported in Linux VAS implementation.
@@ -115,7 +114,7 @@ static int h_modify_vas_window(struct pseries_vas_window 
*win)
 */
do {
rc = plpar_hcall_norets(H_MODIFY_VAS_WINDOW,
-   win->vas_win.winid, lpid, 0,
+   win->vas_win.winid, win->pid, 0,
VAS_MOD_WIN_FLAGS, 0);
 
rc = hcall_return_busy_check(rc);
@@ -124,8 +123,8 @@ static int h_modify_vas_window(struct pseries_vas_window 
*win)
if (rc == H_SUCCESS)
return 0;
 
-   pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u lpid %u\n",
-   rc, win->vas_win.winid, lpid);
+   pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u pid %u\n",
+   rc, win->vas_win.winid, win->pid);
return -EIO;
 }
 
@@ -338,6 +337,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
}
}
 
+   txwin->pid = mfspr(SPRN_PID);
+
/*
 * Allocate / Deallocate window hcalls and setup / free IRQs
 * have to be protected with mutex.
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index d6ea8ab8b07a..2872532ed72a 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -114,6 +114,7 @@ struct pseries_vas_window {
u64 domain[6];  /* Associativity domain Ids */
/* this window is allocated */
u64 util;
+   u32 pid;/* PID associated with this window */
 
/* List of windows opened which is used for LPM */
struct list_head win_list;
-- 
2.27.0




[PATCH v4 1/9] powerpc/pseries/vas: Use common names in VAS capability structure

2022-02-19 Thread Haren Myneni


nr_total/nr_used_credits provides credits usage to user space
via sysfs and the same interface can be used on PowerNV in
future. Changed with proper naming so that applicable on both
pseries and PowerNV.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/vas.c | 10 +-
 arch/powerpc/platforms/pseries/vas.h |  5 ++---
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index d243ddc58827..18aae037ffe9 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -310,8 +310,8 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 
cop_feat_caps = >caps;
 
-   if (atomic_inc_return(_feat_caps->used_lpar_creds) >
-   atomic_read(_feat_caps->target_lpar_creds)) {
+   if (atomic_inc_return(_feat_caps->nr_used_credits) >
+   atomic_read(_feat_caps->nr_total_credits)) {
pr_err("Credits are not available to allocate window\n");
rc = -EINVAL;
goto out;
@@ -385,7 +385,7 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
 out:
-   atomic_dec(_feat_caps->used_lpar_creds);
+   atomic_dec(_feat_caps->nr_used_credits);
kfree(txwin);
return ERR_PTR(rc);
 }
@@ -445,7 +445,7 @@ static int vas_deallocate_window(struct vas_window *vwin)
}
 
list_del(>win_list);
-   atomic_dec(>used_lpar_creds);
+   atomic_dec(>nr_used_credits);
mutex_unlock(_pseries_mutex);
 
put_vas_user_win_ref(>task_ref);
@@ -521,7 +521,7 @@ static int __init get_vas_capabilities(u8 feat, enum 
vas_cop_feat_type type,
}
caps->max_lpar_creds = be16_to_cpu(hv_caps->max_lpar_creds);
caps->max_win_creds = be16_to_cpu(hv_caps->max_win_creds);
-   atomic_set(>target_lpar_creds,
+   atomic_set(>nr_total_credits,
   be16_to_cpu(hv_caps->target_lpar_creds));
if (feat == VAS_GZIP_DEF_FEAT) {
caps->def_lpar_creds = be16_to_cpu(hv_caps->def_lpar_creds);
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
index 4ecb3fcabd10..d6ea8ab8b07a 100644
--- a/arch/powerpc/platforms/pseries/vas.h
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -72,9 +72,8 @@ struct vas_cop_feat_caps {
};
/* Total LPAR available credits. Can be different from max LPAR */
/* credits due to DLPAR operation */
-   atomic_ttarget_lpar_creds;
-   atomic_tused_lpar_creds; /* Used credits so far */
-   u16 avail_lpar_creds; /* Remaining available credits */
+   atomic_tnr_total_credits;   /* Total credits assigned to 
LPAR */
+   atomic_tnr_used_credits;/* Used credits so far */
 };
 
 /*
-- 
2.27.0




[PATCH v4 0/9] powerpc/pseries/vas: NXGZIP support with DLPAR

2022-02-19 Thread Haren Myneni


PowerPC provides HW compression with NX coprocessor. This feature
is available on both PowerNV and PowerVM and included in Linux.
Since each powerpc chip has one NX coprocessor, the VAS introduces
the concept of windows / credits to manage access to this hardware
resource. On powerVM, these limited resources should be available
across all LPARs. So the hypervisor assigns the specific credits
to each LPAR based on processor entitlement so that one LPAR does
not overload NX. The hypervisor can reject the window open request
to a partition if exceeds its credit limit (1 credit per window).

So the total number of target credits in a partition can be changed
if the core configuration is modified. The hypervisor expects the
partition to modify its window usage depends on new target
credits. For example, if the partition uses more credits than the
new target credits, it should close the excessive windows so that
the NX resource will be available to other partitions.

This patch series enables OS to support this dynamic credit
management with DLPAR core removal/add.

Core removal operation:
- Get new VAS capabilities from the hypervisor when the DLPAR
  notifier is received. This capabilities provides the new target
  credits based on new processor entitlement. In the case of QoS
  credit changes, the notification will be issued by updating
  the target_creds via sysfs.
- If the partition is already used more than the new target credits,
  the kernel selects windows, unmap the current paste address and
  close them in the hypervisor, It uses FIFO to identify these
  windows - last windows that are opened are the first ones to be
  closed.
- When the user space issue requests on these windows, NX generates
  page fault on the unmap paste address. The kernel handles the
  fault by returning the paste instruction failure if the window is
  not active (means unmap paste). Then up to the library / user
  space to fall back to SW compression or manage with the current
  windows.

Core add operation:
- The kernel can see increased target credits from the new VAS
  capabilities.
- Scans the window list for the closed windows in the hypervisor
  due to lost credit before and selects windows based on same FIFO.
- Make these corresponding windows active and create remap with
  the same VMA on the new paste address in the fault handler.
- Then the user space should expect paste successful later.

Patch 1: Define common names for sysfs target/used/avail_creds so
 that same sysfs entries can be used even on PowerNV later.
Patch 2: Save PID in the vas window struct  during initial window
 open and use it when reopen later.
Patch 3: Add new mmap fault handler which handles the page fault
 from NX on paste address.
Patch 4: Return the paste instruction failure if the window is not
 active.
Patch 5: If the window is closed in the hypervisor before the user
 space issue the initial mmap(), return -EACCES failure.
Patch 6: Close windows in the hypervisor when the partition exceeds
 its usage than the new target credits.
Patch 7: When credits are available, reopen windows that are closed
 before with core removal.
Patch 8 & 9: The user space determines the credit usage with sysfs
 target/avail/used_creds interfaces. drmgr uses target_creds
to notify OS for QoS credit changes.

Thanks to Nicholas Piggin and Aneesh Kumar for the valuable suggestions
on the NXGZIP design to support DLPAR operations.

Changes in v2:
- Rebase on 5.16-rc5
- Use list safe functions to iterate windows list
- Changes to show the actual value in sysfs used_credits even though
  some windows are inactive with core removal. Reflects -ve value in
  sysfs avail_creds to let userspace know that it opened more windows
  than the current maximum LPAR credits.

Changes in v3:
- Rebase on 5.16
- Reconfigure VAS windows only for CPU hotplug events.

Changes in v4:
- Rebase on 5.17-rc4
- Changes based on comments from Nicholas Piggin
- Included VAS DLPAR notifer code in 'Close windows with DLPAR'
  patch instead of as a separate patch
- Patches reordering and other changes

Haren Myneni (9):
  powerpc/pseries/vas: Use common names in VAS capability structure
  powerpc/pseries/vas: Save PID in pseries_vas_window struct
  powerpc/vas: Add paste address mmap fault handler
  powerpc/vas: Return paste instruction failure if no active window
  powerpc/vas: Map paste address only if window is active
  powerpc/pseries/vas: Close windows with DLPAR core removal
  powerpc/pseries/vas: Reopen windows with DLPAR core add
  powerpc/pseries/vas: sysfs interface to export capabilities
  powerpc/pseries/vas: Write 'nr_total_credits' for QoS credits change

 arch/powerpc/include/asm/ppc-opcode.h  |   2 +
 arch/powerpc/include/asm/vas.h |  12 +
 arch/powerpc/platforms/book3s/vas-api.c| 141 -
 arch/powerpc/platforms/pseries/Makefile|   2 +-
 arch/powerpc/platforms/pse

Re: [PATCH v3 07/10] powerpc/vas: Add paste address mmap fault handler

2022-02-15 Thread Haren Myneni
On Mon, 2022-02-14 at 13:37 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of January 22, 2022 5:59 am:
> > The user space opens VAS windows and issues NX requests by pasting
> > CRB on the corresponding paste address mmap. When the system looses
> 
> s/loose/lose/g throughout the series.
> 
> > credits due to core removal, the kernel has to close the window in
> > the hypervisor
> 
> By the way what if the kernel does not close the window and we try
> to access the memory? The hypervisor will inject faults?

The requests on the already opened windows will be successful even the
LPAR lost credits (due to core removal). But the hypervisor expects the
LPAR to behave like good citizen and give up resources with core
removal. So we do not see any issue with current upstream code for
DLPAR removal.

But we will have an issue with the migration. The hypervisor knows the
actulal number of credits assigned to the source LPAR before migration.
So assigns the same number on the destination. 

> 
> > and make the window inactive by unmapping this paste
> > address. Also the OS has to handle NX request page faults if the
> > user
> > space issue NX requests.
> > 
> > This handler remap the new paste address with the same VMA when the
> > window is active again (due to core add with DLPAR). Otherwise
> > returns paste failure.
> 
> This patch should come before (or combined with) the patch that zaps 
> PTEs. Putting it afterwards is logically backward. Even if you don't
> really expect the series to half work in a half bisected state, it
> just makes the changes easier to follow.
> 
> Thanks,
> Nick
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/book3s/vas-api.c | 60
> > +
> >  1 file changed, 60 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/book3s/vas-api.c
> > b/arch/powerpc/platforms/book3s/vas-api.c
> > index 2d06bd1b1935..5ceba75c13eb 100644
> > --- a/arch/powerpc/platforms/book3s/vas-api.c
> > +++ b/arch/powerpc/platforms/book3s/vas-api.c
> > @@ -351,6 +351,65 @@ static int coproc_release(struct inode *inode,
> > struct file *fp)
> > return 0;
> >  }
> >  
> > +/*
> > + * This fault handler is invoked when the VAS/NX generates page
> > fault on
> > + * the paste address.
> 
> The core generates the page fault here, right? paste destination is 
> translated by the core MMU (the instruction is executed in the core,
> afterall).

correct. Will update. 
> 
> > Happens if the kernel closes window in hypervisor
> > + * (on PowerVM) due to lost credit or the paste address is not
> > mapped.
> 
> Call it pseries everywhere if you're talking about the API and Linux
> code, rather than some specific quirk or issue of of the PowerVM
> implementation.
> 
> > + */
> > +static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)
> > +{
> > +   struct vm_area_struct *vma = vmf->vma;
> > +   struct file *fp = vma->vm_file;
> > +   struct coproc_instance *cp_inst = fp->private_data;
> > +   struct vas_window *txwin;
> > +   u64 paste_addr;
> > +   int ret;
> > +
> > +   /*
> > +* window is not opened. Shouldn't expect this error.
> > +*/
> > +   if (!cp_inst || !cp_inst->txwin) {
> > +   pr_err("%s(): No send window open?\n", __func__);
> 
> Probably don't put PR_ERROR logs with question marks in them. The
> administrator knows less than you to answer the question.
> 
> "Unexpected fault on paste address with TX window closed" etc.
> 
> Then you don't need the comment either because the message explains
> it.
> 
> > +   return VM_FAULT_SIGBUS;
> > +   }
> > +
> > +   txwin = cp_inst->txwin;
> > +   /*
> > +* Fault is coming due to missing from the original mmap.
> 
> Rather than a vague comment like this (which we already know a fault 
> comes from a missing or insufficient PTE), you could point to exactly
> the code which zaps the PTEs.
> 
> > +* Can happen only when the window is closed due to lost
> > +* credit before mmap() or the user space issued NX request
> > +* without mapping.
> > +*/
> > +   if (txwin->task_ref.vma != vmf->vma) {
> > +   pr_err("%s(): No previous mapping with paste
> > address\n",
> > +   __func__);
> > +   return VM_FAULT_SIGBUS;
> > +   }
> > +
> > +   mutex_lock(>task_ref.mmap_mutex);
> > +   /*
> > +* The window may be inactive due to lost 

Re: [PATCH v3 04/10] powerpc/pseries/vas: Reopen windows with DLPAR core add

2022-02-15 Thread Haren Myneni
On Mon, 2022-02-14 at 13:08 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of January 22, 2022 5:56 am:
> > VAS windows can be closed in the hypervisor due to lost credits
> > when the core is removed. If these credits are available later
> > for core add, reopen these windows and set them active. When the
> > kernel sees page fault on the paste address, it creates new mapping
> > on the new paste address. Then the user space can continue to use
> > these windows and send HW compression requests to NX successfully.
> 
> Any reason to put this before the close windows patch? It would be
> more logical to put it afterwards AFAIKS.

reconfig_open_windows() is just to reopen and set the status flag when
windows are closed. I thought adding handler first before closing /
unmap helps during git bisect. 

I can change. 

> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/include/asm/vas.h  |  16 +++
> >  arch/powerpc/platforms/book3s/vas-api.c |   1 +
> >  arch/powerpc/platforms/pseries/vas.c| 144
> > 
> >  arch/powerpc/platforms/pseries/vas.h|   8 +-
> >  4 files changed, 163 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/vas.h
> > b/arch/powerpc/include/asm/vas.h
> > index 57573d9c1e09..f1efe86563cc 100644
> > --- a/arch/powerpc/include/asm/vas.h
> > +++ b/arch/powerpc/include/asm/vas.h
> > @@ -29,6 +29,19 @@
> >  #define VAS_THRESH_FIFO_GT_QTR_FULL2
> >  #define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
> >  
> > +/*
> > + * VAS window status
> > + */
> > +#define VAS_WIN_ACTIVE 0x0 /* Used in platform
> > independent */
> > +   /* vas mmap() */
> > +/* The hypervisor returns these values */
> > +#define VAS_WIN_CLOSED 0x0001
> > +#define VAS_WIN_INACTIVE   0x0002 /* Inactive due to HW
> > failure */
> > +#define VAS_WIN_MOD_IN_PROCESS 0x0003 /* Process of being
> > modified, */
> 
> While you're moving these and adding a comment, it would be good to 
> list what hcalls they are relevant to. H_QUERY_VAS_WINDOW (which is
> not
> used anywhere yet?) These are also a 1-byte field, so '0x00', '0x01'
> etc
> would be more appropriate.

Yes, these status bits are assigned by the hypervisor and we are not
using / supporting them right now. I just list them as defined in PAPR
NX HCALLs.

For example: OS can modify the existing window with modify HCALL when
the window is used. During this time, the hypervisor return this status
with window query HCALL. 
 
> 
> > +  /* deallocated, or quiesced
> > */
> > +/* Linux status bits */
> > +#define VAS_WIN_NO_CRED_CLOSE  0x0004 /* Window is closed
> > due to */
> > +  /* lost credit */
> 
> This is mixing a user defined bit field with hcall API value field.
> You
> also AFAIKS as yet don't fill in the hypervisor status anywhere.
> 
> I would make this it's own field entirely. A boolean would be nice,
> if
> possible.

Yes, HV status bits are not used here. 

In case if the window status is reported thorugh sysfs in future,
thought that it will be simpler to have one status flag. 

I can add 'hv_status' for the hypervisor status flag in pseries_vas-
window struct and 'status' for linux in vas_window struct. 

We also need one more status for migration. So boolean may not be used.

> 
> >  /*
> >   * Get/Set bit fields
> >   */
> > @@ -59,6 +72,8 @@ struct vas_user_win_ref {
> > struct pid *pid;/* PID of owner */
> > struct pid *tgid;   /* Thread group ID of owner */
> > struct mm_struct *mm;   /* Linux process mm_struct */
> > +   struct mutex mmap_mutex;/* protects paste address mmap() */
> > +   /* with DLPAR close/open
> > windows */
> >  };
> >  
> >  /*
> > @@ -67,6 +82,7 @@ struct vas_user_win_ref {
> >  struct vas_window {
> > u32 winid;
> > u32 wcreds_max; /* Window credits */
> > +   u32 status;
> > enum vas_cop_type cop;
> > struct vas_user_win_ref task_ref;
> > char *dbgname;
> > diff --git a/arch/powerpc/platforms/book3s/vas-api.c
> > b/arch/powerpc/platforms/book3s/vas-api.c
> > index 4d82c92ddd52..2b0ced611f32 100644
> > --- a/arch/powerpc/platforms/book3s/vas-api.c
> > +++ b/arch/powerpc/platforms/book3s/vas-api.c
> > @@ -316,6 +316,7 @@ static int coproc_ioc_tx_win_open(struct file
> > *fp, unsigned long arg)
> > return P

Re: [PATCH v3 06/10] powerpc/vas: Map paste address only if window is active

2022-02-15 Thread Haren Myneni
On Mon, 2022-02-14 at 13:20 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of January 22, 2022 5:58 am:
> > The paste address mapping is done with mmap() after the window is
> > opened with ioctl. But the window can be closed due to lost credit
> > due to core removal before mmap(). So if the window is not active,
> > return mmap() failure with -EACCES and expects the user space
> > reissue
> > mmap() when the window is active or open new window when the credit
> > is available.
> > 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/book3s/vas-api.c | 21 -
> >  1 file changed, 20 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/platforms/book3s/vas-api.c
> > b/arch/powerpc/platforms/book3s/vas-api.c
> > index a63fd48e34a7..2d06bd1b1935 100644
> > --- a/arch/powerpc/platforms/book3s/vas-api.c
> > +++ b/arch/powerpc/platforms/book3s/vas-api.c
> > @@ -379,10 +379,27 @@ static int coproc_mmap(struct file *fp,
> > struct vm_area_struct *vma)
> > return -EACCES;
> > }
> >  
> > +   /*
> > +* The initial mapping is done after the window is opened
> > +* with ioctl. But this window might have been closed
> > +* due to lost credit (core removal on PowerVM) before mmap().
> 
> What does "initial mapping" mean?
> 
> mapping ~= mmap, in kernel speak.

yes, the initial mapping is done with the actual mmap() call. 
> 
> You will have to differentiate the concepts.
> 
> > +* So if the window is not active, return mmap() failure
> > +* with -EACCES and expects the user space reconfigure (mmap)
> > +* window when it is active again or open new window when
> > +* the credit is available.
> > +*/
> > +   mutex_lock(>task_ref.mmap_mutex);
> > +   if (txwin->status != VAS_WIN_ACTIVE) {
> > +   pr_err("%s(): Window is not active\n", __func__);
> > +   rc = -EACCES;
> > +   goto out;
> > +   }
> > +
> > paste_addr = cp_inst->coproc->vops->paste_addr(txwin);
> > if (!paste_addr) {
> > pr_err("%s(): Window paste address failed\n",
> > __func__);
> > -   return -EINVAL;
> > +   rc = -EINVAL;
> > +   goto out;
> > }
> >  
> > pfn = paste_addr >> PAGE_SHIFT;
> > @@ -401,6 +418,8 @@ static int coproc_mmap(struct file *fp, struct
> > vm_area_struct *vma)
> >  
> > txwin->task_ref.vma = vma;
> >  
> > +out:
> > +   mutex_unlock(>task_ref.mmap_mutex);
> 
> If the hypervisor can revoke a window at any point with DLPAR, it's
> not 
> clear *why* this is needed. The hypervisor could cause your window
> to 
> close right after this mmap() returns, right? So an explanation for 
> exactly what this patch is needed for beyond that would help.

Yes, the window can be closed by OS due to DLPAR after the mmap()
returns successfully which is a normal case - paste instruction failure
 until the window is reopened again.

But ths patch is mainly for window open by user space and dlpar happens
before the user space issue mmap().

I will add more description in the commit log. 

Thanks
Haren

> 
> Thanks,
> Nick



Re: [PATCH v3 05/10] powerpc/pseries/vas: Close windows with DLPAR core removal

2022-02-15 Thread Haren Myneni
On Mon, 2022-02-14 at 13:17 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of January 22, 2022 5:57 am:
> > The hypervisor reduces the available credits if the core is removed
> > from the LPAR. So there is possibility of using excessive credits
> > (windows) in the LPAR and the hypervisor expects the system to
> > close
> > the excessive windows. Even though the user space can continue to
> > use
> > these windows to send compression requests to NX, the hypervisor
> > expects
> > the LPAR to reduce these windows usage so that NX load can be
> > equally
> > distributed across all LPARs in the system.
> > 
> > When the DLPAR notifier is received, get the new VAS capabilities
> > from
> > the hypervisor and close the excessive windows in the hypervisor.
> > Also
> > the kernel unmaps the paste address so that the user space receives
> > paste
> > failure until these windows are active with the later DLPAR (core
> > add).
> 
> The changelog needs work. Unmapping the window and the ramifications
> of
> that needs more description here or in comments.

Thanks will change. 
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/include/asm/vas.h  |   1 +
> >  arch/powerpc/platforms/book3s/vas-api.c |   2 +
> >  arch/powerpc/platforms/pseries/vas.c| 117
> > ++--
> >  arch/powerpc/platforms/pseries/vas.h|   1 +
> >  4 files changed, 112 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/vas.h
> > b/arch/powerpc/include/asm/vas.h
> > index f1efe86563cc..ddc05a8fc2e3 100644
> > --- a/arch/powerpc/include/asm/vas.h
> > +++ b/arch/powerpc/include/asm/vas.h
> > @@ -74,6 +74,7 @@ struct vas_user_win_ref {
> > struct mm_struct *mm;   /* Linux process mm_struct */
> > struct mutex mmap_mutex;/* protects paste address mmap() */
> > /* with DLPAR close/open
> > windows */
> > +   struct vm_area_struct *vma; /* Save VMA and used in DLPAR ops
> > */
> >  };
> >  
> >  /*
> > diff --git a/arch/powerpc/platforms/book3s/vas-api.c
> > b/arch/powerpc/platforms/book3s/vas-api.c
> > index 2b0ced611f32..a63fd48e34a7 100644
> > --- a/arch/powerpc/platforms/book3s/vas-api.c
> > +++ b/arch/powerpc/platforms/book3s/vas-api.c
> > @@ -399,6 +399,8 @@ static int coproc_mmap(struct file *fp, struct
> > vm_area_struct *vma)
> > pr_devel("%s(): paste addr %llx at %lx, rc %d\n", __func__,
> > paste_addr, vma->vm_start, rc);
> >  
> > +   txwin->task_ref.vma = vma;
> > +
> > return rc;
> >  }
> >  
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index d9ff73d7704d..75ccd0a599ec 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -370,13 +370,28 @@ static struct vas_window
> > *vas_allocate_window(int vas_id, u64 flags,
> > if (rc)
> > goto out_free;
> >  
> > -   vas_user_win_add_mm_context(>vas_win.task_ref);
> > txwin->win_type = cop_feat_caps->win_type;
> > mutex_lock(_pseries_mutex);
> > -   list_add(>win_list, >list);
> > +   /*
> > +* Possible to loose the acquired credit with DLPAR core
> 
> s/loose/lose/g
> 
> > +* removal after the window is opened. So if there are any
> > +* closed windows (means with lost credits), do not give new
> > +* window to user space. New windows will be opened only
> > +* after the existing windows are reopened when credits are
> > +* available.
> > +*/
> > +   if (!caps->close_wins) {
> > +   list_add(>win_list, >list);
> > +   caps->num_wins++;
> > +   mutex_unlock(_pseries_mutex);
> > +   vas_user_win_add_mm_context(>vas_win.task_ref);
> > +   return >vas_win;
> > +   }
> > mutex_unlock(_pseries_mutex);
> >  
> > -   return >vas_win;
> > +   put_vas_user_win_ref(>vas_win.task_ref);
> > +   rc = -EBUSY;
> > +   pr_err("No credit is available to allocate window\n");
> >  
> >  out_free:
> > /*
> > @@ -439,14 +454,24 @@ static int vas_deallocate_window(struct
> > vas_window *vwin)
> >  
> > caps = [win->win_type].caps;
> > mutex_lock(_pseries_mutex);
> > -   rc = deallocate_free_window(win);
> > -   if (rc) {
> >

Re: [PATCH v3 03/10] powerpc/pseries/vas: Save LPID in pseries_vas_window struct

2022-02-15 Thread Haren Myneni
On Mon, 2022-02-14 at 12:41 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of January 22, 2022 5:55 am:
> > The kernel sets the VAS window with partition PID when is opened in
> > the hypervisor. During DLPAR operation, windows can be closed and
> > reopened in the hypervisor when the credit is available. So saves
> > this PID in pseries_vas_window struct when the window is opened
> > initially and reuse it later during DLPAR operation.
> 
> This probably shouldn't be called lpid, while you're changing it.
> "partition PID" and "LPAR PID" is also confusing. I know the name
> somewhat comes from the specifiction, but pid/PID would be fine,
> it's clear we are talking about "this LPAR" when in pseries code.
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/vas.c | 7 ---
> >  arch/powerpc/platforms/pseries/vas.h | 1 +
> >  2 files changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index d2c8292bfb33..2ef56157634f 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -107,7 +107,6 @@ static int h_deallocate_vas_window(u64 winid)
> >  static int h_modify_vas_window(struct pseries_vas_window *win)
> >  {
> > long rc;
> > -   u32 lpid = mfspr(SPRN_PID);
> >  
> > /*
> >  * AMR value is not supported in Linux VAS implementation.
> > @@ -115,7 +114,7 @@ static int h_modify_vas_window(struct
> > pseries_vas_window *win)
> >  */
> > do {
> > rc = plpar_hcall_norets(H_MODIFY_VAS_WINDOW,
> > -   win->vas_win.winid, lpid, 0,
> > +   win->vas_win.winid, win->lpid,
> > 0,
> > VAS_MOD_WIN_FLAGS, 0);
> >  
> > rc = hcall_return_busy_check(rc);
> > @@ -125,7 +124,7 @@ static int h_modify_vas_window(struct
> > pseries_vas_window *win)
> > return 0;
> >  
> > pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u lpid %u\n",
> > -   rc, win->vas_win.winid, lpid);
> > +   rc, win->vas_win.winid, win->lpid);
> > return -EIO;
> >  }
> >  
> > @@ -338,6 +337,8 @@ static struct vas_window
> > *vas_allocate_window(int
> > vas_id, u64 flags,
> > }
> > }
> >  
> > +   txwin->lpid = mfspr(SPRN_PID);
> > +
> > /*
> >  * Allocate / Deallocate window hcalls and setup / free IRQs
> >  * have to be protected with mutex.
> > diff --git a/arch/powerpc/platforms/pseries/vas.h
> > b/arch/powerpc/platforms/pseries/vas.h
> > index fa7ce74f1e49..0538760d13be 100644
> > --- a/arch/powerpc/platforms/pseries/vas.h
> > +++ b/arch/powerpc/platforms/pseries/vas.h
> > @@ -115,6 +115,7 @@ struct pseries_vas_window {
> > u64 domain[6];  /* Associativity domain Ids */
> > /* this window is allocated */
> > u64 util;
> > +   u32 lpid;
> 
> Comment could be "PID associated with this window".   

yes, will add this comment.

> 
> BTW, is the TID parameter deprecated? Doesn't seem that we use that.

Right, tpid is deprecated on p10 and we are not using it.

Thanks
Haren

> 
> Thanks,
> Nick



Re: [PATCH v3 02/10] powerpc/pseries/vas: Add notifier for DLPAR core removal/add

2022-02-15 Thread Haren Myneni
On Mon, 2022-02-14 at 12:27 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of January 22, 2022 5:54 am:
> > The hypervisor assigns credits for each LPAR based on number of
> > cores configured in that system. So expects to release credits
> > (means windows) when the core is removed. This patch adds notifier
> > for core removal/add so that the OS closes windows if the system
> > looses credits due to core removal and reopen windows when the
> > credits available later.
> 
> This could be improved. As far as I can tell,
> 
>  The hypervisor assigns vas credits (windows) for each LPAR based on
> the 
>  number of cores configured in that system. The OS is expected to 
>  release credits when cores are removed, and may allocate more when 
>  cores are added.
> 
> Or can you only re-use credits that you previously lost?

yes, reopen windows / re-use credits when the previously lost credits
are available. 
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/vas.c | 37
> > 
> >  1 file changed, 37 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index c0737379cc7b..d2c8292bfb33 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -538,6 +538,39 @@ static int __init get_vas_capabilities(u8
> > feat, enum vas_cop_feat_type type,
> > return 0;
> >  }
> >  
> > +/*
> > + * Total number of default credits available (target_credits)
> > + * in LPAR depends on number of cores configured. It varies based
> > on
> > + * whether processors are in shared mode or dedicated mode.
> > + * Get the notifier when CPU configuration is changed with DLPAR
> > + * operation so that get the new target_credits (vas default
> > capabilities)
> > + * and then update the existing windows usage if needed.
> > + */
> > +static int pseries_vas_notifier(struct notifier_block *nb,
> > +   unsigned long action, void *data)
> > +{
> > +   struct of_reconfig_data *rd = data;
> > +   struct device_node *dn = rd->dn;
> > +   const __be32 *intserv = NULL;
> > +   int len, rc = 0;
> > +
> > +   if ((action == OF_RECONFIG_ATTACH_NODE) ||
> > +   (action == OF_RECONFIG_DETACH_NODE))
> 
> I suppose the OF notifier is the way to do it (cc Nathan).

Using notifier here. registering notifier
with of_reconfig_notifier_register() as in other places (hotplug-
cpu.c pseries_smp_notifier())

> 
> Could this patch be folded in with where it acually does something?
> It 
> makes it easier to review and understand how the notifier is used.

Added this notifier as a seperate patch to make it smaller. Sure, I can
include this patch in 'Add reconfig_close/open_windows() patch'. 
> 
> 
> > +   intserv = of_get_property(dn, "ibm,ppc-interrupt-
> > server#s",
> > + );
> > +   /*
> > +* Processor config is not changed
> > +*/
> > +   if (!intserv)
> > +   return NOTIFY_OK;
> > +
> > +   return rc;
> > +}
> > +
> > +static struct notifier_block pseries_vas_nb = {
> > +   .notifier_call = pseries_vas_notifier,
> > +};
> > +
> >  static int __init pseries_vas_init(void)
> >  {
> > struct hv_vas_cop_feat_caps *hv_cop_caps;
> > @@ -591,6 +624,10 @@ static int __init pseries_vas_init(void)
> > goto out_cop;
> > }
> >  
> > +   /* Processors can be added/removed only on LPAR */
> 
> What does this comment mean? DLPAR?

I will remve it, basically trying to say that this notifier is called
when core is removed / added. 

Thanks
haren

> 
> Thanks,
> Nick
> 
> > +   if (copypaste_feat && firmware_has_feature(FW_FEATURE_LPAR))
> > +   of_reconfig_notifier_register(_vas_nb);
> > +
> > pr_info("GZIP feature is available\n");
> >  
> >  out_cop:
> > -- 
> > 2.27.0
> > 
> > 
> > 



Re: [PATCH v3 01/10] powerpc/pseries/vas: Use common names in VAS capability structure

2022-02-15 Thread Haren Myneni
On Mon, 2022-02-14 at 12:14 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of January 22, 2022 5:54 am:
> > target/used/avail_creds provides credits usage to user space via
> > sysfs and the same interface can be used on PowerNV in future.
> > Remove "lpar" from these names so that applicable on both PowerVM
> > and PowerNV.
> 
> But not in this series? This is just to save you having to do more
> renaming later?

Thanks for your review. 
Yes, Removing _lpar_ in struct elements to make it clear so that can
easily add in sysfs patch.  

> 
> Reviewed-by: Nicholas Piggin 
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/vas.c | 10 +-
> >  arch/powerpc/platforms/pseries/vas.h |  6 +++---
> >  2 files changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index d243ddc58827..c0737379cc7b 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -310,8 +310,8 @@ static struct vas_window
> > *vas_allocate_window(int vas_id, u64 flags,
> >  
> > cop_feat_caps = >caps;
> >  
> > -   if (atomic_inc_return(_feat_caps->used_lpar_creds) >
> > -   atomic_read(_feat_caps->target_lpar_creds)) 
> > {
> > +   if (atomic_inc_return(_feat_caps->used_creds) >
> > +   atomic_read(_feat_caps->target_creds)) {
> > pr_err("Credits are not available to allocate
> > window\n");
> > rc = -EINVAL;
> > goto out;
> > @@ -385,7 +385,7 @@ static struct vas_window
> > *vas_allocate_window(int vas_id, u64 flags,
> > free_irq_setup(txwin);
> > h_deallocate_vas_window(txwin->vas_win.winid);
> >  out:
> > -   atomic_dec(_feat_caps->used_lpar_creds);
> > +   atomic_dec(_feat_caps->used_creds);
> > kfree(txwin);
> > return ERR_PTR(rc);
> >  }
> > @@ -445,7 +445,7 @@ static int vas_deallocate_window(struct
> > vas_window *vwin)
> > }
> >  
> > list_del(>win_list);
> > -   atomic_dec(>used_lpar_creds);
> > +   atomic_dec(>used_creds);
> > mutex_unlock(_pseries_mutex);
> >  
> > put_vas_user_win_ref(>task_ref);
> > @@ -521,7 +521,7 @@ static int __init get_vas_capabilities(u8 feat,
> > enum vas_cop_feat_type type,
> > }
> > caps->max_lpar_creds = be16_to_cpu(hv_caps->max_lpar_creds);
> > caps->max_win_creds = be16_to_cpu(hv_caps->max_win_creds);
> > -   atomic_set(>target_lpar_creds,
> > +   atomic_set(>target_creds,
> >be16_to_cpu(hv_caps->target_lpar_creds));
> > if (feat == VAS_GZIP_DEF_FEAT) {
> > caps->def_lpar_creds = be16_to_cpu(hv_caps-
> > >def_lpar_creds);
> > diff --git a/arch/powerpc/platforms/pseries/vas.h
> > b/arch/powerpc/platforms/pseries/vas.h
> > index 4ecb3fcabd10..fa7ce74f1e49 100644
> > --- a/arch/powerpc/platforms/pseries/vas.h
> > +++ b/arch/powerpc/platforms/pseries/vas.h
> > @@ -72,9 +72,9 @@ struct vas_cop_feat_caps {
> > };
> > /* Total LPAR available credits. Can be different from max LPAR
> > */
> > /* credits due to DLPAR operation */
> > -   atomic_ttarget_lpar_creds;
> > -   atomic_tused_lpar_creds; /* Used credits so far */
> > -   u16 avail_lpar_creds; /* Remaining available credits */
> > +   atomic_ttarget_creds;
> > +   atomic_tused_creds; /* Used credits so far */
> > +   u16 avail_creds;/* Remaining available credits */
> >  };
> >  
> >  /*
> > -- 
> > 2.27.0
> > 
> > 
> > 



  1   2   3   4   5   6   7   >