Re: [PATCH v2] cxl: Prevent adapter reset if an active context exists

2016-10-12 Thread Andrew Donnellan

On 12/10/16 15:17, Vaibhav Jain wrote:

This patch prevents resetting the cxl adapter via sysfs in presence of
one or more active cxl_context on it. This protects against an
unrecoverable error caused by PSL owning a dirty cache line even after
reset and host tries to touch the same cache line. In case a force reset
of the card is required irrespective of any active contexts, the int
value -1 can be stored in the 'reset' sysfs attribute of the card.

The patch introduces a new atomic_t member named contexts_num inside
struct cxl that holds the number of active context attached to the card
, which is checked against '0' before proceeding with the reset. To
prevent against a race condition where a context is activated just after
reset check is performed, the contexts_num is atomically set to '-1'
after reset-check to indicate that no more contexts can be activated on
the card anymore.

Before activating a context we atomically test if contexts_num is
non-negative and if so, increment its value by one. In case the value of
contexts_num is negative then it indicates that the card is about to be
reset and context activation is error-ed out at that point.

Signed-off-by: Vaibhav Jain 


All the changes look good to me.

Reviewed-by: Andrew Donnellan 


diff --git a/Documentation/ABI/testing/sysfs-class-cxl 
b/Documentation/ABI/testing/sysfs-class-cxl
index 4ba0a2a..dae2b38 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -220,8 +220,11 @@ What:   /sys/class/cxl//reset
 Date:   October 2014
 Contact:linuxppc-dev@lists.ozlabs.org
 Description:write only
-Writing 1 will issue a PERST to card which may cause the card
-to reload the FPGA depending on load_image_on_perst.
+Writing 1 will issue a PERST to card provided there are no
+   contexts active on any one of the card AFUs. This may cause
+   the card to reload the FPGA depending on load_image_on_perst.
+   Writing -1 will do a force PERST irrespective of any active
+   contexts on the card AFUs.


Ugh, spaces vs tabs bites again :(

--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited



[PATCH v2] cxl: Prevent adapter reset if an active context exists

2016-10-11 Thread Vaibhav Jain
This patch prevents resetting the cxl adapter via sysfs in presence of
one or more active cxl_context on it. This protects against an
unrecoverable error caused by PSL owning a dirty cache line even after
reset and host tries to touch the same cache line. In case a force reset
of the card is required irrespective of any active contexts, the int
value -1 can be stored in the 'reset' sysfs attribute of the card.

The patch introduces a new atomic_t member named contexts_num inside
struct cxl that holds the number of active context attached to the card
, which is checked against '0' before proceeding with the reset. To
prevent against a race condition where a context is activated just after
reset check is performed, the contexts_num is atomically set to '-1'
after reset-check to indicate that no more contexts can be activated on
the card anymore.

Before activating a context we atomically test if contexts_num is
non-negative and if so, increment its value by one. In case the value of
contexts_num is negative then it indicates that the card is about to be
reset and context activation is error-ed out at that point.

Signed-off-by: Vaibhav Jain 
---
Changelog:
v2..v1 ->   * Addressed following review comments from Frederic Barrat:
- Spell error changing 'Incase' to 'In case'.
- Changed the comment description for context_num member
  to use a slightly more universal notation for integers.
- Added cleanup code for context irqs in case context lock
  is taken.
- Added a new function called cxl_adapter_context_unlock
  that sets context_num to '0' (forcibly if needed).
- cxl adapter struct when allocated is initialized with
  context lock taken and released when the card config is
  complete.
- Simplified code flow in function reset_adapter_store.

---
 Documentation/ABI/testing/sysfs-class-cxl |  7 --
 drivers/misc/cxl/api.c|  9 +++
 drivers/misc/cxl/context.c|  3 +++
 drivers/misc/cxl/cxl.h| 24 ++
 drivers/misc/cxl/file.c   | 11 
 drivers/misc/cxl/guest.c  |  3 +++
 drivers/misc/cxl/main.c   | 42 ++-
 drivers/misc/cxl/pci.c|  2 ++
 drivers/misc/cxl/sysfs.c  | 27 +---
 9 files changed, 121 insertions(+), 7 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-cxl 
b/Documentation/ABI/testing/sysfs-class-cxl
index 4ba0a2a..dae2b38 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -220,8 +220,11 @@ What:   /sys/class/cxl//reset
 Date:   October 2014
 Contact:linuxppc-dev@lists.ozlabs.org
 Description:write only
-Writing 1 will issue a PERST to card which may cause the card
-to reload the FPGA depending on load_image_on_perst.
+Writing 1 will issue a PERST to card provided there are no
+   contexts active on any one of the card AFUs. This may cause
+   the card to reload the FPGA depending on load_image_on_perst.
+   Writing -1 will do a force PERST irrespective of any active
+   contexts on the card AFUs.
 Users: https://github.com/ibm-capi/libcxl
 
 What:  /sys/class/cxl//perst_reloads_same_image (not in a guest)
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index f3d34b9..85bb2d9 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -236,10 +236,19 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed,
ctx->real_mode = false;
}
 
+   /*
+* Increment the mapped context count for adapter. This also checks
+* if adapter_context_lock is taken.
+*/
+   rc = cxl_adapter_context_get(ctx->afu->adapter);
+   if (rc)
+   goto out;
+
cxl_ctx_get();
 
if ((rc = cxl_ops->attach_process(ctx, kernel, wed, 0))) {
put_pid(ctx->pid);
+   cxl_adapter_context_put(ctx->afu->adapter);
cxl_ctx_put();
goto out;
}
diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c
index c466ee2..5e506c1 100644
--- a/drivers/misc/cxl/context.c
+++ b/drivers/misc/cxl/context.c
@@ -238,6 +238,9 @@ int __detach_context(struct cxl_context *ctx)
put_pid(ctx->glpid);
 
cxl_ctx_put();
+
+   /* Decrease the attached context count on the adapter */
+   cxl_adapter_context_put(ctx->afu->adapter);
return 0;
 }
 
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 01d372a..a144073 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -618,6 +618,14 @@ struct cxl {
bool