The PAPR standard provides suitable mechanisms to query the health and performance stats of an NVDIMM via various hcalls as described in Ref[2]. Until now these stats were never fetched in the papr_scm modules nor exposed to the user-space tools like 'ndctl'. This is partly due to PAPR platform not having support for ACPI and NFIT. Hence 'ndctl' is unable to query and report the dimm health status and a user had no way to determine the current health status of a NDVIMM.
To overcome this limitation this RFC patch-set proposes a new set of Dimm-Specific-Methods(DSM) for querying and fetching health and stat information for dimms that support PAPR and provides an implementation in kernel for these DSM in papr_scm modules. These changes coupled with draft/proposed ndtcl changes located at Ref[4] should provide a way for user to retributive NVDIMM status using ndtcl. Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM in an emulation environment: # ndctl list -DH [ { "dev":"nmem0", "health":{ "health_state":"fatal", "life_used_percentage":60, "shutdown_state":"dirty" } } ] PAPR Dimm-Specific-Methods(DSM) ================================ As the name suggests DSMs are used by vendor specific code in libndctl to execute certain operations or fetch certain information for NVDIMMS. DSMs can be sent to papr_scm module via libndctl (userspace) and libnvdimm(kernel) using the ND_CMD_CALL ioctl which can be handled in the dimm control function papr_scm_ndctl(). For PAPR this RFC proposes two DSMs that directly map to hcalls provided by PHYP to query NVDIMM health and stats. These DSMs are: * DSM_PAPR_SCM_HEALTH: Which map to hcall H_SCM_HEALTH and returns dimm health. * DSM_PAPR_SCM_STATS: Which map to hcall H_SCM_PERFORMANCE_STATS and returns dimm performance stats. The ioctl ND_CMD_CALL can also transfer data between user-space and kernel via 'envelopes'. The envelop is part of a 'struct nd_cmd_pkg' which in return is wrapped in a user defined struct which in our case is called 'struct nd_pkg_papr_scm' (packaged). These struct is defined as: struct nd_pkg_papr_scm { struct nd_cmd_pkg hdr; /* Package header containing sub-cmd */ uint32_t cmd_status; /* Out: Sub-cmd status returned back */ uint8_t payload[]; /* Out: Sub-cmd data buffer */ }; The 'payload' field of the package holds the libnvdimm defined 'envelope' which is used to send/receive data from userspace buffer (libndctl). This RFC uses this field to copy the results of hcalls that are executed in response to the DSM commands. Please note that results of hcalls are not interpreted (with few exceptions) in papr_scm module at all. Instead they are directly copied to the 'payload' field and sent to userspace(libndctl) for interpretation. This essentially means that the papr_scm module simply acts as a conduit for libndctl to issue hcalls and fetch its output. This should make parsing and interpreting with output buffers of hcalls easier as it can be performed in userspace. References: [1]: "Power Architecture Platform Reference" https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference [2]: "[DOC,v2] powerpc: Provide initial documentation for PAPR hcalls" https://patchwork.ozlabs.org/patch/1154292/ [3]: "Linux on Power Architecture Platform Reference" https://members.openpowerfoundation.org/document/dl/469 [4]: https://github.com/vaibhav92/ndctl/tree/papr_scm_health Vaibhav Jain (6): powerpc/papr_scm: Provide support for fetching dimm health information powerpc/papr_scm: Fetch dimm performance stats from PHYP UAPI: ndctl: Introduce NVDIMM_FAMILY_PAPR as a new NVDIMM DSM family powerpc/papr_scm: Add support for handling PAPR DSM commands powerpc/papr_scm: Implement support for DSM_PAPR_SCM_HEALTH powerpc/papr_scm: Implement support for DSM_PAPR_SCM_STATS arch/powerpc/platforms/pseries/papr_scm.c | 314 +++++++++++++++++++++- include/uapi/linux/ndctl.h | 1 + 2 files changed, 306 insertions(+), 9 deletions(-) -- 2.24.1