On Fri, Jun 9, 2017 at 9:34 AM, Linda Knippers <[email protected]> wrote:
> This patch adds a new interface to provide additional Health Status Detail.
> This information is reported as part of the Smart Health with the HPE1
> DSM family so the function for the other families is NULL by default. If
> the field is available, the ndctl --health option will decode
> the bits that make up the field. If the DSM family doesn't support
> this function, no additional information is provided.
>
> With this change a healthy NVDIMM-N that supports this information
> would report something like this:
>
> {
> "dev":"nmem6",
> "id":"802c-0f-1612-122eb278",
> "health":{
> "health_state":"ok",
> "temperature_celsius":25.000000,
> "spares_percentage":99,
> "alarm_temperature":false,
> "alarm_spares":false,
> "temperature_threshold":50.000000,
> "spares_threshold":20,
> "life_used_percentage":0,
> "shutdown_state":"clean",
> "health_status_detail":[
> "ok"
> ]
> }
> }
>
> An ailing NVDIMM-N could report one or more health status
> conditions, sometime like this:
>
> {
> "dev":"nmem6",
> "id":"802c-0f-1612-122eb278",
> "health":{
> "health_state":"ok",
> "temperature_celsius":25.000000,
> "spares_percentage":99,
> "alarm_temperature":false,
> "alarm_spares":false,
> "temperature_threshold":50.000000,
> "spares_threshold":20,
> "life_used_percentage":0,
> "shutdown_state":"clean",
> "health_status_detail":[
> "energy_source_error",
> "arm_error",
> ]
> }
> }
This format for health_detail makes the json harder to consume for
upper-layer applications. Every field should be associated with a
key-id even if it's just a flag. I think I'd like to keep the json
topology flatter. We already have support for the NFIT health state
flags in this form:
{
"provider":"nfit_test.1",
"dev":"ndbus3",
"dimms":[
{
"dev":"nmem5",
"id":"cdab-0a-07e0-fffffeff",
"flag_failed_save":true,
"flag_failed_arm":true,
"flag_failed_restore":true,
"flag_smart_event":true,
"flag_failed_flush":true
}
]
}
So, let's just add additional flag names and omit the ones that are
duplicates of the NFIT-level health state flags.
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm