On Fri, Jun 9, 2017 at 9:34 AM, Linda Knippers <[email protected]> wrote:
> This patch adds a new interface to provide additional Health Status Detail.
> This information is reported as part of the Smart Health with the HPE1
> DSM family so the function for the other families is NULL by default.  If
> the field is available, the ndctl --health option will decode
> the bits that make up the field.  If the DSM family doesn't support
> this function, no additional information is provided.
>
> With this change a healthy NVDIMM-N that supports this information
> would report something like this:
>
> {
>     "dev":"nmem6",
>     "id":"802c-0f-1612-122eb278",
>     "health":{
>       "health_state":"ok",
>       "temperature_celsius":25.000000,
>       "spares_percentage":99,
>       "alarm_temperature":false,
>       "alarm_spares":false,
>       "temperature_threshold":50.000000,
>       "spares_threshold":20,
>       "life_used_percentage":0,
>       "shutdown_state":"clean",
>       "health_status_detail":[
>         "ok"
>       ]
>     }
>   }
>
> An ailing NVDIMM-N could report one or more health status
> conditions, sometime like this:
>
> {
>     "dev":"nmem6",
>     "id":"802c-0f-1612-122eb278",
>     "health":{
>       "health_state":"ok",
>       "temperature_celsius":25.000000,
>       "spares_percentage":99,
>       "alarm_temperature":false,
>       "alarm_spares":false,
>       "temperature_threshold":50.000000,
>       "spares_threshold":20,
>       "life_used_percentage":0,
>       "shutdown_state":"clean",
>       "health_status_detail":[
>         "energy_source_error",
>         "arm_error",
>       ]
>     }
>   }

This format for health_detail makes the json harder to consume for
upper-layer applications. Every field should be associated with a
key-id even if it's just a flag. I think I'd like to keep the json
topology flatter. We already have support for the NFIT health state
flags in this form:

        {
          "provider":"nfit_test.1",
          "dev":"ndbus3",
          "dimms":[
            {
              "dev":"nmem5",
              "id":"cdab-0a-07e0-fffffeff",
              "flag_failed_save":true,
              "flag_failed_arm":true,
              "flag_failed_restore":true,
              "flag_smart_event":true,
              "flag_failed_flush":true
            }
          ]
        }

So, let's just add additional flag names and omit the ones that are
duplicates of the NFIT-level health state flags.
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to