Hi Dan and Michal, I have posted an RFC patch to implment the kernel side interface for this in libnvdimm with an implementation in papr-scm driver module at [1]. Can you please take a look at the patch seried and provide your inputs.
[1] https://lore.kernel.org/linux-nvdimm/[email protected]/ Thanks, ~ Vaibhav Dan Williams <[email protected]> writes: > On Fri, Oct 23, 2020 at 10:28 AM Michal Suchánek <[email protected]> wrote: >> >> Hello, >> >> On Thu, May 28, 2020 at 11:59 AM Vaibhav Jain <[email protected]> wrote: >> > >> > Thanks for this taking time to look into this Dan, >> > >> > Agree with the points you have made earlier that I am summarizing below: >> > >> > * This is better done in ndctl rather than ipmctl. >> > * Should only expose general performance metrics and not performance >> > counters. Performance counter should be exposed via perf >> > * Vendor specific metrics to be separated from generic performance >> > metrics. >> > >> > One way to split generic and vendor specific metrics might be to report >> > generic performance metrics together with dimm health metrics such as >> > "temprature_celsius" or "spares_percentage" that are already reported in >> > by dimm health output. >> > >> > Vendor specific performance metrics can be reported as a seperate object >> > in the json output. Something similar to output below: >> > >> > # ndctl list -DH --stats --vendor-stats >> > [ >> > { >> > "dev":"nmem0", >> > "health":{ >> > "health_state":"ok", >> > "shutdown_state":"clean", >> > "temperature_celsius":48.00, >> > "spares_percentage":10, >> > >> > /* Generic performance metrics/stats */ >> > "TotalMediaReads": 18929, >> > "TotalMediaWrites": 0, >> > .... >> > } >> > >> > /* Vendor specific stats for the dimm */ >> > "vendor-stats": { >> > "Controller Reset Count":10 >> > "Controller Reset Elapsed Time": 3600 >> > "Power-on Seconds": 3600 >> >> How do you tell generic from vendor-specific stats, though? >> >> Controller reset count and power-on time may not be reported by some >> controllers but sound pretty generic. >> >> Even if you declare that the stats reported by all controllers >> available at this moment are generic a later one may not report some of >> these 'generic' statistics, or report them in different way/units, or >> may simply not report anything at all for some technical reason. >> >> Kernels that do not have this feature will not report anything at all >> either. > > My expectation is that for a given json attribute name any vendor > backend that supports it must convey it in a compatible way. If a > given attribute does not make sense for a given vendor, or is not yet > implemented then leaving it unpopulated is indeed the expectation. > > The goal is to both minimize vendor specific logic in infrastructure > that consumes the ndctl json while at the same time balance vendor > needs. In other words avoid "needless" differentiation as much as > possible with small amount of compat work across vendors. _______________________________________________ Linux-nvdimm mailing list -- [email protected] To unsubscribe send an email to [email protected]
