Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called

Linda Knippers Thu, 16 Feb 2017 14:35:24 -0800


On 2/16/2017 3:48 PM, Dan Williams wrote:

On Thu, Feb 16, 2017 at 12:13 PM, Linda Knippers <[email protected]> wrote:

On 02/16/2017 02:35 PM, Dan Williams wrote:

On Thu, Feb 16, 2017 at 10:51 AM, Linda Knippers <[email protected]> wrote:

Right, and NVDIMM_FAMILY_HPE2 has one too, and we have the kernel
option to disable all vendor specific passthroughs if an environment
only wants to run commands with publicly documented effects.



Ok, I can add some to the HPE1 family too.  :-)


Sure, and then the pushback would be "no, just use HPE2 or INTEL". We
have two kernel-supported methods for sending an opaque
vendor-specific command. Why would the kernel need to support a third?


I can't use Intel's because I have an NVDIMM-N and your spec is not
for NVDIMM-N.


I don't understand this comment in the context of a vendor specific
tunnel. There's nothing NVDIMM-N specific about the DSM payload
format. I'm saying override the default family temporarily just to use
the vendor-specific tunnel, not implement the full "INTEL" definition.
This can be used as a stop-gap when you find you need to support a new
command on a legacy kernel.


That presumes that the platform FW supports the INTEL DSM.  Today's
platforms that support NVDIMM-N do not.  It is quite likely that
even when platform do support NVDIMMs with RFIC 0x201, FW wouldn't
expose the DSM family if the hardware isn't actually installed.

There's also nothing on the kernel side stopping any vendor from
implementing the "HPE2" or "INTEL" vendor specific interfaces in their
BIOS, the interfaces are not exclusive.



I'm sure that many vendors will implement the INTEL interfaces when
there are devices using the 0x0201 RFIC.  However, those aren't sufficient
for NVDIMM-N devices.

As for HPE2, it may be short lived.  I'm not sure it's in the wild and
if it's not, perhaps we can remove it.  Still checking.


This is an example of why the kernel does not add kernel enabling for
future "maybe" cases. The fact that this patch wants to allow the
kernel to maybe support a command that doesn't exist yet is a clear
reason to say no.


The kernel often has code for devices that aren't available yet,
like NVDIMMs with 0x201. Sometimes things change or eventually get deprecated.
The Intel DSM spec has changed over time as well. But as I said, I'm still 
checking.


I'm not sure that's a good example because block-apertures are
described by ACPI, and the kernel has no specific enabling around the
format-interface-code. Whereas this patch to expand the dsm_mask is
planning ahead for DSMs that may not ever be defined.


My point was that things change.  They sometimes change before hardware is
shipping.  That's why your spec is up to version 1.3 and still changing.
I'm just trying to acknowledge that change happens and plan for it.

BTW, we do have HPE2 in the wild so never mind about removing it.


[..]

Module options are specified at module load time so you don't
necessarily need a reboot.


Reloading nfit has been a problem for me with dependencies between
other modules but perhaps I've just not gotten the sequence right.
Does it work for you?


Yes:

    ndctl disable-region all
    modprobe -r nfit


Thanks!

It's best to unmount any filesystems mounted on nvdimm namespaces
before disabling regions. There's an ongoing effort to make if safe
[1], but not all filesystems are prepared for their backing device to
disappear.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-January/003797.html

However, I'm now of the opinion that
allowing the family to be changed is a more complete solution to both
problems.

I hope the need for the vendor-specific tunnels goes away over time
and Linux can generically support the most common management tasks
with generic infrastructure.



We totally agree.  In case you're wondering, we don't have a new
DSM queued up just waiting to be exposed by this patch.  However, cases
have come up where we have considered the need for a new DSM function
outside of pure test environments.  So far we have come up with other
approaches but at some point, the right approach will be to add a new DSM
function and I'd really like to not have to spin the kernel and all that
goes along with that when it happens. It's not good for customers.


I don't understand why a module option is unacceptable when that is
the proposed solution to the kernel picking the wrong "default"
family.



The difference is that picking the default family is primarily for
testing while the other case it's not.  Testers do all sorts of things
that customers won't like.

In fact, thinking about it further, I'd be more open to a patch that
allows overriding a DIMM's family via sysfs.



Sorry if this is a dumb question but will that work at boot time?  Today
there are DSMs that are called during initialization.


The only DIMM-level DSMs that are called during initialization are
label management, those aren't the DSMs we are talking about here.

There could also
be DSMs called based on whether NFIT device flag bit 4 is set, although
we don't currently do that.


Sorry, I'm not sure to which NFIT bit this is referring? I'm not sure
it matters if the kernel is never going to initiate and consume the
result of the DSM like it does label data.


There's a state flag that says there was a health status change prior
to the OSPM handoff.  We could eventually want to call that to know more
about whether the NVDIMM is viable for use as there is more information there
than in the basic state flags.  We don't do that today though, just
thinking about the future.


The kernel passes that status and any health-event notification to
userspace. The kernel doesn't consume it directly because the health
status payload is not standardized across vendors, and it keeps the
kernel focused on the mechanism while the policy of what to do with a
health event is a userspace concern. This is the same scheme the
kernel has for handling health status alerts from disks.

That way we can
potentially have cross usage of these different interfaces and it's
less awkward for tooling to use than a module option.



For testing purposes, being able to switch DSM families without a reboot
would be really nice.  You could even expose the GUID and allow it, the
family, and the dsm mask to be overridden. Would be helpful when testing
that new standard DSM we all aspire to.

This approach would touch a lot more code and require a lot more testing
than my relatively simple patches because today these things are configured
at boot time rather than run time.  I wonder about ordering and consistency
checking.  For example, if someone changed the family would you
automatically change the mask or is that two operations?  I assume you'd
check
that the family is actually supported for the device?  With sysfs, different
devices could have different families, where with the module parameter
option
there's one default that's applied to all devices for which that family is
supported.

At this point I'd probably prefer the simplicity of the module parameter
because I know I can do it and test it.  If the only way you'll take the
patch is to add a second parameter then I'll do that, but I still don't
see the point.


Let's do the work to allow the family to be switched so that tooling
can override the kernel default, because it does seem valid for a DIMM
to support multiple DSM-family types and that gives more than one path
to use a vendor specific passthrough.


I am fairly certain that for production use we wouldn't expect management
tools to use more than one DSM family at a time. Tools using DSMs
from one family wouldn't switch to using a new family for one operation
and then switch back for another operation.


If the kernel makes it safe I don't see why this is a blocker,
especially if the goal is to reduce interface proliferation over time.


See my earlier comments.  It would require more than the kernel
making it safe.  It gets even more complicated when a platform supports
an 0x201 NVDIMM and an NVDIMM-N at the same time.

We all share the goal of reducing interface proliferation.  We disagree
on the enforcement while we define more complete standards.

We haven't talked about the 4th DSM family and what their plans are but
I won't be shocked if they create new functions.  That family is very
hardware specific and hardware specs also evolve.


Yes, and we can add support for new functions when and if they define
them, not in advance.


The problem is timing.  New FW and tools can be in customers hands much
more quickly than an OS release.  If a vendor documents a function today
and you pull in a patch adding the bit tomorrow, it could still be a year
or more before that patch makes it into the distro that the customer is running.
That is the fundamental issue that I'm trying to address.

--ljk
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Re: [PATCH 1/3] Allow all supported HPE DSM functions to be called

Reply via email to