Re: [RFC v4 00/17] kunit: introduce KUnit, the Linux kernel unit testing framework

2019-03-20 Thread Knut Omang
Hi Logan,

On Wed, 2019-03-20 at 19:07 -0600, Logan Gunthorpe wrote:
> Hi,
> 
> On 2019-02-14 2:37 p.m., Brendan Higgins wrote:
> > This patch set proposes KUnit, a lightweight unit testing and mocking
> > framework for the Linux kernel.
> 
> I haven't followed the entire conversation but I saw the KUnit write-up
> on LWN and ended up, as an exercise, giving it a try.
> 
> I really like the idea of having a fast unit testing infrastructure in
> the kernel. Occasionally, I write userspace tests for tricky functions
> that I essentially write by copying the code over to a throw away C file
> and exercise them as I need. I think it would be great to be able to
> keep these tests around in a way that they can be run by anyone who
> wants to touch the code.
> 
> I was just dealing with some functions that required some mocked up
> tests so I thought I'd give KUnit a try. I found writing the code very
> easy and the infrastructure I was testing was quite simple to mock out
> the hardware.
> 
> However, I got a bit hung up by one issue: I was writing unit tests for
> code in the NTB tree which itself depends on CONFIG_PCI which cannot be
> enabled in UML (for what should be obvious reasons). I managed to work
> around this because, as luck would have it, all the functions I cared
> about testing were actually static inline functions in headers. So I
> placed my test code in the kunit folder (so it would compile) and hacked
> around a couple a of functions I didn't care about that would not be
> compiled.
> 
> In the end I got it to work acceptably, but I get the impression that
> KUnit will not be usable for wide swaths of kernel code that can't be
> compiled in UML. Has there been any discussion or ideas on how to work
> around this so it can be more generally useful? Or will this feature be
> restricted roughly to non-drivers and functions in headers that don't
> have #ifdefs around them?

Testing drivers, hardware and firmware within production kernels was the use
case that inspired KTF (Kernel Test Framework). Currently KTF is available as a
standalone git repository. That's been the most efficient form for us so far, 
as we typically want tests to be developed once but deployed on many different
kernel versions simultaneously, as part of continuous integration.

But we're also working towards a suitable proposal for how it can be 
smoothly integrated into the kernel, but while still keeping the benefits 
and tools to allow cross-kernel development of tests. As part of this,
I have a master student who has been looking at converting some of 
the existing kernel test suites to KTF, and we have more examples coming 
from our internal usage, as we get more mileage and more users.
See for instance this recent blog entry testing skbuff as an example,
on the Oracle kernel development blog:

https://blogs.oracle.com/linux/writing-kernel-tests-with-the-new-kernel-test-framework-ktf

Other relevant links:

Git repo: https://github.com/oracle/ktf
Formatted docs: http://heim.ifi.uio.no/~knuto/ktf/
LWN mention from my presentation at LPC'17: https://lwn.net/Articles/735034/
Earlier Oracle blog post: 
https://blogs.oracle.com/linux/oracles-new-kernel-test-framework-for-linux-v2
OSS'18 presentation slides: 
https://events.linuxfoundation.org/wp-content/uploads/2017/12/Test-Driven-Kernel-Development-Knut-Omang-Oracle.pdf

> If you're interested in seeing the unit tests I ended up writing you can
> find the commits here[1].

It would certainly be interesting to see how the use cases you struggled with
would work with KTF ;-)

Thanks,
Knut

>
> Thanks,
> 
> Logan
> 
> [1] https://github.com/sbates130272/linux-p2pmem/ ntb_kunit

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

2019-03-20 Thread Oliver
On Thu, Mar 21, 2019 at 7:57 AM Dan Williams  wrote:
>
> On Wed, Mar 20, 2019 at 8:34 AM Dan Williams  wrote:
> >
> > On Wed, Mar 20, 2019 at 1:09 AM Aneesh Kumar K.V
> >  wrote:
> > >
> > > Aneesh Kumar K.V  writes:
> > >
> > > > Dan Williams  writes:
> > > >
> > > >>
> > > >>> Now what will be page size used for mapping vmemmap?
> > > >>
> > > >> That's up to the architecture's vmemmap_populate() implementation.
> > > >>
> > > >>> Architectures
> > > >>> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
> > > >>> device-dax with struct page in the device will have pfn reserve area 
> > > >>> aligned
> > > >>> to PAGE_SIZE with the above example? We can't map that using
> > > >>> PMD_SIZE page size?
> > > >>
> > > >> IIUC, that's a different alignment. Currently that's handled by
> > > >> padding the reservation area up to a section (128MB on x86) boundary,
> > > >> but I'm working on patches to allow sub-section sized ranges to be
> > > >> mapped.
> > > >
> > > > I am missing something w.r.t code. The below code align that using 
> > > > nd_pfn->align
> > > >
> > > >   if (nd_pfn->mode == PFN_MODE_PMEM) {
> > > >   unsigned long memmap_size;
> > > >
> > > >   /*
> > > >* vmemmap_populate_hugepages() allocates the memmap 
> > > > array in
> > > >* HPAGE_SIZE chunks.
> > > >*/
> > > >   memmap_size = ALIGN(64 * npfns, HPAGE_SIZE);
> > > >   offset = ALIGN(start + SZ_8K + memmap_size + 
> > > > dax_label_reserve,
> > > >   nd_pfn->align) - start;
> > > >   }
> > > >
> > > > IIUC that is finding the offset where to put vmemmap start. And that has
> > > > to be aligned to the page size with which we may end up mapping vmemmap
> > > > area right?
> >
> > Right, that's the physical offset of where the vmemmap ends, and the
> > memory to be mapped begins.
> >
> > > > Yes we find the npfns by aligning up using PAGES_PER_SECTION. But that
> > > > is to compute howmany pfns we should map for this pfn dev right?
> > > >
> > >
> > > Also i guess those 4K assumptions there is wrong?
> >
> > Yes, I think to support non-4K-PAGE_SIZE systems the 'pfn' metadata
> > needs to be revved and the PAGE_SIZE needs to be recorded in the
> > info-block.
>
> How often does a system change page-size. Is it fixed or do
> environment change it from one boot to the next? I'm thinking through
> the behavior of what do when the recorded PAGE_SIZE in the info-block
> does not match the current system page size. The simplest option is to
> just fail the device and require it to be reconfigured. Is that
> acceptable?

The kernel page size is set at build time and as far as I know every
distro configures their ppc64(le) kernel for 64K. I've used 4K kernels
a few times in the past to debug PAGE_SIZE dependent problems, but I'd
be surprised if anyone is using 4K in production.

Anyway, my view is that using 4K here isn't really a problem since
it's just the accounting unit of the pfn superblock format. The kernel
reading form it should understand that and scale it to whatever
accounting unit it wants to use internally. Currently we don't so that
should probably be fixed, but that doesn't seem to cause any real
issues. As far as I can tell the only user of npfns in
__nvdimm_setup_pfn() whih prints the "number of pfns truncated"
message.

Am I missing something?

> ___
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [ndctl PATCH v2 4/4] ndctl, monitor: support NVDIMM_FAMILY_HYPERV

2019-03-20 Thread Verma, Vishal L


On Wed, 2019-02-20 at 05:11 +, Dexuan Cui wrote:
> Currently "ndctl monitor" fails for NVDIMM_FAMILY_HYPERV due to
> "no smart support".
> 
> NVDIMM_FAMILY_HYPERV doesn't use ND_CMD_SMART to get the health info.
> Instead, it uses ND_CMD_CALL, so the checking here can't apply,and it
> doesn't support threshold alarms -- actually it only supports one event:
> ND_EVENT_HEALTH_STATE.
> 
> See http://www.uefi.org/RFIC_LIST ("Virtual NVDIMM 0x1901").
> 
> Let's skip the unnecessary checking for NVDIMM_FAMILY_HYPERV, and make
> sure we only monitor the "dimm-health-state" event and ignore the others.
> 
> With the patch, when an error happens, we log it with such a message:
> 
> {"timestamp":"1550547497.431731497","pid":1571,"event":
> {"dimm-health-state":true},"dimm":{"dev":"nmem1",
> "id":"04d5-01-1701-0100","handle":1,"phys_id":0,
> "health":{"health_state":"fatal","shutdown_count":8}}}
> 
> Here the meaningful info is:
> "health":{"health_state":"fatal","shutdown_count":8}
> 
> Signed-off-by: Dexuan Cui 
> ---
>  ndctl/monitor.c | 42 +++---
>  1 file changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/ndctl/monitor.c b/ndctl/monitor.c
> index 43b2abe..43beb06 100644
> --- a/ndctl/monitor.c
> +++ b/ndctl/monitor.c
> @@ -265,31 +265,59 @@ static bool filter_region(struct ndctl_region *region,
>   return true;
>  }
>  
> -static void filter_dimm(struct ndctl_dimm *dimm, struct util_filter_ctx 
> *fctx)
> +static bool ndctl_dimm_test_and_enable_notification(struct ndctl_dimm *dimm)
>  {
> - struct monitor_dimm *mdimm;
> - struct monitor_filter_arg *mfa = fctx->monitor;
>   const char *name = ndctl_dimm_get_devname(dimm);
>  
> + /*
> +  * Hyper-V Virtual NVDIMM doesn't use ND_CMD_SMART to get the health
> +  * info. Instead, it uses ND_CMD_CALL, so the checking here can't
> +  * apply, and it doesn't support threshold alarms -- actually it only
> +  * supports one event: ND_EVENT_HEALTH_STATE.
> +  */
> + if (ndctl_dimm_get_cmd_family(dimm) == NVDIMM_FAMILY_HYPERV) {
> + if (monitor.event_flags != ND_EVENT_HEALTH_STATE) {
> + monitor.event_flags = ND_EVENT_HEALTH_STATE;
> +
> + notice(,
> + "%s: only dimm-health-state can be monitored\n",
> + name);
> + }
> + return true;
> + }
> +

I'm not sure about the family-specific special casing -- it leaks family
specific details into the common implementation, something that's been
avoidable thus far.

Is there any reason why the ndctl_dimm_is_cmd_supported() can't be made
to fail appropriately for anything that is not supported, by adding the
necessary functions to dimm-ops?

That way if the user enters any of the unsupported options, they will
just fail normally, and the user will be expected to provide the right
options for the environment they know they're running in.

Indeed, I think that patch 3 of this series should never be required :)

>   if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_SMART)) {
>   err(, "%s: no smart support\n", name);
> - return;
> + return false;
>   }
>   if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_SMART_THRESHOLD)) {
>   err(, "%s: no smart threshold support\n", name);
> - return;
> + return false;
>   }
>  
>   if (!ndctl_dimm_is_flag_supported(dimm, ND_SMART_ALARM_VALID)) {
>   err(, "%s: smart alarm invalid\n", name);
> - return;
> + return false;
>   }
>  
>   if (enable_dimm_supported_threshold_alarms(dimm)) {
>   err(, "%s: enable supported threshold alarms failed\n", 
> name);
> - return;
> + return false;
>   }
>  
> + return true;
> +}
> +
> +static void filter_dimm(struct ndctl_dimm *dimm, struct util_filter_ctx 
> *fctx)
> +{
> + struct monitor_dimm *mdimm;
> + struct monitor_filter_arg *mfa = fctx->monitor;
> + const char *name = ndctl_dimm_get_devname(dimm);
> +
> +
> + if (!ndctl_dimm_test_and_enable_notification(dimm))
> + return;
> +
>   mdimm = calloc(1, sizeof(struct monitor_dimm));
>   if (!mdimm) {
>   err(, "%s: calloc for monitor dimm failed\n", name);

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [ndctl PATCH v2 3/4] ndctl, lib: implement ndctl_dimm_get_cmd_family()

2019-03-20 Thread Verma, Vishal L


On Wed, 2019-02-20 at 05:11 +, Dexuan Cui wrote:
> Let's export the family info so we can do some family-specific
> handling in ndctl/monitor.c for Hyper-V NVDIMM.

s/Let's//

> 
> Signed-off-by: Dexuan Cui 
> ---
>  ndctl/lib/libndctl.c   | 5 +
>  ndctl/lib/libndctl.sym | 1 +
>  ndctl/libndctl.h   | 1 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
> index 48bdb27..1186579 100644
> --- a/ndctl/lib/libndctl.c
> +++ b/ndctl/lib/libndctl.c
> @@ -1550,6 +1550,11 @@ NDCTL_EXPORT struct ndctl_dimm 
> *ndctl_dimm_get_next(struct ndctl_dimm *dimm)
>   return list_next(>dimms, dimm, list);
>  }
>  
> +NDCTL_EXPORT unsigned long ndctl_dimm_get_cmd_family(struct ndctl_dimm *dimm)
> +{
> + return dimm->cmd_family;
> +}
> +
>  NDCTL_EXPORT unsigned int ndctl_dimm_get_handle(struct ndctl_dimm *dimm)
>  {
>   return dimm->handle;
> diff --git a/ndctl/lib/libndctl.sym b/ndctl/lib/libndctl.sym
> index cb9f769..470e895 100644
> --- a/ndctl/lib/libndctl.sym
> +++ b/ndctl/lib/libndctl.sym
> @@ -38,6 +38,7 @@ global:
>   ndctl_bus_wait_probe;
>   ndctl_dimm_get_first;
>   ndctl_dimm_get_next;
> + ndctl_dimm_get_cmd_family;

Any new APIs need to go in a new LIBNDCTL_XX section (for this release
that would be LIBNDCTL_20).
If you rebase to the current 'pending' branch on github, the section has
already been created at the bottom, and you can just add to that.

>   ndctl_dimm_get_handle;
>   ndctl_dimm_get_phys_id;
>   ndctl_dimm_get_vendor;
> diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h
> index 0debdb6..cb5a8fc 100644
> --- a/ndctl/libndctl.h
> +++ b/ndctl/libndctl.h
> @@ -145,6 +145,7 @@ struct ndctl_dimm *ndctl_dimm_get_next(struct ndctl_dimm 
> *dimm);
>  for (dimm = ndctl_dimm_get_first(bus); \
>   dimm != NULL; \
>   dimm = ndctl_dimm_get_next(dimm))
> +unsigned long ndctl_dimm_get_cmd_family(struct ndctl_dimm *dimm);
>  unsigned int ndctl_dimm_get_handle(struct ndctl_dimm *dimm);
>  unsigned short ndctl_dimm_get_phys_id(struct ndctl_dimm *dimm);
>  unsigned short ndctl_dimm_get_vendor(struct ndctl_dimm *dimm);

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [ndctl PATCH v2 2/4] libndctl: NVDIMM_FAMILY_HYPERV: add .smart_get_shutdown_count (Function 2)

2019-03-20 Thread Verma, Vishal L


On Wed, 2019-02-20 at 05:10 +, Dexuan Cui wrote:
> With the patch, "ndctl list --dimms --health --idle" can show
> "shutdown_count" now, e.g.
> 
> {
> "dev":"nmem0",
> "id":"04d5-01-1701-",
> "handle":0,
> "phys_id":0,
> "health":{
>   "health_state":"ok",
>   "shutdown_count":2
> }
> }
> 
> The patch has to directly call ndctl_cmd_submit() in
> hyperv_cmd_smart_get_flags() and hyperv_cmd_smart_get_shutdown_count() to
> get the needed info, because util_dimm_health_to_json() only submits *one*
> command, and unluckily for Hyper-V Virtual NVDIMM we need to call both
> Function 1 and 2 to get the needed info.
> 
> My feeling is that it's not very good to directly call ndctl_cmd_submit(),
> but by doing this we don't need to make any change to the common code, and
> I'm unsure if it's good to change the common code just for Hyper-V.

I thought about this, and this approach seems reasonable to me. I agree
that there is not precedent for submitting a command from the various
family libraries, but I think the way dimm-ops are structured, this
seems warranted. The way I see it is a dimm op means "get X information
for this family", and however it needs to be obtained is just a detail
specific to that DSM family. For intel it happens to be in the smart
information, but if it happens to be a separate DSM, that is also fine.

I do think this commentary in the changelog can be reworded. Consider
changing the first-person narrative to a more 'informational' or
'imperative' one. For example: "The 'smart_get_shutdown_count' dimm-op
is expected to return a shutdown count, but for the HYPERV family, this
is only available from a separate DSM. Perform a command submission in
the 'hyperv_cmd_smart_get_shutdown_count' dimm op to retrieve this, so
that a smart health listing can display this information"


Apart from the commit message comments in the previous patch, also
consider wrapping commit messages to a 72 column width [1].

[1]: https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

> 
> Signed-off-by: Dexuan Cui 
> ---
>  ndctl/lib/hyperv.c | 62 --
>  ndctl/lib/hyperv.h |  7 ++
>  2 files changed, 62 insertions(+), 7 deletions(-)
> 
> diff --git a/ndctl/lib/hyperv.c b/ndctl/lib/hyperv.c
> index b303d50..e8ec142 100644
> --- a/ndctl/lib/hyperv.c
> +++ b/ndctl/lib/hyperv.c
> @@ -22,7 +22,8 @@
>  #define CMD_HYPERV_STATUS(_c) (CMD_HYPERV(_c)->u.status)
>  #define CMD_HYPERV_SMART_DATA(_c) (CMD_HYPERV(_c)->u.smart.data)
>  
> -static struct ndctl_cmd *hyperv_dimm_cmd_new_smart(struct ndctl_dimm *dimm)
> +static struct ndctl_cmd *hyperv_dimm_cmd_new_cmd(struct ndctl_dimm *dimm,
> +  unsigned int command)

The 'new' in the function name is a bit confusing, as 'new' functions
are also used for cmd allocation. I'd suggest following the intel.c
convention and calling it 'alloc_hyperv_cmd'.

Maybe consider calling it this right from the start in patch 1, and also
having the wrapper for the new smart command in patch 1 - this way there
is less unrelated thrash in this patch, and makes it easier to review.

>  {
>   struct ndctl_bus *bus = ndctl_dimm_get_bus(dimm);
>   struct ndctl_ctx *ctx = ndctl_bus_get_ctx(bus);
> @@ -35,8 +36,7 @@ static struct ndctl_cmd *hyperv_dimm_cmd_new_smart(struct 
> ndctl_dimm *dimm)
>   return NULL;
>   }
>  
> - if (test_dimm_dsm(dimm, ND_HYPERV_CMD_GET_HEALTH_INFO) ==
> -   DIMM_DSM_UNSUPPORTED) {
> + if (test_dimm_dsm(dimm, command) ==  DIMM_DSM_UNSUPPORTED) {
>   dbg(ctx, "unsupported function\n");
>   return NULL;
>   }
> @@ -54,7 +54,7 @@ static struct ndctl_cmd *hyperv_dimm_cmd_new_smart(struct 
> ndctl_dimm *dimm)
>  
>   hyperv = CMD_HYPERV(cmd);
>   hyperv->gen.nd_family = NVDIMM_FAMILY_HYPERV;
> - hyperv->gen.nd_command = ND_HYPERV_CMD_GET_HEALTH_INFO;
> + hyperv->gen.nd_command = command;
>   hyperv->gen.nd_fw_size = 0;
>   hyperv->gen.nd_size_in = offsetof(struct nd_hyperv_smart, status);
>   hyperv->gen.nd_size_out = sizeof(hyperv->u.smart);
> @@ -65,34 +65,74 @@ static struct ndctl_cmd *hyperv_dimm_cmd_new_smart(struct 
> ndctl_dimm *dimm)
>   return cmd;
>  }
>  
> -static int hyperv_smart_valid(struct ndctl_cmd *cmd)
> +static struct ndctl_cmd *hyperv_dimm_cmd_new_smart(struct ndctl_dimm *dimm)
> +{
> + return hyperv_dimm_cmd_new_cmd(dimm, ND_HYPERV_CMD_GET_HEALTH_INFO);
> +}
> +
> +static int hyperv_cmd_valid(struct ndctl_cmd *cmd, unsigned int command)
>  {
>   if (cmd->type != ND_CMD_CALL ||
>   cmd->size != sizeof(*cmd) + sizeof(struct nd_pkg_hyperv) ||
>   CMD_HYPERV(cmd)->gen.nd_family != NVDIMM_FAMILY_HYPERV ||
> - CMD_HYPERV(cmd)->gen.nd_command != ND_HYPERV_CMD_GET_HEALTH_INFO ||
> + CMD_HYPERV(cmd)->gen.nd_command != command ||
>   cmd->status != 0 ||
>   

Re: [ndctl PATCH v2 1/4] libndctl: add support for NVDIMM_FAMILY_HYPERV's _DSM Function 1

2019-03-20 Thread Verma, Vishal L
Hi Dexuan,

Thanks for these patches - I had a few comments below.

Also on a more general note, the patches in this series don't appear to
be correctly threaded. Normally, patch emails in a series are replies to
the first patch (either 1/N or the cover letter), and this allows for
easier review as all related emails can be found in a single top-level
thread. It would be nice if you can fix this for the future - git send-
email should do this correctly automatically, if you use it for sending
the patches.


On Wed, 2019-02-20 at 05:10 +, Dexuan Cui wrote:
> This patch retrieves the health info by Hyper-V _DSM method Function 1:
> 

We should never use "This patch.." in a commit message. See 4.c in [1].

[1]: https://www.ozlabs.org/~akpm/stuff/tpp.txt


> Get Health Information (Function Index 1)
> See http://www.uefi.org/RFIC_LIST ("Virtual NVDIMM 0x1901").
> 
> Now "ndctl list --dimms --health --idle" can show a line "health_state":"ok",
> e.g.
> 
>   {
> "dev":"nmem0",
> "id":"04d5-01-1701-",
> "handle":0,
> "phys_id":0,
> "health":{
>   "health_state":"ok"
> }
>   }
> 
> If there is an error with the NVDIMM, the "ok" will be replaced with 
> "unknown",
> "fatal", "critical", or "non-critical".
> 
> Signed-off-by: Dexuan Cui 
> ---
>  ndctl/lib/Makefile.am |   1 +
>  ndctl/lib/hyperv.c| 129 ++
>  ndctl/lib/hyperv.h|  51 +
>  ndctl/lib/libndctl.c  |   2 +
>  ndctl/lib/private.h   |   3 +
>  ndctl/ndctl.h |   1 +
>  6 files changed, 187 insertions(+)
>  create mode 100644 ndctl/lib/hyperv.c
>  create mode 100644 ndctl/lib/hyperv.h
> 
> diff --git a/ndctl/lib/Makefile.am b/ndctl/lib/Makefile.am
> index 7797039..fb75fda 100644
> --- a/ndctl/lib/Makefile.am
> +++ b/ndctl/lib/Makefile.am
> @@ -20,6 +20,7 @@ libndctl_la_SOURCES =\
>   intel.c \
>   hpe1.c \
>   msft.c \
> + hyperv.c \
>   ars.c \
>   firmware.c \
>   libndctl.c
> diff --git a/ndctl/lib/hyperv.c b/ndctl/lib/hyperv.c
> new file mode 100644
> index 000..b303d50
> --- /dev/null
> +++ b/ndctl/lib/hyperv.c
> @@ -0,0 +1,129 @@
> +/*
> + * Copyright (c) 2019, Microsoft Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU Lesser General Public License,
> + * version 2.1, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT ANY
> + * WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> + * FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for
> + * more details.
> + */

For new files, use the SPDX license identifiers, for an example, see:
https://github.com/pmem/ndctl/blob/master/ndctl/load-keys.c#L1

> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "private.h"
> +#include "hyperv.h"
> +
> +#define CMD_HYPERV(_c) ((_c)->hyperv)

I'm not sure this macro improves readability, in fact I think it rather
detracts from it in many cases - see further below.
Additionally, no need for the preceeding underscore in the macro
arguments - the rest of the code base doesn't do this, and I'm not sure
what value it provides.

> +#define CMD_HYPERV_STATUS(_c) (CMD_HYPERV(_c)->u.status)
> +#define CMD_HYPERV_SMART_DATA(_c) (CMD_HYPERV(_c)->u.smart.data)

> +
> +static struct ndctl_cmd *hyperv_dimm_cmd_new_smart(struct ndctl_dimm *dimm)
> +{
> + struct ndctl_bus *bus = ndctl_dimm_get_bus(dimm);
> + struct ndctl_ctx *ctx = ndctl_bus_get_ctx(bus);
> + struct ndctl_cmd *cmd;
> + size_t size;
> + struct nd_pkg_hyperv *hyperv;
> +
> + if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_CALL)) {
> + dbg(ctx, "unsupported cmd\n");
> + return NULL;
> + }
> +
> + if (test_dimm_dsm(dimm, ND_HYPERV_CMD_GET_HEALTH_INFO) ==
> +   DIMM_DSM_UNSUPPORTED) {
> + dbg(ctx, "unsupported function\n");
> + return NULL;
> + }
> +
> + size = sizeof(*cmd) + sizeof(struct nd_pkg_hyperv);
> + cmd = calloc(1, size);
> + if (!cmd)
> + return NULL;
> +
> + cmd->dimm = dimm;
> + ndctl_cmd_ref(cmd);
> + cmd->type = ND_CMD_CALL;
> + cmd->size = size;
> + cmd->status = 1;
> +
> + hyperv = CMD_HYPERV(cmd);
> + hyperv->gen.nd_family = NVDIMM_FAMILY_HYPERV;
> + hyperv->gen.nd_command = ND_HYPERV_CMD_GET_HEALTH_INFO;
> + hyperv->gen.nd_fw_size = 0;
> + hyperv->gen.nd_size_in = offsetof(struct nd_hyperv_smart, status);
> + hyperv->gen.nd_size_out = sizeof(hyperv->u.smart);
> + hyperv->u.smart.status = 0;

calloc() zeroes the newly allocated memory - no need to set any of the
fields in the struct to '0' manually.

> +
> + cmd->firmware_status = >u.smart.status;
> +
> + return cmd;
> +}
> +
> +static int hyperv_smart_valid(struct ndctl_cmd *cmd)
> +{
> + if 

Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

2019-03-20 Thread Dan Williams
On Wed, Mar 20, 2019 at 8:34 AM Dan Williams  wrote:
>
> On Wed, Mar 20, 2019 at 1:09 AM Aneesh Kumar K.V
>  wrote:
> >
> > Aneesh Kumar K.V  writes:
> >
> > > Dan Williams  writes:
> > >
> > >>
> > >>> Now what will be page size used for mapping vmemmap?
> > >>
> > >> That's up to the architecture's vmemmap_populate() implementation.
> > >>
> > >>> Architectures
> > >>> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
> > >>> device-dax with struct page in the device will have pfn reserve area 
> > >>> aligned
> > >>> to PAGE_SIZE with the above example? We can't map that using
> > >>> PMD_SIZE page size?
> > >>
> > >> IIUC, that's a different alignment. Currently that's handled by
> > >> padding the reservation area up to a section (128MB on x86) boundary,
> > >> but I'm working on patches to allow sub-section sized ranges to be
> > >> mapped.
> > >
> > > I am missing something w.r.t code. The below code align that using 
> > > nd_pfn->align
> > >
> > >   if (nd_pfn->mode == PFN_MODE_PMEM) {
> > >   unsigned long memmap_size;
> > >
> > >   /*
> > >* vmemmap_populate_hugepages() allocates the memmap array 
> > > in
> > >* HPAGE_SIZE chunks.
> > >*/
> > >   memmap_size = ALIGN(64 * npfns, HPAGE_SIZE);
> > >   offset = ALIGN(start + SZ_8K + memmap_size + 
> > > dax_label_reserve,
> > >   nd_pfn->align) - start;
> > >   }
> > >
> > > IIUC that is finding the offset where to put vmemmap start. And that has
> > > to be aligned to the page size with which we may end up mapping vmemmap
> > > area right?
>
> Right, that's the physical offset of where the vmemmap ends, and the
> memory to be mapped begins.
>
> > > Yes we find the npfns by aligning up using PAGES_PER_SECTION. But that
> > > is to compute howmany pfns we should map for this pfn dev right?
> > >
> >
> > Also i guess those 4K assumptions there is wrong?
>
> Yes, I think to support non-4K-PAGE_SIZE systems the 'pfn' metadata
> needs to be revved and the PAGE_SIZE needs to be recorded in the
> info-block.

How often does a system change page-size. Is it fixed or do
environment change it from one boot to the next? I'm thinking through
the behavior of what do when the recorded PAGE_SIZE in the info-block
does not match the current system page size. The simplest option is to
just fail the device and require it to be reconfigured. Is that
acceptable?
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

2019-03-20 Thread Dan Williams
On Wed, Mar 20, 2019 at 1:09 AM Aneesh Kumar K.V
 wrote:
>
> Aneesh Kumar K.V  writes:
>
> > Dan Williams  writes:
> >
> >>
> >>> Now what will be page size used for mapping vmemmap?
> >>
> >> That's up to the architecture's vmemmap_populate() implementation.
> >>
> >>> Architectures
> >>> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
> >>> device-dax with struct page in the device will have pfn reserve area 
> >>> aligned
> >>> to PAGE_SIZE with the above example? We can't map that using
> >>> PMD_SIZE page size?
> >>
> >> IIUC, that's a different alignment. Currently that's handled by
> >> padding the reservation area up to a section (128MB on x86) boundary,
> >> but I'm working on patches to allow sub-section sized ranges to be
> >> mapped.
> >
> > I am missing something w.r.t code. The below code align that using 
> > nd_pfn->align
> >
> >   if (nd_pfn->mode == PFN_MODE_PMEM) {
> >   unsigned long memmap_size;
> >
> >   /*
> >* vmemmap_populate_hugepages() allocates the memmap array in
> >* HPAGE_SIZE chunks.
> >*/
> >   memmap_size = ALIGN(64 * npfns, HPAGE_SIZE);
> >   offset = ALIGN(start + SZ_8K + memmap_size + 
> > dax_label_reserve,
> >   nd_pfn->align) - start;
> >   }
> >
> > IIUC that is finding the offset where to put vmemmap start. And that has
> > to be aligned to the page size with which we may end up mapping vmemmap
> > area right?

Right, that's the physical offset of where the vmemmap ends, and the
memory to be mapped begins.

> > Yes we find the npfns by aligning up using PAGES_PER_SECTION. But that
> > is to compute howmany pfns we should map for this pfn dev right?
> >
>
> Also i guess those 4K assumptions there is wrong?

Yes, I think to support non-4K-PAGE_SIZE systems the 'pfn' metadata
needs to be revved and the PAGE_SIZE needs to be recorded in the
info-block.
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/6] security/keys/encrypted: Clean up request_trusted_key()

2019-03-20 Thread Dan Williams
On Wed, Mar 20, 2019 at 5:07 AM Mimi Zohar  wrote:
>
> On Tue, 2019-03-19 at 22:48 -0700, Dan Williams wrote:
> > On Tue, Mar 19, 2019 at 7:36 PM Mimi Zohar  wrote:
> > >
> > > On Tue, 2019-03-19 at 17:20 -0700, Dan Williams wrote:
> > > > On Tue, Mar 19, 2019 at 5:07 PM Mimi Zohar  wrote:
> > > > >
> > > > > On Mon, 2019-03-18 at 23:06 -0700, Dan Williams wrote:
> > > > >
> > > > > < snip >
> > > > >
> > > > > > +/*
> > > > > > + * request_trusted_key - request the trusted key
> > > > > > + *
> > > > > > + * Trusted keys are sealed to PCRs and other metadata. Although 
> > > > > > userspace
> > > > > > + * manages both trusted/encrypted key-types, like the encrypted 
> > > > > > key type
> > > > > > + * data, trusted key type data is not visible decrypted from 
> > > > > > userspace.
> > > > > > + */
> > > > > > +static struct key *request_trusted_key(const char *trusted_desc,
> > > > > > + const u8 **master_key, size_t 
> > > > > > *master_keylen)
> > > > > > +{
> > > > > > + struct trusted_key_payload *tpayload;
> > > > > > + struct key_type *type;
> > > > > > + struct key *tkey;
> > > > > > +
> > > > > > + type = key_type_lookup("trusted");
> > > > >
> > > > > The associated key_type_put() will need to be called.
> > > >
> > > > Yes.
> > >
> > > I don't know if defining a key_type_lookup() wrapper, perhaps named
> > > is_key_type_available(), would help.  Both key_type_lookup() and
> > > key_type_put() would be called.  The existing code could then remain
> > > the same.
> > >
> >
> > Maybe, but something still needs to pin the hosting module. I think
> > this means that the first call to key_type->instantiate() pins the
> > hosting module, and the ->destroy() of the last key for the key_type
> > unpins the module. It does mean that the ->destroy() method is no
> > longer optional.
>
> This sounds like it isn't a new problem.  Both issues need to be
> addressed, but I think we should differentiate between them and
> address them separately.
>
> In terms of the original nvdimm encrypted/trusted key problem, the
> above suggestion requires the least amount of change.  For v5.2, I
> would replace it with the full updated patch set.

I believe smallest amount of change is this single patch:

https://patchwork.kernel.org/patch/10858649/
___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/6] security/keys/encrypted: Clean up request_trusted_key()

2019-03-20 Thread Mimi Zohar
On Tue, 2019-03-19 at 22:48 -0700, Dan Williams wrote:
> On Tue, Mar 19, 2019 at 7:36 PM Mimi Zohar  wrote:
> >
> > On Tue, 2019-03-19 at 17:20 -0700, Dan Williams wrote:
> > > On Tue, Mar 19, 2019 at 5:07 PM Mimi Zohar  wrote:
> > > >
> > > > On Mon, 2019-03-18 at 23:06 -0700, Dan Williams wrote:
> > > >
> > > > < snip >
> > > >
> > > > > +/*
> > > > > + * request_trusted_key - request the trusted key
> > > > > + *
> > > > > + * Trusted keys are sealed to PCRs and other metadata. Although 
> > > > > userspace
> > > > > + * manages both trusted/encrypted key-types, like the encrypted key 
> > > > > type
> > > > > + * data, trusted key type data is not visible decrypted from 
> > > > > userspace.
> > > > > + */
> > > > > +static struct key *request_trusted_key(const char *trusted_desc,
> > > > > + const u8 **master_key, size_t 
> > > > > *master_keylen)
> > > > > +{
> > > > > + struct trusted_key_payload *tpayload;
> > > > > + struct key_type *type;
> > > > > + struct key *tkey;
> > > > > +
> > > > > + type = key_type_lookup("trusted");
> > > >
> > > > The associated key_type_put() will need to be called.
> > >
> > > Yes.
> >
> > I don't know if defining a key_type_lookup() wrapper, perhaps named
> > is_key_type_available(), would help.  Both key_type_lookup() and
> > key_type_put() would be called.  The existing code could then remain
> > the same.
> >
> 
> Maybe, but something still needs to pin the hosting module. I think
> this means that the first call to key_type->instantiate() pins the
> hosting module, and the ->destroy() of the last key for the key_type
> unpins the module. It does mean that the ->destroy() method is no
> longer optional.

This sounds like it isn't a new problem.  Both issues need to be
addressed, but I think we should differentiate between them and
address them separately.

In terms of the original nvdimm encrypted/trusted key problem, the
above suggestion requires the least amount of change.  For v5.2, I
would replace it with the full updated patch set.

Mimi

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


MAIL SYSTEM ERROR - RETURNED MAIL

2019-03-20 Thread Bounced mail
The original message was received at Wed, 20 Mar 2019 18:25:25 +0800
from lists.01.org [178.210.32.99]

- The following addresses had permanent fatal errors -


- Transcript of session follows -
  while talking to lists.01.org.:
>>> MAIL From:"Bounced mail" 
<<< 501 "Bounced mail" ... Refused



___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

2019-03-20 Thread Aneesh Kumar K.V
Aneesh Kumar K.V  writes:

> Dan Williams  writes:
>
>>
>>> Now what will be page size used for mapping vmemmap?
>>
>> That's up to the architecture's vmemmap_populate() implementation.
>>
>>> Architectures
>>> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
>>> device-dax with struct page in the device will have pfn reserve area aligned
>>> to PAGE_SIZE with the above example? We can't map that using
>>> PMD_SIZE page size?
>>
>> IIUC, that's a different alignment. Currently that's handled by
>> padding the reservation area up to a section (128MB on x86) boundary,
>> but I'm working on patches to allow sub-section sized ranges to be
>> mapped.
>
> I am missing something w.r.t code. The below code align that using 
> nd_pfn->align
>
>   if (nd_pfn->mode == PFN_MODE_PMEM) {
>   unsigned long memmap_size;
>
>   /*
>* vmemmap_populate_hugepages() allocates the memmap array in
>* HPAGE_SIZE chunks.
>*/
>   memmap_size = ALIGN(64 * npfns, HPAGE_SIZE);
>   offset = ALIGN(start + SZ_8K + memmap_size + dax_label_reserve,
>   nd_pfn->align) - start;
>   }
>
> IIUC that is finding the offset where to put vmemmap start. And that has
> to be aligned to the page size with which we may end up mapping vmemmap
> area right?
>
> Yes we find the npfns by aligning up using PAGES_PER_SECTION. But that
> is to compute howmany pfns we should map for this pfn dev right?
>   

Also i guess those 4K assumptions there is wrong?

modified   drivers/nvdimm/pfn_devs.c
@@ -783,7 +783,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
return -ENXIO;
}
 
-   npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
+   npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
pfn_sb->mode = cpu_to_le32(nd_pfn->mode);
pfn_sb->dataoff = cpu_to_le64(offset);
pfn_sb->npfns = cpu_to_le64(npfns);


-aneesh

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm


Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

2019-03-20 Thread Aneesh Kumar K.V
Dan Williams  writes:

>
>> Now what will be page size used for mapping vmemmap?
>
> That's up to the architecture's vmemmap_populate() implementation.
>
>> Architectures
>> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
>> device-dax with struct page in the device will have pfn reserve area aligned
>> to PAGE_SIZE with the above example? We can't map that using
>> PMD_SIZE page size?
>
> IIUC, that's a different alignment. Currently that's handled by
> padding the reservation area up to a section (128MB on x86) boundary,
> but I'm working on patches to allow sub-section sized ranges to be
> mapped.

I am missing something w.r.t code. The below code align that using nd_pfn->align

if (nd_pfn->mode == PFN_MODE_PMEM) {
unsigned long memmap_size;

/*
 * vmemmap_populate_hugepages() allocates the memmap array in
 * HPAGE_SIZE chunks.
 */
memmap_size = ALIGN(64 * npfns, HPAGE_SIZE);
offset = ALIGN(start + SZ_8K + memmap_size + dax_label_reserve,
nd_pfn->align) - start;
  }

IIUC that is finding the offset where to put vmemmap start. And that has
to be aligned to the page size with which we may end up mapping vmemmap
area right?

Yes we find the npfns by aligning up using PAGES_PER_SECTION. But that
is to compute howmany pfns we should map for this pfn dev right?

-aneesh

___
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm