On Sat, Apr 22, 2017 at 12:04 AM, Oliver O'Halloran <[email protected]> wrote: > On Sat, Apr 22, 2017 at 3:19 AM, Dan Williams <[email protected]> > wrote: >> On Fri, Apr 21, 2017 at 12:12 AM, Oliver O'Halloran <[email protected]> wrote: >>> Read the default alignment from the hpage_pmd_size in sysfs. On PPC the >>> PMD size depends on the MMU being used. When the traditional hash MMU is >>> used (P9 and earlier) the PMD size is 16MB while the newer radix MMU >>> uses a 2MB PMD size. The choice of MMU is done at runtime depending on >>> what the hardware supports so we need to detect this at runtime rather >>> than hardcoding it. >>> >>> Signed-off-by: Oliver O'Halloran <[email protected]> >>> --- >>> ndctl/Makefile.am | 3 ++- >>> ndctl/builtin-xaction-namespace.c | 41 >>> +++++++++++++++++++++++++++++---------- >>> 2 files changed, 33 insertions(+), 11 deletions(-) >>> >>> diff --git a/ndctl/Makefile.am b/ndctl/Makefile.am >>> index c563e9411cc3..6d565c643efd 100644 >>> --- a/ndctl/Makefile.am >>> +++ b/ndctl/Makefile.am >>> @@ -10,7 +10,8 @@ ndctl_SOURCES = ndctl.c \ >>> ../util/log.c \ >>> builtin-list.c \ >>> builtin-test.c \ >>> - ../util/json.c >>> + ../util/json.c \ >>> + ../util/sysfs.c >>> >>> if ENABLE_SMART >>> ndctl_SOURCES += util/json-smart.c >>> diff --git a/ndctl/builtin-xaction-namespace.c >>> b/ndctl/builtin-xaction-namespace.c >>> index d6c38dc15984..713a95987d91 100644 >>> --- a/ndctl/builtin-xaction-namespace.c >>> +++ b/ndctl/builtin-xaction-namespace.c >>> @@ -22,6 +22,7 @@ >>> #include <sys/types.h> >>> #include <util/size.h> >>> #include <util/json.h> >>> +#include <util/sysfs.h> >>> #include <json-c/json.h> >>> #include <util/filter.h> >>> #include <ndctl/libndctl.h> >>> @@ -54,6 +55,8 @@ static struct parameters { >>> const char *align; >>> } param; >>> >>> +char default_align_buf[SYSFS_ATTR_SIZE]; >>> + >>> void builtin_xaction_namespace_reset(void) >>> { >>> /* >>> @@ -137,7 +140,24 @@ enum namespace_action { >>> ACTION_DESTROY, >>> }; >>> >>> -static int set_defaults(enum namespace_action mode) >>> +const char *sysfs_read_default_align(struct ndctl_ctx *ctx, const char >>> *def, >>> + const char *path) >>> +{ >>> + /* >>> + * HACK: The command handlers aren't supposed to write into >>> + * the ndctl command context, but we want the debug >>> + * output to go somewhere sensible. >>> + */ >>> + if (__sysfs_read_attr((struct log_ctx *)ctx, path, >>> default_align_buf)) >>> + return strdup(def); >>> + >>> + if (!strlen(default_align_buf)) >>> + return def; >>> + >>> + return default_align_buf; >> >> I chatted with Dave Hansen about this and we're thinking we should go >> ahead and add a new attribute to the device-dax sysfs with the list of >> supported alignments, similar to what we have in the btt case for >> supported sector sizes. >> >> The reason is that the sensitivity to page sizes is a device-dax >> internal requirement. Theoretically device-dax could support any >> alignment and handle it with a mix of page sizes. However, since >> device-dax wants to be strict and predictable about the tlb size >> backing a given device-dax mapping then it should list the possible >> options. > > Sounds good to me. Using the thp sysfs entries was always going to be > a bit of hack job, but I couldn't find anything better to use as a > sane default. I'll post a patch Monday (unless you already wrote one?) > >> Looking at the transparent_hugepage sysfs is a bit of a layering >> violation. There is no strict guarantee that device-dax is tied to thp >> in the longterm. The thp sysfs is also awkward because it does not >> tell us the pud page size. > > Are there any concrete plans on moving device-dax away from THP? THP > under hash has some horrible quirks and although the hardware supports > up to 16G pages they're impossible to use outside of hugetlbfs. It > would be nice to make use of the larger page sizes with dax devs, but > I'm not sure I want to jump down that rabbit hole...
Yes, we have done this for the 1GB / pud case on x86_64. We added the base THP primitives for pud support [1] even though the rest of the THP implementation only understands pmds. [1]: commit a00cc7d9dd93 mm, x86: add support for PUD-sized transparent hugepages _______________________________________________ Linux-nvdimm mailing list [email protected] https://lists.01.org/mailman/listinfo/linux-nvdimm
