On Wed, Oct 16, 2019 at 7:43 PM Aneesh Kumar K.V <[email protected]> wrote: > > On 10/17/19 12:35 AM, Dan Williams wrote: > > On Wed, Oct 16, 2019 at 9:59 AM Aneesh Kumar K.V > > <[email protected]> wrote: > >> > >> On 10/16/19 9:34 PM, Dan Williams wrote: > >>> On Tue, Oct 15, 2019 at 10:29 PM Aneesh Kumar K.V > >>> <[email protected]> wrote: > >>>> > >>>> On 10/16/19 3:32 AM, Dan Williams wrote: > >>>>> On Tue, Oct 15, 2019 at 8:33 AM Aneesh Kumar K.V > >>>>> <[email protected]> wrote: > >>>>>> > >>>>>> nvdimm core currently maps the full namespace to an ioremap range > >>>>>> while probing the namespace mode. This can result in probe failures > >>>>>> on architectures that have limited ioremap space. > >>>>> > >>>>> Is there a #define for that limit? > >>>>> > >>>> > >>>> Arch specific #define. For example. ppc64 have different limits based on > >>>> platform and translation mode. Hash translation with 4k PAGE_SIZE limit > >>>> ioremap range to 8TB. > >>>> > >>>>>> nvdimm core can avoid this failure by only mapping the reserver block > >>>>>> area to > >>>>>> check for pfn superblock type and map the full namespace resource only > >>>>>> before > >>>>>> using the namespace. nvdimm core use ioremap range only for the raw > >>>>>> and btt > >>>>>> namespace and we can limit the max namespace size for these two modes. > >>>>>> For > >>>>>> both fsdax and devdax this change enables nvdimm to map namespace > >>>>>> larger > >>>>>> that ioremap limit. > >>>>> > >>>>> If the direct map has more space I think it would be better to add a > >>>>> way to use that to map for all namespaces rather than introduce > >>>>> arbitrary failures based on the mode. > >>>>> > >>>>> I would buy a performance argument to avoid overmapping, but for > >>>>> namespace access compatibility where an alternate mapping method would > >>>>> succeed I think we should aim for that to be used instead. Thoughts? > >>>>> > >>>> > >>>> That would require to have struct page allocated for these range and > >>>> both raw and btt don't need a struct page backing? > >>>> > >>> > >>> I was thinking a new mapping interface that just consumed direct-map > >>> space, but did not allocate pages. > >>> > >> > >> Not sure how easy that would be. We are looking at having part of > >> direct-map address not managed by any zone and then possibly archs need > >> to be taught to handle these ? (for example for ppc64 we "bolt" direct > >> map range where as we allow taking low level hash fault for I/O remap > >> range) > >> > >> Even though you don't consider the patch as complete, considering the > >> approach you outlined would require larger changes, do you think this > >> patch can be accepted as a bug fix? Right now we can fail namespace > >> initialization during boot or ndctl enable-namespace all. > >> > >> For example with ppc64 and I/O remap range limit of 8TB, we can > >> individually create a 6TB namespace. We also allow to create multiple > >> such namespaces. But if we try to enable them all together using ndctl > >> enable-namespace all, that will fail with error > >> > >> [ 54.259910] vmap allocation for size x failed: use vmalloc=<size> to > >> increase size > >> > >> because we probe these namespaces in parallel. > > > > The patch is incomplete, right? > > Incomplete with respect to the detail that we still don't allow large > raw and btt namespaces. > > > > It does not fix the raw mode namespace > > case, and that error message seems to indicate to the user how to fix > > the problem. I was of the impression it was a fixed range in the > > address map. Could you instead try to autodetect the potential pmem > > usage and auto increase the vmap space? > > > The error is printed by generic code and the failures are due to fixed > size. We can't workaround that using vmalloc=<size> option.
Darn. Ok, explain to me again how this patch helps. This just seems to delay the inevitable failure a bit, but the end result is that the user needs to pick and choose which namespaces to enable after the kernel has tried to auto-probe namespaces. _______________________________________________ Linux-nvdimm mailing list -- [email protected] To unsubscribe send an email to [email protected]
