On Mon, Sep 9, 2019 at 11:29 PM Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> wrote: > > With PFN_MODE_PMEM namespace, the memmap area is allocated from the device > area. Some architectures map the memmap area with large page size. On > architectures like ppc64, 16MB page for memap mapping can map 262144 pfns. > This maps a namespace size of 16G. > > When populating memmap region with 16MB page from the device area, > make sure the allocated space is not used to map resources outside this > namespace. Such usage of device area will prevent a namespace destroy. > > Add resource end pnf in altmap and use that to check if the memmap area > allocation can map pfn outside the namespace. On ppc64 in such case we > fallback > to allocation from memory. > > This fix kernel crash reported below: > > [ 132.034989] WARNING: CPU: 13 PID: 13719 at mm/memremap.c:133 > devm_memremap_pages_release+0x2d8/0x2e0 > [ 133.464754] BUG: Unable to handle kernel data access at 0xc00c00010b204000 > [ 133.464760] Faulting instruction address: 0xc00000000007580c > [ 133.464766] Oops: Kernel access of bad area, sig: 11 [#1] > [ 133.464771] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries > ..... > [ 133.464901] NIP [c00000000007580c] vmemmap_free+0x2ac/0x3d0 > [ 133.464906] LR [c0000000000757f8] vmemmap_free+0x298/0x3d0 > [ 133.464910] Call Trace: > [ 133.464914] [c000007cbfd0f7b0] [c0000000000757f8] vmemmap_free+0x298/0x3d0 > (unreliable) > [ 133.464921] [c000007cbfd0f8d0] [c000000000370a44] > section_deactivate+0x1a4/0x240 > [ 133.464928] [c000007cbfd0f980] [c000000000386270] > __remove_pages+0x3a0/0x590 > [ 133.464935] [c000007cbfd0fa50] [c000000000074158] > arch_remove_memory+0x88/0x160 > [ 133.464942] [c000007cbfd0fae0] [c0000000003be8c0] > devm_memremap_pages_release+0x150/0x2e0 > [ 133.464949] [c000007cbfd0fb70] [c000000000738ea0] > devm_action_release+0x30/0x50 > [ 133.464955] [c000007cbfd0fb90] [c00000000073a5a4] release_nodes+0x344/0x400 > [ 133.464961] [c000007cbfd0fc40] [c00000000073378c] > device_release_driver_internal+0x15c/0x250 > [ 133.464968] [c000007cbfd0fc80] [c00000000072fd14] unbind_store+0x104/0x110 > [ 133.464973] [c000007cbfd0fcd0] [c00000000072ee24] drv_attr_store+0x44/0x70 > [ 133.464981] [c000007cbfd0fcf0] [c0000000004a32bc] sysfs_kf_write+0x6c/0xa0 > [ 133.464987] [c000007cbfd0fd10] [c0000000004a1dfc] > kernfs_fop_write+0x17c/0x250 > [ 133.464993] [c000007cbfd0fd60] [c0000000003c348c] __vfs_write+0x3c/0x70 > [ 133.464999] [c000007cbfd0fd80] [c0000000003c75d0] vfs_write+0xd0/0x250
Question, does this crash only happen when the namespace is not 16MB aligned? In other words was this bug exposed by the subsection-hotplug changes and should it contain Fixes: tag for those commits?