Re: [PATCH v6 4/8] mm,memory_hotplug: Allocate memmap from the added memory range
On 09.04.21 07:10, Oscar Salvador wrote: On Thu, Apr 08, 2021 at 10:05:18PM -0700, Andrew Morton wrote: Yes please. I just sent v7 with that yesterday [1] Hope David/Michal finds some time to review patch#4 as that is the only missing piece atm. I will have a look next week. Still have to wrap my head around the present_pages stuff and if that couldn't be handled in a cleaner way. -- Thanks, David / dhildenb
Re: [PATCH v6 4/8] mm,memory_hotplug: Allocate memmap from the added memory range
On Thu, Apr 08, 2021 at 10:05:18PM -0700, Andrew Morton wrote: > Yes please. I just sent v7 with that yesterday [1] Hope David/Michal finds some time to review patch#4 as that is the only missing piece atm. [1] https://lkml.org/lkml/2021/4/8/546 -- Oscar Salvador SUSE L3
Re: [PATCH v6 4/8] mm,memory_hotplug: Allocate memmap from the added memory range
On Wed, 07 Apr 2021 22:38:37 +0200 Oscar Salvador wrote: > On 2021-04-06 22:28, Oscar Salvador wrote: > > Heh, it seems I spaced out today. > > > > We need a few things on top: > > > Yes please.
Re: [PATCH v6 4/8] mm,memory_hotplug: Allocate memmap from the added memory range
On 2021-04-06 22:28, Oscar Salvador wrote: Heh, it seems I spaced out today. We need a few things on top: Should I send a new version with the fixup included? I think it would ease the review but I do not want to spam. -- Oscar Salvador SUSE L3
Re: [PATCH v6 4/8] mm,memory_hotplug: Allocate memmap from the added memory range
On Tue, Apr 06, 2021 at 01:11:11PM +0200, Oscar Salvador wrote: > Physical memory hotadd has to allocate a memmap (struct page array) for > the newly added memory section. Currently, alloc_pages_node() is used > for those allocations. > > This has some disadvantages: > a) an existing memory is consumed for that purpose > (eg: ~2MB per 128MB memory section on x86_64) > b) if the whole node is movable then we have off-node struct pages > which has performance drawbacks. > c) It might be there are no PMD_ALIGNED chunks so memmap array gets > populated with base pages. > > This can be improved when CONFIG_SPARSEMEM_VMEMMAP is enabled. > > Vmemap page tables can map arbitrary memory. > That means that we can simply use the beginning of each memory section and > map struct pages there. > struct pages which back the allocated space then just need to be treated > carefully. > > Implementation wise we will reuse vmem_altmap infrastructure to override > the default allocator used by __populate_section_memmap. > Part of the implementation also relies on memory_block structure gaining > a new field which specifies the number of vmemmap_pages at the beginning. > This patch also introduces the following functions: > > - vmemmap_init_space: Initializes vmemmap pages by calling > move_pfn_range_to_zone(), > calls kasan_add_zero_shadow() or the vmemmap range and > marks > online as many sections as vmemmap pages fully span. > - vmemmap_adjust_pages: Accounts/substract vmemmap_pages to node and zone >present_pages > - vmemmap_deinit_space: Undoes what vmemmap_init_space does. > > The new function memory_block_online() calls vmemmap_init_space() before > doing the actual online_pages(). Should online_pages() fail, we clean up > by calling vmemmap_adjust_pages() and vmemmap_deinit_space(). > > On offline, memory_block_offline() calls vmemmap_adjust_pages() prior to > calling > offline_pages(), because offline_pages() performs the tearing-down of kthreads > and the rebuilding of the zonelists if the node/zone become empty. > If offline_pages() fails, we account back vmemmap pages by > vmemmap_adjust_pages(). > If it succeeds, we call vmemmap_deinit_space(). > > Hot-remove: > > We need to be careful when removing memory, as adding and > removing memory needs to be done with the same granularity. > To check that this assumption is not violated, we check the > memory range we want to remove and if a) any memory block has > vmemmap pages and b) the range spans more than a single memory > block, we scream out loud and refuse to proceed. > > If all is good and the range was using memmap on memory (aka vmemmap pages), > we construct an altmap structure so free_hugepage_table does the right > thing and calls vmem_altmap_free instead of free_pagetable. > > Signed-off-by: Oscar Salvador Heh, it seems I spaced out today. We need a few things on top: - In case !CONFIG_MEMORY_HOTREMOVE, we still need vmemmap_deinit_space as we call it from memory_block_online() too in case online_pages() fails. So we need to move it out of #CONFIG_MEMORY_HOTREMOVE, with the others. - If vmemmap pages fully spans sections, we need to mark those sections as online, since online_pages() will not do it for us. In vmemmap_deinit_space, we need to mark them back offline. Since vmemmap_deinit_space might get called from memory_block_online(), we need to move the offline_mem_sections() out of #CONFIG_MEMORY_HOTREMOVE code. So, we would need the following on top: diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index cc3dbc0abf02..c7669d2accfd 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -111,6 +111,7 @@ extern int add_one_highpage(struct page *page, int pfn, int bad_ppro); extern void vmemmap_adjust_pages(unsigned long pfn, long nr_pages); extern int vmemmap_init_space(unsigned long pfn, unsigned long nr_pages, int nid, int online_type); +extern void vmemmap_deinit_space(unsigned long pfn, unsigned long nr_pages); extern int online_pages(unsigned long pfn, unsigned long nr_pages, int online_type, int nid); extern struct zone *test_pages_in_a_zone(unsigned long start_pfn, @@ -317,7 +318,6 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {} #ifdef CONFIG_MEMORY_HOTREMOVE -extern void vmemmap_deinit_space(unsigned long pfn, unsigned long nr_pages); extern void try_offline_node(int nid); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); extern int remove_memory(int nid, u64 start, u64 size); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 747e1c21d8e2..76f4ca5ed230 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1383,10 +1383,8 @@ static inline int online_section_nr(unsigned long nr) #ifdef CONFIG_MEMORY_HOTPLUG void
Re: [PATCH v6 4/8] mm,memory_hotplug: Allocate memmap from the added memory range
Hi Oscar, Thank you for the patch! Yet something to improve: [auto build test ERROR on driver-core/driver-core-testing] [also build test ERROR on pm/linux-next tip/x86/core arm64/for-next/core linus/master v5.12-rc6] [cannot apply to hnaz-linux-mm/master next-20210406] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Oscar-Salvador/Allocate-memmap-from-hotadded-memory-per-device/20210406-191333 base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 6e11b376fd74356e32d842be588e12dc9bf6e197 config: x86_64-randconfig-a015-20210406 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a46f59a747a7273cc439efaf3b4f98d8b63d2f20) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # https://github.com/0day-ci/linux/commit/4d4265dd4e598c7b0971d57894685136229f5d07 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Oscar-Salvador/Allocate-memmap-from-hotadded-memory-per-device/20210406-191333 git checkout 4d4265dd4e598c7b0971d57894685136229f5d07 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): >> drivers/base/memory.c:201:3: error: implicit declaration of function >> 'vmemmap_deinit_space' [-Werror,-Wimplicit-function-declaration] vmemmap_deinit_space(start_pfn, nr_vmemmap_pages); ^ drivers/base/memory.c:201:3: note: did you mean 'vmemmap_init_space'? include/linux/memory_hotplug.h:112:12: note: 'vmemmap_init_space' declared here extern int vmemmap_init_space(unsigned long pfn, unsigned long nr_pages, ^ drivers/base/memory.c:231:4: error: implicit declaration of function 'vmemmap_deinit_space' [-Werror,-Wimplicit-function-declaration] vmemmap_deinit_space(start_pfn, nr_pages); ^ 2 errors generated. vim +/vmemmap_deinit_space +201 drivers/base/memory.c 171 172 static int memory_block_online(struct memory_block *mem) 173 { 174 unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); 175 unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; 176 unsigned long nr_vmemmap_pages = mem->nr_vmemmap_pages; 177 int ret; 178 179 /* 180 * Although vmemmap pages have a different lifecycle than the pages 181 * they describe (they remain until the memory is unplugged), doing 182 * its initialization and accounting at hot-{online,offline} stage 183 * simplifies things a lot 184 */ 185 if (nr_vmemmap_pages) { 186 ret = vmemmap_init_space(start_pfn, nr_vmemmap_pages, mem->nid, 187 mem->online_type); 188 if (ret) 189 return ret; 190 } 191 192 ret = online_pages(start_pfn + nr_vmemmap_pages, 193 nr_pages - nr_vmemmap_pages, mem->online_type, 194 mem->nid); 195 196 /* 197 * Undo the work if online_pages() fails. 198 */ 199 if (ret && nr_vmemmap_pages) { 200 vmemmap_adjust_pages(start_pfn, -nr_vmemmap_pages); > 201 vmemmap_deinit_space(start_pfn, nr_vmemmap_pages); 202 } 203 204 return ret; 205 } 206 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip