On 3/25/21 11:09 PM, Joao Martins wrote:
[...]
> Patch 11: Optimize grabbing page refcount changes given that we
> are working with compound pages i.e. we do 1 increment to the head
> page for a given set of N subpages compared as opposed to N individual writes.
> {get,pin}_user_pages_fast() for zone_device with compound pagemap consequently
> improves considerably with DRAM stored struct pages. It also *greatly*
> improves pinning with altmap. Results with gup_test:
>
> before after
> (16G get_user_pages_fast 2M page size) ~59 ms -> ~6.1 ms
> (16G pin_user_pages_fast 2M page size) ~87 ms -> ~6.2 ms
> (16G get_user_pages_fast altmap 2M page size) ~494 ms -> ~9 ms
> (16G pin_user_pages_fast altmap 2M page size) ~494 ms -> ~10 ms
>
> altmap performance gets specially interesting when pinning a pmem dimm:
>
> before after
> (128G get_user_pages_fast 2M page size) ~492 ms -> ~49 ms
> (128G pin_user_pages_fast 2M page size) ~493 ms -> ~50 ms
> (128G get_user_pages_fast altmap 2M page size) ~3.91 ms -> ~70 ms
> (128G pin_user_pages_fast altmap 2M page size) ~3.97 ms -> ~74 ms
>
Quick correction: These last two 3.91 and 3.97 on the left column are in
*seconds* not
milliseconds. By mistake I added an extra 'm'. Sorry about that.
> The unpinning improvement patches are in mmotm/linux-next so removed from this
> series.
>
> I have deferred the __get_user_pages() patch to outside this series
> (https://lore.kernel.org/linux-mm/[email protected]/),
> as I found an simpler way to address it and that is also applicable to
> THP. But will submit that as a follow up of this.
>
> Patches apply on top of linux-next tag next-20210325 (commit b4f20b70784a).
>
> Comments and suggestions very much appreciated!
>
> Changelog,
>
> RFC -> v1:
> (New patches 1-3, 5-8 but the diffstat is that different)
> * Fix hwpoisoning of devmap pages reported by Jane (Patch 1 is new in v1)
> * Fix/Massage commit messages to be more clear and remove the 'we'
> occurences (Dan, John, Matthew)
I just noticed that I haven't fully removed the 'we' occurrences. Patches 7 and
8 had
their commit messages rewritten and I mistakenly re-introduced remnants of
'we'. I will
have it fixed for (v2) albeit I'll still wait for comments on this series
before following up.
> * Use pfn_align to be clear it's nr of pages for @align value (John, Dan)
> * Add two helpers pgmap_align() and pgmap_pfn_align() as accessors of
> pgmap->align;
> * Remove the gup_device_compound_huge special path and have the same code
> work both ways while special casing when devmap page is compound (Jason,
> John)
> * Avoid usage of vmemmap_populate_basepages() and introduce a first class
> loop that doesn't care about passing an altmap for memmap reuse. (Dan)
> * Completely rework the vmemmap_populate_compound() to avoid the
> sparse_add_section
> hack into passing block across sparse_add_section calls. It's a lot easier
> to
> follow and more explicit in what it does.
> * Replace the vmemmap refactoring with adding a @pgmap argument and moving
> parts of the vmemmap_populate_base_pages(). (Patch 5 and 6 are new as a
> result)
> * Add PMD tail page vmemmap area reuse for 1GB pages. (Patch 8 is new)
> * Improve memmap_init_zone_device() to initialize compound pages when
> struct pages are cache warm. That lead to a even further speed up further
> from RFC series from 190ms -> 80-120ms. Patches 2 and 3 are the new ones
> as a result (Dan)
> * Remove PGMAP_COMPOUND and use @align as the property to detect whether
> or not to reuse vmemmap areas (Dan)
>
> Thanks,
> Joao
>
> Joao Martins (11):
> memory-failure: fetch compound_head after pgmap_pfn_valid()
> mm/page_alloc: split prep_compound_page into head and tail subparts
> mm/page_alloc: refactor memmap_init_zone_device() page init
> mm/memremap: add ZONE_DEVICE support for compound pages
> mm/sparse-vmemmap: add a pgmap argument to section activation
> mm/sparse-vmemmap: refactor vmemmap_populate_basepages()
> mm/sparse-vmemmap: populate compound pagemaps
> mm/sparse-vmemmap: use hugepages for PUD compound pagemaps
> mm/page_alloc: reuse tail struct pages for compound pagemaps
> device-dax: compound pagemap support
> mm/gup: grab head page refcount once for group of subpages
>
> drivers/dax/device.c | 58 +++++++--
> include/linux/memory_hotplug.h | 5 +-
> include/linux/memremap.h | 13 ++
> include/linux/mm.h | 8 +-
> mm/gup.c | 52 +++++---
> mm/memory-failure.c | 2 +
> mm/memory_hotplug.c | 3 +-
> mm/memremap.c | 9 +-
> mm/page_alloc.c | 126 +++++++++++++------
> mm/sparse-vmemmap.c | 221 +++++++++++++++++++++++++++++----
> mm/sparse.c | 24 ++--
> 11 files changed, 406 insertions(+), 115 deletions(-)
>
_______________________________________________
Linux-nvdimm mailing list -- [email protected]
To unsubscribe send an email to [email protected]