Re: [PATCH] mm: Remove double faults once write a device pfn

2024-01-24 Thread Alistair Popple
"Zhou, Xianrong" writes: > [AMD Official Use Only - General] > >> > The vmf_insert_pfn_prot could cause unnecessary double faults on a >> > device pfn. Because currently the vmf_insert_pfn_prot does not >> > make the pfn writable so the pte entry is normally read-only or >> >

Re: [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-03 Thread Alistair Popple
Christian König writes: > Am 01.12.23 um 06:48 schrieb Zeng, Oak: >> [SNIP] >> Besides memory eviction/oversubscription, there are a few other pain points >> when I use hmm: >> >> 1) hmm doesn't support file-back memory, so it is hard to share > memory b/t process in a gpu environment. You

Re: [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread Alistair Popple
"Zeng, Oak" writes: > See inline comments > >> -Original Message- >> From: dri-devel On Behalf Of >> zhuweixi >> Sent: Thursday, November 30, 2023 5:48 AM >> To: Christian König ; Zeng, Oak >> ; Christian König ; linux- >> m...@kvack.org; linux-ker...@vger.kernel.org;

Re: [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread Alistair Popple
zhuweixi writes: > Glad to know that there is a common demand for a new syscall like > hmadvise(). I expect it would also be useful for homogeneous NUMA > cases. Credits to cudaMemAdvise() API which brought this idea to > GMEM's design. It's not clear to me that this would need to be a new

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-08-22 Thread Alistair Popple
"Kasireddy, Vivek" writes: > Hi Alistair, > >> >> > > > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the >> issue. >> >> > > > Although, I do not have THP enabled (or built-in), shmem does not >> evict >> >> > > > the pages after hole punch as noted in the comment in >> >>

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-08-21 Thread Alistair Popple
"Kasireddy, Vivek" writes: > Hi Jason, > >> > > >> > > > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue. >> > > > Although, I do not have THP enabled (or built-in), shmem does not evict >> > > > the pages after hole punch as noted in the comment in >> shmem_fallocate(): >>

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-08-03 Thread Alistair Popple
David Hildenbrand writes: > On 03.08.23 14:14, Jason Gunthorpe wrote: >> On Thu, Aug 03, 2023 at 07:35:51AM +, Kasireddy, Vivek wrote: >>> Hi Jason, >>> >> Right, the "the zero pages are changed into writable pages" in your >> above comment just might not apply, because there won't

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-07-24 Thread Alistair Popple
"Kasireddy, Vivek" writes: > Hi Alistair, Hi Vivek, >> I wonder if we actually need the flag? IIUC it is already used for more >> than just KSM. For example it can be called as part of fault handling by >> set_pte_at_notify() in in wp_page_copy(). > Yes, I noticed that but what I really

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-07-24 Thread Alistair Popple
Jason Gunthorpe writes: > On Mon, Jul 24, 2023 at 07:54:38AM +, Kasireddy, Vivek wrote: >> And replace mmu_notifier_update_mapping(vma->vm_mm, address, pte_pfn(*ptep)) >> in the current patch with >> mmu_notifier_change_pte(vma->vm_mm, address, ptep, false)); > > It isn't very useful

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-07-24 Thread Alistair Popple
"Kasireddy, Vivek" writes: > Hi Alistair, > >> >> >> "Kasireddy, Vivek" writes: >> >> Yes, although obviously as I think you point out below you wouldn't be >> able to take any sleeping locks in mmu_notifier_update_mapping(). > Yes, I understand that, but I am not sure how we can prevent

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-07-20 Thread Alistair Popple
"Kasireddy, Vivek" writes: > Hi Alistair, > >> >> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> > index 64a3239b6407..1f2f0209101a 100644 >> > --- a/mm/hugetlb.c >> > +++ b/mm/hugetlb.c >> > @@ -6096,8 +6096,12 @@ vm_fault_t hugetlb_fault(struct mm_struct >> *mm, struct vm_area_struct *vma,

Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

2023-07-18 Thread Alistair Popple
Vivek Kasireddy writes: > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 64a3239b6407..1f2f0209101a 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -6096,8 +6096,12 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct > vm_area_struct *vma, >* hugetlb_no_page will

Re: [PATCH] mm/migrate_device: Return number of migrating pages in args->cpages

2022-11-16 Thread Alistair Popple
:51, Alistair Popple wrote: >> migrate_vma->cpages originally contained a count of the number of >> pages migrating including non-present pages which can be poluated > > "populated" > >> directly on the target. >> >> Commit 2

[PATCH] mm/memory: Return vm_fault_t result from migrate_to_ram() callback

2022-11-14 Thread Alistair Popple
t variable, restoring the previous behaviour on migrate_to_ram() failure. Fixes: 16ce101db85d ("mm/memory.c: fix race when faulting a device private page") Signed-off-by: Alistair Popple Cc: Ralph Campbell Cc: John Hubbard Cc: Alex Sierra Cc: Ben Skeggs Cc: Felix Kuehling Cc: Lyude Paul

[PATCH] mm/migrate_device: Return number of migrating pages in args->cpages

2022-11-10 Thread Alistair Popple
refactor migrate_vma and migrate_deivce_coherent_page()") Signed-off-by: Alistair Popple Reported-by: Ralph Campbell Cc: John Hubbard Cc: Alex Sierra Cc: Ben Skeggs Cc: Felix Kuehling Cc: Lyude Paul Cc: Jason Gunthorpe Cc: Michael Ellerman --- Hi Andrew, hoping you can merge this small fix which Ralph reported to

Re: [PATCH v2 0/8] Fix several device private page reference counting issues

2022-10-25 Thread Alistair Popple
"Vlastimil Babka (SUSE)" writes: > On 9/28/22 14:01, Alistair Popple wrote: >> This series aims to fix a number of page reference counting issues in >> drivers dealing with device private ZONE_DEVICE pages. These result in >> use-after-free type bugs, either fro

Re: [PATCH] mm/memremap: Introduce pgmap_request_folio() using pgmap offsets

2022-10-23 Thread Alistair Popple
and switch to using >>>> gen_pool_alloc() to track which offsets of a pgmap are allocated. That's an interesting idea. I might take a look at converting hmm-tests to do this (and probably by extension Nouveau as the allocator is basically the same). Feel free to also add: Reviewed-by: A

Re: [PATCH v2 1/8] mm/memory.c: Fix race when faulting a device private page

2022-10-02 Thread Alistair Popple
Felix Kuehling writes: > On 2022-09-28 08:01, Alistair Popple wrote: >> When the CPU tries to access a device private page the migrate_to_ram() >> callback associated with the pgmap for the page is called. However no >> reference is taken on the faulting page. T

Re: [PATCH 2/7] mm: Free device private pages have zero refcount

2022-09-29 Thread Alistair Popple
Dan Williams writes: > Alistair Popple wrote: >> >> Jason Gunthorpe writes: >> >> > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote: >> >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page >> >>

Re: [PATCH v2 8/8] hmm-tests: Add test for migrate_device_range()

2022-09-29 Thread Alistair Popple
Andrew Morton writes: > On Wed, 28 Sep 2022 22:01:22 +1000 Alistair Popple wrote: > >> @@ -1401,22 +1494,7 @@ static int dmirror_device_init(struct dmirror_device >> *mdevice, int id) >> >> static void dmirror_device_remove(struct dmirror_device *mdevice

Re: [PATCH 1/7] mm/memory.c: Fix race when faulting a device private page

2022-09-28 Thread Alistair Popple
Michael Ellerman writes: > Alistair Popple writes: >> When the CPU tries to access a device private page the migrate_to_ram() >> callback associated with the pgmap for the page is called. However no >> reference is taken on the faulting page. Therefore a concurrent >&

[PATCH v2 8/8] hmm-tests: Add test for migrate_device_range()

2022-09-28 Thread Alistair Popple
Signed-off-by: Alistair Popple Cc: Jason Gunthorpe Cc: Ralph Campbell Cc: John Hubbard Cc: Alex Sierra Cc: Felix Kuehling --- lib/test_hmm.c | 120 +- lib/test_hmm_uapi.h| 1 +- tools/testing/selftests/vm/hmm-tests.c

[PATCH v2 7/8] nouveau/dmem: Evict device private memory during release

2022-09-28 Thread Alistair Popple
device pages have been freed which may never happen. Fix this by migrating device mappings back to normal CPU memory prior to freeing the GPU memory chunks and associated device private pages. Signed-off-by: Alistair Popple Cc: Lyude Paul Cc: Ben Skeggs Cc: Ralph Campbell Cc: John Hubbard

[PATCH v2 6/8] nouveau/dmem: Refactor nouveau_dmem_fault_copy_one()

2022-09-28 Thread Alistair Popple
. Refactor out the core functionality so that it is not specific to fault handling. Signed-off-by: Alistair Popple Reviewed-by: Lyude Paul Cc: Ben Skeggs Cc: Ralph Campbell Cc: John Hubbard --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 58 +-- 1 file changed, 28 insertions

[PATCH v2 5/8] mm/migrate_device.c: Add migrate_device_range()

2022-09-28 Thread Alistair Popple
n free up device memory. To allow that this patch introduces the migrate_device family of functions which are functionally similar to migrate_vma but which skips the initial lookup based on mapping. Signed-off-by: Alistair Popple Cc: "Huang, Ying" Cc: Zi Yan Cc: Matthew Wilcox Cc:

[PATCH v2 4/8] mm/migrate_device.c: Refactor migrate_vma and migrate_deivce_coherent_page()

2022-09-28 Thread Alistair Popple
this isn't true for device private memory, and a future change requires similar functionality for device private memory. So refactor the code into something more sensible for migrating device memory without a vma. Signed-off-by: Alistair Popple Cc: "Huang, Ying" Cc: Zi Yan Cc: Matthew

[PATCH v2 2/8] mm: Free device private pages have zero refcount

2022-09-28 Thread Alistair Popple
functions such as get_page_unless_zero(). Signed-off-by: Alistair Popple Cc: Jason Gunthorpe Cc: Michael Ellerman Cc: Felix Kuehling Cc: Alex Deucher Cc: Christian König Cc: Ben Skeggs Cc: Lyude Paul Cc: Ralph Campbell Cc: Alex Sierra Cc: John Hubbard Cc: Dan Williams --- This will conflict with Dan's ser

[PATCH v2 1/8] mm/memory.c: Fix race when faulting a device private page

2022-09-28 Thread Alistair Popple
if it's expected or not. Signed-off-by: Alistair Popple Cc: Jason Gunthorpe Cc: John Hubbard Cc: Ralph Campbell Cc: Michael Ellerman Cc: Felix Kuehling Cc: Lyude Paul --- arch/powerpc/kvm/book3s_hv_uvmem.c | 15 ++- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17

[PATCH v2 3/8] mm/memremap.c: Take a pgmap reference on page allocation

2022-09-28 Thread Alistair Popple
ough pages are still mapped by the kernel which can lead to kernel crashes, particularly if a driver frees the pagemap. To fix this drivers should take a pagemap reference when allocating the page. This reference can then be returned when the page is freed. Signed-off-by: Alistair Popple Fixes: 27

[PATCH v2 0/8] Fix several device private page reference counting issues

2022-09-28 Thread Alistair Popple
-...@lists.freedesktop.org Cc: nouv...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Alistair Popple (8): mm/memory.c: Fix race when faulting a device private page mm: Free device private pages have zero refcount mm/memremap.c: Take a pgmap reference on page allocation mm

Re: [PATCH 5/7] nouveau/dmem: Refactor nouveau_dmem_fault_copy_one()

2022-09-28 Thread Alistair Popple
Lyude Paul writes: > On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote: >> nouveau_dmem_fault_copy_one() is used during handling of CPU faults via >> the migrate_to_ram() callback and is used to copy data from GPU to CPU >> memory. It is currently specific to fa

Re: [PATCH 2/7] mm: Free device private pages have zero refcount

2022-09-26 Thread Alistair Popple
Jason Gunthorpe writes: > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote: >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page >> refcount") device private pages have no longer had an extra reference >> count when the page is in u

Re: [PATCH 6/7] nouveau/dmem: Evict device private memory during release

2022-09-26 Thread Alistair Popple
Felix Kuehling writes: > On 2022-09-26 17:35, Lyude Paul wrote: >> On Mon, 2022-09-26 at 16:03 +1000, Alistair Popple wrote: >>> When the module is unloaded or a GPU is unbound from the module it is >>> possible for device private pages to be left mapped in curr

Re: [PATCH 6/7] nouveau/dmem: Evict device private memory during release

2022-09-26 Thread Alistair Popple
John Hubbard writes: > On 9/26/22 14:35, Lyude Paul wrote: >>> + for (i = 0; i < npages; i++) { >>> + if (src_pfns[i] & MIGRATE_PFN_MIGRATE) { >>> + struct page *dpage; >>> + >>> + /* >>> +* _GFP_NOFAIL because the GPU is

[PATCH 6/7] nouveau/dmem: Evict device private memory during release

2022-09-26 Thread Alistair Popple
and callbacks have all been freed. Fix this by migrating any mappings back to normal CPU memory prior to freeing the GPU memory chunks and associated device private pages. Signed-off-by: Alistair Popple --- I assume the AMD driver might have a similar issue. However I can't see where device private

[PATCH 7/7] hmm-tests: Add test for migrate_device_range()

2022-09-26 Thread Alistair Popple
Signed-off-by: Alistair Popple --- lib/test_hmm.c | 119 +- lib/test_hmm_uapi.h| 1 +- tools/testing/selftests/vm/hmm-tests.c | 49 +++- 3 files changed, 148 insertions(+), 21 deletions(-) diff --git a/lib/test_hmm.c

[PATCH 5/7] nouveau/dmem: Refactor nouveau_dmem_fault_copy_one()

2022-09-26 Thread Alistair Popple
. Refactor out the core functionality so that it is not specific to fault handling. Signed-off-by: Alistair Popple --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 59 +-- 1 file changed, 29 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b

[PATCH 4/7] mm/migrate_device.c: Add migrate_device_range()

2022-09-26 Thread Alistair Popple
n free up device memory. To allow that this patch introduces the migrate_device family of functions which are functionally similar to migrate_vma but which skips the initial lookup based on mapping. Signed-off-by: Alistair Popple --- include/linux/migrate.h | 7 +++- mm/migrate_device.c

[PATCH 3/7] mm/migrate_device.c: Refactor migrate_vma and migrate_deivce_coherent_page()

2022-09-26 Thread Alistair Popple
this isn't true for device private memory, and a future change requires similar functionality for device private memory. So refactor the code into something more sensible for migrating device memory without a vma. Signed-off-by: Alistair Popple --- mm/migrate_device.c | 150

[PATCH 2/7] mm: Free device private pages have zero refcount

2022-09-26 Thread Alistair Popple
functions such as get_page_unless_zero(). Signed-off-by: Alistair Popple --- arch/powerpc/kvm/book3s_hv_uvmem.c | 1 + drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 + drivers/gpu/drm/nouveau/nouveau_dmem.c | 1 + lib/test_hmm.c | 1 + mm/memremap.c| 5

[PATCH 1/7] mm/memory.c: Fix race when faulting a device private page

2022-09-26 Thread Alistair Popple
if it's expected or not. Signed-off-by: Alistair Popple --- arch/powerpc/kvm/book3s_hv_uvmem.c | 15 ++- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17 +++-- drivers/gpu/drm/amd/amdkfd/kfd_migrate.h | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +--- include

[PATCH 0/7] Fix several device private page reference counting issues

2022-09-26 Thread Alistair Popple
. Unfortunately I lack the hardware to test on either of these so would appreciate it if someone with access could test those. Alistair Popple (7): mm/memory.c: Fix race when faulting a device private page mm: Free device private pages have zero refcount mm/migrate_device.c: Refactor

[PATCH] mm/gup.c: Fix formating in check_and_migrate_movable_page()

2022-07-20 Thread Alistair Popple
Commit b05a79d4377f ("mm/gup: migrate device coherent pages when pinning instead of failing") added a badly formatted if statement. Fix it. Signed-off-by: Alistair Popple Reported-by: David Hildenbrand --- Apologies Andrew for missing this. Hopefully this fixes things. mm/gup.c |

[PATCH] mm/gup.c: Fix formating in check_and_migrate_movable_page()

2022-07-20 Thread Alistair Popple
Commit b05a79d4377f ("mm/gup: migrate device coherent pages when pinning instead of failing") added a badly formatted if statement. Fix it. Signed-off-by: Alistair Popple Reported-by: David Hildenbrand --- Apologies Andrew for missing this. Hopefully this fixes things. mm/gup.c |

[PATCH] nouveau/svm: Fix to migrate all requested pages

2022-07-20 Thread Alistair Popple
SG_MAX_SINGLE_ALLOC. However a typo in updating the starting address means that only the first chunk will get migrated. Fix the calculation so that the entire range will get migrated if possible. Signed-off-by: Alistair Popple Fixes: e3d8b0890469 ("drm/nouveau/svm: map pages after migr

[PATCH] mm/gup: migrate device coherent pages when pinning instead of failing

2022-07-14 Thread Alistair Popple
-by: Alistair Popple Acked-by: Felix Kuehling Signed-off-by: Christoph Hellwig --- This patch hopefully addresses all of David's comments. It replaces both my "mm: remove the vma check in migrate_vma_setup()" and "mm/gup: migrate device coherent pages when pinning instead of failing" p

Re: [PATCH v8 07/15] mm/gup: migrate device coherent pages when pinning instead of failing

2022-07-14 Thread Alistair Popple
David Hildenbrand writes: > On 07.07.22 21:03, Alex Sierra wrote: >> From: Alistair Popple >> >> Currently any attempts to pin a device coherent page will fail. This is >> because device coherent pages need to be managed by a device driver, and >> pinning

Re: [PATCH v8 06/15] mm: remove the vma check in migrate_vma_setup()

2022-07-13 Thread Alistair Popple
David Hildenbrand writes: > On 07.07.22 21:03, Alex Sierra wrote: >> From: Alistair Popple >> >> migrate_vma_setup() checks that a valid vma is passed so that the page >> tables can be walked to find the pfns associated with a given address >> range. However i

Re: [PATCH v7 04/14] mm: add device coherent vma selection for memory migration

2022-06-30 Thread Alistair Popple
David Hildenbrand writes: > On 29.06.22 05:54, Alex Sierra wrote: >> This case is used to migrate pages from device memory, back to system >> memory. Device coherent type memory is cache coherent from device and CPU >> point of view. >> >> Signed-off-by: Alex Sierra >> Acked-by: Felix

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-21 Thread Alistair Popple
David Hildenbrand writes: > On 21.06.22 18:08, Sierra Guiza, Alejandro (Alex) wrote: >> >> On 6/21/2022 7:25 AM, David Hildenbrand wrote: >>> On 21.06.22 13:55, Alistair Popple wrote: >>>> David Hildenbrand writes: >>>> >>>>> On 2

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-21 Thread Alistair Popple
of view. >>>>>>>> This is used on platforms that have an advanced system bus (like CAPI >>>>>>>> or CXL). Any page of a process can be migrated to such memory. However, >>>>>>>> no one should be allowed to pin such memory so th

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-20 Thread Alistair Popple
Oded Gabbay writes: > On Mon, Jun 20, 2022 at 3:33 AM Alistair Popple wrote: >> >> >> Oded Gabbay writes: >> >> > On Fri, Jun 17, 2022 at 8:20 PM Sierra Guiza, Alejandro (Alex) >> > wrote: >> >> >> >> >> >&g

Re: [PATCH v5 01/13] mm: add zone device coherent type memory support

2022-06-19 Thread Alistair Popple
> >> evicted. >> >> >> >> Signed-off-by: Alex Sierra >> >> Acked-by: Felix Kuehling >> >> Reviewed-by: Alistair Popple >> >> [hch: rebased ontop of the refcount changes, >> >>removed is_dev_private_or_coherent_page]

Re: [PATCH v5 02/13] mm: handling Non-LRU pages returned by vm_normal_pages

2022-06-08 Thread Alistair Popple
I can't see any issues with this now so: Reviewed-by: Alistair Popple Alex Sierra writes: > With DEVICE_COHERENT, we'll soon have vm_normal_pages() return > device-managed anonymous pages that are not LRU pages. Although they > behave like normal pages for purposes of mapping in

Re: [PATCH v3 02/13] mm: handling Non-LRU pages returned by vm_normal_pages

2022-05-27 Thread Alistair Popple
Felix Kuehling writes: > Am 2022-05-25 um 00:11 schrieb Alistair Popple: >> Alex Sierra writes: >> >>> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return >>> device-managed anonymous pages that are not LRU pages. Although they >>> behave

Re: [PATCH v3 02/13] mm: handling Non-LRU pages returned by vm_normal_pages

2022-05-26 Thread Alistair Popple
"Sierra Guiza, Alejandro (Alex)" writes: > On 5/24/2022 11:11 PM, Alistair Popple wrote: >> Alex Sierra writes: >> >>> With DEVICE_COHERENT, we'll soon have vm_normal_pages() return >>> device-managed anonymous pages that are not LRU pages.

Re: [PATCH v3 02/13] mm: handling Non-LRU pages returned by vm_normal_pages

2022-05-25 Thread Alistair Popple
Alex Sierra writes: > With DEVICE_COHERENT, we'll soon have vm_normal_pages() return > device-managed anonymous pages that are not LRU pages. Although they > behave like normal pages for purposes of mapping in CPU page, and for > COW. They do not support LRU lists, NUMA migration or THP. > >

Re: [PATCH v2 11/13] mm: handling Non-LRU pages returned by vm_normal_pages

2022-05-23 Thread Alistair Popple
Technically I think this patch should be earlier in the series. As I understand it patch 1 allows DEVICE_COHERENT pages to be inserted in the page tables and therefore makes it possible for page table walkers to see non-LRU pages. Some more comments below: Alex Sierra writes: > With

Re: [PATCH v1 14/15] tools: add hmm gup tests for device coherent type

2022-05-16 Thread Alistair Popple
(variant->device_number)) { > + ASSERT_EQ(HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL | > HMM_DMIRROR_PROT_WRITE, m[0]); > + ASSERT_EQ(HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL | > HMM_DMIRROR_PROT_WRITE, m[1]); > + } else { > + ASSERT_EQ(HMM_DMIRROR_PROT_W

Re: [PATCH v1 01/15] mm: add zone device coherent type memory support

2022-05-11 Thread Alistair Popple
Alex Sierra writes: [...] > diff --git a/mm/rmap.c b/mm/rmap.c > index fedb82371efe..d57102cd4b43 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1995,7 +1995,8 @@ void try_to_migrate(struct folio *folio, enum ttu_flags > flags) > TTU_SYNC))) >

Re: [PATCH v1 04/15] mm: add device coherent checker to remove migration pte

2022-05-11 Thread Alistair Popple
"Sierra Guiza, Alejandro (Alex)" writes: > @apop...@nvidia.com Could you please check this patch? It's somehow related > to migrate_device_page() for long term device coherent pages. > > Regards, > Alex Sierra >> -Original Message- >> From: amd-gfx On Behalf Of Alex >> Sierra >>

Re: [PATCH v1 04/15] mm: add device coherent checker to remove migration pte

2022-05-05 Thread Alistair Popple
"Sierra Guiza, Alejandro (Alex)" writes: > @apop...@nvidia.com Could you please check this patch? It's somehow related to > migrate_device_page() for long term device coherent pages. Sure thing. This whole series is in my queue of things to review once I make it home from LSF/MM. - Alistair

Re: [PATCH v1 1/3] mm: split vm_normal_pages for LRU and non-LRU handling

2022-03-16 Thread Alistair Popple
Felix Kuehling writes: > On 2022-03-11 04:16, David Hildenbrand wrote: >> On 10.03.22 18:26, Alex Sierra wrote: >>> DEVICE_COHERENT pages introduce a subtle distinction in the way >>> "normal" pages can be used by various callers throughout the kernel. >>> They behave like normal pages for

Re: [PATCH v1 1/3] mm: split vm_normal_pages for LRU and non-LRU handling

2022-03-16 Thread Alistair Popple
Felix Kuehling writes: > Am 2022-03-10 um 14:25 schrieb Matthew Wilcox: >> On Thu, Mar 10, 2022 at 11:26:31AM -0600, Alex Sierra wrote: >>> @@ -606,7 +606,7 @@ static void print_bad_pte(struct vm_area_struct *vma, >>> unsigned long addr, >>>* PFNMAP mappings in order to support COWable

Re: [PATCH v6 01/10] mm: add zone device coherent type memory support

2022-02-17 Thread Alistair Popple
Felix Kuehling writes: > Am 2022-02-16 um 07:26 schrieb Jason Gunthorpe: >> The other place that needs careful audit is all the callers using >> vm_normal_page() - they must all be able to accept a ZONE_DEVICE page >> if we don't set pte_devmap. > > How much code are we talking about here? A

Re: [PATCH v6 01/10] mm: add zone device coherent type memory support

2022-02-16 Thread Alistair Popple
Jason Gunthorpe writes: > On Wed, Feb 16, 2022 at 09:31:03AM +0100, David Hildenbrand wrote: >> On 16.02.22 03:36, Alistair Popple wrote: >> > On Wednesday, 16 February 2022 1:03:57 PM AEDT Jason Gunthorpe wrote: >> >> On Wed, Feb 16, 2022 at 12:23:44P

Re: [PATCH v6 01/10] mm: add zone device coherent type memory support

2022-02-15 Thread Alistair Popple
On Wednesday, 16 February 2022 1:03:57 PM AEDT Jason Gunthorpe wrote: > On Wed, Feb 16, 2022 at 12:23:44PM +1100, Alistair Popple wrote: > > > Device private and device coherent pages are not marked with pte_devmap and > > they > > are backed by a struct page. The on

Re: [PATCH v6 01/10] mm: add zone device coherent type memory support

2022-02-15 Thread Alistair Popple
Jason Gunthorpe writes: > On Tue, Feb 15, 2022 at 04:35:56PM -0500, Felix Kuehling wrote: >> >> On 2022-02-15 14:41, Jason Gunthorpe wrote: >> > On Tue, Feb 15, 2022 at 07:32:09PM +0100, Christoph Hellwig wrote: >> > > On Tue, Feb 15, 2022 at 10:45:24AM -0400, Jason Gunthorpe wrote: >> > > > >

Re: [PATCH v6 01/10] mm: add zone device coherent type memory support

2022-02-13 Thread Alistair Popple
CAPI >>> or CXL). Any page of a process can be migrated to such memory. However, >>> no one should be allowed to pin such memory so that it can always be >>> evicted. >>> >>> Signed-off-by: Alex Sierra >>> Acked-by: Felix Kuehling >>> Reviewed

Re: [PATCH v2 2/3] mm/gup.c: Migrate device coherent pages when pinning instead of failing

2022-02-13 Thread Alistair Popple
John Hubbard writes: > On 2/11/22 18:51, Alistair Popple wrote: […] >>> See below… >>> >>>> + } >>>> + >>>> + pages[i] = migrate_device_page(head, gup_flags); >> migrate_device_page() will ret

Re: [PATCH v2 2/3] mm/gup.c: Migrate device coherent pages when pinning instead of failing

2022-02-11 Thread Alistair Popple
On Saturday, 12 February 2022 1:10:29 PM AEDT John Hubbard wrote: > On 2/6/22 20:26, Alistair Popple wrote: > > Currently any attempts to pin a device coherent page will fail. This is > > because device coherent pages need to be managed by a device driver, and > > pinni

Re: [PATCH v2 2/3] mm/gup.c: Migrate device coherent pages when pinning instead of failing

2022-02-10 Thread Alistair Popple
On Thursday, 10 February 2022 10:47:35 PM AEDT David Hildenbrand wrote: > On 10.02.22 12:39, Alistair Popple wrote: > > On Thursday, 10 February 2022 9:53:38 PM AEDT David Hildenbrand wrote: > >> On 07.02.22 05:26, Alistair Popple wrote: > >>> Currently any attempts

Re: [PATCH v2 2/3] mm/gup.c: Migrate device coherent pages when pinning instead of failing

2022-02-10 Thread Alistair Popple
On Thursday, 10 February 2022 9:53:38 PM AEDT David Hildenbrand wrote: > On 07.02.22 05:26, Alistair Popple wrote: > > Currently any attempts to pin a device coherent page will fail. This is > > because device coherent pages need to be managed by a device driver, and > > pinni

Re: start sorting out the ZONE_DEVICE refcount mess v2

2022-02-10 Thread Alistair Popple
On Thursday, 10 February 2022 6:28:01 PM AEDT Christoph Hellwig wrote: [...] > Changes since v1: > - add a missing memremap.h include in memcontrol.c > - include rebased versions of the device coherent support and >device coherent migration support series as well as additional >cleanup

Re: [PATCH 11/27] mm: refactor the ZONE_DEVICE handling in migrate_vma_insert_page

2022-02-10 Thread Alistair Popple
Reviewed-by: Alistair Popple On Thursday, 10 February 2022 6:28:12 PM AEDT Christoph Hellwig wrote: > Make the flow a little more clear and prepare for adding a new > ZONE_DEVICE memory type. > > Signed-off-by: Christoph Hellwig > --- > mm/migrate.c | 31 +++---

Re: [PATCH 12/27] mm: refactor the ZONE_DEVICE handling in migrate_vma_pages

2022-02-10 Thread Alistair Popple
Reviewed-by: Alistair Popple On Thursday, 10 February 2022 6:28:13 PM AEDT Christoph Hellwig wrote: > Make the flow a little more clear and prepare for adding a new > ZONE_DEVICE memory type. > > Signed-off-by: Christoph Hellwig > --- > mm/migrate.c | 27 ---

Re: [PATCH 14/27] mm: build migrate_vma_* for all configs with ZONE_DEVICE support

2022-02-10 Thread Alistair Popple
Thanks, it's also better than more stubbed functions. Reviewed-by: Alistair Popple On Thursday, 10 February 2022 6:28:15 PM AEDT Christoph Hellwig wrote: > This code will be used for device coherent memory as well in a bit, > so relax the ifdef a bit. > > Signed-off-by: Chris

Re: [PATCH 13/27] mm: move the migrate_vma_* device migration code into it's own file

2022-02-10 Thread Alistair Popple
I got the following build error: /data/source/linux/mm/migrate_device.c: In function ‘migrate_vma_collect_pmd’: /data/source/linux/mm/migrate_device.c:242:3: error: implicit declaration of function ‘flush_tlb_range’; did you mean ‘flush_pmd_tlb_range’? [-Werror=implicit-function-declaration]

Re: [PATCH 6/8] mm: don't include in

2022-02-09 Thread Alistair Popple
On Thursday, 10 February 2022 4:48:36 AM AEDT Christoph Hellwig wrote: > On Mon, Feb 07, 2022 at 04:19:29PM -0500, Felix Kuehling wrote: > > > > Am 2022-02-07 um 01:32 schrieb Christoph Hellwig: > >> Move the check for the actual pgmap types that need the free at refcount > >> one behavior into

[PATCH v2 3/3] tools: add hmm gup test for long term pinned device pages

2022-02-06 Thread Alistair Popple
From: Alex Sierra The intention is to test device coherent type pages that have been called through get user pages with PIN_LONGTERM flag set. These pages should get migrated back to normal system memory. Signed-off-by: Alex Sierra Signed-off-by: Alistair Popple Reviewed-by: Felix Kuehling

[PATCH v2 2/3] mm/gup.c: Migrate device coherent pages when pinning instead of failing

2022-02-06 Thread Alistair Popple
and accessible from the CPU so can be migrated just like pinning ZONE_MOVABLE pages. So instead of failing all attempts to pin them first try migrating them out of ZONE_DEVICE. Signed-off-by: Alistair Popple Acked-by: Felix Kuehling --- Changes for v2: - Added Felix's Acked-by - Fixed missing

[PATCH v2 1/3] migrate.c: Remove vma check in migrate_vma_setup()

2022-02-06 Thread Alistair Popple
required. Signed-off-by: Alistair Popple Acked-by: Felix Kuehling --- Changes for v2: - Added Felix's Acked-by mm/migrate.c | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index a9aed12..0d6570d 100644 --- a/mm

[PATCH v2 0/3] Migrate device coherent pages on get_user_pages()

2022-02-06 Thread Alistair Popple
- Rebased on to linux-next-20220204 Alex Sierra (1): tools: add hmm gup test for long term pinned device pages Alistair Popple (2): migrate.c: Remove vma check in migrate_vma_setup() mm/gup.c: Migrate device coherent pages when pinning instead of failing mm/gup.c

Re: [PATCH 2/3] mm/gup.c: Migrate device coherent pages when pinning instead of failing

2022-02-06 Thread Alistair Popple
On Wednesday, 2 February 2022 2:03:01 AM AEDT Felix Kuehling wrote: > > Am 2022-02-01 um 02:05 schrieb Alistair Popple: > > Currently any attempts to pin a device coherent page will fail. This is > > because device coherent pages need to be managed by a device driver, and >

[PATCH 3/3] tools: add hmm gup test for long term pinned device pages

2022-01-31 Thread Alistair Popple
From: Alex Sierra The intention is to test device coherent type pages that have been called through get user pages with PIN_LONGTERM flag set. These pages should get migrated back to normal system memory. Signed-off-by: Alex Sierra Signed-off-by: Alistair Popple --- tools/testing/selftests

[PATCH 2/3] mm/gup.c: Migrate device coherent pages when pinning instead of failing

2022-01-31 Thread Alistair Popple
and accessible from the CPU so can be migrated just like pinning ZONE_MOVABLE pages. So instead of failing all attempts to pin them first try migrating them out of ZONE_DEVICE. Signed-off-by: Alistair Popple --- mm/gup.c | 105 ++-- 1 file changed

[PATCH 1/3] migrate.c: Remove vma check in migrate_vma_setup()

2022-01-31 Thread Alistair Popple
required. Signed-off-by: Alistair Popple --- mm/migrate.c | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index d3cc358..31ba8ca 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2581,24 +2581,24 @@ int

[PATCH 0/3] Migrate device coherent pages on get_user_pages()

2022-01-31 Thread Alistair Popple
erm pinned device pages Alistair Popple (2): migrate.c: Remove vma check in migrate_vma_setup() mm/gup.c: Migrate device coherent pages when pinning instead of failing mm/gup.c | 105 +++--- mm/migrate.c | 34 -

Re: [PATCH v5 09/10] tools: update hmm-test to support device coherent type

2022-01-31 Thread Alistair Popple
Oh sorry, I had looked at this but forgotten to add my reviewed by: Reviewed-by: Alistair Popple On Tuesday, 1 February 2022 10:27:25 AM AEDT Sierra Guiza, Alejandro (Alex) wrote: > Hi Alistair, > This is the last patch to be reviewed from this series. It already has > the changes fr

Re: [PATCH] mm: add device coherent vma selection for memory migration

2022-01-31 Thread Alistair Popple
Thanks for fixing. I'm guessing Andrew will want you to resend this as part of a new v6 series, but please add: Reviewed-by: Alistair Popple On Tuesday, 1 February 2022 6:48:13 AM AEDT Alex Sierra wrote: > This case is used to migrate pages from device memory, back to system > memory.

Re: [PATCH v5 01/10] mm: add zone device coherent type memory support

2022-01-30 Thread Alistair Popple
Looks good, feel free to add: Reviewed-by: Alistair Popple On Saturday, 29 January 2022 7:08:16 AM AEDT Alex Sierra wrote: > Device memory that is cache coherent from device and CPU point of view. > This is used on platforms that have an advanced system bus (like CAPI > or CXL).

Re: [PATCH v5 02/10] mm: add device coherent vma selection for memory migration

2022-01-30 Thread Alistair Popple
On Saturday, 29 January 2022 7:08:17 AM AEDT Alex Sierra wrote: [...] > struct migrate_vma { > diff --git a/mm/migrate.c b/mm/migrate.c > index cd137aedcfe5..d3cc3589e1e8 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -2264,7 +2264,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >

Re: [PATCH v4 08/10] lib: add support for device coherent type in test_hmm

2022-01-27 Thread Alistair Popple
I haven't tested the change which checks that pages migrated back to sysmem, but it looks ok so: Reviewed-by: Alistair Popple On Thursday, 27 January 2022 2:09:47 PM AEDT Alex Sierra wrote: > Device Coherent type uses device memory that is coherently accesible by > the CPU. This could be

Re: [PATCH v4 07/10] lib: test_hmm add module param for zone device type

2022-01-27 Thread Alistair Popple
Thanks for the updates, looks good now. Reviewed-by: Alistair Popple On Thursday, 27 January 2022 2:09:46 PM AEDT Alex Sierra wrote: > In order to configure device coherent in test_hmm, two module parameters > should be passed, which correspond to the SP start address of each >

Re: [PATCH v4 06/10] lib: test_hmm add ioctl to get zone device type

2022-01-27 Thread Alistair Popple
Reviewed-by: Alistair Popple On Thursday, 27 January 2022 2:09:45 PM AEDT Alex Sierra wrote: > new ioctl cmd added to query zone device type. This will be > used once the test_hmm adds zone device coherent type. > > Signed-off-by: Alex Sierra > --- > lib/te

Re: [PATCH v4 04/10] drm/amdkfd: add SPM support for SVM

2022-01-27 Thread Alistair Popple
On Thursday, 27 January 2022 2:09:43 PM AEDT Alex Sierra wrote: [...] > @@ -984,3 +990,4 @@ int svm_migrate_init(struct amdgpu_device *adev) > > return 0; > } > + > git-am complained about this when I applied the series. Given you have to rebase anyway it would be worth fixing this.

Re: [PATCH v4 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-27 Thread Alistair Popple
On Thursday, 27 January 2022 2:09:42 PM AEDT Alex Sierra wrote: > Avoid long term pinning for Coherent device type pages. This could > interfere with their own device memory manager. For now, we are just > returning error for PIN_LONGTERM Coherent device type pages. Eventually, > these type of

Re: [PATCH v4 02/10] mm: add device coherent vma selection for memory migration

2022-01-27 Thread Alistair Popple
On Thursday, 27 January 2022 2:09:41 PM AEDT Alex Sierra wrote: [...] > diff --git a/mm/migrate.c b/mm/migrate.c > index 277562cd4cf5..2b3375e165b1 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -2340,8 +2340,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > if

Re: [PATCH v4 01/10] mm: add zone device coherent type memory support

2022-01-27 Thread Alistair Popple
On Thursday, 27 January 2022 2:09:40 PM AEDT Alex Sierra wrote: [...] > diff --git a/mm/migrate.c b/mm/migrate.c > index 1852d787e6ab..277562cd4cf5 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -362,7 +362,7 @@ static int expected_page_refs(struct address_space > *mapping, struct page

Re: [PATCH v3 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-20 Thread Alistair Popple
On Thursday, 20 January 2022 11:36:21 PM AEDT Joao Martins wrote: > On 1/10/22 22:31, Alex Sierra wrote: > > Avoid long term pinning for Coherent device type pages. This could > > interfere with their own device memory manager. For now, we are just > > returning error for PIN_LONGTERM Coherent

  1   2   3   >