Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Fri, 2017-04-07 at 12:26 -0400, Jerome Glisse wrote: > On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote: > > On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote: > > > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > > > > This introduce a simple struct and associated helpers for device driver > > > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > > > > will find a unuse physical address range and trigger memory hotplug for > > > > it which allocates and initialize struct page for the device memory. > > > > > > > > Signed-off-by: Jérôme Glisse> > > > Signed-off-by: Evgeny Baskakov > > > > Signed-off-by: John Hubbard > > > > Signed-off-by: Mark Hairgrove > > > > Signed-off-by: Sherry Cheung > > > > Signed-off-by: Subhash Gutti > > > > --- > > > > include/linux/hmm.h | 114 +++ > > > > mm/Kconfig | 9 ++ > > > > mm/hmm.c| 398 > > > > > > > > 3 files changed, 521 insertions(+) > > > > > > > > +/* > > > > + * To add (hotplug) device memory, HMM assumes that there is no real > > > > resource > > > > + * that reserves a range in the physical address space (this is > > > > intended to be > > > > + * use by unaddressable device memory). It will reserve a physical > > > > range big > > > > + * enough and allocate struct page for it. > > > > > > I've found that the implementation of this is quite non-portable, in that > > > starting from iomem_resource.end+1-size (which is effectively -size) on > > > my platform (powerpc) does not give expected results. It could be that > > > additional changes are needed to arch_add_memory() to support this > > > use case. > > > > The CDM version does not use that part, that being said isn't -size a valid > > value we care only about unsigned here ? What is the end value on powerpc ? > > In any case this sounds more like a unsigned/signed arithmetic issue, i will > > look into it. > > Thanks! > > > > > > > + > > > > + size = ALIGN(size, SECTION_SIZE); > > > > + addr = (iomem_resource.end + 1ULL) - size; > > > > > > > > > Why don't we allocate_resource() with the right constraints and get a new > > > unused region? > > > > The issue with allocate_resource() is that it does scan the resource tree > > from lower address to higher ones. I was told that it was less likely to > > have hotplug issue conflict if i pick highest physicall address for the > > device memory hence why i do my own scan from the end toward the start. > > > > Again all this function does not apply to PPC, it can be hidden behind > > x86 config if you prefer it. > > Ok so i have look into it and there is no arithmetic bug in my code the > issue is simpler than that. It seems only x86 clamp iomem_resource.end to > MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue. > > It is fine not to clamp if you know that you won't get resource with > funky physical address but in case of UNADDRESSABLE i do not get any > physical address so i have to pick one and i want to pick one that is > unlikely to cause trouble latter on with someone hotpluging memory. > > If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix > this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan > in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter > is probably safer and more bullet proof in respect to other arch getting > interested in this. > We do care about UNADDRESSABLE for certain platforms on powerpc I think MAX_PHYSMEM_BITS sounds good or we can make it an arch hook. I spoke to Michael Ellerman and he recommended we do either. We can't clamp down iomem_resource.end in the arch as we have other things beyond MAX_PHYSMEM_BITS, but doing the walk in HMM from the end of MAX_PHYSMEM_BITS is a good idea to begin with. Balbir Singh.
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Fri, 2017-04-07 at 12:26 -0400, Jerome Glisse wrote: > On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote: > > On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote: > > > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > > > > This introduce a simple struct and associated helpers for device driver > > > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > > > > will find a unuse physical address range and trigger memory hotplug for > > > > it which allocates and initialize struct page for the device memory. > > > > > > > > Signed-off-by: Jérôme Glisse > > > > Signed-off-by: Evgeny Baskakov > > > > Signed-off-by: John Hubbard > > > > Signed-off-by: Mark Hairgrove > > > > Signed-off-by: Sherry Cheung > > > > Signed-off-by: Subhash Gutti > > > > --- > > > > include/linux/hmm.h | 114 +++ > > > > mm/Kconfig | 9 ++ > > > > mm/hmm.c| 398 > > > > > > > > 3 files changed, 521 insertions(+) > > > > > > > > +/* > > > > + * To add (hotplug) device memory, HMM assumes that there is no real > > > > resource > > > > + * that reserves a range in the physical address space (this is > > > > intended to be > > > > + * use by unaddressable device memory). It will reserve a physical > > > > range big > > > > + * enough and allocate struct page for it. > > > > > > I've found that the implementation of this is quite non-portable, in that > > > starting from iomem_resource.end+1-size (which is effectively -size) on > > > my platform (powerpc) does not give expected results. It could be that > > > additional changes are needed to arch_add_memory() to support this > > > use case. > > > > The CDM version does not use that part, that being said isn't -size a valid > > value we care only about unsigned here ? What is the end value on powerpc ? > > In any case this sounds more like a unsigned/signed arithmetic issue, i will > > look into it. > > Thanks! > > > > > > > + > > > > + size = ALIGN(size, SECTION_SIZE); > > > > + addr = (iomem_resource.end + 1ULL) - size; > > > > > > > > > Why don't we allocate_resource() with the right constraints and get a new > > > unused region? > > > > The issue with allocate_resource() is that it does scan the resource tree > > from lower address to higher ones. I was told that it was less likely to > > have hotplug issue conflict if i pick highest physicall address for the > > device memory hence why i do my own scan from the end toward the start. > > > > Again all this function does not apply to PPC, it can be hidden behind > > x86 config if you prefer it. > > Ok so i have look into it and there is no arithmetic bug in my code the > issue is simpler than that. It seems only x86 clamp iomem_resource.end to > MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue. > > It is fine not to clamp if you know that you won't get resource with > funky physical address but in case of UNADDRESSABLE i do not get any > physical address so i have to pick one and i want to pick one that is > unlikely to cause trouble latter on with someone hotpluging memory. > > If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix > this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan > in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter > is probably safer and more bullet proof in respect to other arch getting > interested in this. > We do care about UNADDRESSABLE for certain platforms on powerpc I think MAX_PHYSMEM_BITS sounds good or we can make it an arch hook. I spoke to Michael Ellerman and he recommended we do either. We can't clamp down iomem_resource.end in the arch as we have other things beyond MAX_PHYSMEM_BITS, but doing the walk in HMM from the end of MAX_PHYSMEM_BITS is a good idea to begin with. Balbir Singh.
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote: > On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote: > > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > > > This introduce a simple struct and associated helpers for device driver > > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > > > will find a unuse physical address range and trigger memory hotplug for > > > it which allocates and initialize struct page for the device memory. > > > > > > Signed-off-by: Jérôme Glisse> > > Signed-off-by: Evgeny Baskakov > > > Signed-off-by: John Hubbard > > > Signed-off-by: Mark Hairgrove > > > Signed-off-by: Sherry Cheung > > > Signed-off-by: Subhash Gutti > > > --- > > > include/linux/hmm.h | 114 +++ > > > mm/Kconfig | 9 ++ > > > mm/hmm.c| 398 > > > > > > 3 files changed, 521 insertions(+) > > > > > > +/* > > > + * To add (hotplug) device memory, HMM assumes that there is no real > > > resource > > > + * that reserves a range in the physical address space (this is intended > > > to be > > > + * use by unaddressable device memory). It will reserve a physical range > > > big > > > + * enough and allocate struct page for it. > > > > I've found that the implementation of this is quite non-portable, in that > > starting from iomem_resource.end+1-size (which is effectively -size) on > > my platform (powerpc) does not give expected results. It could be that > > additional changes are needed to arch_add_memory() to support this > > use case. > > The CDM version does not use that part, that being said isn't -size a valid > value we care only about unsigned here ? What is the end value on powerpc ? > In any case this sounds more like a unsigned/signed arithmetic issue, i will > look into it. > > > > > > + > > > + size = ALIGN(size, SECTION_SIZE); > > > + addr = (iomem_resource.end + 1ULL) - size; > > > > > > Why don't we allocate_resource() with the right constraints and get a new > > unused region? > > The issue with allocate_resource() is that it does scan the resource tree > from lower address to higher ones. I was told that it was less likely to > have hotplug issue conflict if i pick highest physicall address for the > device memory hence why i do my own scan from the end toward the start. > > Again all this function does not apply to PPC, it can be hidden behind > x86 config if you prefer it. Ok so i have look into it and there is no arithmetic bug in my code the issue is simpler than that. It seems only x86 clamp iomem_resource.end to MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue. It is fine not to clamp if you know that you won't get resource with funky physical address but in case of UNADDRESSABLE i do not get any physical address so i have to pick one and i want to pick one that is unlikely to cause trouble latter on with someone hotpluging memory. If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter is probably safer and more bullet proof in respect to other arch getting interested in this. Cheers, Jérôme
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote: > On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote: > > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > > > This introduce a simple struct and associated helpers for device driver > > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > > > will find a unuse physical address range and trigger memory hotplug for > > > it which allocates and initialize struct page for the device memory. > > > > > > Signed-off-by: Jérôme Glisse > > > Signed-off-by: Evgeny Baskakov > > > Signed-off-by: John Hubbard > > > Signed-off-by: Mark Hairgrove > > > Signed-off-by: Sherry Cheung > > > Signed-off-by: Subhash Gutti > > > --- > > > include/linux/hmm.h | 114 +++ > > > mm/Kconfig | 9 ++ > > > mm/hmm.c| 398 > > > > > > 3 files changed, 521 insertions(+) > > > > > > +/* > > > + * To add (hotplug) device memory, HMM assumes that there is no real > > > resource > > > + * that reserves a range in the physical address space (this is intended > > > to be > > > + * use by unaddressable device memory). It will reserve a physical range > > > big > > > + * enough and allocate struct page for it. > > > > I've found that the implementation of this is quite non-portable, in that > > starting from iomem_resource.end+1-size (which is effectively -size) on > > my platform (powerpc) does not give expected results. It could be that > > additional changes are needed to arch_add_memory() to support this > > use case. > > The CDM version does not use that part, that being said isn't -size a valid > value we care only about unsigned here ? What is the end value on powerpc ? > In any case this sounds more like a unsigned/signed arithmetic issue, i will > look into it. > > > > > > + > > > + size = ALIGN(size, SECTION_SIZE); > > > + addr = (iomem_resource.end + 1ULL) - size; > > > > > > Why don't we allocate_resource() with the right constraints and get a new > > unused region? > > The issue with allocate_resource() is that it does scan the resource tree > from lower address to higher ones. I was told that it was less likely to > have hotplug issue conflict if i pick highest physicall address for the > device memory hence why i do my own scan from the end toward the start. > > Again all this function does not apply to PPC, it can be hidden behind > x86 config if you prefer it. Ok so i have look into it and there is no arithmetic bug in my code the issue is simpler than that. It seems only x86 clamp iomem_resource.end to MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue. It is fine not to clamp if you know that you won't get resource with funky physical address but in case of UNADDRESSABLE i do not get any physical address so i have to pick one and i want to pick one that is unlikely to cause trouble latter on with someone hotpluging memory. If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter is probably safer and more bullet proof in respect to other arch getting interested in this. Cheers, Jérôme
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote: > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > > This introduce a simple struct and associated helpers for device driver > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > > will find a unuse physical address range and trigger memory hotplug for > > it which allocates and initialize struct page for the device memory. > > > > Signed-off-by: Jérôme Glisse> > Signed-off-by: Evgeny Baskakov > > Signed-off-by: John Hubbard > > Signed-off-by: Mark Hairgrove > > Signed-off-by: Sherry Cheung > > Signed-off-by: Subhash Gutti > > --- > > include/linux/hmm.h | 114 +++ > > mm/Kconfig | 9 ++ > > mm/hmm.c| 398 > > > > 3 files changed, 521 insertions(+) > > > > +/* > > + * To add (hotplug) device memory, HMM assumes that there is no real > > resource > > + * that reserves a range in the physical address space (this is intended > > to be > > + * use by unaddressable device memory). It will reserve a physical range > > big > > + * enough and allocate struct page for it. > > I've found that the implementation of this is quite non-portable, in that > starting from iomem_resource.end+1-size (which is effectively -size) on > my platform (powerpc) does not give expected results. It could be that > additional changes are needed to arch_add_memory() to support this > use case. The CDM version does not use that part, that being said isn't -size a valid value we care only about unsigned here ? What is the end value on powerpc ? In any case this sounds more like a unsigned/signed arithmetic issue, i will look into it. > > > + > > + size = ALIGN(size, SECTION_SIZE); > > + addr = (iomem_resource.end + 1ULL) - size; > > > Why don't we allocate_resource() with the right constraints and get a new > unused region? The issue with allocate_resource() is that it does scan the resource tree from lower address to higher ones. I was told that it was less likely to have hotplug issue conflict if i pick highest physicall address for the device memory hence why i do my own scan from the end toward the start. Again all this function does not apply to PPC, it can be hidden behind x86 config if you prefer it. Cheers, Jérôme
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote: > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > > This introduce a simple struct and associated helpers for device driver > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > > will find a unuse physical address range and trigger memory hotplug for > > it which allocates and initialize struct page for the device memory. > > > > Signed-off-by: Jérôme Glisse > > Signed-off-by: Evgeny Baskakov > > Signed-off-by: John Hubbard > > Signed-off-by: Mark Hairgrove > > Signed-off-by: Sherry Cheung > > Signed-off-by: Subhash Gutti > > --- > > include/linux/hmm.h | 114 +++ > > mm/Kconfig | 9 ++ > > mm/hmm.c| 398 > > > > 3 files changed, 521 insertions(+) > > > > +/* > > + * To add (hotplug) device memory, HMM assumes that there is no real > > resource > > + * that reserves a range in the physical address space (this is intended > > to be > > + * use by unaddressable device memory). It will reserve a physical range > > big > > + * enough and allocate struct page for it. > > I've found that the implementation of this is quite non-portable, in that > starting from iomem_resource.end+1-size (which is effectively -size) on > my platform (powerpc) does not give expected results. It could be that > additional changes are needed to arch_add_memory() to support this > use case. The CDM version does not use that part, that being said isn't -size a valid value we care only about unsigned here ? What is the end value on powerpc ? In any case this sounds more like a unsigned/signed arithmetic issue, i will look into it. > > > + > > + size = ALIGN(size, SECTION_SIZE); > > + addr = (iomem_resource.end + 1ULL) - size; > > > Why don't we allocate_resource() with the right constraints and get a new > unused region? The issue with allocate_resource() is that it does scan the resource tree from lower address to higher ones. I was told that it was less likely to have hotplug issue conflict if i pick highest physicall address for the device memory hence why i do my own scan from the end toward the start. Again all this function does not apply to PPC, it can be hidden behind x86 config if you prefer it. Cheers, Jérôme
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > This introduce a simple struct and associated helpers for device driver > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > will find a unuse physical address range and trigger memory hotplug for > it which allocates and initialize struct page for the device memory. > > Signed-off-by: Jérôme Glisse> Signed-off-by: Evgeny Baskakov > Signed-off-by: John Hubbard > Signed-off-by: Mark Hairgrove > Signed-off-by: Sherry Cheung > Signed-off-by: Subhash Gutti > --- > include/linux/hmm.h | 114 +++ > mm/Kconfig | 9 ++ > mm/hmm.c| 398 > > 3 files changed, 521 insertions(+) > > +/* > + * To add (hotplug) device memory, HMM assumes that there is no real resource > + * that reserves a range in the physical address space (this is intended to > be > + * use by unaddressable device memory). It will reserve a physical range big > + * enough and allocate struct page for it. I've found that the implementation of this is quite non-portable, in that starting from iomem_resource.end+1-size (which is effectively -size) on my platform (powerpc) does not give expected results. It could be that additional changes are needed to arch_add_memory() to support this use case. > + > + size = ALIGN(size, SECTION_SIZE); > + addr = (iomem_resource.end + 1ULL) - size; Why don't we allocate_resource() with the right constraints and get a new unused region? Thanks, Balbir
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote: > This introduce a simple struct and associated helpers for device driver > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It > will find a unuse physical address range and trigger memory hotplug for > it which allocates and initialize struct page for the device memory. > > Signed-off-by: Jérôme Glisse > Signed-off-by: Evgeny Baskakov > Signed-off-by: John Hubbard > Signed-off-by: Mark Hairgrove > Signed-off-by: Sherry Cheung > Signed-off-by: Subhash Gutti > --- > include/linux/hmm.h | 114 +++ > mm/Kconfig | 9 ++ > mm/hmm.c| 398 > > 3 files changed, 521 insertions(+) > > +/* > + * To add (hotplug) device memory, HMM assumes that there is no real resource > + * that reserves a range in the physical address space (this is intended to > be > + * use by unaddressable device memory). It will reserve a physical range big > + * enough and allocate struct page for it. I've found that the implementation of this is quite non-portable, in that starting from iomem_resource.end+1-size (which is effectively -size) on my platform (powerpc) does not give expected results. It could be that additional changes are needed to arch_add_memory() to support this use case. > + > + size = ALIGN(size, SECTION_SIZE); > + addr = (iomem_resource.end + 1ULL) - size; Why don't we allocate_resource() with the right constraints and get a new unused region? Thanks, Balbir
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
So during rebase on lastest mmotm one if branch logic got inversed. Attached is a fixup patch. Cheers, Jérôme >From 374bca39b19a88da1d1c6d38c0a4c49c1af31c18 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?=Date: Thu, 6 Apr 2017 17:16:56 -0400 Subject: [PATCH] fixup! mm/hmm/devmem: device memory hotplug using ZONE_DEVICE --- mm/hmm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hmm.c b/mm/hmm.c index ff8ec59..f567a8b 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -898,7 +898,7 @@ static int hmm_devmem_pages_create(struct hmm_devmem *devmem) ret = arch_add_memory(nid, align_start, align_size, devmem->pagemap.type); mem_hotplug_done(); - if (!ret) + if (ret) goto error_add_memory; for (pfn = devmem->pfn_first; pfn < devmem->pfn_last; pfn++) { -- 2.9.3
Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
So during rebase on lastest mmotm one if branch logic got inversed. Attached is a fixup patch. Cheers, Jérôme >From 374bca39b19a88da1d1c6d38c0a4c49c1af31c18 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= Date: Thu, 6 Apr 2017 17:16:56 -0400 Subject: [PATCH] fixup! mm/hmm/devmem: device memory hotplug using ZONE_DEVICE --- mm/hmm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hmm.c b/mm/hmm.c index ff8ec59..f567a8b 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -898,7 +898,7 @@ static int hmm_devmem_pages_create(struct hmm_devmem *devmem) ret = arch_add_memory(nid, align_start, align_size, devmem->pagemap.type); mem_hotplug_done(); - if (!ret) + if (ret) goto error_add_memory; for (pfn = devmem->pfn_first; pfn < devmem->pfn_last; pfn++) { -- 2.9.3
[HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
This introduce a simple struct and associated helpers for device driver to use when hotpluging un-addressable device memory as ZONE_DEVICE. It will find a unuse physical address range and trigger memory hotplug for it which allocates and initialize struct page for the device memory. Signed-off-by: Jérôme GlisseSigned-off-by: Evgeny Baskakov Signed-off-by: John Hubbard Signed-off-by: Mark Hairgrove Signed-off-by: Sherry Cheung Signed-off-by: Subhash Gutti --- include/linux/hmm.h | 114 +++ mm/Kconfig | 9 ++ mm/hmm.c| 398 3 files changed, 521 insertions(+) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index d267989..50a1115 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -72,6 +72,11 @@ #if IS_ENABLED(CONFIG_HMM) +#include +#include +#include + + struct hmm; /* @@ -322,6 +327,115 @@ int hmm_vma_fault(struct vm_area_struct *vma, #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */ +#if IS_ENABLED(CONFIG_HMM_DEVMEM) +struct hmm_devmem; + +struct page *hmm_vma_alloc_locked_page(struct vm_area_struct *vma, + unsigned long addr); + +/* + * struct hmm_devmem_ops - callback for ZONE_DEVICE memory events + * + * @free: call when refcount on page reach 1 and thus is no longer use + * @fault: call when there is a page fault to unaddressable memory + */ +struct hmm_devmem_ops { + void (*free)(struct hmm_devmem *devmem, struct page *page); + int (*fault)(struct hmm_devmem *devmem, +struct vm_area_struct *vma, +unsigned long addr, +struct page *page, +unsigned int flags, +pmd_t *pmdp); +}; + +/* + * struct hmm_devmem - track device memory + * + * @completion: completion object for device memory + * @pfn_first: first pfn for this resource (set by hmm_devmem_add()) + * @pfn_last: last pfn for this resource (set by hmm_devmem_add()) + * @resource: IO resource reserved for this chunk of memory + * @pagemap: device page map for that chunk + * @device: device to bind resource to + * @ops: memory operations callback + * @ref: per CPU refcount + * + * This an helper structure for device drivers that do not wish to implement + * the gory details related to hotplugging new memoy and allocating struct + * pages. + * + * Device drivers can directly use ZONE_DEVICE memory on their own if they + * wish to do so. + */ +struct hmm_devmem { + struct completion completion; + unsigned long pfn_first; + unsigned long pfn_last; + struct resource *resource; + struct device *device; + struct dev_pagemap pagemap; + const struct hmm_devmem_ops *ops; + struct percpu_ref ref; +}; + +/* + * To add (hotplug) device memory, HMM assumes that there is no real resource + * that reserves a range in the physical address space (this is intended to be + * use by unaddressable device memory). It will reserve a physical range big + * enough and allocate struct page for it. + * + * The device driver can wrap the hmm_devmem struct inside a private device + * driver struct. The device driver must call hmm_devmem_remove() before the + * device goes away and before freeing the hmm_devmem struct memory. + */ +struct hmm_devmem *hmm_devmem_add(const struct hmm_devmem_ops *ops, + struct device *device, + unsigned long size); +void hmm_devmem_remove(struct hmm_devmem *devmem); + +int hmm_devmem_fault_range(struct hmm_devmem *devmem, + struct vm_area_struct *vma, + const struct migrate_vma_ops *ops, + unsigned long *src, + unsigned long *dst, + unsigned long start, + unsigned long addr, + unsigned long end, + void *private); + +/* + * hmm_devmem_page_set_drvdata - set per-page driver data field + * + * @page: pointer to struct page + * @data: driver data value to set + * + * Because page can not be on lru we have an unsigned long that driver can use + * to store a per page field. This just a simple helper to do that. + */ +static inline void hmm_devmem_page_set_drvdata(struct page *page, + unsigned long data) +{ + unsigned long *drvdata = (unsigned long *)>pgmap; + + drvdata[1] = data; +} + +/* + * hmm_devmem_page_get_drvdata - get per page driver data field + * + * @page: pointer to struct page + * Return: driver data value + */ +static inline unsigned long hmm_devmem_page_get_drvdata(struct page
[HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE
This introduce a simple struct and associated helpers for device driver to use when hotpluging un-addressable device memory as ZONE_DEVICE. It will find a unuse physical address range and trigger memory hotplug for it which allocates and initialize struct page for the device memory. Signed-off-by: Jérôme Glisse Signed-off-by: Evgeny Baskakov Signed-off-by: John Hubbard Signed-off-by: Mark Hairgrove Signed-off-by: Sherry Cheung Signed-off-by: Subhash Gutti --- include/linux/hmm.h | 114 +++ mm/Kconfig | 9 ++ mm/hmm.c| 398 3 files changed, 521 insertions(+) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index d267989..50a1115 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -72,6 +72,11 @@ #if IS_ENABLED(CONFIG_HMM) +#include +#include +#include + + struct hmm; /* @@ -322,6 +327,115 @@ int hmm_vma_fault(struct vm_area_struct *vma, #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */ +#if IS_ENABLED(CONFIG_HMM_DEVMEM) +struct hmm_devmem; + +struct page *hmm_vma_alloc_locked_page(struct vm_area_struct *vma, + unsigned long addr); + +/* + * struct hmm_devmem_ops - callback for ZONE_DEVICE memory events + * + * @free: call when refcount on page reach 1 and thus is no longer use + * @fault: call when there is a page fault to unaddressable memory + */ +struct hmm_devmem_ops { + void (*free)(struct hmm_devmem *devmem, struct page *page); + int (*fault)(struct hmm_devmem *devmem, +struct vm_area_struct *vma, +unsigned long addr, +struct page *page, +unsigned int flags, +pmd_t *pmdp); +}; + +/* + * struct hmm_devmem - track device memory + * + * @completion: completion object for device memory + * @pfn_first: first pfn for this resource (set by hmm_devmem_add()) + * @pfn_last: last pfn for this resource (set by hmm_devmem_add()) + * @resource: IO resource reserved for this chunk of memory + * @pagemap: device page map for that chunk + * @device: device to bind resource to + * @ops: memory operations callback + * @ref: per CPU refcount + * + * This an helper structure for device drivers that do not wish to implement + * the gory details related to hotplugging new memoy and allocating struct + * pages. + * + * Device drivers can directly use ZONE_DEVICE memory on their own if they + * wish to do so. + */ +struct hmm_devmem { + struct completion completion; + unsigned long pfn_first; + unsigned long pfn_last; + struct resource *resource; + struct device *device; + struct dev_pagemap pagemap; + const struct hmm_devmem_ops *ops; + struct percpu_ref ref; +}; + +/* + * To add (hotplug) device memory, HMM assumes that there is no real resource + * that reserves a range in the physical address space (this is intended to be + * use by unaddressable device memory). It will reserve a physical range big + * enough and allocate struct page for it. + * + * The device driver can wrap the hmm_devmem struct inside a private device + * driver struct. The device driver must call hmm_devmem_remove() before the + * device goes away and before freeing the hmm_devmem struct memory. + */ +struct hmm_devmem *hmm_devmem_add(const struct hmm_devmem_ops *ops, + struct device *device, + unsigned long size); +void hmm_devmem_remove(struct hmm_devmem *devmem); + +int hmm_devmem_fault_range(struct hmm_devmem *devmem, + struct vm_area_struct *vma, + const struct migrate_vma_ops *ops, + unsigned long *src, + unsigned long *dst, + unsigned long start, + unsigned long addr, + unsigned long end, + void *private); + +/* + * hmm_devmem_page_set_drvdata - set per-page driver data field + * + * @page: pointer to struct page + * @data: driver data value to set + * + * Because page can not be on lru we have an unsigned long that driver can use + * to store a per page field. This just a simple helper to do that. + */ +static inline void hmm_devmem_page_set_drvdata(struct page *page, + unsigned long data) +{ + unsigned long *drvdata = (unsigned long *)>pgmap; + + drvdata[1] = data; +} + +/* + * hmm_devmem_page_get_drvdata - get per page driver data field + * + * @page: pointer to struct page + * Return: driver data value + */ +static inline unsigned long hmm_devmem_page_get_drvdata(struct page *page) +{ + unsigned long *drvdata = (unsigned long *)>pgmap; + + return drvdata[1]; +} +#endif /*