Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-09 Thread Balbir Singh
On Fri, 2017-04-07 at 12:26 -0400, Jerome Glisse wrote:
> On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote:
> > On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote:
> > > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> > > > This introduce a simple struct and associated helpers for device driver
> > > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> > > > will find a unuse physical address range and trigger memory hotplug for
> > > > it which allocates and initialize struct page for the device memory.
> > > > 
> > > > Signed-off-by: Jérôme Glisse 
> > > > Signed-off-by: Evgeny Baskakov 
> > > > Signed-off-by: John Hubbard 
> > > > Signed-off-by: Mark Hairgrove 
> > > > Signed-off-by: Sherry Cheung 
> > > > Signed-off-by: Subhash Gutti 
> > > > ---
> > > >  include/linux/hmm.h | 114 +++
> > > >  mm/Kconfig  |   9 ++
> > > >  mm/hmm.c| 398 
> > > > 
> > > >  3 files changed, 521 insertions(+)
> > > > 
> > > > +/*
> > > > + * To add (hotplug) device memory, HMM assumes that there is no real 
> > > > resource
> > > > + * that reserves a range in the physical address space (this is 
> > > > intended to be
> > > > + * use by unaddressable device memory). It will reserve a physical 
> > > > range big
> > > > + * enough and allocate struct page for it.
> > > 
> > > I've found that the implementation of this is quite non-portable, in that
> > > starting from iomem_resource.end+1-size (which is effectively -size) on
> > > my platform (powerpc) does not give expected results. It could be that
> > > additional changes are needed to arch_add_memory() to support this
> > > use case.
> > 
> > The CDM version does not use that part, that being said isn't -size a valid
> > value we care only about unsigned here ? What is the end value on powerpc ?
> > In any case this sounds more like a unsigned/signed arithmetic issue, i will
> > look into it.
> > 

Thanks!

> > > 
> > > > +
> > > > +   size = ALIGN(size, SECTION_SIZE);
> > > > +   addr = (iomem_resource.end + 1ULL) - size;
> > > 
> > > 
> > > Why don't we allocate_resource() with the right constraints and get a new
> > > unused region?
> > 
> > The issue with allocate_resource() is that it does scan the resource tree
> > from lower address to higher ones. I was told that it was less likely to
> > have hotplug issue conflict if i pick highest physicall address for the
> > device memory hence why i do my own scan from the end toward the start.
> > 
> > Again all this function does not apply to PPC, it can be hidden behind
> > x86 config if you prefer it.
> 
> Ok so i have look into it and there is no arithmetic bug in my code the
> issue is simpler than that. It seems only x86 clamp iomem_resource.end to
> MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue.

> 
> It is fine not to clamp if you know that you won't get resource with
> funky physical address but in case of UNADDRESSABLE i do not get any
> physical address so i have to pick one and i want to pick one that is
> unlikely to cause trouble latter on with someone hotpluging memory.
> 
> If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix
> this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan
> in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter
> is probably safer and more bullet proof in respect to other arch getting
> interested in this.
>

We do care about UNADDRESSABLE for certain platforms on powerpc
 
I think MAX_PHYSMEM_BITS sounds good or we can make it an arch hook. I spoke
to Michael Ellerman and he recommended we do either. We can't clamp down
iomem_resource.end in the arch as we have other things beyond MAX_PHYSMEM_BITS,
but doing the walk in HMM from the end of MAX_PHYSMEM_BITS is a good idea to
begin with.

Balbir Singh.


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-09 Thread Balbir Singh
On Fri, 2017-04-07 at 12:26 -0400, Jerome Glisse wrote:
> On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote:
> > On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote:
> > > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> > > > This introduce a simple struct and associated helpers for device driver
> > > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> > > > will find a unuse physical address range and trigger memory hotplug for
> > > > it which allocates and initialize struct page for the device memory.
> > > > 
> > > > Signed-off-by: Jérôme Glisse 
> > > > Signed-off-by: Evgeny Baskakov 
> > > > Signed-off-by: John Hubbard 
> > > > Signed-off-by: Mark Hairgrove 
> > > > Signed-off-by: Sherry Cheung 
> > > > Signed-off-by: Subhash Gutti 
> > > > ---
> > > >  include/linux/hmm.h | 114 +++
> > > >  mm/Kconfig  |   9 ++
> > > >  mm/hmm.c| 398 
> > > > 
> > > >  3 files changed, 521 insertions(+)
> > > > 
> > > > +/*
> > > > + * To add (hotplug) device memory, HMM assumes that there is no real 
> > > > resource
> > > > + * that reserves a range in the physical address space (this is 
> > > > intended to be
> > > > + * use by unaddressable device memory). It will reserve a physical 
> > > > range big
> > > > + * enough and allocate struct page for it.
> > > 
> > > I've found that the implementation of this is quite non-portable, in that
> > > starting from iomem_resource.end+1-size (which is effectively -size) on
> > > my platform (powerpc) does not give expected results. It could be that
> > > additional changes are needed to arch_add_memory() to support this
> > > use case.
> > 
> > The CDM version does not use that part, that being said isn't -size a valid
> > value we care only about unsigned here ? What is the end value on powerpc ?
> > In any case this sounds more like a unsigned/signed arithmetic issue, i will
> > look into it.
> > 

Thanks!

> > > 
> > > > +
> > > > +   size = ALIGN(size, SECTION_SIZE);
> > > > +   addr = (iomem_resource.end + 1ULL) - size;
> > > 
> > > 
> > > Why don't we allocate_resource() with the right constraints and get a new
> > > unused region?
> > 
> > The issue with allocate_resource() is that it does scan the resource tree
> > from lower address to higher ones. I was told that it was less likely to
> > have hotplug issue conflict if i pick highest physicall address for the
> > device memory hence why i do my own scan from the end toward the start.
> > 
> > Again all this function does not apply to PPC, it can be hidden behind
> > x86 config if you prefer it.
> 
> Ok so i have look into it and there is no arithmetic bug in my code the
> issue is simpler than that. It seems only x86 clamp iomem_resource.end to
> MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue.

> 
> It is fine not to clamp if you know that you won't get resource with
> funky physical address but in case of UNADDRESSABLE i do not get any
> physical address so i have to pick one and i want to pick one that is
> unlikely to cause trouble latter on with someone hotpluging memory.
> 
> If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix
> this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan
> in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter
> is probably safer and more bullet proof in respect to other arch getting
> interested in this.
>

We do care about UNADDRESSABLE for certain platforms on powerpc
 
I think MAX_PHYSMEM_BITS sounds good or we can make it an arch hook. I spoke
to Michael Ellerman and he recommended we do either. We can't clamp down
iomem_resource.end in the arch as we have other things beyond MAX_PHYSMEM_BITS,
but doing the walk in HMM from the end of MAX_PHYSMEM_BITS is a good idea to
begin with.

Balbir Singh.


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-07 Thread Jerome Glisse
On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote:
> On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote:
> > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> > > This introduce a simple struct and associated helpers for device driver
> > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> > > will find a unuse physical address range and trigger memory hotplug for
> > > it which allocates and initialize struct page for the device memory.
> > > 
> > > Signed-off-by: Jérôme Glisse 
> > > Signed-off-by: Evgeny Baskakov 
> > > Signed-off-by: John Hubbard 
> > > Signed-off-by: Mark Hairgrove 
> > > Signed-off-by: Sherry Cheung 
> > > Signed-off-by: Subhash Gutti 
> > > ---
> > >  include/linux/hmm.h | 114 +++
> > >  mm/Kconfig  |   9 ++
> > >  mm/hmm.c| 398 
> > > 
> > >  3 files changed, 521 insertions(+)
> > > 
> > > +/*
> > > + * To add (hotplug) device memory, HMM assumes that there is no real 
> > > resource
> > > + * that reserves a range in the physical address space (this is intended 
> > > to be
> > > + * use by unaddressable device memory). It will reserve a physical range 
> > > big
> > > + * enough and allocate struct page for it.
> > 
> > I've found that the implementation of this is quite non-portable, in that
> > starting from iomem_resource.end+1-size (which is effectively -size) on
> > my platform (powerpc) does not give expected results. It could be that
> > additional changes are needed to arch_add_memory() to support this
> > use case.
> 
> The CDM version does not use that part, that being said isn't -size a valid
> value we care only about unsigned here ? What is the end value on powerpc ?
> In any case this sounds more like a unsigned/signed arithmetic issue, i will
> look into it.
> 
> > 
> > > +
> > > + size = ALIGN(size, SECTION_SIZE);
> > > + addr = (iomem_resource.end + 1ULL) - size;
> > 
> > 
> > Why don't we allocate_resource() with the right constraints and get a new
> > unused region?
> 
> The issue with allocate_resource() is that it does scan the resource tree
> from lower address to higher ones. I was told that it was less likely to
> have hotplug issue conflict if i pick highest physicall address for the
> device memory hence why i do my own scan from the end toward the start.
> 
> Again all this function does not apply to PPC, it can be hidden behind
> x86 config if you prefer it.

Ok so i have look into it and there is no arithmetic bug in my code the
issue is simpler than that. It seems only x86 clamp iomem_resource.end to
MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue.

It is fine not to clamp if you know that you won't get resource with
funky physical address but in case of UNADDRESSABLE i do not get any
physical address so i have to pick one and i want to pick one that is
unlikely to cause trouble latter on with someone hotpluging memory.

If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix
this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan
in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter
is probably safer and more bullet proof in respect to other arch getting
interested in this.

Cheers,
Jérôme


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-07 Thread Jerome Glisse
On Thu, Apr 06, 2017 at 10:02:55PM -0400, Jerome Glisse wrote:
> On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote:
> > On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> > > This introduce a simple struct and associated helpers for device driver
> > > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> > > will find a unuse physical address range and trigger memory hotplug for
> > > it which allocates and initialize struct page for the device memory.
> > > 
> > > Signed-off-by: Jérôme Glisse 
> > > Signed-off-by: Evgeny Baskakov 
> > > Signed-off-by: John Hubbard 
> > > Signed-off-by: Mark Hairgrove 
> > > Signed-off-by: Sherry Cheung 
> > > Signed-off-by: Subhash Gutti 
> > > ---
> > >  include/linux/hmm.h | 114 +++
> > >  mm/Kconfig  |   9 ++
> > >  mm/hmm.c| 398 
> > > 
> > >  3 files changed, 521 insertions(+)
> > > 
> > > +/*
> > > + * To add (hotplug) device memory, HMM assumes that there is no real 
> > > resource
> > > + * that reserves a range in the physical address space (this is intended 
> > > to be
> > > + * use by unaddressable device memory). It will reserve a physical range 
> > > big
> > > + * enough and allocate struct page for it.
> > 
> > I've found that the implementation of this is quite non-portable, in that
> > starting from iomem_resource.end+1-size (which is effectively -size) on
> > my platform (powerpc) does not give expected results. It could be that
> > additional changes are needed to arch_add_memory() to support this
> > use case.
> 
> The CDM version does not use that part, that being said isn't -size a valid
> value we care only about unsigned here ? What is the end value on powerpc ?
> In any case this sounds more like a unsigned/signed arithmetic issue, i will
> look into it.
> 
> > 
> > > +
> > > + size = ALIGN(size, SECTION_SIZE);
> > > + addr = (iomem_resource.end + 1ULL) - size;
> > 
> > 
> > Why don't we allocate_resource() with the right constraints and get a new
> > unused region?
> 
> The issue with allocate_resource() is that it does scan the resource tree
> from lower address to higher ones. I was told that it was less likely to
> have hotplug issue conflict if i pick highest physicall address for the
> device memory hence why i do my own scan from the end toward the start.
> 
> Again all this function does not apply to PPC, it can be hidden behind
> x86 config if you prefer it.

Ok so i have look into it and there is no arithmetic bug in my code the
issue is simpler than that. It seems only x86 clamp iomem_resource.end to
MAX_PHYSMEM_BITS so using allocate_resource() would just hide the issue.

It is fine not to clamp if you know that you won't get resource with
funky physical address but in case of UNADDRESSABLE i do not get any
physical address so i have to pick one and i want to pick one that is
unlikely to cause trouble latter on with someone hotpluging memory.

If we care about the UNADDRESSABLE case on powerpc i see 2 way to fix
this. Clamp iomem_resource.end to MAX_PHYSMEM_BITS or restrict my scan
in hmm to MIN(iomem_resource.end, 1UL << MAX_PHYSMEM_BITS) the latter
is probably safer and more bullet proof in respect to other arch getting
interested in this.

Cheers,
Jérôme


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-06 Thread Jerome Glisse
On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote:
> On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> > This introduce a simple struct and associated helpers for device driver
> > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> > will find a unuse physical address range and trigger memory hotplug for
> > it which allocates and initialize struct page for the device memory.
> > 
> > Signed-off-by: Jérôme Glisse 
> > Signed-off-by: Evgeny Baskakov 
> > Signed-off-by: John Hubbard 
> > Signed-off-by: Mark Hairgrove 
> > Signed-off-by: Sherry Cheung 
> > Signed-off-by: Subhash Gutti 
> > ---
> >  include/linux/hmm.h | 114 +++
> >  mm/Kconfig  |   9 ++
> >  mm/hmm.c| 398 
> > 
> >  3 files changed, 521 insertions(+)
> > 
> > +/*
> > + * To add (hotplug) device memory, HMM assumes that there is no real 
> > resource
> > + * that reserves a range in the physical address space (this is intended 
> > to be
> > + * use by unaddressable device memory). It will reserve a physical range 
> > big
> > + * enough and allocate struct page for it.
> 
> I've found that the implementation of this is quite non-portable, in that
> starting from iomem_resource.end+1-size (which is effectively -size) on
> my platform (powerpc) does not give expected results. It could be that
> additional changes are needed to arch_add_memory() to support this
> use case.

The CDM version does not use that part, that being said isn't -size a valid
value we care only about unsigned here ? What is the end value on powerpc ?
In any case this sounds more like a unsigned/signed arithmetic issue, i will
look into it.

> 
> > +
> > +   size = ALIGN(size, SECTION_SIZE);
> > +   addr = (iomem_resource.end + 1ULL) - size;
> 
> 
> Why don't we allocate_resource() with the right constraints and get a new
> unused region?

The issue with allocate_resource() is that it does scan the resource tree
from lower address to higher ones. I was told that it was less likely to
have hotplug issue conflict if i pick highest physicall address for the
device memory hence why i do my own scan from the end toward the start.

Again all this function does not apply to PPC, it can be hidden behind
x86 config if you prefer it.

Cheers,
Jérôme


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-06 Thread Jerome Glisse
On Fri, Apr 07, 2017 at 11:37:34AM +1000, Balbir Singh wrote:
> On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> > This introduce a simple struct and associated helpers for device driver
> > to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> > will find a unuse physical address range and trigger memory hotplug for
> > it which allocates and initialize struct page for the device memory.
> > 
> > Signed-off-by: Jérôme Glisse 
> > Signed-off-by: Evgeny Baskakov 
> > Signed-off-by: John Hubbard 
> > Signed-off-by: Mark Hairgrove 
> > Signed-off-by: Sherry Cheung 
> > Signed-off-by: Subhash Gutti 
> > ---
> >  include/linux/hmm.h | 114 +++
> >  mm/Kconfig  |   9 ++
> >  mm/hmm.c| 398 
> > 
> >  3 files changed, 521 insertions(+)
> > 
> > +/*
> > + * To add (hotplug) device memory, HMM assumes that there is no real 
> > resource
> > + * that reserves a range in the physical address space (this is intended 
> > to be
> > + * use by unaddressable device memory). It will reserve a physical range 
> > big
> > + * enough and allocate struct page for it.
> 
> I've found that the implementation of this is quite non-portable, in that
> starting from iomem_resource.end+1-size (which is effectively -size) on
> my platform (powerpc) does not give expected results. It could be that
> additional changes are needed to arch_add_memory() to support this
> use case.

The CDM version does not use that part, that being said isn't -size a valid
value we care only about unsigned here ? What is the end value on powerpc ?
In any case this sounds more like a unsigned/signed arithmetic issue, i will
look into it.

> 
> > +
> > +   size = ALIGN(size, SECTION_SIZE);
> > +   addr = (iomem_resource.end + 1ULL) - size;
> 
> 
> Why don't we allocate_resource() with the right constraints and get a new
> unused region?

The issue with allocate_resource() is that it does scan the resource tree
from lower address to higher ones. I was told that it was less likely to
have hotplug issue conflict if i pick highest physicall address for the
device memory hence why i do my own scan from the end toward the start.

Again all this function does not apply to PPC, it can be hidden behind
x86 config if you prefer it.

Cheers,
Jérôme


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-06 Thread Balbir Singh
On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> This introduce a simple struct and associated helpers for device driver
> to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> will find a unuse physical address range and trigger memory hotplug for
> it which allocates and initialize struct page for the device memory.
> 
> Signed-off-by: Jérôme Glisse 
> Signed-off-by: Evgeny Baskakov 
> Signed-off-by: John Hubbard 
> Signed-off-by: Mark Hairgrove 
> Signed-off-by: Sherry Cheung 
> Signed-off-by: Subhash Gutti 
> ---
>  include/linux/hmm.h | 114 +++
>  mm/Kconfig  |   9 ++
>  mm/hmm.c| 398 
> 
>  3 files changed, 521 insertions(+)
> 
> +/*
> + * To add (hotplug) device memory, HMM assumes that there is no real resource
> + * that reserves a range in the physical address space (this is intended to 
> be
> + * use by unaddressable device memory). It will reserve a physical range big
> + * enough and allocate struct page for it.

I've found that the implementation of this is quite non-portable, in that
starting from iomem_resource.end+1-size (which is effectively -size) on
my platform (powerpc) does not give expected results. It could be that
additional changes are needed to arch_add_memory() to support this
use case.

> +
> + size = ALIGN(size, SECTION_SIZE);
> + addr = (iomem_resource.end + 1ULL) - size;


Why don't we allocate_resource() with the right constraints and get a new
unused region?

Thanks,
Balbir


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-06 Thread Balbir Singh
On Wed, 2017-04-05 at 16:40 -0400, Jérôme Glisse wrote:
> This introduce a simple struct and associated helpers for device driver
> to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
> will find a unuse physical address range and trigger memory hotplug for
> it which allocates and initialize struct page for the device memory.
> 
> Signed-off-by: Jérôme Glisse 
> Signed-off-by: Evgeny Baskakov 
> Signed-off-by: John Hubbard 
> Signed-off-by: Mark Hairgrove 
> Signed-off-by: Sherry Cheung 
> Signed-off-by: Subhash Gutti 
> ---
>  include/linux/hmm.h | 114 +++
>  mm/Kconfig  |   9 ++
>  mm/hmm.c| 398 
> 
>  3 files changed, 521 insertions(+)
> 
> +/*
> + * To add (hotplug) device memory, HMM assumes that there is no real resource
> + * that reserves a range in the physical address space (this is intended to 
> be
> + * use by unaddressable device memory). It will reserve a physical range big
> + * enough and allocate struct page for it.

I've found that the implementation of this is quite non-portable, in that
starting from iomem_resource.end+1-size (which is effectively -size) on
my platform (powerpc) does not give expected results. It could be that
additional changes are needed to arch_add_memory() to support this
use case.

> +
> + size = ALIGN(size, SECTION_SIZE);
> + addr = (iomem_resource.end + 1ULL) - size;


Why don't we allocate_resource() with the right constraints and get a new
unused region?

Thanks,
Balbir


Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-06 Thread Jerome Glisse

So during rebase on lastest mmotm one if branch logic got inversed.
Attached is a fixup patch.

Cheers,
Jérôme
>From 374bca39b19a88da1d1c6d38c0a4c49c1af31c18 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= 
Date: Thu, 6 Apr 2017 17:16:56 -0400
Subject: [PATCH] fixup! mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

---
 mm/hmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index ff8ec59..f567a8b 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -898,7 +898,7 @@ static int hmm_devmem_pages_create(struct hmm_devmem 
*devmem)
ret = arch_add_memory(nid, align_start, align_size,
  devmem->pagemap.type);
mem_hotplug_done();
-   if (!ret)
+   if (ret)
goto error_add_memory;
 
for (pfn = devmem->pfn_first; pfn < devmem->pfn_last; pfn++) {
-- 
2.9.3



Re: [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-06 Thread Jerome Glisse

So during rebase on lastest mmotm one if branch logic got inversed.
Attached is a fixup patch.

Cheers,
Jérôme
>From 374bca39b19a88da1d1c6d38c0a4c49c1af31c18 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= 
Date: Thu, 6 Apr 2017 17:16:56 -0400
Subject: [PATCH] fixup! mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

---
 mm/hmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index ff8ec59..f567a8b 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -898,7 +898,7 @@ static int hmm_devmem_pages_create(struct hmm_devmem 
*devmem)
ret = arch_add_memory(nid, align_start, align_size,
  devmem->pagemap.type);
mem_hotplug_done();
-   if (!ret)
+   if (ret)
goto error_add_memory;
 
for (pfn = devmem->pfn_first; pfn < devmem->pfn_last; pfn++) {
-- 
2.9.3



[HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-05 Thread Jérôme Glisse
This introduce a simple struct and associated helpers for device driver
to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
will find a unuse physical address range and trigger memory hotplug for
it which allocates and initialize struct page for the device memory.

Signed-off-by: Jérôme Glisse 
Signed-off-by: Evgeny Baskakov 
Signed-off-by: John Hubbard 
Signed-off-by: Mark Hairgrove 
Signed-off-by: Sherry Cheung 
Signed-off-by: Subhash Gutti 
---
 include/linux/hmm.h | 114 +++
 mm/Kconfig  |   9 ++
 mm/hmm.c| 398 
 3 files changed, 521 insertions(+)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index d267989..50a1115 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -72,6 +72,11 @@
 
 #if IS_ENABLED(CONFIG_HMM)
 
+#include 
+#include 
+#include 
+
+
 struct hmm;
 
 /*
@@ -322,6 +327,115 @@ int hmm_vma_fault(struct vm_area_struct *vma,
 #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
 
 
+#if IS_ENABLED(CONFIG_HMM_DEVMEM)
+struct hmm_devmem;
+
+struct page *hmm_vma_alloc_locked_page(struct vm_area_struct *vma,
+  unsigned long addr);
+
+/*
+ * struct hmm_devmem_ops - callback for ZONE_DEVICE memory events
+ *
+ * @free: call when refcount on page reach 1 and thus is no longer use
+ * @fault: call when there is a page fault to unaddressable memory
+ */
+struct hmm_devmem_ops {
+   void (*free)(struct hmm_devmem *devmem, struct page *page);
+   int (*fault)(struct hmm_devmem *devmem,
+struct vm_area_struct *vma,
+unsigned long addr,
+struct page *page,
+unsigned int flags,
+pmd_t *pmdp);
+};
+
+/*
+ * struct hmm_devmem - track device memory
+ *
+ * @completion: completion object for device memory
+ * @pfn_first: first pfn for this resource (set by hmm_devmem_add())
+ * @pfn_last: last pfn for this resource (set by hmm_devmem_add())
+ * @resource: IO resource reserved for this chunk of memory
+ * @pagemap: device page map for that chunk
+ * @device: device to bind resource to
+ * @ops: memory operations callback
+ * @ref: per CPU refcount
+ *
+ * This an helper structure for device drivers that do not wish to implement
+ * the gory details related to hotplugging new memoy and allocating struct
+ * pages.
+ *
+ * Device drivers can directly use ZONE_DEVICE memory on their own if they
+ * wish to do so.
+ */
+struct hmm_devmem {
+   struct completion   completion;
+   unsigned long   pfn_first;
+   unsigned long   pfn_last;
+   struct resource *resource;
+   struct device   *device;
+   struct dev_pagemap  pagemap;
+   const struct hmm_devmem_ops *ops;
+   struct percpu_ref   ref;
+};
+
+/*
+ * To add (hotplug) device memory, HMM assumes that there is no real resource
+ * that reserves a range in the physical address space (this is intended to be
+ * use by unaddressable device memory). It will reserve a physical range big
+ * enough and allocate struct page for it.
+ *
+ * The device driver can wrap the hmm_devmem struct inside a private device
+ * driver struct. The device driver must call hmm_devmem_remove() before the
+ * device goes away and before freeing the hmm_devmem struct memory.
+ */
+struct hmm_devmem *hmm_devmem_add(const struct hmm_devmem_ops *ops,
+ struct device *device,
+ unsigned long size);
+void hmm_devmem_remove(struct hmm_devmem *devmem);
+
+int hmm_devmem_fault_range(struct hmm_devmem *devmem,
+  struct vm_area_struct *vma,
+  const struct migrate_vma_ops *ops,
+  unsigned long *src,
+  unsigned long *dst,
+  unsigned long start,
+  unsigned long addr,
+  unsigned long end,
+  void *private);
+
+/*
+ * hmm_devmem_page_set_drvdata - set per-page driver data field
+ *
+ * @page: pointer to struct page
+ * @data: driver data value to set
+ *
+ * Because page can not be on lru we have an unsigned long that driver can use
+ * to store a per page field. This just a simple helper to do that.
+ */
+static inline void hmm_devmem_page_set_drvdata(struct page *page,
+  unsigned long data)
+{
+   unsigned long *drvdata = (unsigned long *)>pgmap;
+
+   drvdata[1] = data;
+}
+
+/*
+ * hmm_devmem_page_get_drvdata - get per page driver data field
+ *
+ * @page: pointer to struct page
+ * Return: driver data value
+ */
+static inline unsigned long hmm_devmem_page_get_drvdata(struct page 

[HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE

2017-04-05 Thread Jérôme Glisse
This introduce a simple struct and associated helpers for device driver
to use when hotpluging un-addressable device memory as ZONE_DEVICE. It
will find a unuse physical address range and trigger memory hotplug for
it which allocates and initialize struct page for the device memory.

Signed-off-by: Jérôme Glisse 
Signed-off-by: Evgeny Baskakov 
Signed-off-by: John Hubbard 
Signed-off-by: Mark Hairgrove 
Signed-off-by: Sherry Cheung 
Signed-off-by: Subhash Gutti 
---
 include/linux/hmm.h | 114 +++
 mm/Kconfig  |   9 ++
 mm/hmm.c| 398 
 3 files changed, 521 insertions(+)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index d267989..50a1115 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -72,6 +72,11 @@
 
 #if IS_ENABLED(CONFIG_HMM)
 
+#include 
+#include 
+#include 
+
+
 struct hmm;
 
 /*
@@ -322,6 +327,115 @@ int hmm_vma_fault(struct vm_area_struct *vma,
 #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
 
 
+#if IS_ENABLED(CONFIG_HMM_DEVMEM)
+struct hmm_devmem;
+
+struct page *hmm_vma_alloc_locked_page(struct vm_area_struct *vma,
+  unsigned long addr);
+
+/*
+ * struct hmm_devmem_ops - callback for ZONE_DEVICE memory events
+ *
+ * @free: call when refcount on page reach 1 and thus is no longer use
+ * @fault: call when there is a page fault to unaddressable memory
+ */
+struct hmm_devmem_ops {
+   void (*free)(struct hmm_devmem *devmem, struct page *page);
+   int (*fault)(struct hmm_devmem *devmem,
+struct vm_area_struct *vma,
+unsigned long addr,
+struct page *page,
+unsigned int flags,
+pmd_t *pmdp);
+};
+
+/*
+ * struct hmm_devmem - track device memory
+ *
+ * @completion: completion object for device memory
+ * @pfn_first: first pfn for this resource (set by hmm_devmem_add())
+ * @pfn_last: last pfn for this resource (set by hmm_devmem_add())
+ * @resource: IO resource reserved for this chunk of memory
+ * @pagemap: device page map for that chunk
+ * @device: device to bind resource to
+ * @ops: memory operations callback
+ * @ref: per CPU refcount
+ *
+ * This an helper structure for device drivers that do not wish to implement
+ * the gory details related to hotplugging new memoy and allocating struct
+ * pages.
+ *
+ * Device drivers can directly use ZONE_DEVICE memory on their own if they
+ * wish to do so.
+ */
+struct hmm_devmem {
+   struct completion   completion;
+   unsigned long   pfn_first;
+   unsigned long   pfn_last;
+   struct resource *resource;
+   struct device   *device;
+   struct dev_pagemap  pagemap;
+   const struct hmm_devmem_ops *ops;
+   struct percpu_ref   ref;
+};
+
+/*
+ * To add (hotplug) device memory, HMM assumes that there is no real resource
+ * that reserves a range in the physical address space (this is intended to be
+ * use by unaddressable device memory). It will reserve a physical range big
+ * enough and allocate struct page for it.
+ *
+ * The device driver can wrap the hmm_devmem struct inside a private device
+ * driver struct. The device driver must call hmm_devmem_remove() before the
+ * device goes away and before freeing the hmm_devmem struct memory.
+ */
+struct hmm_devmem *hmm_devmem_add(const struct hmm_devmem_ops *ops,
+ struct device *device,
+ unsigned long size);
+void hmm_devmem_remove(struct hmm_devmem *devmem);
+
+int hmm_devmem_fault_range(struct hmm_devmem *devmem,
+  struct vm_area_struct *vma,
+  const struct migrate_vma_ops *ops,
+  unsigned long *src,
+  unsigned long *dst,
+  unsigned long start,
+  unsigned long addr,
+  unsigned long end,
+  void *private);
+
+/*
+ * hmm_devmem_page_set_drvdata - set per-page driver data field
+ *
+ * @page: pointer to struct page
+ * @data: driver data value to set
+ *
+ * Because page can not be on lru we have an unsigned long that driver can use
+ * to store a per page field. This just a simple helper to do that.
+ */
+static inline void hmm_devmem_page_set_drvdata(struct page *page,
+  unsigned long data)
+{
+   unsigned long *drvdata = (unsigned long *)>pgmap;
+
+   drvdata[1] = data;
+}
+
+/*
+ * hmm_devmem_page_get_drvdata - get per page driver data field
+ *
+ * @page: pointer to struct page
+ * Return: driver data value
+ */
+static inline unsigned long hmm_devmem_page_get_drvdata(struct page *page)
+{
+   unsigned long *drvdata = (unsigned long *)>pgmap;
+
+   return drvdata[1];
+}
+#endif /*