[mm PATCH v7 3/4] mm: Implement new zone specific memblock iterator

2019-04-05 Thread Alexander Duyck
From: Alexander Duyck Introduce a new iterator for_each_free_mem_pfn_range_in_zone. This iterator will take care of making sure a given memory range provided is in fact contained within a zone. It takes are of all the bounds checking we were doing in deferred_grow_zone, and deferred_init_memmap

[mm PATCH v7 1/4] mm: Use mm_zero_struct_page from SPARC on all 64b architectures

2019-04-05 Thread Alexander Duyck
From: Alexander Duyck Use the same approach that was already in use on Sparc on all the architectures that support a 64b long. This is mostly motivated by the fact that 7 to 10 store/move instructions are likely always going to be faster than having to call into a function

[mm PATCH v7 0/4] Deferred page init improvements

2019-04-05 Thread Alexander Duyck
NR_PAGES patch to stop on section aligned boundaries Dropped patches 5-7 Will follow-up later with reserved bit rework before resubmitting --- Alexander Duyck (4): mm: Use mm_zero_struct_page from SPARC on all 64b architectures mm: Drop meminit_pfn_in_nid as it is redundant

[mm PATCH v7 2/4] mm: Drop meminit_pfn_in_nid as it is redundant

2019-04-05 Thread Alexander Duyck
From: Alexander Duyck As best as I can tell the meminit_pfn_in_nid call is completely redundant. The deferred memory initialization is already making use of for_each_free_mem_range which in turn will call into __next_mem_range which will only return a memory range if it matches the node ID

[mm PATCH v7 4/4] mm: Initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections

2019-04-05 Thread Alexander Duyck
From: Alexander Duyck Add yet another iterator, for_each_free_mem_range_in_zone_from, and then use it to support initializing and freeing pages in groups no larger than MAX_ORDER_NR_PAGES. By doing this we can greatly improve the cache locality of the pages while we do several loops over them

Re: [mm PATCH v6 6/7] mm: Add reserved flag setting to set_page_links

2019-03-13 Thread Alexander Duyck
On Wed, 2019-03-13 at 09:33 -0700, Andrew Morton wrote: > On Tue, 12 Mar 2019 15:50:36 -0700 Alexander Duyck > wrote: > > > On Tue, 2019-03-12 at 15:07 -0700, Andrew Morton wrote: > > > On Wed, 5 Dec 2018 21:42:47 +0100 Michal Hocko wrote: > > > > >

Re: [mm PATCH v6 6/7] mm: Add reserved flag setting to set_page_links

2019-03-12 Thread Alexander Duyck
On Tue, 2019-03-12 at 15:07 -0700, Andrew Morton wrote: > On Wed, 5 Dec 2018 21:42:47 +0100 Michal Hocko wrote: > > > > I got your explanation. However Andrew had already applied the patches > > > and I had some outstanding issues in them that needed to be addressed. > > > So I thought it best

[driver-core PATCH v10 7/9] PM core: Use new async_schedule_dev command

2019-01-22 Thread Alexander Duyck
to the device. Reviewed-by: Dan Williams Reviewed-by: Bart Van Assche Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/power/main.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c

[driver-core PATCH v10 1/9] driver core: Establish order of operations for device_add and device_del via bitflag

2019-01-22 Thread Alexander Duyck
th it makes more sense to just consolidate the dev->dead and dev->driver checks together into one set of checks. Reviewed-by: Dan Williams Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/base.h |4 drivers/base/core.c | 11 +++ driv

[driver-core PATCH v10 4/9] workqueue: Provide queue_work_node to queue work near a given NUMA node

2019-01-22 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- include/linux/workqueue.h |2 + kernel/workqueue.c| 84 + 2 files changed, 86 insertions(+) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 60d673e15632..1f50c1e586e7 100644 --- a/inc

[driver-core PATCH v10 6/9] driver core: Attach devices on CPU local to device node

2019-01-22 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- drivers/base/dd.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 627ad05064e0..aa6a9c613595 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -829,7 +829,7 @@ static int __device_attach

[driver-core PATCH v10 9/9] driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity

2019-01-22 Thread Alexander Duyck
-by: Alexander Duyck --- drivers/base/test/test_async_driver_probe.c | 261 +-- 1 file changed, 201 insertions(+), 60 deletions(-) diff --git a/drivers/base/test/test_async_driver_probe.c b/drivers/base/test/test_async_driver_probe.c index e7f145d662f0..f4b1d8e54daf 100644

[driver-core PATCH v10 8/9] libnvdimm: Schedule device registration on node local to the device

2019-01-22 Thread Alexander Duyck
. By guaranteeing NUMA locality I see an improvement of as high as 25% for per-node init of a system with 12TB of persistent memory. Reviewed-by: Bart Van Assche Signed-off-by: Alexander Duyck --- drivers/nvdimm/bus.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff

[driver-core PATCH v10 5/9] async: Add support for queueing on specific NUMA node

2019-01-22 Thread Alexander Duyck
to the device. Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- include/linux/async.h | 82 +++-- kernel/async.c| 53 +--- 2 files changed, 108 insertions(+), 27 deletions(-) diff

[driver-core PATCH v10 0/9] Add NUMA aware async_schedule calls

2019-01-22 Thread Alexander Duyck
is now patch 1, and added Reviewed-by from Dan Williams v9->v10: Moved "dead" from device struct to device_private struct Added Reviewed-by from Rafael to patch 1 Rebased on latest linux-next --- Alexander Duyck (9): driver core: Establish order of operations fo

[driver-core PATCH v10 2/9] device core: Consolidate locking and unlocking of parent and device

2019-01-22 Thread Alexander Duyck
-by: Luis Chamberlain Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/base.h |2 + drivers/base/bus.c | 23 ++-- drivers/base/dd.c | 95 --- 3

Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag

2019-01-18 Thread Alexander Duyck
On Fri, 2019-01-18 at 16:54 +0100, Greg KH wrote: > On Wed, Dec 12, 2018 at 04:44:58PM -0800, Alexander Duyck wrote: > > Add an additional bit flag to the device struct named "dead". > > > > This additional flag provides a guarantee that when a device_del is &g

Re: [driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag

2019-01-10 Thread Alexander Duyck
On Thu, 2018-12-20 at 16:28 +0100, Greg Kroah-Hartman wrote: > On Wed, Dec 19, 2018 at 03:27:48PM +0100, Rafael J. Wysocki wrote: > > On Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck > > wrote: > > > > > > Add an additional bit flag to the device struct named &qu

Re: [driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device

2018-12-17 Thread Alexander Duyck
On Fri, 2018-12-14 at 11:40 +0100, Rafael J. Wysocki wrote: > somOn Thu, Dec 13, 2018 at 1:45 AM Alexander Duyck > wrote: > > > > Try to consolidate all of the locking and unlocking of both the parent and > > device when attaching or removing a driver from a given devic

[driver-core PATCH v9 8/9] libnvdimm: Schedule device registration on node local to the device

2018-12-12 Thread Alexander Duyck
. By guaranteeing NUMA locality I see an improvement of as high as 25% for per-node init of a system with 12TB of persistent memory. Reviewed-by: Bart Van Assche Signed-off-by: Alexander Duyck --- drivers/nvdimm/bus.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff

[driver-core PATCH v9 9/9] driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity

2018-12-12 Thread Alexander Duyck
-by: Alexander Duyck --- drivers/base/test/test_async_driver_probe.c | 261 +-- 1 file changed, 201 insertions(+), 60 deletions(-) diff --git a/drivers/base/test/test_async_driver_probe.c b/drivers/base/test/test_async_driver_probe.c index e7f145d662f0..f4b1d8e54daf 100644

[driver-core PATCH v9 7/9] PM core: Use new async_schedule_dev command

2018-12-12 Thread Alexander Duyck
to the device. Reviewed-by: Dan Williams Reviewed-by: Bart Van Assche Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/power/main.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c

[driver-core PATCH v9 5/9] async: Add support for queueing on specific NUMA node

2018-12-12 Thread Alexander Duyck
to the device. Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- include/linux/async.h | 82 +++-- kernel/async.c| 53 +--- 2 files changed, 108 insertions(+), 27 deletions(-) diff

[driver-core PATCH v9 6/9] driver core: Attach devices on CPU local to device node

2018-12-12 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- drivers/base/dd.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 3353f654861b..be9040db8321 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -829,7 +829,7 @@ static int __device_attach

[driver-core PATCH v9 0/9] Add NUMA aware async_schedule calls

2018-12-12 Thread Alexander Duyck
Dropped Reviewed-by from patch 2 due to significant changes Added Reviewed-by for patches reviewed by Luis Chamberlain v8->v9: Dropped patch 1 as it was applied, shifted remaining patches by 1 Added new patch 9 that adds test framework for NUMA and sequential init Tweaked what

[driver-core PATCH v9 2/9] device core: Consolidate locking and unlocking of parent and device

2018-12-12 Thread Alexander Duyck
-by: Luis Chamberlain Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/base.h |2 + drivers/base/bus.c | 23 ++-- drivers/base/dd.c | 95 --- 3

[driver-core PATCH v9 4/9] workqueue: Provide queue_work_node to queue work near a given NUMA node

2018-12-12 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- include/linux/workqueue.h |2 + kernel/workqueue.c| 84 + 2 files changed, 86 insertions(+) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 60d673e15632..1f50c1e586e7 100644 --- a/inc

[driver-core PATCH v9 1/9] driver core: Establish order of operations for device_add and device_del via bitflag

2018-12-12 Thread Alexander Duyck
makes more sense to just consolidate the dev->dead and dev->driver checks together into one set of checks. Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- drivers/base/core.c| 11 +++ drivers/base/dd.c | 22 +++--- include/linux/device.h

[driver-core PATCH v9 3/9] driver core: Probe devices asynchronously instead of the driver

2018-12-12 Thread Alexander Duyck
. To achieve this it was necessary to add a new member "async_driver" to the device_private structure to store the driver pointer while we wait on the deferred probe call. Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- drivers/base/base.h |2 ++ dr

Re: [driver-core PATCH v8 0/9] Add NUMA aware async_schedule calls

2018-12-10 Thread Alexander Duyck
On Mon, 2018-12-10 at 11:22 -0800, Luis Chamberlain wrote: > On Wed, Dec 05, 2018 at 09:25:13AM -0800, Alexander Duyck wrote: > > This patch set provides functionality that will help to improve the > > locality of the async_schedule calls used to provide deferred >

Re: [driver-core PATCH v8 2/9] driver core: Establish order of operations for device_add and device_del via bitflag

2018-12-10 Thread Alexander Duyck
On Mon, 2018-12-10 at 13:23 -0800, Dan Williams wrote: > On Mon, Dec 10, 2018 at 1:15 PM Dan Williams wrote: > > > > On Mon, Dec 10, 2018 at 12:58 PM Alexander Duyck > > wrote: > > [..] > > > Also the context for the two functions seems to b

Re: [driver-core PATCH v8 2/9] driver core: Establish order of operations for device_add and device_del via bitflag

2018-12-10 Thread Alexander Duyck
On Mon, 2018-12-10 at 11:43 -0800, Dan Williams wrote: > On Mon, Dec 10, 2018 at 11:35 AM Alexander Duyck > wrote: > > > > On Mon, 2018-12-10 at 10:58 -0800, Dan Williams wrote: > > > On Wed, Dec 5, 2018 at 9:25 AM Alexander Duyck > > > wrote: > &g

Re: [driver-core PATCH v8 2/9] driver core: Establish order of operations for device_add and device_del via bitflag

2018-12-10 Thread Alexander Duyck
On Mon, 2018-12-10 at 10:58 -0800, Dan Williams wrote: > On Wed, Dec 5, 2018 at 9:25 AM Alexander Duyck > wrote: > > > > Add an additional bit flag to the device struct named "dead". > > > > This additional flag provides a guarantee that when a device_

Re: [mm PATCH v6 6/7] mm: Add reserved flag setting to set_page_links

2018-12-05 Thread Alexander Duyck
On Wed, 2018-12-05 at 18:22 +0100, Michal Hocko wrote: > On Fri 30-11-18 13:53:18, Alexander Duyck wrote: > > Modify the set_page_links function to include the setting of the reserved > > flag via a simple AND and OR operation. The motivation for this is the fact > > that t

[driver-core PATCH v8 8/9] PM core: Use new async_schedule_dev command

2018-12-05 Thread Alexander Duyck
to the device. Reviewed-by: Dan Williams Reviewed-by: Bart Van Assche Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/power/main.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c

[driver-core PATCH v8 5/9] workqueue: Provide queue_work_node to queue work near a given NUMA node

2018-12-05 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- include/linux/workqueue.h |2 + kernel/workqueue.c| 84 + 2 files changed, 86 insertions(+) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 60d673e15632..1f50c1e586e7 100644 --- a/inc

[driver-core PATCH v8 9/9] libnvdimm: Schedule device registration on node local to the device

2018-12-05 Thread Alexander Duyck
. By guaranteeing NUMA locality I see an improvement of as high as 25% for per-node init of a system with 12TB of persistent memory. Reviewed-by: Bart Van Assche Signed-off-by: Alexander Duyck --- drivers/nvdimm/bus.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff

[driver-core PATCH v8 7/9] driver core: Attach devices on CPU local to device node

2018-12-05 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- drivers/base/dd.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index b731741059cb..49096adf96a1 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -833,7 +833,7 @@ static int __device_attach

[driver-core PATCH v8 6/9] async: Add support for queueing on specific NUMA node

2018-12-05 Thread Alexander Duyck
to the device. Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- include/linux/async.h | 82 +++-- kernel/async.c| 53 +--- 2 files changed, 108 insertions(+), 27 deletions(-) diff

[driver-core PATCH v8 2/9] driver core: Establish order of operations for device_add and device_del via bitflag

2018-12-05 Thread Alexander Duyck
ot present and could result in the device_del call attempting to remove a driver from an interface only to have the async worker attempt to probe the driver later when it finally completes the asynchronous probe call. Signed-off-by: Alexander Duyck --- drivers/base/core.c| 11 +++ dr

[driver-core PATCH v8 1/9] driver core: Move async_synchronize_full call

2018-12-05 Thread Alexander Duyck
of any asynchronous events the driver we are detaching might have scheduled. Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers") Reviewed-by: Luis Chamberlain Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- drivers

[driver-core PATCH v8 4/9] driver core: Probe devices asynchronously instead of the driver

2018-12-05 Thread Alexander Duyck
. To achieve this it was necessary to add a new member "async_driver" to the device_private structure to store the driver pointer while we wait on the deferred probe call. Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- drivers/base/base.h |2 ++ dr

Re: [PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning

2018-12-04 Thread Alexander Duyck
On Tue, 2018-12-04 at 18:24 -0500, Barret Rhoden wrote: > Hi - > > On 2018-12-04 at 14:51 Alexander Duyck > wrote: > > [snip] > > > > I think the confusion arises from the fact that there are a few MMIO > > > resources with a struct page

Re: [PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning

2018-12-04 Thread Alexander Duyck
On Tue, 2018-12-04 at 11:08 -0800, Dan Williams wrote: > On Mon, Dec 3, 2018 at 1:50 PM Alexander Duyck > wrote: > > > > On Mon, 2018-12-03 at 13:05 -0800, Dan Williams wrote: > > > On Mon, Dec 3, 2018 at 12:53 PM Alexander Duyck > > > wrote: > > >

Re: [PATCH RFC 0/3] Fix KVM misinterpreting Reserved page as an MMIO page

2018-12-04 Thread Alexander Duyck
On Tue, 2018-12-04 at 14:59 +0800, Yi Zhang wrote: > On 2018-12-03 at 11:25:20 -0800, Alexander Duyck wrote: > > I have loosely based this patch series off of the following patch series > > from Zhang Yi: > > https://lore.kernel.org/lkml/cover.1536342881.git.yi.z.

Re: [PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning

2018-12-03 Thread Alexander Duyck
On Mon, 2018-12-03 at 13:05 -0800, Dan Williams wrote: > On Mon, Dec 3, 2018 at 12:53 PM Alexander Duyck > wrote: > > > > On Mon, 2018-12-03 at 12:31 -0800, Dan Williams wrote: > > > On Mon, Dec 3, 2018 at 12:21 PM Alexander Duyck > > > wrote: > > >

Re: [PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning

2018-12-03 Thread Alexander Duyck
On Mon, 2018-12-03 at 12:31 -0800, Dan Williams wrote: > On Mon, Dec 3, 2018 at 12:21 PM Alexander Duyck > wrote: > > > > On Mon, 2018-12-03 at 11:47 -0800, Dan Williams wrote: > > > On Mon, Dec 3, 2018 at 11:25 AM Alexander Duyck > > > wrote: >

Re: [PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning

2018-12-03 Thread Alexander Duyck
On Mon, 2018-12-03 at 11:47 -0800, Dan Williams wrote: > On Mon, Dec 3, 2018 at 11:25 AM Alexander Duyck > wrote: > > > > Add a means of exposing if a pagemap supports refcount pinning. I am doing > > this to expose if a given pagemap has backing stru

[PATCH RFC 0/3] Fix KVM misinterpreting Reserved page as an MMIO page

2018-12-03 Thread Alexander Duyck
to determine support for page pinning to include a check of the pgmap to see if it supports pinning. --- Alexander Duyck (3): kvm: Split use cases for kvm_is_reserved_pfn to kvm_is_refcounted_pfn mm: Add support for exposing if dev_pagemap supports refcount pinning kvm: Add

[PATCH RFC 3/3] kvm: Add additional check to determine if a page is refcounted

2018-12-03 Thread Alexander Duyck
on the page. As such I have updated the check to add a special case for ZONE_DEVICE pages that have the new support_refcount_pinning flag set. Signed-off-by: Alexander Duyck --- virt/kvm/kvm_main.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/virt/kvm

[PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning

2018-12-03 Thread Alexander Duyck
-by: Alexander Duyck --- drivers/nvdimm/pfn_devs.c |2 ++ include/linux/memremap.h |5 - include/linux/mm.h| 11 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 6f22272e8d80..7a4a85bcf7f4

[PATCH RFC 1/3] kvm: Split use cases for kvm_is_reserved_pfn to kvm_is_refcounted_pfn

2018-12-03 Thread Alexander Duyck
issues with attempted LRU accesses on the ZONE_DEVICE pages. Signed-off-by: Alexander Duyck --- arch/x86/kvm/mmu.c |6 +++--- include/linux/kvm_host.h |2 +- virt/kvm/kvm_main.c | 22 +- 3 files changed, 17 insertions(+), 13 deletions(-) diff --git a/arch/x86

Re: [PATCH v2 0/3] kvm: Use huge pages for DAX-backed files

2018-12-03 Thread Alexander Duyck
On Mon, 2018-12-03 at 12:40 -0500, Barret Rhoden wrote: > On 2018-11-14 at 16:55 Dan Williams wrote: > > [ add Alex who is looking into removing PageReserved for DAX pages. ] > > Thanks. I can keep my eye out for his patches and repost once that's > done. > > Alternatively, if you all want

Re: [driver-core PATCH v7 4/9] driver core: Probe devices asynchronously instead of the driver

2018-12-03 Thread Alexander Duyck
On Fri, 2018-11-30 at 18:48 -0800, Luis Chamberlain wrote: > On Wed, Nov 28, 2018 at 04:32:26PM -0800, Alexander Duyck wrote: > > Probe devices asynchronously instead of the driver. > > +static void __driver_attach_async_helper(void *_dev, async_cookie_t cookie) > > +{ >

[mm PATCH v6 7/7] mm: Use common iterator for deferred_init_pages and deferred_free_pages

2018-11-30 Thread Alexander Duyck
as a result of this patch. Reviewed-by: Pavel Tatashin Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 146 +-- 1 file changed, 77 insertions(+), 69 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 61eb9945d805..48c6fc73a70d

[mm PATCH v6 5/7] mm: Move hot-plug specific memory init into separate functions and optimize

2018-11-30 Thread Alexander Duyck
such as the LRU pointers versus the pgmap. Reviewed-by: Pavel Tatashin Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 208 --- 1 file changed, 135 insertions(+), 73 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index

[mm PATCH v6 6/7] mm: Add reserved flag setting to set_page_links

2018-11-30 Thread Alexander Duyck
lue. Previously the generated value was written and then then a btsq instruction was issued. On my x86_64 test system with 3TB of persistent memory per node I saw the persistent memory initialization time on average drop from 23.49s to 19.12s per node. Reviewed-by: Pavel Tatashin Signed-off-by: Alexan

[mm PATCH v6 2/7] mm: Drop meminit_pfn_in_nid as it is redundant

2018-11-30 Thread Alexander Duyck
system with 384GB of memory per node I saw a reduction in initialization time from 2.80s to 1.85s as a result of this patch. Reviewed-by: Pavel Tatashin Acked-by: Michal Hocko Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 51 ++- 1 file

[mm PATCH v6 3/7] mm: Implement new zone specific memblock iterator

2018-11-30 Thread Alexander Duyck
help to speed up the search a bit by iterating until the end of a range is greater than the start of the zone pfn range, and will exit completely if the start is beyond the end of the zone. Reviewed-by: Pavel Tatashin Signed-off-by: Alexander Duyck --- include/linux/memblock.h | 25

[mm PATCH v6 1/7] mm: Use mm_zero_struct_page from SPARC on all 64b architectures

2018-11-30 Thread Alexander Duyck
-byte struct page. Mike Rapoport ran a similar test on a OpenPower (S812LC 8348-21C) with Power8 processor and 128GB or RAM. His results per 64-byte struct page were 4.68ns before, and 4.59ns after this patch. Reviewed-by: Pavel Tatashin Acked-by: Michal Hocko Signed-off-by: Alexander Duyck

[mm PATCH v6 0/7] Deferred page init improvements

2018-11-30 Thread Alexander Duyck
Updated several Reviewed-by, and incorporated suggestions from Pavel Added __init_single_page_nolru to patch 5 to consolidate code Refactored iterator in patch 7 and fixed several issues --- Alexander Duyck (7): mm: Use mm_zero_struct_page from SPARC on all 64b architectures mm: Drop m

[mm PATCH v6 4/7] mm: Initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections

2018-11-30 Thread Alexander Duyck
Tatashin Signed-off-by: Alexander Duyck --- include/linux/memblock.h | 16 + mm/page_alloc.c | 160 +- 2 files changed, 132 insertions(+), 44 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 95d1aa

Re: [driver-core PATCH v7 2/9] driver core: Establish clear order of operations for deferred probe and remove

2018-11-29 Thread Alexander Duyck
On Thu, 2018-11-29 at 10:55 -0800, Dan Williams wrote: > On Thu, Nov 29, 2018 at 10:07 AM Alexander Duyck > wrote: > > > > On Wed, 2018-11-28 at 17:57 -0800, Dan Williams wrote: > > [..] > > > I think the flag should be named "cancel" and set i

Re: [driver-core PATCH v7 2/9] driver core: Establish clear order of operations for deferred probe and remove

2018-11-29 Thread Alexander Duyck
On Wed, 2018-11-28 at 17:57 -0800, Dan Williams wrote: > On Wed, Nov 28, 2018 at 4:32 PM Alexander Duyck > wrote: > > > > Add an additional bit flag to the device struct named async_probe. This > > additional flag allows us to guarantee ordering between probe a

[driver-core PATCH v7 3/9] device core: Consolidate locking and unlocking of parent and device

2018-11-28 Thread Alexander Duyck
specific to attaching and detaching the driver while acquiring these locks. By doing this I was able to reduce the number of spots where we touch need_parent_lock from 12 down to 4. Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander

[driver-core PATCH v7 1/9] driver core: Move async_synchronize_full call

2018-11-28 Thread Alexander Duyck
of any asynchronous events the driver we are detaching might have scheduled. Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers") Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- drivers/base/dd.c |6 +++--- 1 file

[driver-core PATCH v7 4/9] driver core: Probe devices asynchronously instead of the driver

2018-11-28 Thread Alexander Duyck
. To achieve this it was necessary to add a new member "async_driver" to the device_private structure to store the driver pointer while we wait on the deferred probe call. Reviewed-by: Bart Van Assche Reviewed-by: Dan Williams Signed-off-by: Alexander Duyck --- drivers/base/base.h |2 + dr

[driver-core PATCH v7 8/9] PM core: Use new async_schedule_dev command

2018-11-28 Thread Alexander Duyck
to the device. Reviewed-by: Dan Williams Reviewed-by: Bart Van Assche Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/power/main.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c

[driver-core PATCH v7 7/9] driver core: Attach devices on CPU local to device node

2018-11-28 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- drivers/base/dd.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 036c8ffa522f..b24a5473c968 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -853,7 +853,7 @@ static int __device_attach

[driver-core PATCH v7 0/9] Add NUMA aware async_schedule calls

2018-11-28 Thread Alexander Duyck
eviewed-by for several patches Updated cover page and patch descriptions throughout the set --- Alexander Duyck (9): driver core: Move async_synchronize_full call driver core: Establish clear order of operations for deferred probe and remove device core: Consolidate loc

[driver-core PATCH v7 5/9] workqueue: Provide queue_work_node to queue work near a given NUMA node

2018-11-28 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- include/linux/workqueue.h |2 + kernel/workqueue.c| 84 + 2 files changed, 86 insertions(+) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 60d673e15632..1f50c1e586e7 100644 --- a/inc

[driver-core PATCH v7 2/9] driver core: Establish clear order of operations for deferred probe and remove

2018-11-28 Thread Alexander Duyck
Signed-off-by: Alexander Duyck --- drivers/base/dd.c | 16 include/linux/device.h |3 +++ 2 files changed, 19 insertions(+) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 88713f182086..ef3f70a7cb5a 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c

Re: [driver-core PATCH v6 4/9] driver core: Move async_synchronize_full call

2018-11-27 Thread Alexander Duyck
On Tue, 2018-11-27 at 12:35 -0800, Dan Williams wrote: > On Tue, Nov 27, 2018 at 9:38 AM Alexander Duyck > wrote: > > > > On Mon, 2018-11-26 at 18:11 -0800, Dan Williams wrote: > > > On Thu, Nov 8, 2018 at 10:07 AM Alexander Duyck > > > wrote: > > &g

Re: [driver-core PATCH v6 6/9] driver core: Probe devices asynchronously instead of the driver

2018-11-27 Thread Alexander Duyck
On Mon, 2018-11-26 at 18:48 -0800, Dan Williams wrote: > On Thu, Nov 8, 2018 at 10:07 AM Alexander Duyck > wrote: > > > > Probe devices asynchronously instead of the driver. This results in us > > seeing the same behavior if the device is registered before the driver o

Re: [driver-core PATCH v6 9/9] libnvdimm: Schedule device registration on node local to the device

2018-11-27 Thread Alexander Duyck
On Mon, 2018-11-26 at 18:21 -0800, Dan Williams wrote: > On Thu, Nov 8, 2018 at 10:07 AM Alexander Duyck > wrote: > > > > Force the device registration for nvdimm devices to be closer to the actual > > device. This is achieved by using either the NUMA node ID of the regi

Re: [driver-core PATCH v6 4/9] driver core: Move async_synchronize_full call

2018-11-27 Thread Alexander Duyck
On Mon, 2018-11-26 at 18:11 -0800, Dan Williams wrote: > On Thu, Nov 8, 2018 at 10:07 AM Alexander Duyck > wrote: > > > > Move the async_synchronize_full call out of __device_release_driver and > > into driver_detach. > > > > The idea behind this is

Re: [driver-core PATCH v5 5/9] driver core: Establish clear order of operations for deferred probe and remove

2018-11-27 Thread Alexander Duyck
On Mon, 2018-11-26 at 18:35 -0800, Dan Williams wrote: > On Mon, Nov 5, 2018 at 1:12 PM Alexander Duyck > wrote: > > > > This patch adds an additional bit to the device struct named async_probe. > > This additional bit allows us to guarantee ordering between probe a

Re: [mm PATCH v5 4/7] mm: Initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections

2018-11-19 Thread Alexander Duyck
On Fri, 2018-11-09 at 20:02 -0500, Pavel Tatashin wrote: > On 18-11-05 13:19:45, Alexander Duyck wrote: > > } > > - first_init_pfn = max(zone->zone_start_pfn, first_init_pfn); > > + > > + /* If the zone is empty somebody else

Re: [mm PATCH v5 0/7] Deferred page init improvements

2018-11-15 Thread Alexander Duyck
On 11/15/2018 12:10 AM, Michal Hocko wrote: On Wed 14-11-18 16:50:23, Alexander Duyck wrote: On 11/14/2018 7:07 AM, Michal Hocko wrote: On Mon 05-11-18 13:19:25, Alexander Duyck wrote: This patchset is essentially a refactor of the page initialization logic that is meant to provide

Re: [mm PATCH v5 0/7] Deferred page init improvements

2018-11-14 Thread Alexander Duyck
On 11/14/2018 7:07 AM, Michal Hocko wrote: On Mon 05-11-18 13:19:25, Alexander Duyck wrote: This patchset is essentially a refactor of the page initialization logic that is meant to provide for better code reuse while providing a significant improvement in deferred page initialization

Re: [mm PATCH v5 0/7] Deferred page init improvements

2018-11-12 Thread Alexander Duyck
On Fri, Nov 9, 2018 at 5:17 PM Pavel Tatashin wrote: > > On 18-11-09 16:46:02, Alexander Duyck wrote: > > On Fri, 2018-11-09 at 19:00 -0500, Pavel Tatashin wrote: > > > On 18-11-09 15:14:35, Alexander Duyck wrote: > > > > On Fri, 2018-11-09 at 16:15 -0500, Pavel T

Re: [mm PATCH v5 7/7] mm: Use common iterator for deferred_init_pages and deferred_free_pages

2018-11-12 Thread Alexander Duyck
On 11/9/2018 8:13 PM, Pavel Tatashin wrote: On 18-11-05 13:20:01, Alexander Duyck wrote: +static unsigned long __next_pfn_valid_range(unsigned long *i, + unsigned long end_pfn) { - if (!pfn_valid_within(pfn)) - return false

Re: [driver-core PATCH v6 2/9] async: Add support for queueing on specific NUMA node

2018-11-11 Thread Alexander Duyck
On 11/11/2018 12:35 PM, Greg KH wrote: On Sun, Nov 11, 2018 at 11:53:20AM -0800, Dan Williams wrote: On Sun, Nov 11, 2018 at 11:32 AM Greg KH wrote: On Thu, Nov 08, 2018 at 10:06:50AM -0800, Alexander Duyck wrote: Introduce four new variants of the async_schedule_ functions that allow

Re: [mm PATCH v5 0/7] Deferred page init improvements

2018-11-09 Thread Alexander Duyck
On Fri, 2018-11-09 at 19:00 -0500, Pavel Tatashin wrote: > On 18-11-09 15:14:35, Alexander Duyck wrote: > > On Fri, 2018-11-09 at 16:15 -0500, Pavel Tatashin wrote: > > > On 18-11-05 13:19:25, Alexander Duyck wrote: > > > > This patchset is essentially a refactor of

Re: [mm PATCH v5 3/7] mm: Implement new zone specific memblock iterator

2018-11-09 Thread Alexander Duyck
On Fri, 2018-11-09 at 18:26 -0500, Pavel Tatashin wrote: > > +/** > > + * for_each_free_mem_range_in_zone - iterate through zone specific free > > + * memblock areas > > + * @i: u64 used as loop variable > > + * @zone: zone in which all of the memory blocks reside > > + * @p_start: ptr to

Re: [driver-core PATCH v6 3/9] device core: Consolidate locking and unlocking of parent and device

2018-11-08 Thread Alexander Duyck
On Thu, 2018-11-08 at 14:43 -0800, jane@oracle.com wrote: > Hi, Alex, > > > On 11/08/2018 10:06 AM, Alexander Duyck wrote: > > +/* > > + * __device_driver_lock - release locks needed to manipulate dev->drv > > You meant to say __device_driver_unl

[driver-core PATCH v6 9/9] libnvdimm: Schedule device registration on node local to the device

2018-11-08 Thread Alexander Duyck
. By guaranteeing NUMA locality I see an improvement of as high as 25% for per-node init of a system with 12TB of persistent memory. Reviewed-by: Bart Van Assche Signed-off-by: Alexander Duyck --- drivers/nvdimm/bus.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff

[driver-core PATCH v6 4/9] driver core: Move async_synchronize_full call

2018-11-08 Thread Alexander Duyck
of any asynchronous events the driver we are detaching might have scheduled. Reviewed-by: Bart Van Assche Signed-off-by: Alexander Duyck --- drivers/base/dd.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 76c40fe69463

[driver-core PATCH v6 6/9] driver core: Probe devices asynchronously instead of the driver

2018-11-08 Thread Alexander Duyck
-off-by: Alexander Duyck --- drivers/base/bus.c | 23 ++-- drivers/base/dd.c | 68 include/linux/device.h | 10 ++- 3 files changed, 80 insertions(+), 21 deletions(-) diff --git a/drivers/base/bus.c b/drivers/base/bus.c

[driver-core PATCH v6 5/9] driver core: Establish clear order of operations for deferred probe and remove

2018-11-08 Thread Alexander Duyck
of "bool X:1" to define the bitfield to a "u8 X:1" setup in order to resolve some checkpatch warnings. Signed-off-by: Alexander Duyck --- drivers/base/dd.c | 104 +++- include/linux/device.h |3 + 2 files changed, 62 inserti

[driver-core PATCH v6 7/9] driver core: Attach devices on CPU local to device node

2018-11-08 Thread Alexander Duyck
memory for NVDIMM this can have a significant impact as initialing 3TB on remote node can take up to 39 seconds while initialing it on a local node only takes 23 seconds. It is situations like this where we will see the biggest improvement. Reviewed-by: Bart Van Assche Signed-off-by: Alexander Duyck

[driver-core PATCH v6 8/9] PM core: Use new async_schedule_dev command

2018-11-08 Thread Alexander Duyck
to the device. Reviewed-by: Bart Van Assche Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base/power/main.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index a690fd400260

[driver-core PATCH v6 1/9] workqueue: Provide queue_work_node to queue work near a given NUMA node

2018-11-08 Thread Alexander Duyck
to a node anyway instead of being directly used. As we are only using the first CPU to represent the NUMA node for now I am limiting the scope of the function so that it can only be used with unbound workqueues. Acked-by: Tejun Heo Reviewed-by: Bart Van Assche Signed-off-by: Alexander Duyck --

[driver-core PATCH v6 2/9] async: Add support for queueing on specific NUMA node

2018-11-08 Thread Alexander Duyck
and that device has a NUMA node other than NUMA_NO_NODE. The main motivation behind this is to address the need to be able to schedule device specific init work on specific NUMA nodes in order to improve performance of memory initialization. Signed-off-by: Alexander Duyck --- include/linux/async.h

[driver-core PATCH v6 3/9] device core: Consolidate locking and unlocking of parent and device

2018-11-08 Thread Alexander Duyck
specific to attaching and detaching the driver while acquiring these locks. By doing this I was able to reduce the number of spots where we touch need_parent_lock from 12 down to 4. Reviewed-by: Bart Van Assche Reviewed-by: Rafael J. Wysocki Signed-off-by: Alexander Duyck --- drivers/base

Re: [driver-core PATCH v5 5/9] driver core: Establish clear order of operations for deferred probe and remove

2018-11-06 Thread Alexander Duyck
; > > > url: > > https://github.com/0day-ci/linux/commits/Alexander-Duyck/Add-NUMA-aware-async_schedule-calls/20181106-093800 > > reproduce: make htmldocs > > > > All warnings (new ones prefixed by >>): > > > >include/net/mac80211.h:1

Re: [driver-core PATCH v5 1/9] workqueue: Provide queue_work_node to queue work near a given NUMA node

2018-11-06 Thread Alexander Duyck
On Mon, 2018-11-05 at 16:42 -0800, Bart Van Assche wrote: > On Mon, 2018-11-05 at 13:11 -0800, Alexander Duyck wrote: > > +/** > > + * workqueue_select_cpu_near - Select a CPU based on NUMA node > > + * @node: NUMA node ID that we wa

Re: [driver-core PATCH v5 0/9] Add NUMA aware async_schedule calls

2018-11-06 Thread Alexander Duyck
On Mon, 2018-11-05 at 16:50 -0800, Bart Van Assche wrote: > On Mon, 2018-11-05 at 13:11 -0800, Alexander Duyck wrote: > > This patch set provides functionality that will help to improve the > > locality of the async_schedule calls used to provide deferred > > initializati

Re: [driver-core PATCH v5 4/9] driver core: Move async_synchronize_full call

2018-11-06 Thread Alexander Duyck
On Mon, 2018-11-05 at 17:04 -0800, Bart Van Assche wrote: > On Mon, 2018-11-05 at 13:11 -0800, Alexander Duyck wrote: > > This patch moves the async_synchronize_full call out of > > __device_release_driver and into driver_detach. > > > > The idea behind this is that

[mm PATCH v5 7/7] mm: Use common iterator for deferred_init_pages and deferred_free_pages

2018-11-05 Thread Alexander Duyck
to 1.06s as a result of this patch. Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 134 +++ 1 file changed, 65 insertions(+), 69 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9eb993a9be99..521b94eb02a0 100644 --- a/mm

  1   2   >