Re: [PATCH 2/6] mm/swapfile.c: Replace some #ifdef with IS_ENABLED()

2018-07-13 Thread Daniel Jordan
On Fri, Jul 13, 2018 at 07:36:32AM +0800, Huang, Ying wrote: > @@ -1260,7 +1257,6 @@ static void swapcache_free(swp_entry_t entry) > } > } > > -#ifdef CONFIG_THP_SWAP > static void swapcache_free_cluster(swp_entry_t entry) > { > unsigned long offset = swp_offset(entry); > @@ -1271,

Re: [PATCH 2/6] mm/swapfile.c: Replace some #ifdef with IS_ENABLED()

2018-07-13 Thread Daniel Jordan
On Fri, Jul 13, 2018 at 11:38:19AM -0700, Daniel Jordan wrote: > On Fri, Jul 13, 2018 at 07:36:32AM +0800, Huang, Ying wrote: > > @@ -1260,7 +1257,6 @@ static void swapcache_free(swp_entry_t entry) > > } > > } > > > > -#ifdef CONFIG_THP_SWAP > &

Re: [PATCH 3/6] swap: Unify normal/huge code path in swap_page_trans_huge_swapped()

2018-07-13 Thread Daniel Jordan
On Fri, Jul 13, 2018 at 07:36:33AM +0800, Huang, Ying wrote: > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 75c84aa763a3..160f78072667 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -270,7 +270,10 @@ static inline void cluster_set_null(struct > swap_cluster_info *info) > > static

Re: [PATCH 6/6] swap, put_swap_page: Share more between huge/normal code path

2018-07-13 Thread Daniel Jordan
On Fri, Jul 13, 2018 at 07:36:36AM +0800, Huang, Ying wrote: > From: Huang Ying > > In this patch, locking related code is shared between huge/normal code > path in put_swap_page() to reduce code duplication. And `free_entries > == 0` case is merged into more general `free_entries != > SWAPFILE_

Re: [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free

2018-03-21 Thread Daniel Jordan
On 03/20/2018 04:54 AM, Aaron Lu wrote: ...snip... reduced zone->lock contention on free path from 35% to 1.1%. Also, it shows good result on parallel free(*) workload by reducing zone->lock contention from 90% to almost zero(lru lock increased from almost 0 to 90% though). Hi Aaron, I'm lookin

Re: [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free

2018-03-22 Thread Daniel Jordan
On 03/21/2018 09:30 PM, Aaron Lu wrote: On Wed, Mar 21, 2018 at 01:44:25PM -0400, Daniel Jordan wrote: On 03/20/2018 04:54 AM, Aaron Lu wrote: ...snip... reduced zone->lock contention on free path from 35% to 1.1%. Also, it shows good result on parallel free(*) workload by reducing zone-&g

Re: [RFC PATCH v2 2/4] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-03-22 Thread Daniel Jordan
On 03/22/2018 01:15 PM, Matthew Wilcox wrote: On Tue, Mar 20, 2018 at 10:11:01PM +0800, Aaron Lu wrote: A new document file called "struct_page_filed" is added to explain the newly reused field in "struct page". Sounds rather ad-hoc for a single field, I'd rather document it via comments.

Re: [PATCH -V5 RESEND 03/21] swap: Support PMD swap mapping in swap_duplicate()

2018-09-26 Thread Daniel Jordan
On Wed, Sep 26, 2018 at 08:55:59PM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > On Tue, Sep 25, 2018 at 03:13:30PM +0800, Huang Ying wrote: > >> /* > >> * Increase reference count of swap entry by 1. > >> - * Returns 0 for success, or -EN

Re: [PATCH -V5 RESEND 03/21] swap: Support PMD swap mapping in swap_duplicate()

2018-09-27 Thread Daniel Jordan
On Thu, Sep 27, 2018 at 09:34:36AM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > On Wed, Sep 26, 2018 at 08:55:59PM +0800, Huang, Ying wrote: > >> Daniel Jordan writes: > >> > On Tue, Sep 25, 2018 at 03:13:30PM +0800, Huang Ying wrote: > >> >>

Re: [PATCH -V5 RESEND 03/21] swap: Support PMD swap mapping in swap_duplicate()

2018-09-28 Thread Daniel Jordan
On Fri, Sep 28, 2018 at 04:19:03PM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > One way is to change > > copy_one_pte's return to int so we can just pass the error code back to > > copy_pte_range so it knows whether to try adding the continuation. > >

Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node

2018-11-27 Thread Daniel Jordan
On Tue, Nov 27, 2018 at 12:12:28AM +, Elliott, Robert (Persistent Memory) wrote: > I ran a short test with: > * HPE ProLiant DL360 Gen9 system > * Intel Xeon E5-2699 CPU with 18 physical cores (0-17) and > 18 hyperthreaded cores (36-53) > * DDR4 NVDIMM-Ns (which run at regular DRAM DIMM spe

Re: [RFC PATCH v4 05/13] workqueue, ktask: renice helper threads to prevent starvation

2018-11-20 Thread Daniel Jordan
On Tue, Nov 20, 2018 at 08:33:19AM -0800, Tejun Heo wrote: > On Mon, Nov 19, 2018 at 08:45:54AM -0800, Daniel Jordan wrote: > > So instead of flush_work_at_nice, how about this?: > > > > void renice_work_sync(work_struct *work, long nice); > > Wouldn't renice_or_

Re: [RFC PATCH v4 01/13] ktask: add documentation

2018-11-28 Thread Daniel Jordan
On Tue, Nov 27, 2018 at 08:50:08PM +0100, Pavel Machek wrote: > Hi! Hi, Pavel. > > + > > +ktask: parallelize CPU-intensive kernel work > > + > > + > > +:Date: November,

Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node

2018-11-19 Thread Daniel Jordan
On Mon, Nov 12, 2018 at 10:15:46PM +, Elliott, Robert (Persistent Memory) wrote: > > > > -Original Message- > > From: Daniel Jordan > > Sent: Monday, November 12, 2018 11:54 AM > > To: Elliott, Robert (Persistent Memory) > > Cc: Danie

Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node

2018-11-19 Thread Daniel Jordan
On Mon, Nov 12, 2018 at 08:54:12AM -0800, Daniel Jordan wrote: > On Sat, Nov 10, 2018 at 03:48:14AM +, Elliott, Robert (Persistent Memory) > wrote: > > > -Original Message- > > > From: linux-kernel-ow...@vger.kernel.org > > ow...@vger.kernel.org> On B

Re: [RFC PATCH v4 05/13] workqueue, ktask: renice helper threads to prevent starvation

2018-11-19 Thread Daniel Jordan
On Tue, Nov 13, 2018 at 08:34:00AM -0800, Tejun Heo wrote: > Hello, Daniel. Hi Tejun, sorry for the delay. Plumbers... > On Mon, Nov 05, 2018 at 11:55:50AM -0500, Daniel Jordan wrote: > > static bool start_flush_work(struct work_struct *work, struct wq_barrie

Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node

2018-11-12 Thread Daniel Jordan
On Sat, Nov 10, 2018 at 03:48:14AM +, Elliott, Robert (Persistent Memory) wrote: > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org > ow...@vger.kernel.org> On Behalf Of Daniel Jordan > > Sent: Monday, November 05, 2018 10:56 AM > > Su

Re: [PATCH -V7 RESEND 08/21] swap: Support to read a huge swap cluster for swapin a THP

2018-11-30 Thread Daniel Jordan
Hi Ying, On Tue, Nov 20, 2018 at 04:54:36PM +0800, Huang Ying wrote: > diff --git a/mm/swap_state.c b/mm/swap_state.c > index 97831166994a..1eedbc0aede2 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -387,14 +389,42 @@ struct page *__read_swap_cache_async(swp_entry_t entry, > gfp_t g

Re: [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work

2018-11-30 Thread Daniel Jordan
On Fri, Nov 30, 2018 at 11:18:19AM -0800, Tejun Heo wrote: > Hello, > > On Mon, Nov 05, 2018 at 11:55:45AM -0500, Daniel Jordan wrote: > > Michal, you mentioned that ktask should be sensitive to CPU utilization[1]. > > ktask threads now run at the lowest priority o

Re: [PATCH -V7 RESEND 08/21] swap: Support to read a huge swap cluster for swapin a THP

2018-12-03 Thread Daniel Jordan
On Sat, Dec 01, 2018 at 08:34:06AM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > What do you think? > > I think that swapoff() which is the main user of try_to_unuse() isn't a > common operation in practical. So it's not necessary to make it more > comp

Re: [PATCH -V6 00/21] swap: Swapout/swapin THP in one piece

2018-11-08 Thread Daniel Jordan
v6 crc_ccitt autofs4 --8<--- /* * stress-usage-counts.c * * gcc -o stress-usage-counts stress-usage-counts.c -pthread * * Daniel Jordan */ #define _GNU_SOURCE #include #include #include #include #include #include #include #incl

Re: [PATCH -mm -v4 14/21] mm, cgroup, THP, swap: Support to move swap account for PMD swap mapping

2018-07-09 Thread Daniel Jordan
On Fri, Jun 22, 2018 at 11:51:44AM +0800, Huang, Ying wrote: > Because there is no way to prevent a huge swap cluster from being > split except when it has SWAP_HAS_CACHE flag set. What about making get_mctgt_type_thp take the cluster lock? That function would be the first lock_cluster user outsi

Re: [PATCH -mm -v4 14/21] mm, cgroup, THP, swap: Support to move swap account for PMD swap mapping

2018-07-10 Thread Daniel Jordan
On Tue, Jul 10, 2018 at 03:49:58PM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > > On Fri, Jun 22, 2018 at 11:51:44AM +0800, Huang, Ying wrote: > >> Because there is no way to prevent a huge swap cluster from being > >> split except when it has SWAP_HAS_CACH

Re: [PATCH -mm -v4 08/21] mm, THP, swap: Support to read a huge swap cluster for swapin a THP

2018-07-03 Thread Daniel Jordan
On Fri, Jun 22, 2018 at 11:51:38AM +0800, Huang, Ying wrote: > @@ -411,14 +414,32 @@ struct page *__read_swap_cache_async(swp_entry_t entry, > gfp_t gfp_mask, ... > + if (thp_swap_supported() && huge_cluster) { > + gfp_t gfp = alloc_hugepage_direct_g

Re: [PATCH -mm -v4 05/21] mm, THP, swap: Support PMD swap mapping in free_swap_and_cache()/swap_free()

2018-07-05 Thread Daniel Jordan
On Fri, Jun 22, 2018 at 11:51:35AM +0800, Huang, Ying wrote: > +static unsigned char swap_free_cluster(struct swap_info_struct *si, > +swp_entry_t entry) ... > + /* Cluster has been split, free each swap entries in cluster */ > + if (!cluster_is_huge(ci))

Re: [PATCH -mm -V3 03/21] mm, THP, swap: Support PMD swap mapping in swap_duplicate()

2018-06-12 Thread Daniel Jordan
On Tue, Jun 12, 2018 at 09:23:19AM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > #2: We've masked off SWAP_HAS_CACHE and COUNT_CONTINUED, and already > > checked > > for SWAP_MAP_BAD, so I think condition #2 always fails and can just be > > removed. &g

Re: [PATCH -mm -V3 03/21] mm, THP, swap: Support PMD swap mapping in swap_duplicate()

2018-06-12 Thread Daniel Jordan
On Tue, Jun 12, 2018 at 11:15:28AM +0800, Huang, Ying wrote: > "Huang, Ying" writes: > >> On Wed, May 23, 2018 at 04:26:07PM +0800, Huang, Ying wrote: > >>> @@ -3516,11 +3512,39 @@ static int __swap_duplicate(swp_entry_t entry, > >>> unsigned char usage) > >> > >> Two comments about this part of

[PATCH] mm, swap: fix swap_count comment about nonexistent SWAP_HAS_CONT

2018-06-12 Thread Daniel Jordan
570a335b8e22 ("swap_info: swap count continuations") introduces COUNT_CONTINUED but refers to it incorrectly as SWAP_HAS_CONT in a comment in swap_count. Fix it. Fixes: 570a335b8e22 ("swap_info: swap count continuations") Signed-off-by: Daniel Jordan Cc: "Huang, Ying&qu

Re: [PATCH -mm -V3 03/21] mm, THP, swap: Support PMD swap mapping in swap_duplicate()

2018-06-12 Thread Daniel Jordan
On Tue, Jun 12, 2018 at 09:23:19AM +0800, Huang, Ying wrote: > Daniel Jordan writes: > >> +#else > >> +static inline int __swap_duplicate_cluster(swp_entry_t *entry, > > > > This doesn't need inline. > > Why not? This is just a one line stub. F

Re: [PATCH -mm -V3 03/21] mm, THP, swap: Support PMD swap mapping in swap_duplicate()

2018-06-13 Thread Daniel Jordan
On Wed, Jun 13, 2018 at 09:26:54AM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > > On Tue, Jun 12, 2018 at 09:23:19AM +0800, Huang, Ying wrote: > >> Daniel Jordan writes: > >> >> +#else > >> >> +static inline int __swap_duplicate_clu

Re: [PATCH -V9 07/21] swap: Support PMD swap mapping when splitting huge PMD

2018-12-18 Thread Daniel Jordan
+Aneesh On Fri, Dec 14, 2018 at 02:27:40PM +0800, Huang Ying wrote: > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index bd2543e10938..49df3e7c96c7 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > +int split_swap_cluster_map(swp_entry_t entry) ... > + VM_BUG_ON(!IS_ALI

Re: [PATCH -V9 10/21] swap: Swapin a THP in one piece

2018-12-18 Thread Daniel Jordan
On Fri, Dec 14, 2018 at 02:27:43PM +0800, Huang Ying wrote: > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 1cec1eec340e..644cb5d6b056 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -33,6 +33,8 @@ > #include > #include > #include > +#include > +#include swap.h is

Re: [v4 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-02 Thread Daniel Jordan
On Sun, Dec 30, 2018 at 12:49:34PM +0800, Yang Shi wrote: > The test on my virtual machine with congested HDD shows long tail > latency is reduced significantly. > > Without the patch > page_fault1_thr-1490 [023] 129.311706: funcgraph_entry: #57377.796 > us | do_swap_page(); > page_fau

Re: [v4 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-03 Thread Daniel Jordan
On Thu, Jan 03, 2019 at 09:10:13AM -0800, Yang Shi wrote: > How about the below description: > > The test with page_fault1 of will-it-scale (sometimes tracing may just show > runtest.py that is the wrapper script of page_fault1), which basically > launches NR_CPU threads to generate 128MB anonymou

Re: [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work

2018-11-05 Thread Daniel Jordan
On Mon, Nov 05, 2018 at 06:29:31PM +0100, Michal Hocko wrote: > On Mon 05-11-18 11:55:45, Daniel Jordan wrote: > > Michal, you mentioned that ktask should be sensitive to CPU utilization[1]. > > ktask threads now run at the lowest priority on the system to avoid > > disturbin

Re: [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work

2018-11-05 Thread Daniel Jordan
Hi Zi, On Mon, Nov 05, 2018 at 01:49:14PM -0500, Zi Yan wrote: > On 5 Nov 2018, at 11:55, Daniel Jordan wrote: > > Do you think if it makes sense to use ktask for huge page migration (the data > copy part)? It certainly could. > I did some experiments back in 2016[1], wh

Re: [RFC PATCH v4 02/13] ktask: multithread CPU-intensive kernel work

2018-11-05 Thread Daniel Jordan
On Mon, Nov 05, 2018 at 12:51:33PM -0800, Randy Dunlap wrote: > On 11/5/18 8:55 AM, Daniel Jordan wrote: > > diff --git a/init/Kconfig b/init/Kconfig > > index 41583f468cb4..ed82f76ed0b7 100644 > > --- a/init/Kconfig > > +++ b/init/Kconfig > > @@ -346,6 +346,17 @@

Re: [RFC PATCH v4 01/13] ktask: add documentation

2018-11-05 Thread Daniel Jordan
On Mon, Nov 05, 2018 at 01:19:50PM -0800, Randy Dunlap wrote: > On 11/5/18 8:55 AM, Daniel Jordan wrote: > > Hi, > > > +Resource Limits > > +=== > > + > > +ktask has resource limits on the number of w

Re: [RFC PATCH v4 06/13] vfio: parallelize vfio_pin_map_dma

2018-11-05 Thread Daniel Jordan
On Mon, Nov 05, 2018 at 02:51:41PM -0700, Alex Williamson wrote: > On Mon, 5 Nov 2018 11:55:51 -0500 > Daniel Jordan wrote: > > +static int vfio_pin_map_dma_chunk(unsigned long start_vaddr, > > + unsigned long end_vaddr, > > +

Re: [RFC PATCH v4 01/13] ktask: add documentation

2018-11-06 Thread Daniel Jordan
On Tue, Nov 06, 2018 at 09:49:11AM +0100, Peter Zijlstra wrote: > On Mon, Nov 05, 2018 at 11:55:46AM -0500, Daniel Jordan wrote: > > +Concept > > +=== > > + > > +ktask is built on unbound workqueues to take advantage of the thread > > management >

Re: [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work

2018-11-06 Thread Daniel Jordan
On Mon, Nov 05, 2018 at 09:48:56PM -0500, Zi Yan wrote: > On 5 Nov 2018, at 21:20, Daniel Jordan wrote: > > > Hi Zi, > > > > On Mon, Nov 05, 2018 at 01:49:14PM -0500, Zi Yan wrote: > >> On 5 Nov 2018, at 11:55, Daniel Jordan wrote: > >> > >> D

Re: [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work

2018-11-07 Thread Daniel Jordan
On Tue, Nov 06, 2018 at 10:21:45AM +0100, Michal Hocko wrote: > On Mon 05-11-18 17:29:55, Daniel Jordan wrote: > > On Mon, Nov 05, 2018 at 06:29:31PM +0100, Michal Hocko wrote: > > > On Mon 05-11-18 11:55:45, Daniel Jordan wrote: > > > > Michal, you mentioned that kt

Re: [RFC PATCH v4 01/13] ktask: add documentation

2018-11-07 Thread Daniel Jordan
On Wed, Nov 07, 2018 at 11:27:52AM +0100, Peter Zijlstra wrote: > On Tue, Nov 06, 2018 at 08:51:54PM +, Jason Gunthorpe wrote: > > On Tue, Nov 06, 2018 at 12:34:11PM -0800, Daniel Jordan wrote: > > > > > > What isn't clear is if this calling thread is wait

Re: [RFC PATCH v4 01/13] ktask: add documentation

2018-11-07 Thread Daniel Jordan
On Wed, Nov 07, 2018 at 11:35:54AM +0100, Peter Zijlstra wrote: > On Tue, Nov 06, 2018 at 12:34:11PM -0800, Daniel Jordan wrote: > > On Tue, Nov 06, 2018 at 09:49:11AM +0100, Peter Zijlstra wrote: > > > On Mon, Nov 05, 2018 at 11:55:46AM -0500, Daniel Jordan wrote

Re: [RFC PATCH v4 01/13] ktask: add documentation

2018-11-08 Thread Daniel Jordan
On Thu, Nov 08, 2018 at 10:26:38AM -0700, Jonathan Corbet wrote: > On Mon, 5 Nov 2018 11:55:46 -0500 > Daniel Jordan wrote: > > > Motivates and explains the ktask API for kernel clients. > > A couple of quick thoughts: > > - Agree with Peter on the use of "task

[RFC PATCH v4 01/13] ktask: add documentation

2018-11-05 Thread Daniel Jordan
Motivates and explains the ktask API for kernel clients. Signed-off-by: Daniel Jordan --- Documentation/core-api/index.rst | 1 + Documentation/core-api/ktask.rst | 213 +++ 2 files changed, 214 insertions(+) create mode 100644 Documentation/core-api/ktask.rst

[RFC PATCH v4 03/13] ktask: add undo support

2018-11-05 Thread Daniel Jordan
allback to fail. For simplicity and because it's a slow path, undoing is not multithreaded. Signed-off-by: Daniel Jordan --- include/linux/ktask.h | 36 +++- kernel/ktask.c| 125 +++--- 2 files changed, 138 insertions(+), 23 deletions(-

[RFC PATCH v4 06/13] vfio: parallelize vfio_pin_map_dma

2018-11-05 Thread Daniel Jordan
later in the series. Signed-off-by: Daniel Jordan --- drivers/vfio/vfio_iommu_type1.c | 106 +++- 1 file changed, 76 insertions(+), 30 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index d9fd3188615d..e7cfbf0c8071 100644

[RFC PATCH v4 12/13] mm: parallelize clear_gigantic_page

2018-11-05 Thread Daniel Jordan
emory from multiple nodes, so that speedups increase as the size increases. Signed-off-by: Daniel Jordan --- mm/memory.c | 32 1 file changed, 24 insertions(+), 8 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 15c417e8e31d..445d06537905 100644 --- a/m

[RFC PATCH v4 10/13] mm: enlarge type of offset argument in mem_map_offset and mem_map_next

2018-11-05 Thread Daniel Jordan
tion of 'offset' from unsigned long to int by the sole caller of mem_map_offset, follow_hugetlb_page. Signed-off-by: Daniel Jordan --- mm/internal.h | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 3b1ec1412fd2..cc90de4d4e01 1

[RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work

2018-11-05 Thread Daniel Jordan
cz [2] https://lkml.kernel.org/r/1458339291-4093-1-git-send-email-...@redhat.com [3] https://lkml.kernel.org/r/20180928153922.ga17...@ziepe.ca [4] https://lkml.kernel.org/r/1489568404-7817-1-git-send-email-aaron...@intel.com [5] https://www.redhat.com/archives/vfio-users/2018-April/msg00020.html Da

[RFC PATCH v4 02/13] ktask: multithread CPU-intensive kernel work

2018-11-05 Thread Daniel Jordan
level, starting these threads, and load balancing the work between them. The Documentation patch earlier in this series, from which the above was swiped, has more background. Inspired by work from Pavel Tatashin, Steve Sistare, and Jonathan Adams. Signed-off-by: Daniel Jordan Suggested-by: Pavel

[RFC PATCH v4 13/13] hugetlbfs: parallelize hugetlbfs_fallocate with ktask

2018-11-05 Thread Daniel Jordan
that 31% of the L1d misses were on hugetlb_fault_mutex_table[hash] in the 16-thread case. Signed-off-by: Daniel Jordan --- fs/hugetlbfs/inode.c | 114 +++ 1 file changed, 93 insertions(+), 21 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs

[RFC PATCH v4 08/13] vfio: remove unnecessary mmap_sem writer acquisition around locked_vm

2018-11-05 Thread Daniel Jordan
Now that mmap_sem is no longer required for modifying locked_vm, remove it in the VFIO code. [XXX Can be sent separately, along with similar conversions in the other places mmap_sem was taken for locked_vm. While at it, could make similar changes to pinned_vm.] Signed-off-by: Daniel Jordan

[RFC PATCH v4 09/13] vfio: relieve mmap_sem reader cacheline bouncing by holding it longer

2018-11-05 Thread Daniel Jordan
Profiling shows significant time being spent on atomic ops in mmap_sem reader acquisition. mmap_sem is taken and dropped for every single base page during pinning, so this is not surprising. Reduce the number of times mmap_sem is taken by holding for longer, which relieves atomic cacheline bounci

[RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node

2018-11-05 Thread Daniel Jordan
ime per stdev node (ms) baseline (4.15-rc2)1261 1.9 ktask 3.88x325 5.0 Signed-off-by: Daniel Jordan Suggested-by: Pavel Tatashin --- mm/page_alloc.c | 91 ++--- 1 file changed, 78 in

[RFC PATCH v4 07/13] mm: change locked_vm's type from unsigned long to atomic_long_t

2018-11-05 Thread Daniel Jordan
run more often. Performance results appear later in the series. On powerpc, this was cross-compiled-tested only. [XXX Can send separately.] Signed-off-by: Daniel Jordan --- arch/powerpc/kvm/book3s_64_vio.c| 15 --- arch/powerpc/mm/mmu_context_iommu.c | 16 d

[RFC PATCH v4 04/13] ktask: run helper threads at MAX_NICE

2018-11-05 Thread Daniel Jordan
ge clearing, so I imagine the results would be similar for them. [1] lkml.kernel.org/r/20171206143509.gg7...@dhcp22.suse.cz Signed-off-by: Daniel Jordan --- kernel/ktask.c | 12 1 file changed, 12 insertions(+) diff --git a/kernel/ktask.c b/kernel/ktask.c index b91c62f14dcd..7229

[RFC PATCH v4 05/13] workqueue, ktask: renice helper threads to prevent starvation

2018-11-05 Thread Daniel Jordan
( 2.03)16.24 ( 0.90) usemem47.54 ( 0.89) 48.18 ( 0.77)49.70 ( 1.20) These results are similar to case 1's, though the differences between times are not quite as pronounced because ktask_vfio ran shorter compared to usemem. Signed-off-by: Daniel

Re: [PATCH -V6 00/21] swap: Swapout/swapin THP in one piece

2018-10-23 Thread Daniel Jordan
dma_direct_supported", which was required to get them to build. ---8<--- /* * swap-verify.c - helper to verify contents of swapped out pages * * Daniel Jordan */ #define _GNU_SOURCE #include #include #include

Re: [PATCH -V6 00/21] swap: Swapout/swapin THP in one piece

2018-10-24 Thread Daniel Jordan
On Wed, Oct 24, 2018 at 11:31:42AM +0800, Huang, Ying wrote: > Hi, Daniel, > > Daniel Jordan writes: > > > On Wed, Oct 10, 2018 at 03:19:03PM +0800, Huang Ying wrote: > >> And for all, Any comment is welcome! > >> > >> This patchset is based on the

Re: [PATCH -V6 06/21] swap: Support PMD swap mapping when splitting huge PMD

2018-10-24 Thread Daniel Jordan
On Wed, Oct 10, 2018 at 03:19:09PM +0800, Huang Ying wrote: > +#ifdef CONFIG_THP_SWAP > +/* > + * The corresponding page table shouldn't be changed under us, that > + * is, the page table lock should be held. > + */ > +int split_swap_cluster_map(swp_entry_t entry) > +{ > + struct swap_info_stru

Re: [PATCH -V6 14/21] swap: Support to move swap account for PMD swap mapping

2018-10-24 Thread Daniel Jordan
On Wed, Oct 10, 2018 at 03:19:17PM +0800, Huang Ying wrote: > +static struct page *mc_handle_swap_pmd(struct vm_area_struct *vma, > + pmd_t pmd, swp_entry_t *entry) > +{ Got /home/dbbench/linux/mm/memcontrol.c:4719:21: warning: ‘mc_handle_swap_pmd’ defined but not used [-Wunus

Re: [PATCH -V6 06/21] swap: Support PMD swap mapping when splitting huge PMD

2018-10-25 Thread Daniel Jordan
On Thu, Oct 25, 2018 at 08:54:16AM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > > On Wed, Oct 10, 2018 at 03:19:09PM +0800, Huang Ying wrote: > >> +#ifdef CONFIG_THP_SWAP > >> +/* > >> + * The corresponding page table shouldn't be changed und

Re: [RFC v4 PATCH 2/5] mm/__free_one_page: skip merge for order-0 page unless compaction failed

2018-10-19 Thread Daniel Jordan
On Fri, Oct 19, 2018 at 09:54:35AM +0100, Mel Gorman wrote: > On Fri, Oct 19, 2018 at 01:57:03PM +0800, Aaron Lu wrote: > > > > > > I don't think this is the right way of thinking about it because it's > > > possible to have the system split in such a way so that the migration > > > scanner only e

Re: [RFC PATCH v2 0/8] lru_lock scalability and SMP list functions

2018-10-19 Thread Daniel Jordan
On Fri, Oct 19, 2018 at 01:35:11PM +0200, Vlastimil Babka wrote: > On 9/11/18 2:42 AM, Daniel Jordan wrote: > > On large systems, lru_lock can become heavily contended in memory-intensive > > workloads such as decision support, applications that manage their memory > > manua

Re: [PATCH v2 0/7] swap: THP optimizing refactoring

2018-07-17 Thread Daniel Jordan
On Tue, Jul 17, 2018 at 08:55:49AM +0800, Huang, Ying wrote: > This patchset is based on 2018-07-13 head of mmotm tree. Looks good. Still think patch 7 would be easier to review if split into two logical changes. Either way, though. For the series, Reviewed-by: Daniel Jordan

Control dependency between prior load in while condition and later store?

2018-04-04 Thread Daniel Jordan
A question for memory-barriers.txt aficionados. Is there a control dependency between the prior load of 'a' and the later store of 'c'?: while (READ_ONCE(a)); WRITE_ONCE(c, 1); I have my doubts because memory-barriers.txt doesn't talk much about loops and because of what that document sa

Re: Control dependency between prior load in while condition and later store?

2018-04-04 Thread Daniel Jordan
On 04/04/2018 04:35 PM, Alan Stern wrote: On Wed, 4 Apr 2018, Daniel Jordan wrote: A question for memory-barriers.txt aficionados. Is there a control dependency between the prior load of 'a' and the later store of 'c'?: while (READ_ONCE(a)); WRITE_ONCE(c, 1);

Re: [RFC PATCH v2 3/4] mm/rmqueue_bulk: alloc without touching individual page structure

2018-03-29 Thread Daniel Jordan
On 03/21/2018 11:01 AM, Aaron Lu wrote: I'm sorry, but I feel the added complexity here is simply too large to justify the change. Especially if the motivation seems to be just the microbenchmark. It would be better if this was motivated by a real workload where zone lock contention was identifie

Re: [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free

2018-03-29 Thread Daniel Jordan
On 03/20/2018 04:54 AM, Aaron Lu wrote: This series is meant to improve zone->lock scalability for order 0 pages. With will-it-scale/page_fault1 workload, on a 2 sockets Intel Skylake server with 112 CPUs, CPU spend 80% of its time spinning on zone->lock. Perf profile shows the most time consumin

Re: [PATCH -mm -V3 00/21] mm, THP, swap: Swapout/swapin THP in one piece

2018-06-04 Thread Daniel Jordan
On Wed, May 23, 2018 at 04:26:04PM +0800, Huang, Ying wrote: > And for all, Any comment is welcome! > > This patchset is based on the 2018-05-18 head of mmotm/master. Trying to review this and it doesn't apply to mmotm-2018-05-18-16-44. git fails on patch 10: Applying: mm, THP, swap: Support to

Re: [PATCH -mm -V3 00/21] mm, THP, swap: Swapout/swapin THP in one piece

2018-06-05 Thread Daniel Jordan
On Tue, Jun 05, 2018 at 12:30:13PM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > > On Wed, May 23, 2018 at 04:26:04PM +0800, Huang, Ying wrote: > >> And for all, Any comment is welcome! > >> > >> This patchset is based on the 2018-05-18 head of

[PATCH 3/6] vfio/spapr_tce: drop mmap_sem now that locked_vm is atomic

2019-04-02 Thread Daniel Jordan
With locked_vm now an atomic, there is no need to take mmap_sem as writer. Delete and refactor accordingly. Signed-off-by: Daniel Jordan Cc: Alexey Kardashevskiy Cc: Alex Williamson Cc: Andrew Morton Cc: Christoph Lameter Cc: Davidlohr Bueso Cc: Cc: Cc: --- drivers/vfio

[PATCH 4/6] fpga/dlf/afu: drop mmap_sem now that locked_vm is atomic

2019-04-02 Thread Daniel Jordan
With locked_vm now an atomic, there is no need to take mmap_sem as writer. Delete and refactor accordingly. Signed-off-by: Daniel Jordan Cc: Alan Tull Cc: Andrew Morton Cc: Christoph Lameter Cc: Davidlohr Bueso Cc: Moritz Fischer Cc: Wu Hao Cc: Cc: Cc: --- drivers/fpga/dfl-afu-dma

[PATCH 2/6] vfio/type1: drop mmap_sem now that locked_vm is atomic

2019-04-02 Thread Daniel Jordan
With locked_vm now an atomic, there is no need to take mmap_sem as writer. Delete and refactor accordingly. Signed-off-by: Daniel Jordan Cc: Alex Williamson Cc: Andrew Morton Cc: Christoph Lameter Cc: Davidlohr Bueso Cc: Cc: Cc: --- drivers/vfio/vfio_iommu_type1.c | 27

Re: [PATCH -mm -V8] mm, swap: fix race between swapoff and some swap operations

2019-02-26 Thread Daniel Jordan
On Tue, Feb 26, 2019 at 02:49:05PM +0800, Huang, Ying wrote: > Do you have time to take a look at this patch? Hi Ying, is this handling all places where swapoff might cause a task to read invalid data? For example, why don't other reads of swap_map (for example swp_swapcount, page_swapcount, swap

Re: [PATCH v2] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-12 Thread Daniel Jordan
Alexey Klimov writes: > int cpu_device_up(struct device *dev) Yeah, definitely better to do the wait here. > int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) > { > - int cpu, ret = 0; > + struct device *dev; > + cpumask_var_t mask; > + int cpu, ret; > + > + if (!zalloc

Re: [PATCH 3/3] vfio/type1: batch page pinning

2021-02-18 Thread Daniel Jordan
Alex Williamson writes: > This might not be the first batch we fill, I think this needs to unwind > rather than direct return. So it does, well spotted. And it's the same thing with the ENODEV case farther up. > Series looks good otherwise. Thanks for going through it!

[PATCH 0/3] vfio/type1: batch page pinning

2021-02-03 Thread Daniel Jordan
u_replay() -> vfio_pin_pages_remote() Each was run... - with varying sizes - with/without disable_hugepages=1 - with/without LOCKED_VM exceeded I didn't test vfio_pin_page_external() because there was no readily available hardware, but the changes there are pretty minimal. Series based on v5.

[PATCH 1/3] vfio/type1: change success value of vaddr_get_pfn()

2021-02-03 Thread Daniel Jordan
. Signed-off-by: Daniel Jordan --- drivers/vfio/vfio_iommu_type1.c | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 0b4dedaa9128..4d608bc552a4 100644 --- a/drivers/vfio/vfio_iommu_type1.c

[PATCH 3/3] vfio/type1: batch page pinning

2021-02-03 Thread Daniel Jordan
0.48% -0.48% __get_user_pages_remote +0.39% slot_rmap_walk_next +0.32% vfio_pin_map_dma +0.26% kvm_handle_hva_range ... Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: Daniel

[PATCH 2/3] vfio/type1: prepare for batched pinning with struct vfio_batch

2021-02-03 Thread Daniel Jordan
x27;t allocate memory. vaddr_get_pfn() becomes vaddr_get_pfns() to prepare for handling multiple pages, though for now only one page is stored in the pages array. Signed-off-by: Daniel Jordan --- drivers/vfio/vfio_iommu_type1.c | 71 +++-- 1 file changed, 58 insertions(+

Re: [PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-04 Thread Daniel Jordan
Peter Zijlstra writes: > On Thu, Feb 04, 2021 at 12:50:34PM +, Alexey Klimov wrote: >> On Thu, Feb 4, 2021 at 9:46 AM Peter Zijlstra wrote: >> > >> > On Thu, Feb 04, 2021 at 01:01:57AM +, Alexey Klimov wrote: >> > > @@ -1281,6 +1282,11 @@ static int cpu_up(unsigned int cpu, enum >> > >

Re: [PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

2021-02-04 Thread Daniel Jordan
Alexey Klimov writes: > When a CPU offlined and onlined via device_offline() and device_online() > the userspace gets uevent notification. If, after receiving "online" uevent, > userspace executes sched_setaffinity() on some task trying to move it > to a recently onlined CPU, then it often fails

Re: [PATCH] mm/vmstat: Add events for PMD based THP migration without split

2020-06-01 Thread Daniel Jordan
Hi Anshuman, On Fri, May 22, 2020 at 09:04:04AM +0530, Anshuman Khandual wrote: > This adds the following two new VM events which will help in validating PMD > based THP migration without split. Statistics reported through these events > will help in performance debugging. > > 1. THP_PMD_MIGRATIO

Re: [PATCH] swap: Add percpu cluster_next to reduce lock contention on swap cache

2020-05-18 Thread Daniel Jordan
On Mon, May 18, 2020 at 02:37:15PM +0800, Huang, Ying wrote: > Daniel Jordan writes: > > On Thu, May 14, 2020 at 03:04:24PM +0800, Huang Ying wrote: > >> And the pmbench score increases 15.9%. > > > > What metric is that, and how long did you run the benchmark for

Re: [PATCH 2/2] mm, util: account_locked_vm() does not hold mmap_lock

2020-07-30 Thread Daniel Jordan
On Wed, Jul 29, 2020 at 12:21:11PM -0700, Hugh Dickins wrote: > On Sun, 26 Jul 2020, Pengfei Li wrote: > > > Since mm->locked_vm is already an atomic counter, account_locked_vm() > > does not need to hold mmap_lock. > > I am worried that this patch, already added to mmotm, along with its > 1/2 ma

[PATCH v2 7/7] padata: document multithreaded jobs

2020-05-20 Thread Daniel Jordan
Add Documentation for multithreaded jobs. Signed-off-by: Daniel Jordan --- Documentation/core-api/padata.rst | 41 +++ 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/Documentation/core-api/padata.rst b/Documentation/core-api/padata.rst index

[PATCH v2 3/7] padata: allocate work structures for parallel jobs from a pool

2020-05-20 Thread Daniel Jordan
er makes sense, so remove it. If the global pool is exhausted, a parallel job is run in the current task instead to throttle a system trying to do too much in parallel. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 8 +-- kernel/padata.c| 118 +++--

[PATCH v2 2/7] padata: initialize earlier

2020-05-20 Thread Daniel Jordan
padata will soon initialize the system's struct pages in parallel, so it needs to be ready by page_alloc_init_late(). The error return from padata_driver_init() triggers an initcall warning, so add a warning to padata_init() to avoid silent failure. Signed-off-by: Daniel Jordan --- in

[PATCH v2 4/7] padata: add basic support for multithreaded jobs

2020-05-20 Thread Daniel Jordan
tashin and Steve Sistare. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 29 kernel/padata.c| 152 - 2 files changed, 178 insertions(+), 3 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 3bf

[PATCH v2 6/7] mm: make deferred init's max threads arch-specific

2020-05-20 Thread Daniel Jordan
Using padata during deferred init has only been tested on x86, so for now limit it to this architecture. If another arch wants this, it can find the max thread limit that's best for it and override deferred_page_init_max_threads(). Signed-off-by: Daniel Jordan --- arch/x86/mm/init_64.c

[PATCH v2 5/7] mm: parallelize deferred_init_memmap()

2020-05-20 Thread Daniel Jordan
6/VMM-fast-restart_kvmforum2019.pdf Signed-off-by: Daniel Jordan --- mm/Kconfig | 6 ++--- mm/page_alloc.c | 60 - 2 files changed, 58 insertions(+), 8 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index c1acc34c1c358..04c1da3f9f44c 10064

[PATCH v2 1/7] padata: remove exit routine

2020-05-20 Thread Daniel Jordan
padata_driver_exit() is unnecessary because padata isn't built as a module and doesn't exit. padata's init routine will soon allocate memory, so getting rid of the exit function now avoids pointless code to free it. Signed-off-by: Daniel Jordan --- kernel/padata.c | 6 -- 1

[PATCH v2 0/7] padata: parallelize deferred page init

2020-05-20 Thread Daniel Jordan
ux-mm/1588812129-8596-1-git-send-email-anthony.yzn...@oracle.com/ [2] https://lore.kernel.org/linux-mm/20181105165558.11698-1-daniel.m.jor...@oracle.com/ Daniel Jordan (7): padata: remove exit routine padata: initialize earlier padata: allocate work structures for parallel jobs from a pool padat

Re: [PATCH -V2] swap: Reduce lock contention on swap cache from swap slots allocation

2020-05-21 Thread Daniel Jordan
On Wed, May 20, 2020 at 11:15:02AM +0800, Huang Ying wrote: > @@ -2827,6 +2865,11 @@ static struct swap_info_struct *alloc_swap_info(void) > p = kvzalloc(struct_size(p, avail_lists, nr_node_ids), GFP_KERNEL); > if (!p) > return ERR_PTR(-ENOMEM); > + p->cluster_next_cpu

Re: [PATCH v2 5/7] mm: parallelize deferred_init_memmap()

2020-05-21 Thread Daniel Jordan
On Wed, May 20, 2020 at 06:29:32PM -0700, Alexander Duyck wrote: > On Wed, May 20, 2020 at 11:27 AM Daniel Jordan > > @@ -1814,16 +1815,44 @@ deferred_init_maxorder(u64 *i, struct zone *zone, > > unsigned long *start_pfn, > > return nr_pages; > > } &

Re: [PATCH v2 5/7] mm: parallelize deferred_init_memmap()

2020-05-21 Thread Daniel Jordan
On Thu, May 21, 2020 at 08:00:31AM -0700, Alexander Duyck wrote: > So I was thinking about my suggestion further and the loop at the end > isn't quite correct as I believe it could lead to gaps. The loop on > the end should probably be: > for_each_free_mem_pfn_range_in_zone_from(i,

Re: [PATCH v2 5/7] mm: parallelize deferred_init_memmap()

2020-05-21 Thread Daniel Jordan
On Thu, May 21, 2020 at 09:46:35AM -0700, Alexander Duyck wrote: > It is more about not bothering with the extra tracking. We don't > really need it and having it doesn't really add much in the way of > value. Yeah, it can probably go. > > > > @@ -1863,11 +1892,32 @@ static int __init deferred_in

  1   2   3   4   >