Re: [PATCH v2] mm, sysctl: make VM stats configurable

2017-09-24 Thread Huang, Ying
eference the '_numa_mem_' per cpu variable directly. > @@ -2743,6 +2746,17 @@ static inline void zone_statistics(struct zone > *preferred_zone, struct zone *z) > #ifdef CONFIG_NUMA > enum numa_stat_item local_stat = NUMA_LOCAL; > > + /* > + *

[PATCH] mm, swap: Make VMA based swap readahead configurable

2017-09-20 Thread Huang, Ying
From: Huang Ying This patch adds a new Kconfig option VMA_SWAP_READAHEAD and wraps VMA based swap readahead code inside #ifdef CONFIG_VMA_SWAP_READAHEAD/#endif. This is more friendly for tiny kernels. And as pointed to by Minchan Kim, give people who want to disable the swap readahead an

Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

2017-09-14 Thread Huang, Ying
Minchan Kim writes: > On Fri, Sep 15, 2017 at 11:15:08AM +0800, Huang, Ying wrote: >> Minchan Kim writes: >> >> > On Thu, Sep 14, 2017 at 08:01:30PM +0800, Huang, Ying wrote: >> >> Minchan Kim writes: >> >> >> >> > On Wed, Sep

Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

2017-09-14 Thread Huang, Ying
Minchan Kim writes: > On Thu, Sep 14, 2017 at 08:01:30PM +0800, Huang, Ying wrote: >> Minchan Kim writes: >> >> > On Wed, Sep 13, 2017 at 02:02:29PM -0700, Andrew Morton wrote: >> >> On Wed, 13 Sep 2017 10:40:19 +0900 Minchan Kim wrote: >> >>

Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

2017-09-14 Thread Huang, Ying
ter all, yes, it would > be a minimum we should do. But it still breaks users don't/can't read/modify > alert and program. > > How about this? > > Can't we make vma-based readahead config option? > With that, users who no interest on readahead don't enable

Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

2017-09-13 Thread Huang, Ying
isabling people to the issue? This sounds good for me. Hi, Minchan, what do you think about this? I think for low-end android device, the end-user may have no opportunity to upgrade to the latest kernel, the device vendor should care about this. For desktop users, the warning proposed by Andrew may help to remind them for the new knob. Best Regards, Huang, Ying

Re: [PATCH 4/5] mm:swap: respect page_cluster for readahead

2017-09-12 Thread Huang, Ying
Minchan Kim writes: > On Tue, Sep 12, 2017 at 04:32:43PM +0800, Huang, Ying wrote: >> Minchan Kim writes: >> >> > On Tue, Sep 12, 2017 at 04:07:01PM +0800, Huang, Ying wrote: >> > < snip > >> >> >> > My concern is users have be

Re: [PATCH 4/5] mm:swap: respect page_cluster for readahead

2017-09-12 Thread Huang, Ying
Minchan Kim writes: > On Tue, Sep 12, 2017 at 04:07:01PM +0800, Huang, Ying wrote: > < snip > >> >> > My concern is users have been disabled swap readahead by page-cluster >> >> > would >> >> > be regressed. Please take care of them. &

Re: [PATCH 4/5] mm:swap: respect page_cluster for readahead

2017-09-12 Thread Huang, Ying
Minchan Kim writes: > On Tue, Sep 12, 2017 at 03:29:45PM +0800, Huang, Ying wrote: >> Minchan Kim writes: >> >> > On Tue, Sep 12, 2017 at 02:44:36PM +0800, Huang, Ying wrote: >> >> Minchan Kim writes: >> >> >> >> > On Tue, Sep 12

Re: [PATCH 4/5] mm:swap: respect page_cluster for readahead

2017-09-12 Thread Huang, Ying
Minchan Kim writes: > On Tue, Sep 12, 2017 at 02:44:36PM +0800, Huang, Ying wrote: >> Minchan Kim writes: >> >> > On Tue, Sep 12, 2017 at 01:23:01PM +0800, Huang, Ying wrote: >> >> Minchan Kim writes: >> >> >> >> > page_cluster

Re: [PATCH 4/5] mm:swap: respect page_cluster for readahead

2017-09-11 Thread Huang, Ying
Minchan Kim writes: > On Tue, Sep 12, 2017 at 01:23:01PM +0800, Huang, Ying wrote: >> Minchan Kim writes: >> >> > page_cluster 0 means "we don't want readahead" so in the case, >> > let's skip the readahead detection logic. >>

Re: [PATCH 4/5] mm:swap: respect page_cluster for readahead

2017-09-11 Thread Huang, Ying
Minchan Kim writes: > page_cluster 0 means "we don't want readahead" so in the case, > let's skip the readahead detection logic. > > Cc: "Huang, Ying" > Signed-off-by: Minchan Kim > --- > include/linux/swap.h | 3 ++- > 1 file changed, 2

Re: [PATCH -v2] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free

2017-09-05 Thread Huang, Ying
Thomas Gleixner writes: > On Tue, 5 Sep 2017, Huang, Ying wrote: > >> From: Huang Ying >> >> When developing code to bootup some APs (Application CPUs) >> asynchronously, the following kernel panic is encountered. After >> checking the code, it is found th

Re: [PATCH -v2] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free

2017-09-05 Thread Huang, Ying
Thomas Gleixner writes: > On Tue, 5 Sep 2017, Huang, Ying wrote: > >> From: Huang Ying >> >> When developing code to bootup some APs (Application CPUs) >> asynchronously, the following kernel panic is encountered. After >> checking the code, it is found th

[PATCH -v2] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free

2017-09-04 Thread Huang, Ying
From: Huang Ying When developing code to bootup some APs (Application CPUs) asynchronously, the following kernel panic is encountered. After checking the code, it is found that the irq_to_desc() may return NULL during CPU hotplug. So the NULL pointer checking is added to fix this. "

Re: [PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free

2017-09-04 Thread Huang, Ying
Thomas Gleixner writes: > On Mon, 4 Sep 2017, Huang, Ying wrote: >> diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c >> index 638eb9c83d9f..af9029625271 100644 >> --- a/kernel/irq/cpuhotplug.c > ry> +++ b/kernel/irq/cpuhotplug.c &g

[PATCH] IRQ, cpu-hotplug: Fix a race between CPU hotplug and IRQ desc alloc/free

2017-09-04 Thread Huang, Ying
From: Huang Ying When developing code to bootup some APs (Application CPUs) asynchronously, the following kernel panic is encountered. After checking the code, it is found that the IRQ descriptor may be NULL during CPU hotplug. So I added corresponding NULL pointer checking to fix this. And

Re: [PATCH] mm: kvfree the swap cluster info if the swap file is unsatisfactory

2017-09-03 Thread Huang, Ying
/mm/swapfile.c > @@ -3053,6 +3053,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, > specialfile, int, swap_flags) > spin_unlock(&swap_lock); > vfree(swap_map); > kvfree(cluster_info); > + kvfree(frontswap_map); > if (swap_file) { > if (inode && S_ISREG(inode->i_mode)) { > inode_unlock(inode); Yes. There is a memory leak. Reviewed-by: "Huang, Ying" Best Regards, Huang, Ying

Re: [PATCH] mm: kvfree the swap cluster info if the swap file is unsatisfactory

2017-08-31 Thread Huang, Ying
the vfree calls to use kvfree. > > Found by running generic/357 from xfstests. > > Signed-off-by: Darrick J. Wong Thanks for fixing! Reviewed-by: "Huang, Ying" Best Regards, Huang, Ying > --- > mm/swapfile.c |2 +- > 1 file changed, 1 insertion(+), 1 delet

[PATCH -mm] mm: Improve readability of clear_huge_page

2017-08-29 Thread Huang, Ying
From: Huang Ying The optimized clear_huge_page() isn't easy to read and understand. This is suggested by Michael Hocko to improve it. Suggested-by: Michal Hocko Signed-off-by: "Huang, Ying" --- mm/memory.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-)

Re: [LKP] [lkp-robot] [sched/cfs] 625ed2bf04: unixbench.score -7.4% regression

2017-08-27 Thread Huang, Ying
ributed more balanced, so I think scheduler do better job here. The problem is that the tasklist_lock isn't scalable. But considering this is only a micro-benchmark which specially exercises fork/exit/wait syscall, this may be not a big problem in reality. So, all in all, I think we can ignore this regression. Best Regards, Huang, Ying

Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-27 Thread Huang, Ying
"Huang, Ying" writes: > Hi, Peter, > > "Huang, Ying" writes: > >> Peter Zijlstra writes: >> >>> On Sat, Aug 05, 2017 at 08:47:02AM +0800, Huang, Ying wrote: >>>> Yes. That looks good. So you will prepare the final patch?

Re: [PATCH -mm -v2] mm: Clear to access sub-page last when clearing huge page

2017-08-21 Thread Huang, Ying
Michal Hocko writes: > On Tue 15-08-17 09:46:18, Huang, Ying wrote: >> From: Huang Ying >> >> Huge page helps to reduce TLB miss rate, but it has higher cache >> footprint, sometimes this may cause some issue. For example, when >> clearing huge page on x86_64

[PATCH -mm -v2] mm: Clear to access sub-page last when clearing huge page

2017-08-14 Thread Huang, Ying
From: Huang Ying Huge page helps to reduce TLB miss rate, but it has higher cache footprint, sometimes this may cause some issue. For example, when clearing huge page on x86_64 platform, the cache footprint is 2M. But on a Xeon E5 v3 2699 CPU, there are 18 cores, 36 threads, and only 45M LLC

Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-13 Thread Huang, Ying
Hi, Peter, "Huang, Ying" writes: > Peter Zijlstra writes: > >> On Sat, Aug 05, 2017 at 08:47:02AM +0800, Huang, Ying wrote: >>> Yes. That looks good. So you will prepare the final patch? Or you >>> hope me to do that? >> >> I was hoping yo

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-09 Thread Huang, Ying
Hi, Andrew, Andrew Morton writes: > On Mon, 7 Aug 2017 15:21:31 +0800 "Huang, Ying" wrote: > >> From: Huang Ying >> >> Huge page helps to reduce TLB miss rate, but it has higher cache >> footprint, sometimes this may cause some issue. For examp

Re: [PATCH -mm -v4 1/5] mm, swap: Add swap readahead hit statistics

2017-08-09 Thread Huang, Ying
Andrew Morton writes: > On Mon, 7 Aug 2017 13:40:34 +0800 "Huang, Ying" wrote: > >> From: Huang Ying >> >> The statistics for total readahead pages and total readahead hits are >> recorded and exported via the following sysfs interface. >> &g

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-08 Thread Huang, Ying
Matthew Wilcox writes: > On Mon, Aug 07, 2017 at 03:21:31PM +0800, Huang, Ying wrote: >> @@ -2509,7 +2509,8 @@ enum mf_action_page_type { >> #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) >> extern void clear_huge_p

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-08 Thread Huang, Ying
"Huang, Ying" writes: > "Kirill A. Shutemov" writes: > >> On Mon, Aug 07, 2017 at 03:21:31PM +0800, Huang, Ying wrote: >>> From: Huang Ying >>> >>> Huge page helps to reduce TLB miss rate, but it has higher cache >>>

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-07 Thread Huang, Ying
Christopher Lameter writes: > On Mon, 7 Aug 2017, Huang, Ying wrote: > >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4374,9 +4374,31 @@ void clear_huge_page(struct page *page, >> } >> >> might_sleep(); >> -for (i = 0; i <

Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-07 Thread Huang, Ying
Peter Zijlstra writes: > On Sat, Aug 05, 2017 at 08:47:02AM +0800, Huang, Ying wrote: >> Yes. That looks good. So you will prepare the final patch? Or you >> hope me to do that? > > I was hoping you'd do it ;-) Thanks! Here is the updated patch Best Regards,

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-07 Thread Huang, Ying
Mike Kravetz writes: > On 08/07/2017 12:21 AM, Huang, Ying wrote: >> From: Huang Ying >> >> Huge page helps to reduce TLB miss rate, but it has higher cache >> footprint, sometimes this may cause some issue. For example, when >> clearing huge page on x86_64 pl

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-07 Thread Huang, Ying
Christopher Lameter writes: > On Mon, 7 Aug 2017, Huang, Ying wrote: > >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4374,9 +4374,31 @@ void clear_huge_page(struct page *page, >> } >> >> might_sleep(); >> -for (i = 0; i <

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-07 Thread Huang, Ying
"Kirill A. Shutemov" writes: > On Mon, Aug 07, 2017 at 03:21:31PM +0800, Huang, Ying wrote: >> From: Huang Ying >> >> Huge page helps to reduce TLB miss rate, but it has higher cache >> footprint, sometimes this may cause some issue. For example, when >

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-07 Thread Huang, Ying
Jan Kara writes: > On Mon 07-08-17 15:21:31, Huang, Ying wrote: >> From: Huang Ying >> >> Huge page helps to reduce TLB miss rate, but it has higher cache >> footprint, sometimes this may cause some issue. For example, when >> clearing huge page on x86_64 pl

[PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-07 Thread Huang, Ying
From: Huang Ying Huge page helps to reduce TLB miss rate, but it has higher cache footprint, sometimes this may cause some issue. For example, when clearing huge page on x86_64 platform, the cache footprint is 2M. But on a Xeon E5 v3 2699 CPU, there are 18 cores, 36 threads, and only 45M LLC

[PATCH -mm -v4 5/5] mm, swap: Don't use VMA based swap readahead if HDD is used as swap

2017-08-06 Thread Huang, Ying
From: Huang Ying VMA based swap readahead will readahead the virtual pages that is continuous in the virtual address space. While the original swap readahead will readahead the swap slots that is continuous in the swap device. Although VMA based swap readahead is more correct for the swap

[PATCH -mm -v4 4/5] mm, swap: Add sysfs interface for VMA based swap readahead

2017-08-06 Thread Huang, Ying
From: Huang Ying The sysfs interface to control the VMA based swap readahead is added as follow, /sys/kernel/mm/swap/vma_ra_enabled Enable the VMA based swap readahead algorithm, or use the original global swap readahead algorithm. /sys/kernel/mm/swap/vma_ra_max_order Set the max order of

[PATCH -mm -v4 1/5] mm, swap: Add swap readahead hit statistics

2017-08-06 Thread Huang, Ying
From: Huang Ying The statistics for total readahead pages and total readahead hits are recorded and exported via the following sysfs interface. /sys/kernel/mm/swap/ra_hits /sys/kernel/mm/swap/ra_total With them, the efficiency of the swap readahead could be measured, so that the swap readahead

[PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

2017-08-06 Thread Huang, Ying
From: Huang Ying The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consec

[PATCH -mm -v4 2/5] mm, swap: Fix swap readahead marking

2017-08-06 Thread Huang, Ying
From: Huang Ying In the original implementation, it is possible that the existing pages in the swap cache (not newly readahead) could be marked as the readahead pages. This will cause the statistics of swap readahead be wrong and influence the swap readahead algorithm too. This is fixed via

[PATCH -mm -v4 0/5] mm, swap: VMA based swap readahead

2017-08-06 Thread Huang, Ying
swap readahead statistics, because that is the interface used by other similar statistics. - Add ABI document for newly added sysfs interface. v3: - Rebased on latest -mm tree - Use percpu_counter for swap readahead statistics per Dave Hansen's comment. Best Regards, Huang, Ying

Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-04 Thread Huang, Ying
Peter Zijlstra writes: > On Fri, Aug 04, 2017 at 10:05:55AM +0800, Huang, Ying wrote: >> "Huang, Ying" writes: >> > Peter Zijlstra writes: > >> >> +struct __call_single_data { >> >> struct llist_node llist; >> >> s

Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-03 Thread Huang, Ying
"Huang, Ying" writes: > Peter Zijlstra writes: > [snip] >> diff --git a/include/linux/smp.h b/include/linux/smp.h >> index 68123c1fe549..8d817cb80a38 100644 >> --- a/include/linux/smp.h >> +++ b/include/linux/smp.h >> @@ -14,13 +14,16 @@ >>

Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-03 Thread Huang, Ying
__call_single_data)); > + Another requirement of the alignment is that it should be the power of 2. Otherwise, for example, if someone adds a field to struct, so that the size becomes 40 on x86_64. The alignment should be 64 instead of 40. Best Regards, Huang, Ying > /* total number of cp

Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-03 Thread Huang, Ying
Eric Dumazet writes: > On Wed, 2017-08-02 at 16:52 +0800, Huang, Ying wrote: >> From: Huang Ying >> >> struct call_single_data is used in IPI to transfer information between >> CPUs. Its size is bigger than sizeof(unsigned long) and less than >> cache line si

Re: [PATCH 1/3] percpu: Add alloc_percpu_aligned()

2017-08-02 Thread Huang, Ying
Christopher Lameter writes: > On Wed, 2 Aug 2017, Huang, Ying wrote: > >> --- a/include/linux/percpu.h >> +++ b/include/linux/percpu.h >> @@ -129,5 +129,8 @@ extern phys_addr_t per_cpu_ptr_to_phys(void *addr); >&

[PATCH 2/3] iova: Use alloc_percpu_aligned()

2017-08-02 Thread Huang, Ying
From: Huang Ying To use the newly introduced alloc_percpu_aligned(), which can allocate cache line size aligned percpu memory dynamically. Signed-off-by: "Huang, Ying" Cc: Joerg Roedel Cc: io...@lists.linux-foundation.org --- drivers/iommu/iova.c | 2 +- 1 file changed, 1 inser

[PATCH 1/3] percpu: Add alloc_percpu_aligned()

2017-08-02 Thread Huang, Ying
From: Huang Ying To allocate percpu memory that is aligned with cache line size dynamically. We can statically allocate percpu memory that is aligned with cache line size with DEFINE_PER_CPU_ALIGNED(), but we have no correspondent API for dynamic allocation. Signed-off-by: "Huang, Ying

[PATCH 0/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-02 Thread Huang, Ying
From: Huang Ying struct call_single_data is used in IPI to transfer information between CPUs. Its size is bigger than sizeof(unsigned long) and less than cache line size. Now, it is allocated with no any alignment requirement. This makes it possible for allocated call_single_data to cross 2

[PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data

2017-08-02 Thread Huang, Ying
From: Huang Ying struct call_single_data is used in IPI to transfer information between CPUs. Its size is bigger than sizeof(unsigned long) and less than cache line size. Now, it is allocated with no any alignment requirement. This makes it possible for allocated call_single_data to cross 2

Re: linux-next: build warning after merge of the akpm tree

2017-07-31 Thread Huang, Ying
looks like a false positive reporting and not reported by my compiler and kbuild compiler (gcc-6). But anyway, we should silence it. Best Regards, Huang, Ying -->8-- >From 7a7ff76d7bcbd7affda169b29abcf3dafa38052e Mon Sep 17 00:00:00 2001 From: Huang Ying Date: Tue, 1 Aug 2017

Re: "BUG: unable to handle kernel NULL pointer dereference" in swapping shmem

2017-07-31 Thread Huang, Ying
2cd503b4980b0afc ]--- > [ 113.341281] Kernel panic - not syncing: Fatal exception > [ 113.347398] Kernel Offset: 0x700 from 0x8100 (relocation > range: 0x8000-0xbfff) Thanks for reporting! Do you test it on a HDD? I can reproduce this on a

Re: [PATCH -mm -v3 1/6] mm, swap: Add swap cache statistics sysfs interface

2017-07-25 Thread Huang, Ying
Hi, Rik, Rik van Riel writes: > On Tue, 2017-07-25 at 09:51 +0800, Huang, Ying wrote: >> From: Huang Ying >> >> The swap cache stats could be gotten only via sysrq, which isn't >> convenient in some situation.  So the sysfs interface of swap cache >> sta

Re: [PATCH -mm -v3 1/6] mm, swap: Add swap cache statistics sysfs interface

2017-07-25 Thread Huang, Ying
Andrew Morton writes: > On Tue, 25 Jul 2017 09:51:46 +0800 "Huang, Ying" wrote: > >> The swap cache stats could be gotten only via sysrq, which isn't >> convenient in some situation. So the sysfs interface of swap cache >> stats is added for that. T

Re: [PATCH -mm -v3 6/6] mm, swap: Don't use VMA based swap readahead if HDD is used as swap

2017-07-25 Thread Huang, Ying
Hi, Andrew, Andrew Morton writes: > On Tue, 25 Jul 2017 09:51:51 +0800 "Huang, Ying" wrote: > >> From: Huang Ying >> >> VMA based swap readahead will readahead the virtual pages that is >> continuous in the virtual address space. While the original s

[PATCH -mm -v3 4/6] mm, swap: VMA based swap readahead

2017-07-24 Thread Huang, Ying
From: Huang Ying The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consec

[PATCH -mm -v3 6/6] mm, swap: Don't use VMA based swap readahead if HDD is used as swap

2017-07-24 Thread Huang, Ying
From: Huang Ying VMA based swap readahead will readahead the virtual pages that is continuous in the virtual address space. While the original swap readahead will readahead the swap slots that is continuous in the swap device. Although VMA based swap readahead is more correct for the swap

[PATCH -mm -v3 1/6] mm, swap: Add swap cache statistics sysfs interface

2017-07-24 Thread Huang, Ying
From: Huang Ying The swap cache stats could be gotten only via sysrq, which isn't convenient in some situation. So the sysfs interface of swap cache stats is added for that. The added sysfs directories/files are as follow, /sys/kernel/mm/swap /sys/kernel/mm/swap/cache_find_total /sys/k

[PATCH -mm -v3 5/6] mm, swap: Add sysfs interface for VMA based swap readahead

2017-07-24 Thread Huang, Ying
From: Huang Ying The sysfs interface to control the VMA based swap readahead is added as follow, /sys/kernel/mm/swap/vma_ra_enabled Enable the VMA based swap readahead algorithm, or use the original global swap readahead algorithm. /sys/kernel/mm/swap/vma_ra_max_order Set the max order of

[PATCH -mm -v3 2/6] mm, swap: Add swap readahead hit statistics

2017-07-24 Thread Huang, Ying
From: Huang Ying The statistics for total readahead pages and total readahead hits are recorded and exported via the following sysfs interface. /sys/kernel/mm/swap/ra_hits /sys/kernel/mm/swap/ra_total With them, the efficiency of the swap readahead could be measured, so that the swap readahead

[PATCH -mm -v3 3/6] mm, swap: Fix swap readahead marking

2017-07-24 Thread Huang, Ying
From: Huang Ying In the original implementation, it is possible that the existing pages in the swap cache (not newly readahead) could be marked as the readahead pages. This will cause the statistics of swap readahead be wrong and influence the swap readahead algorithm too. This is fixed via

[PATCH -mm -v3 0/6] mm, swap: VMA based swap readahead

2017-07-24 Thread Huang, Ying
dahead hit rate is high, shows that the space locality is still valid in some practical workloads. Changelogs: v3: - Rebased on latest -mm tree - Use percpu_counter for swap readahead statistics per Dave Hansen's comment. Best Regards, Huang, Ying

Re: Is it possible to use ftrace to measure secondary CPU bootup time

2017-07-24 Thread Huang, Ying
Steven Rostedt writes: > On Mon, 24 Jul 2017 13:46:07 +0800 > "Huang\, Ying" wrote: > >> Hi, Steven, >> >> We are working on parallelizing secondary CPU bootup. So we need to >> measure the bootup time of secondary CPU, that is, measure time spen

Is it possible to use ftrace to measure secondary CPU bootup time

2017-07-23 Thread Huang, Ying
early (before core_initcall()?). So, do you think it is possible to use ftrace to measure secondary CPU bootup time? Thanks, Huang, Ying

[PATCH -mm -v3 08/12] memcg, THP, swap: Support move mem cgroup charge for THP swapped out

2017-07-23 Thread Huang, Ying
From: Huang Ying PTE mapped THP (Transparent Huge Page) will be ignored when moving memory cgroup charge. But for THP which is in the swap cache, the memory cgroup charge for the swap of a tail-page may be moved in current implementation. That isn't correct, because the swap charge for al

[PATCH -mm -v3 11/12] mm, THP, swap: Delay splitting THP after swapped out

2017-07-23 Thread Huang, Ying
From: Huang Ying In this patch, splitting transparent huge page (THP) during swapping out is delayed from after adding the THP into the swap cache to after swapping out finishes. After the patch, more operations for the anonymous THP reclaiming, such as writing the THP to the swap device

[PATCH -mm -v3 10/12] memcg, THP, swap: Make mem_cgroup_swapout() support THP

2017-07-23 Thread Huang, Ying
From: Huang Ying This patch makes mem_cgroup_swapout() works for the transparent huge page (THP). Which will move the memory cgroup charge from memory to swap for a THP. This will be used for the THP swap support. Where a THP may be swapped out as a whole to a set of (HPAGE_PMD_NR) continuous

[PATCH -mm -v3 07/12] mm, THP, swap: Support to split THP for THP swapped out

2017-07-23 Thread Huang, Ying
From: Huang Ying After adding swapping out support for THP (Transparent Huge Page), it is possible that a THP in swap cache (partly swapped out) need to be split. To split such a THP, the swap cluster backing the THP need to be split too, that is, the CLUSTER_FLAG_HUGE flag need to be cleared

[PATCH -mm -v3 12/12] mm, THP, swap: Add THP swapping out fallback counting

2017-07-23 Thread Huang, Ying
From: Huang Ying When swapping out THP (Transparent Huge Page), instead of swapping out the THP as a whole, sometimes we have to fallback to split the THP into normal pages before swapping, because no free swap clusters are available, or cgroup limit is exceeded, etc. To count the number of the

[PATCH -mm -v3 09/12] memcg, THP, swap: Avoid to duplicated charge THP in swap cache

2017-07-23 Thread Huang, Ying
From: Huang Ying For a THP (Transparent Huge Page), tail_page->mem_cgroup is NULL. So to check whether the page is charged already, we need to check the head page. This is not an issue before because it is impossible for a THP to be in the swap cache before. But after we add delay

[PATCH -mm -v3 04/12] mm, THP, swap: Don't allocate huge cluster for file backed swap device

2017-07-23 Thread Huang, Ying
From: Huang Ying It's hard to write a whole transparent huge page (THP) to a file backed swap device during swapping out and the file backed swap device isn't very popular. So the huge cluster allocation for the file backed swap device is disabled. Signed-off-by: "Huang, Ying

[PATCH -mm -v3 06/12] Test code to write THP to swap device as a whole

2017-07-23 Thread Huang, Ying
From: Huang Ying To support to delay splitting THP (Transparent Huge Page) after swapped out. We need to enhance swap writing code to support to write a THP as a whole. This will improve swap write IO performance. As Ming Lei pointed out, this should be based on multipage bvec support, which

[PATCH -mm -v3 05/12] block, THP: Make block_device_operations.rw_page support THP

2017-07-23 Thread Huang, Ying
From: Huang Ying The .rw_page in struct block_device_operations is used by the swap subsystem to read/write the page contents from/into the corresponding swap slot in the swap device. To support the THP (Transparent Huge Page) swap optimization, the .rw_page is enhanced to support to read/write

[PATCH -mm -v3 03/12] mm, THP, swap: Make reuse_swap_page() works for THP swapped out

2017-07-23 Thread Huang, Ying
From: Huang Ying After supporting to delay THP (Transparent Huge Page) splitting after swapped out, it is possible that some page table mappings of the THP are turned into swap entries. So reuse_swap_page() need to check the swap count in addition to the map count as before. This patch done

[PATCH -mm -v3 00/12] mm, THP, swap: Delay splitting THP after swapped out

2017-07-23 Thread Huang, Ying
From: Huang Ying Hi, Andrew, could you help me to check whether the overall design is reasonable? Hi, Johannes and Minchan, Thanks a lot for your review to the first step of the THP swap optimization! Could you help me to review the second step in this patchset? Hi, Hugh, Shaohua, Minchan and

[PATCH -mm -v3 02/12] mm, THP, swap: Support to reclaim swap space for THP swapped out

2017-07-23 Thread Huang, Ying
From: Huang Ying The normal swap slot reclaiming can be done when the swap count reaches SWAP_HAS_CACHE. But for the swap slot which is backing a THP, all swap slots backing one THP must be reclaimed together, because the swap slot may be used again when the THP is swapped out again later. So

[PATCH -mm -v3 01/12] mm, THP, swap: Support to clear swap cache flag for THP swapped out

2017-07-23 Thread Huang, Ying
From: Huang Ying Previously, swapcache_free_cluster() is used only in the error path of shrink_page_list() to free the swap cluster just allocated if the THP (Transparent Huge Page) is failed to be split. In this patch, it is enhanced to clear the swap cache flag (SWAP_HAS_CACHE) for the swap

Re: [PATCH 2/2] mm/swap: Remove lock_initialized flag from swap_slots_cache

2017-07-23 Thread Huang, Ying
be onlined alloc_swap_slot_cache() mutex_lock(cache[B]->alloc_lock) mutex_init(cache[B]->alloc_lock) !!! The cache[B]->alloc_lock will be reinitialized when it is still held. Best Regards, Huang, Ying > Reported-by: Wenwei Tao > Sign

Re: [PATCH -mm -v2 00/12] mm, THP, swap: Delay splitting THP after swapped out

2017-07-23 Thread Huang, Ying
Andrew Morton writes: > On Fri, 23 Jun 2017 15:12:51 +0800 "Huang, Ying" wrote: > >> From: Huang Ying >> >> Hi, Andrew, could you help me to check whether the overall design is >> reasonable? >> >> Hi, Johannes and Minchan, Thanks a lot

Re: [PATCH -mm -v2 2/6] mm, swap: Add swap readahead hit statistics

2017-07-11 Thread Huang, Ying
Dave Hansen writes: > On 06/29/2017 06:44 PM, Huang, Ying wrote: >> >> static atomic_t swapin_readahead_hits = ATOMIC_INIT(4); >> +static atomic_long_t swapin_readahead_hits_total = ATOMIC_INIT(0); >> +static atomic_long_t swapin_readahead_total = AT

Re: [PATCH -mm -v2 0/6] mm, swap: VMA based swap readahead

2017-06-30 Thread Huang, Ying
you have time to take a look at this patchset? Best Regards, Huang, Ying [snip]

[PATCH -mm -v2 0/6] mm, swap: VMA based swap readahead

2017-06-29 Thread Huang, Ying
The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consecutive blocks in swap device ar

[PATCH -mm -v2 5/6] mm, swap: Add sysfs interface for VMA based swap readahead

2017-06-29 Thread Huang, Ying
From: Huang Ying The sysfs interface to control the VMA based swap readahead is added as follow, /sys/kernel/mm/swap/vma_ra_enabled Enable the VMA based swap readahead algorithm, or use the original global swap readahead algorithm. /sys/kernel/mm/swap/vma_ra_max_order Set the max order of

[PATCH -mm -v2 1/6] mm, swap: Add swap cache statistics sysfs interface

2017-06-29 Thread Huang, Ying
From: Huang Ying The swap cache stats could be gotten only via sysrq, which isn't convenient in some situation. So the sysfs interface of swap cache stats is added for that. The added sysfs directories/files are as follow, /sys/kernel/mm/swap /sys/kernel/mm/swap/cache_find_total /sys/k

[PATCH -mm -v2 4/6] mm, swap: VMA based swap readahead

2017-06-29 Thread Huang, Ying
From: Huang Ying The swap readahead is an important mechanism to reduce the swap in latency. Although pure sequential memory access pattern isn't very popular for anonymous memory, the space locality is still considered valid. In the original swap readahead implementation, the consec

[PATCH -mm -v2 6/6] mm, swap: Don't use VMA based swap readahead if HDD is used as swap

2017-06-29 Thread Huang, Ying
From: Huang Ying VMA based swap readahead will readahead the virtual pages that is continuous in the virtual address space. While the original swap readahead will readahead the swap slots that is continuous in the swap device. Although VMA based swap readahead is more correct for the swap

[PATCH -mm -v2 2/6] mm, swap: Add swap readahead hit statistics

2017-06-29 Thread Huang, Ying
From: Huang Ying The statistics for total readahead pages and total readahead hits are recorded and exported via the following sysfs interface. /sys/kernel/mm/swap/ra_hits /sys/kernel/mm/swap/ra_total With them, the efficiency of the swap readahead could be measured, so that the swap readahead

[PATCH -mm -v2 3/6] mm, swap: Fix swap readahead marking

2017-06-29 Thread Huang, Ying
From: Huang Ying In the original implementation, it is possible that the existing pages in the swap cache (not newly readahead) could be marked as the readahead pages. This will cause the statistics of swap readahead be wrong and influence the swap readahead algorithm too. This is fixed via

Re: mmotm 2017-06-23-15-03 uploaded

2017-06-26 Thread huang ying
80 SS:ESP: 0068:f54efd80 [ 10.670881] CR2: 001fe2b8 [ 10.671140] ---[ end trace f51518af57e6b531 ]--- I think this comes from the signed and unsigned int comparison on i386. The gcc version is, gcc (Debian 6.3.0-18) 6.3.0 20170516 Best Regards, Huang, Ying

[PATCH -mm -v2 06/12] Test code to write THP to swap device as a whole

2017-06-23 Thread Huang, Ying
From: Huang Ying To support to delay splitting THP (Transparent Huge Page) after swapped out. We need to enhance swap writing code to support to write a THP as a whole. This will improve swap write IO performance. As Ming Lei pointed out, this should be based on multipage bvec support, which

[PATCH -mm -v2 02/12] mm, THP, swap: Support to reclaim swap space for THP swapped out

2017-06-23 Thread Huang, Ying
From: Huang Ying The normal swap slot reclaiming can be done when the swap count reaches SWAP_HAS_CACHE. But for the swap slot which is backing a THP, all swap slots backing one THP must be reclaimed together, because the swap slot may be used again when the THP is swapped out again later. So

[PATCH -mm -v2 01/12] mm, THP, swap: Support to clear swap cache flag for THP swapped out

2017-06-23 Thread Huang, Ying
From: Huang Ying Previously, swapcache_free_cluster() is used only in the error path of shrink_page_list() to free the swap cluster just allocated if the THP (Transparent Huge Page) is failed to be split. In this patch, it is enhanced to clear the swap cache flag (SWAP_HAS_CACHE) for the swap

[PATCH -mm -v2 05/12] block, THP: Make block_device_operations.rw_page support THP

2017-06-23 Thread Huang, Ying
From: Huang Ying The .rw_page in struct block_device_operations is used by the swap subsystem to read/write the page contents from/into the corresponding swap slot in the swap device. To support the THP (Transparent Huge Page) swap optimization, the .rw_page is enhanced to support to read/write

[PATCH -mm -v2 11/12] mm, THP, swap: Delay splitting THP after swapped out

2017-06-23 Thread Huang, Ying
From: Huang Ying In this patch, splitting transparent huge page (THP) during swapping out is delayed from after adding the THP into the swap cache to after swapping out finishes. After the patch, more operations for the anonymous THP reclaiming, such as writing the THP to the swap device

[PATCH -mm -v2 00/12] mm, THP, swap: Delay splitting THP after swapped out

2017-06-23 Thread Huang, Ying
From: Huang Ying Hi, Andrew, could you help me to check whether the overall design is reasonable? Hi, Johannes and Minchan, Thanks a lot for your review to the first step of the THP swap optimization! Could you help me to review the second step in this patchset? Hi, Hugh, Shaohua, Minchan and

[PATCH -mm -v2 03/12] mm, THP, swap: Make reuse_swap_page() works for THP swapped out

2017-06-23 Thread Huang, Ying
From: Huang Ying After supporting to delay THP (Transparent Huge Page) splitting after swapped out, it is possible that some page table mappings of the THP are turned into swap entries. So reuse_swap_page() need to check the swap count in addition to the map count as before. This patch done

[PATCH -mm -v2 10/12] memcg, THP, swap: Make mem_cgroup_swapout() support THP

2017-06-23 Thread Huang, Ying
From: Huang Ying This patch makes mem_cgroup_swapout() works for the transparent huge page (THP). Which will move the memory cgroup charge from memory to swap for a THP. This will be used for the THP swap support. Where a THP may be swapped out as a whole to a set of (HPAGE_PMD_NR) continuous

[PATCH -mm -v2 09/12] memcg, THP, swap: Avoid to duplicated charge THP in swap cache

2017-06-23 Thread Huang, Ying
From: Huang Ying For a THP (Transparent Huge Page), tail_page->mem_cgroup is NULL. So to check whether the page is charged already, we need to check the head page. This is not an issue before because it is impossible for a THP to be in the swap cache before. But after we add delay

[PATCH -mm -v2 07/12] mm, THP, swap: Support to split THP for THP swapped out

2017-06-23 Thread Huang, Ying
From: Huang Ying After adding swapping out support for THP (Transparent Huge Page), it is possible that a THP in swap cache (partly swapped out) need to be split. To split such a THP, the swap cluster backing the THP need to be split too, that is, the CLUSTER_FLAG_HUGE flag need to be cleared

<    4   5   6   7   8   9   10   11   12   13   >