in some cases since
> THP may get split during page reclaim, then a part of tail pages get
> reclaimed instead of the whole 512 pages, but nr_scanned is accounted
> by 512, particularly for direct reclaim. But, this should be not a
> significant issue.
>
> Cc: "Huang, Ying&q
Josef Bacik writes:
> On Fri, May 24, 2019 at 03:46:17PM +0800, Huang, Ying wrote:
>> "Huang, Ying" writes:
>>
>> > "Huang, Ying" writes:
>> >
>> >> Hi, Josef,
>> >>
>> >> kernel test robot writes:
>
"Huang, Ying" writes:
> "Huang, Ying" writes:
>
>> Hi, Josef,
>>
>> kernel test robot writes:
>>
>>> Greeting,
>>>
>>> FYI, we noticed a -12.4% regression of fio.write_bw_MBps due to commit:
>>>
>>>
in some cases since
> THP may get split during page reclaim, then a part of tail pages get
> reclaimed instead of the whole 512 pages, but nr_scanned is accounted
> by 512, particularly for direct reclaim. But, this should be not a
> significant issue.
>
> Cc: "Huang, Ying&q
From: Huang Ying
Via commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks") on,
after swapoff, the address_space associated with the swap device will be
freed. So swap_address_space() users which touch the address_space need
some kind of mechanism to prevent the address_
From: Huang Ying
When swapin is performed, after getting the swap entry information from
the page table, system will swap in the swap entry, without any lock held
to prevent the swap device from being swapoff. This may cause the race
like below,
CPU 1 CPU 2
rs as well.
>
> So accounting those counters in base page instead of accounting THP as
> one page.
>
> This change may result in lower steal/scan ratio in some cases since
> THP may get split during page reclaim, then a part of tail pages get
> reclaimed instead of the whole 512 page
Yang Shi writes:
> On 5/9/19 7:12 PM, Huang, Ying wrote:
>> Yang Shi writes:
>>
>>> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
>>> swapped out"), THP can be swapped out in a whole. But, nr_reclaimed
>>> still gets in
12 pages in worst case.
>
> This change may result in more reclaimed pages than scanned pages showed
> by /proc/vmstat since scanning one head page would reclaim 512 base pages.
>
> Cc: "Huang, Ying"
> Cc: Johannes Weiner
> Cc: Michal Hocko
> Cc: Mel Gorman
>
"Huang, Ying" writes:
> Hi, Josef,
>
> kernel test robot writes:
>
>> Greeting,
>>
>> FYI, we noticed a -12.4% regression of fio.write_bw_MBps due to commit:
>>
>>
>> commit: 302167c50b32e7fccc98994a91d40ddbbab04e52 ("btrfs: don
g a particular type of I/O action as specified by the user.
> test-url: https://github.com/axboe/fio
>
>
Do you have time to take a look at this regression?
Best Regards,
Huang, Ying
Theodore Ts'o writes:
> On Wed, Mar 13, 2019 at 03:26:39PM +0800, huang ying wrote:
>> >
>> >
>> > commit: fde872682e175743e0c3ef939c89e3c6008a1529 ("ext4: force inode
>> > writes when nfsd calls commit_metadata()")
>> > https:/
s commit_metadata()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
It appears that this is a performance regression caused by a
functionality fixing. So we should ignore this?
Best Regards,
Huang, Ying
Linus Torvalds writes:
> On Wed, Feb 27, 2019 at 5:19 PM Huang, Ying wrote:
>>
>> So I think in the heavily contended situation, we should put the fields
>> accessed by rwsem holder in a different cache line of rwsem. But in
>> un-contended situation, we should put t
Waiman Long writes:
> On 02/27/2019 08:18 PM, Huang, Ying wrote:
>> Waiman Long writes:
>>
>>> On 02/26/2019 12:30 PM, Linus Torvalds wrote:
>>>> On Tue, Feb 26, 2019 at 12:17 AM Huang, Ying wrote:
>>>>> As for fixing. Should we care about
Waiman Long writes:
> On 02/26/2019 12:30 PM, Linus Torvalds wrote:
>> On Tue, Feb 26, 2019 at 12:17 AM Huang, Ying wrote:
>>> As for fixing. Should we care about the cache line alignment of struct
>>> inode? Or its size is considered more important because there
Daniel Jordan writes:
> On Tue, Feb 26, 2019 at 02:49:05PM +0800, Huang, Ying wrote:
>> Do you have time to take a look at this patch?
>
> Hi Ying, is this handling all places where swapoff might cause a task to read
> invalid data? For example, why don't other reads o
ng, then cause the regression. The parent
commit happens to have the right cache line layout, while the first bad
commit doesn't.
As for fixing. Should we care about the cache line alignment of struct
inode? Or its size is considered more important because there may be a
huge number of s
Hi, Daniel and Andrea,
"Huang, Ying" writes:
> From: Huang Ying
>
> When swapin is performed, after getting the swap entry information from
> the page table, system will swap in the swap entry, without any lock held
> to prevent the swap device from being swapoff.
Greg Kroah-Hartman writes:
> On Thu, Feb 21, 2019 at 03:18:22PM +0800, Huang, Ying wrote:
>> Greg Kroah-Hartman writes:
>>
>> > On Thu, Feb 21, 2019 at 11:10:49AM +0800, kernel test robot wrote:
>> >> On Tue, Feb 19, 2019 at 01:19:04PM +0100, Greg Kroah-H
19%936will-it-scale.per_thread_ops
>>
>>
>>
>> commit a36dc70b810afe9183de2ea18faa4c0939c139ac
>> Author: 0day robot
>> Date: Wed Feb 20 14:21:19 2019 +0800
>>
>> backfile klist_node in struct device for debugging
>>
>> Signed-off-by: 0day robot
>>
>> diff --git a/include/linux/device.h b/include/linux/device.h
>> index d0e452fd0bff2..31666cb72b3ba 100644
>> --- a/include/linux/device.h
>> +++ b/include/linux/device.h
>> @@ -1035,6 +1035,7 @@ struct device {
>> spinlock_t devres_lock;
>> struct list_headdevres_head;
>>
>> +struct klist_node knode_class_test_by_rongc;
>> struct class*class;
>> const struct attribute_group **groups; /* optional groups */
>
> While this is fun to worry about alignment and structure size of 'struct
> device' I find it odd given that the syscalls and userspace load of
> those test programs have nothing to do with 'struct device' at all.
>
> So I can work on fixing up the alignment of struct device, as that's a
> nice thing to do for systems with 30k of these in memory, but that
> shouldn't affect a workload of a constant string of signal calls.
Hi, Greg,
I don't think this is an issues of struct device. As you said, struct
device isn't access much during test. Struct device may share slab page
with some other data structures (signal related, or fd related (as in
some other test cases)), so that the alignment of these data structures
are affected, so caused the performance regression.
Best Regards,
Huang, Ying
> thanks,
>
> greg k-h
Wei Yang writes:
> On Thu, Feb 21, 2019 at 12:46:18PM +0800, Huang, Ying wrote:
>>Wei Yang writes:
>>
>>> On Thu, Feb 21, 2019 at 11:10:49AM +0800, kernel test robot wrote:
>>>>On Tue, Feb 19, 2019 at 01:19:04PM +0100, Greg Kroah-Hartman wrote:
>>&g
\ |\
>>447722 22% 546258 10%
>> will-it-scale.time.involuntary_context_switches
>>226995 19% 269751will-it-scale.workload
>> 787 19%936will-it-scale.per_thread_ops
>>
>>
>>
>>commit a36dc70b810afe9183de2ea18faa4c0939c139ac
>>Author: 0day robot
>>Date: Wed Feb 20 14:21:19 2019 +0800
>>
>>backfile klist_node in struct device for debugging
>>
>>Signed-off-by: 0day robot
>>
>>diff --git a/include/linux/device.h b/include/linux/device.h
>>index d0e452fd0bff2..31666cb72b3ba 100644
>>--- a/include/linux/device.h
>>+++ b/include/linux/device.h
>>@@ -1035,6 +1035,7 @@ struct device {
>> spinlock_t devres_lock;
>> struct list_headdevres_head;
>>
>>+ struct klist_node knode_class_test_by_rongc;
>> struct class*class;
>> const struct attribute_group **groups; /* optional groups */
>
> Hmm... because this is not properly aligned?
>
> struct klist_node {
> void*n_klist; /* never access directly */
> struct list_headn_node;
> struct kref n_ref;
> };
>
> Except struct kref has one "int" type, others are pointers.
>
> But... I am still confused.
I guess because the size of struct device is changed, it influences some
alignment changes in the system. Thus influence the benchmark score.
Best Regards,
Huang, Ying
>>
>>Best Regards,
>>Rong Chen
From: Huang Ying
When swapin is performed, after getting the swap entry information from
the page table, system will swap in the swap entry, without any lock held
to prevent the swap device from being swapoff. This may cause the race
like below,
CPU 1 CPU 2
Andrea Arcangeli writes:
> On Thu, Feb 14, 2019 at 04:07:37PM +0800, Huang, Ying wrote:
>> Before, we choose to use stop_machine() to reduce the overhead of hot
>> path (page fault handler) as much as possible. But now, I found
>> rcu_read_lock_sched() is just a wrapp
Tim Chen writes:
> On 2/11/19 10:47 PM, Huang, Ying wrote:
>> Andrea Parri writes:
>>
>>>>> + if (!si)
>>>>> + goto bad_nofile;
>>>>> +
>>>>> + preempt_disable();
>>>>> + if (!(si->flags &
Daniel Jordan writes:
> On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote:
>> +struct swap_info_struct *get_swap_device(swp_entry_t entry)
>> +{
>> +struct swap_info_struct *si;
>> +unsigned long type, offset;
>> +
>> +i
From: Huang Ying
When swapin is performed, after getting the swap entry information from
the page table, system will swap in the swap entry, without any lock held
to prevent the swap device from being swapoff. This may cause the race
like below,
CPU 1 CPU 2
Yang Shi writes:
> swap_vma_readahead()'s comment is missed, just add it.
>
> Cc: Huang Ying
> Cc: Tim Chen
> Cc: Minchan Kim
> Signed-off-by: Yang Shi
Thank!
Reviewed-by: "Huang, Ying"
Best Regards,
Huang, Ying
> ---
> v5: Fixed the comments p
Yang Shi writes:
> swap_vma_readahead()'s comment is missed, just add it.
>
> Cc: Huang Ying
> Cc: Tim Chen
> Cc: Minchan Kim
> Signed-off-by: Yang Shi
> ---
> mm/swap_state.c | 17 +
> 1 file changed, 17 insertions(+)
>
> diff --g
ld backport the fix to from 4.14 on. But Hugh thinks
it may be rare for the KSM pages being in the swap device when swapoff,
so nobody reports the bug so far.
Best Regards,
Huang, Ying
Huang Ying writes:
> KSM pages may be mapped to the multiple VMAs that cannot be reached
> from one anon_v
ly via checking
PageTransCompound(). That is how this patch works.
Fixes: e07098294adf ("mm, THP, swap: support to reclaim swap space for THP
swapped out")
Signed-off-by: "Huang, Ying"
Reported-and-Tested-and-Acked-by: Hugh Dickins
Cc: Rik van Riel
Cc: Johannes Weiner
Hi, Daniel,
Daniel Jordan writes:
> +Aneesh
>
> On Fri, Dec 14, 2018 at 02:27:40PM +0800, Huang Ying wrote:
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index bd2543e10938..49df3e7c96c7 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
Daniel Jordan writes:
> On Fri, Dec 14, 2018 at 02:27:43PM +0800, Huang Ying wrote:
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 1cec1eec340e..644cb5d6b056 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -33,6 +33,8 @@
>>
PMD swap mappings to the corresponding swap
cluster. So when clearing the SWAP_HAS_CACHE flag, the huge swap
cluster will only be split if the PMD swap mapping count is 0.
Otherwise, we will keep it as the huge swap cluster. So that we can
swapin a THP in one piece later.
Signed-off-by: "
During MADV_WILLNEED, for a PMD swap mapping, if THP swapin is enabled
for the VMA, the whole swap cluster will be swapin. Otherwise, the
huge swap cluster and the PMD swap mapping will be split and fallback
to PTE swap mapping.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutem
During mincore(), for PMD swap mapping, swap cache will be looked up.
If the resulting page isn't compound page, the PMD swap mapping will
be split and fallback to PTE swap mapping processing.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcang
ed by misaligned address processing issue in __split_huge_swap_pmd().
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya
in is
disabled, the huge swap cluster and the PMD swap mapping will be split
and fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
C
refactoring, there is no any functional change in
this patch.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horig
continuation failed to allocate a page with
GFP_ATOMIC, we need to unlock the spinlock and try again with
GFP_KERNEL.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minc
Original code is only for PMD migration entry, it is revised to
support PMD swap mapping.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
swapping is
used, so that we can take full advantage of THP including its high
performance for swapout/swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
lback to PTE processing.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horiguchi
Cc: Zi Yan
Cc: Daniel J
For a PMD swap mapping, zap_huge_pmd() will clear the PMD and call
free_swap_and_cache() to decrease the swap reference count and maybe
free or split the huge swap cluster and the THP in swap cache.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
2 new /proc/vmstat fields are added, "thp_swapin" and
"thp_swapin_fallback" to count swapin a THP from swap device in one
piece and fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Ho
er() is changed to before unlocking sub-pages. So
that all sub-pages will be kept locked from the THP has been split to
the huge swap cluster is split. This makes the code much easier to be
reasoned.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcang
THP, add it into the swap cache. So later the contents
of the huge swap cluster can be read into the THP.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Hi, Andrew, could you help me to check whether the overall design is
reasonable?
Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the
swap part of the patchset? Especially [02/21], [03/21], [04/21],
[05/21], [06/21], [07/21], [08/21], [09/21], [10/21], [11/21],
[12/21], [20/21], [2
uster will be split if
its PMD swap mapping count is 0.
The first parameter of swap_duplicate() is changed to return the swap
entry to call add_swap_count_continuation() for. Because we may need
to call it for a swap entry in the middle of a huge swap cluster.
Signed-off-by: "Huang, Ying"
mapping count and probably free the swap space
and the THP in swap cache too.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
instead. Some functions enabled by CONFIG_ARCH_ENABLE_THP_MIGRATION
are for page migration only, they are still enabled only for that.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh D
.
If the swap cluster mapped by a PMD swap mapping has been split
already, we will split the PMD swap mapping and unuse the PTEs.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horiguchi
Cc: Zi Yan
Cc: Daniel
uster be freed.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horiguchi
Cc: Zi Yan
Cc: Daniel Jordan
---
arc
calling try_to_free_swap() instead,
which will check all PTE swap mappings inside the huge swap cluster.
This fix could be folded into the patch: mm, swap: rid swapoff of
quadratic complexity in -mm patchset.
Signed-off-by: "Huang, Ying"
Cc: Vineeth Remanan Pillai
Cc: Kelley Nielsen
C
yscall_64+0x12/0x65
do_syscall_64+0x57/0x65
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: "Huang, Ying"
Cc: Vineeth Remanan Pillai
Cc: Kelley Nielsen
Cc: Rik van Riel
Cc: Matthew Wilcox
Cc: Hugh Dickins
---
mm/swapfile.c | 1 +
1 file changed, 1 insertion(+)
diff --
calling try_to_free_swap() instead,
which will check all PTE swap mappings inside the huge swap cluster.
This fix could be folded into the patch: mm, swap: rid swapoff of
quadratic complexity in -mm patchset.
Signed-off-by: "Huang, Ying"
Cc: Vineeth Remanan Pillai
Cc: Kelley Nielsen
C
Huang Ying writes:
> Hi, Andrew, could you help me to check whether the overall design is
> reasonable?
>
> Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the
> swap part of the patchset? Especially [02/21], [03/21], [04/21],
> [05/21], [06/21], [07/21], [08
refactoring, there is no any functional change in
this patch.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horig
uster be freed.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horiguchi
Cc: Zi Yan
Cc: Daniel Jordan
---
arc
THP, add it into the swap cache. So later the contents
of the huge swap cluster can be read into the THP.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
During mincore(), for PMD swap mapping, swap cache will be looked up.
If the resulting page isn't compound page, the PMD swap mapping will
be split and fallback to PTE swap mapping processing.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcang
split already, we will split the PMD swap mapping and
unuse the PTEs.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Nao
mapping count and probably free the swap space
and the THP in swap cache too.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
For a PMD swap mapping, zap_huge_pmd() will clear the PMD and call
free_swap_and_cache() to decrease the swap reference count and maybe
free or split the huge swap cluster and the THP in swap cache.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
lback to PTE processing.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horiguchi
Cc: Zi Yan
Cc: Daniel J
in is
disabled, the huge swap cluster and the PMD swap mapping will be split
and fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
C
During MADV_WILLNEED, for a PMD swap mapping, if THP swapin is enabled
for the VMA, the whole swap cluster will be swapin. Otherwise, the
huge swap cluster and the PMD swap mapping will be split and fallback
to PTE swap mapping.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutem
swapping is
used, so that we can take full advantage of THP including its high
performance for swapout/swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Original code is only for PMD migration entry, it is revised to
support PMD swap mapping.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
The help of CONFIG_THP_SWAP is updated to reflect the latest progress
of THP (Tranparent Huge Page) swap optimization.
Signed-off-by: "Huang, Ying"
Reviewed-by: Dan Williams
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua
2 new /proc/vmstat fields are added, "thp_swapin" and
"thp_swapin_fallback" to count swapin a THP from swap device in one
piece and fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Ho
continuation failed to allocate a page with
GFP_ATOMIC, we need to unlock the spinlock and try again with
GFP_KERNEL.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minc
Hi, Andrew, could you help me to check whether the overall design is
reasonable?
Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the
swap part of the patchset? Especially [02/21], [03/21], [04/21],
[05/21], [06/21], [07/21], [08/21], [09/21], [10/21], [11/21],
[12/21], [20/21], [2
PMD swap mappings to the corresponding swap
cluster. So when clearing the SWAP_HAS_CACHE flag, the huge swap
cluster will only be split if the PMD swap mapping count is 0.
Otherwise, we will keep it as the huge swap cluster. So that we can
swapin a THP in one piece later.
Signed-off-by: "
instead. Some functions enabled by CONFIG_ARCH_ENABLE_THP_MIGRATION
are for page migration only, they are still enabled only for that.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh D
ed by misaligned address processing issue in __split_huge_swap_pmd().
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya
fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horiguchi
Cc: Zi Yan
Cc: Daniel
uster will be split if
its PMD swap mapping count is 0.
The first parameter of swap_duplicate() is changed to return the swap
entry to call add_swap_count_continuation() for. Because we may need
to call it for a swap entry in the middle of a huge swap cluster.
Signed-off-by: "Huang, Ying"
er() is changed to before unlocking sub-pages. So
that all sub-pages will be kept locked from the THP has been split to
the huge swap cluster is split. This makes the code much easier to be
reasoned.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcang
Daniel Jordan writes:
> On Sat, Dec 01, 2018 at 08:34:06AM +0800, Huang, Ying wrote:
>> Daniel Jordan writes:
>> > What do you think?
>>
>> I think that swapoff() which is the main user of try_to_unuse() isn't a
>> common operation in practica
Hi, Daniel,
Daniel Jordan writes:
> Hi Ying,
>
> On Tue, Nov 20, 2018 at 04:54:36PM +0800, Huang Ying wrote:
>> diff --git a/mm/swap_state.c b/mm/swap_state.c
>> index 97831166994a..1eedbc0aede2 100644
>> --- a/mm/swap_state.c
>> +++ b/mm/swap_state.c
>
rocesses exited within
14s, another 3 exited within 100s. For this commit, the first process
exited at 203s. That is, this commit makes memory allocation more fair
among processes, so that processes proceeded at more similar speed. But
this raises system memory footprint too, so triggered much more swap,
thus lower benchmark score.
In general, memory allocation fairness among processes should be a good
thing. So I think the report should have been a "performance
improvement" instead of "performance regression".
Best Regards,
Huang, Ying
in is
disabled, the huge swap cluster and the PMD swap mapping will be split
and fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
C
The help of CONFIG_THP_SWAP is updated to reflect the latest progress
of THP (Tranparent Huge Page) swap optimization.
Signed-off-by: "Huang, Ying"
Reviewed-by: Dan Williams
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua
Original code is only for PMD migration entry, it is revised to
support PMD swap mapping.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
swapping is
used, so that we can take full advantage of THP including its high
performance for swapout/swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
For a PMD swap mapping, zap_huge_pmd() will clear the PMD and call
free_swap_and_cache() to decrease the swap reference count and maybe
free or split the huge swap cluster and the THP in swap cache.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
continuation failed to allocate a page with
GFP_ATOMIC, we need to unlock the spinlock and try again with
GFP_KERNEL.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minc
lback to PTE processing.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horiguchi
Cc: Zi Yan
Cc: Daniel J
During mincore(), for PMD swap mapping, swap cache will be looked up.
If the resulting page isn't compound page, the PMD swap mapping will
be split and fallback to PTE swap mapping processing.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcang
During MADV_WILLNEED, for a PMD swap mapping, if THP swapin is enabled
for the VMA, the whole swap cluster will be swapin. Otherwise, the
huge swap cluster and the PMD swap mapping will be split and fallback
to PTE swap mapping.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutem
refactoring, there is no any functional change in
this patch.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya Horig
Hi, Andrew, could you help me to check whether the overall design is
reasonable?
Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the
swap part of the patchset? Especially [02/21], [03/21], [04/21],
[05/21], [06/21], [07/21], [08/21], [09/21], [10/21], [11/21],
[12/21], [20/21], [2
instead. Some functions enabled by CONFIG_ARCH_ENABLE_THP_MIGRATION
are for page migration only, they are still enabled only for that.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh D
ed by misaligned address processing issue in __split_huge_swap_pmd().
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Naoya
mapping count and probably free the swap space
and the THP in swap cache too.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
split already, we will split the PMD swap mapping and
unuse the PTEs.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Shaohua Li
Cc: Hugh Dickins
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Dave Hansen
Cc: Nao
2 new /proc/vmstat fields are added, "thp_swapin" and
"thp_swapin_fallback" to count swapin a THP from swap device in one
piece and fallback to normal page swapin.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
Cc: Michal Ho
301 - 400 of 1204 matches
Mail list logo