Re: [PATCH v6 0/6] workingset protection/detection on the anonymous LRU list

2020-06-25 Thread Joonsoo Kim
2020년 6월 17일 (수) 오후 2:26, 님이 작성:
>
> From: Joonsoo Kim 
>
> Hello,
>
> This patchset implements workingset protection and detection on
> the anonymous LRU list.
>
> * Changes on v6
> - rework to reflect a new LRU balance model
> - remove memcg charge timing stuff on v5 since alternative is already
> merged on mainline
> - remove readahead stuff on v5 (reason is the same with above)
> - clear shadow entry if corresponding swap entry is deleted
> (mm/swapcache: support to handle the exceptional entries in swapcache)
> - change experiment environment
> (from ssd swap to ram swap, for fast evaluation and for reducing side-effect 
> of I/O)
> - update performance number

Hello, Johannes.

Could you review the v6 patchset?

Some minor things have changed so it's really welcome if you review the patchset
again. Especially, "mm/swap: implement workingset detection for anonymous LRU"
doesn't get any ack yet. :)

Thanks.


[PATCH v6 0/6] workingset protection/detection on the anonymous LRU list

2020-06-16 Thread js1304
From: Joonsoo Kim 

Hello,

This patchset implements workingset protection and detection on
the anonymous LRU list.

* Changes on v6
- rework to reflect a new LRU balance model
- remove memcg charge timing stuff on v5 since alternative is already
merged on mainline
- remove readahead stuff on v5 (reason is the same with above)
- clear shadow entry if corresponding swap entry is deleted
(mm/swapcache: support to handle the exceptional entries in swapcache)
- change experiment environment
(from ssd swap to ram swap, for fast evaluation and for reducing side-effect of 
I/O)
- update performance number

* Changes on v5
- change memcg charge timing for the swapped-in page (fault -> swap-in)
- avoid readahead if previous owner of the swapped-out page isn't me
- use another lruvec to update the reclaim_stat for a new anonymous page
- add two more cases to fix up the reclaim_stat

* Changes on v4
- In the patch "mm/swapcache: support to handle the exceptional
entries in swapcache":
-- replace the word "value" with "exceptional entries"
-- add to handle the shadow entry in add_to_swap_cache()
-- support the huge page
-- remove the registration code for shadow shrinker

- remove the patch "mm/workingset: use the node counter
if memcg is the root memcg" since workingset detection for
anonymous page doesn't use shadow shrinker now
- minor style fixes

* Changes on v3
- rework the patch, "mm/vmscan: protect the workingset on anonymous LRU"
(use almost same reference tracking algorithm to the one for the file
mapped page)

* Changes on v2
- fix a critical bug that uses out of index lru list in
workingset_refault()
- fix a bug that reuses the rotate value for previous page

* SUBJECT
workingset protection

* PROBLEM
In current implementation, newly created or swap-in anonymous page is
started on the active list. Growing the active list results in rebalancing
active/inactive list so old pages on the active list are demoted to the
inactive list. Hence, hot page on the active list isn't protected at all.

Following is an example of this situation.

Assume that 50 hot pages on active list and system can contain total
100 pages. Numbers denote the number of pages on active/inactive
list (active | inactive). (h) stands for hot pages and (uo) stands for
used-once pages.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(uo) | 50(h)

3. workload: another 50 newly created (used-once) pages
50(uo) | 50(uo), swap-out 50(h)

As we can see, hot pages are swapped-out and it would cause swap-in later.

* SOLUTION
Since this is what we want to avoid, this patchset implements workingset
protection. Like as the file LRU list, newly created or swap-in anonymous
page is started on the inactive list. Also, like as the file LRU list,
if enough reference happens, the page will be promoted. This simple
modification changes the above example as following.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(h) | 50(uo)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(uo)

hot pages remains in the active list. :)

* EXPERIMENT
I tested this scenario on my test bed and confirmed that this problem
happens on current implementation. I also checked that it is fixed by
this patchset.


* SUBJECT
workingset detection

* PROBLEM
Later part of the patchset implements the workingset detection for
the anonymous LRU list. There is a corner case that workingset protection
could cause thrashing. If we can avoid thrashing by workingset detection,
we can get the better performance.

Following is an example of thrashing due to the workingset protection.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (will be hot) pages
50(h) | 50(wh)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(wh)

4. workload: 50 (will be hot) pages
50(h) | 50(wh), swap-in 50(wh)

5. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(wh)

6. repeat 4, 5

Without workingset detection, this kind of workload cannot be promoted
and thrashing happens forever.

* SOLUTION
Therefore, this patchset implements workingset detection.
All the infrastructure for workingset detecion is already implemented,
so there is not much work to do. First, extend workingset detection
code to deal with the anonymous LRU list. Then, make swap cache handles
the exceptional value for the shadow entry. Lastly, install/retrieve
the shadow value into/from the swap cache and check the refault distance.

* EXPERIMENT
I made a test program to imitates above scenario and confirmed that
problem exists. Then, I checked that this patchset fixes it.

My test setup is a virtual machine with 8 cpus and 6100MB memory. But,
the amount of the memory that the test program can use is about 280 MB.
This is because the system uses large ram-backed swap and large ramdisk
to capture the trace.

Test scenario is like as below.

1.