[PATCH v3] mlock: fix mlock count can not decrease in race condition

2017-05-25 Thread Yisheng Xie
Kefeng reported that when running the follow test, the mlock count in
meminfo will increases permanently:

 [1] testcase
 linux:~ # cat test_mlockal
 grep Mlocked /proc/meminfo
  for j in `seq 0 10`
  do
for i in `seq 4 15`
do
./p_mlockall >> log &
done
sleep 0.2
 done
 # wait some time to let mlock counter decrease and 5s may not enough
 sleep 5
 grep Mlocked /proc/meminfo

 linux:~ # cat p_mlockall.c
 #include 
 #include 
 #include 

 #define SPACE_LEN  4096

 int main(int argc, char ** argv)
 {
int ret;
void *adr = malloc(SPACE_LEN);
if (!adr)
return -1;

ret = mlockall(MCL_CURRENT | MCL_FUTURE);
printf("mlcokall ret = %d\n", ret);

ret = munlockall();
printf("munlcokall ret = %d\n", ret);

free(adr);
return 0;
 }

In __munlock_pagevec() we should decrement NR_MLOCK for each page where
we clear the PageMlocked flag. Commit 1ebb7cc6a583 ("mm: munlock: batch
NR_MLOCK zone state updates") has introduced a bug where we don't
decrement NR_MLOCK for pages where we clear the flag, but fail to
isolate them from the lru list (e.g. when the pages are on some other
cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK
accounting gets permanently disrupted by this.

Fix it by counting the number of page whoes PageMlock flag is cleared.

Fixes: 1ebb7cc6a583 ("mm: munlock: batch NR_MLOCK zone state updates")
Signed-off-by: Yisheng Xie 
Reported-by: Kefeng Wang 
Tested-by: Kefeng Wang 
Suggested-by: Vlastimil Babka 
Acked-by: Vlastimil Babka 
Cc: Joern Engel 
Cc: Mel Gorman 
Cc: Michel Lespinasse 
Cc: Hugh Dickins 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Xishi Qiu 
CC: zhongjiang 
Cc: Hanjun Guo 
Cc: 
---
v2:
 - use delta_munlocked for it doesn't do the increment in fastpath - Vlastimil

v3:
 - change the changelog to make it more clear - Vlastimil

Hi Andrew:
Could you please help to fold this?

Thanks
Yisheng Xie

 mm/mlock.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index c483c5c..b562b55 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -284,7 +284,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
 {
int i;
int nr = pagevec_count(pvec);
-   int delta_munlocked;
+   int delta_munlocked = -nr;
struct pagevec pvec_putback;
int pgrescued = 0;
 
@@ -304,6 +304,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
continue;
else
__munlock_isolation_failed(page);
+   } else {
+   delta_munlocked++;
}
 
/*
@@ -315,7 +317,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
pagevec_add(_putback, pvec->pages[i]);
pvec->pages[i] = NULL;
}
-   delta_munlocked = -nr + pagevec_count(_putback);
__mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
spin_unlock_irq(zone_lru_lock(zone));
 
-- 
1.7.12.4



[PATCH v3] mlock: fix mlock count can not decrease in race condition

2017-05-25 Thread Yisheng Xie
Kefeng reported that when running the follow test, the mlock count in
meminfo will increases permanently:

 [1] testcase
 linux:~ # cat test_mlockal
 grep Mlocked /proc/meminfo
  for j in `seq 0 10`
  do
for i in `seq 4 15`
do
./p_mlockall >> log &
done
sleep 0.2
 done
 # wait some time to let mlock counter decrease and 5s may not enough
 sleep 5
 grep Mlocked /proc/meminfo

 linux:~ # cat p_mlockall.c
 #include 
 #include 
 #include 

 #define SPACE_LEN  4096

 int main(int argc, char ** argv)
 {
int ret;
void *adr = malloc(SPACE_LEN);
if (!adr)
return -1;

ret = mlockall(MCL_CURRENT | MCL_FUTURE);
printf("mlcokall ret = %d\n", ret);

ret = munlockall();
printf("munlcokall ret = %d\n", ret);

free(adr);
return 0;
 }

In __munlock_pagevec() we should decrement NR_MLOCK for each page where
we clear the PageMlocked flag. Commit 1ebb7cc6a583 ("mm: munlock: batch
NR_MLOCK zone state updates") has introduced a bug where we don't
decrement NR_MLOCK for pages where we clear the flag, but fail to
isolate them from the lru list (e.g. when the pages are on some other
cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK
accounting gets permanently disrupted by this.

Fix it by counting the number of page whoes PageMlock flag is cleared.

Fixes: 1ebb7cc6a583 ("mm: munlock: batch NR_MLOCK zone state updates")
Signed-off-by: Yisheng Xie 
Reported-by: Kefeng Wang 
Tested-by: Kefeng Wang 
Suggested-by: Vlastimil Babka 
Acked-by: Vlastimil Babka 
Cc: Joern Engel 
Cc: Mel Gorman 
Cc: Michel Lespinasse 
Cc: Hugh Dickins 
Cc: Rik van Riel 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Xishi Qiu 
CC: zhongjiang 
Cc: Hanjun Guo 
Cc: 
---
v2:
 - use delta_munlocked for it doesn't do the increment in fastpath - Vlastimil

v3:
 - change the changelog to make it more clear - Vlastimil

Hi Andrew:
Could you please help to fold this?

Thanks
Yisheng Xie

 mm/mlock.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index c483c5c..b562b55 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -284,7 +284,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
 {
int i;
int nr = pagevec_count(pvec);
-   int delta_munlocked;
+   int delta_munlocked = -nr;
struct pagevec pvec_putback;
int pgrescued = 0;
 
@@ -304,6 +304,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
continue;
else
__munlock_isolation_failed(page);
+   } else {
+   delta_munlocked++;
}
 
/*
@@ -315,7 +317,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
pagevec_add(_putback, pvec->pages[i]);
pvec->pages[i] = NULL;
}
-   delta_munlocked = -nr + pagevec_count(_putback);
__mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
spin_unlock_irq(zone_lru_lock(zone));
 
-- 
1.7.12.4