On 12/3/20 12:36 PM, Zhaoyang Huang wrote:
> The scenario on which "Free swap -4kB" happens in my system, which is caused 
> by
>  get_swap_page_of_type or get_swap_pages racing with show_mem. Remove the race
>  here.
> 
> Signed-off-by: Zhaoyang Huang <[email protected]>
> ---
>  mm/swapfile.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index cf63b5f..13201b6 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -974,6 +974,8 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], 
> int entry_size)
>       /* Only single cluster request supported */
>       WARN_ON_ONCE(n_goal > 1 && size == SWAPFILE_CLUSTER);
>  
> +     spin_lock(&swap_avail_lock);
> +
>       avail_pgs = atomic_long_read(&nr_swap_pages) / size;
>       if (avail_pgs <= 0)
>               goto noswap;

This goto will leave with the spin lock locked, so that's a bug.

> @@ -986,8 +988,6 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], 
> int entry_size)
>  
>       atomic_long_sub(n_goal * size, &nr_swap_pages);
>  
> -     spin_lock(&swap_avail_lock);
> -

Is the problem that while we adjust n_goal with a min3(..., avail_pgs), somebody
else can decrease nr_swap_pages in the meanwhile and then we underflow? If yes,
the spin lock won't eliminate all such cases it seems, as e.g.
get_swap_page_of_type isn't done under the same lock, AFAIK.

>  start_over:
>       node = numa_node_id();
>       plist_for_each_entry_safe(si, next, &swap_avail_heads[node], 
> avail_lists[node]) {
> @@ -1061,14 +1061,13 @@ swp_entry_t get_swap_page_of_type(int type)
>  
>       spin_lock(&si->lock);
>       if (si->flags & SWP_WRITEOK) {
> -             atomic_long_dec(&nr_swap_pages);
>               /* This is called for allocating swap entry, not cache */
>               offset = scan_swap_map(si, 1);
>               if (offset) {
> +                     atomic_long_dec(&nr_swap_pages);
>                       spin_unlock(&si->lock);
>                       return swp_entry(type, offset);
>               }
> -             atomic_long_inc(&nr_swap_pages);

This hunk looks safer, unless I miss something. Did you check if it's enough to
prevent the negative values on your systems?

>       }
>       spin_unlock(&si->lock);
>  fail:
> 

Reply via email to