It seems merge windows is closed so bump up.

On Mon, Jun 15, 2015 at 10:06:54PM +0900, Minchan Kim wrote:
> We want to know per-process workingset size for smart memory management
> on userland and we use swap(ex, zram) heavily to maximize memory efficiency
> so workingset includes swap as well as RSS.
> 
> On such system, if there are lots of shared anonymous pages, it's
> really hard to figure out exactly how many each process consumes
> memory(ie, rss + wap) if the system has lots of shared anonymous
> memory(e.g, android).
> 
> This patch introduces SwapPss field on /proc/<pid>/smaps so we can get
> more exact workingset size per process.
> 
> Bongkyu tested it. Result is below.
> 
> 1. 50M used swap
> SwapTotal: 461976 kB
> SwapFree: 411192 kB
> 
> $ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print 
> sum}';
> 48236
> $ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print 
> sum}';
> 141184
> 
> 2. 240M used swap
> SwapTotal: 461976 kB
> SwapFree: 216808 kB
> 
> $ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print 
> sum}';
> 230315
> $ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print 
> sum}';
> 1387744
> 
> * from v1
>   * add more description - Andrew
>   * swp_swacount fix on !CONFIG_SWP - Sergey
>   * add what PSS is to proc.txt - Andrew
>     * Bring quote from lwn.net - Corbet
>       * http://lwn.net/Articles/230975/
> 
> Cc: Hugh Dickins <[email protected]>
> Cc: Sergey Senozhatsky <[email protected]>
> Cc: Jonathan Corbet <[email protected]>
> Report-and-Tested-by: Bongkyu Kim <[email protected]>
> Signed-off-by: Minchan Kim <[email protected]>
> ---
>  Documentation/filesystems/proc.txt | 18 +++++++++++-----
>  fs/proc/task_mmu.c                 | 18 ++++++++++++++--
>  include/linux/swap.h               |  6 ++++++
>  mm/swapfile.c                      | 42 
> ++++++++++++++++++++++++++++++++++++++
>  4 files changed, 77 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/filesystems/proc.txt 
> b/Documentation/filesystems/proc.txt
> index c3b6b301d8b0..cfc765e6cfa6 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -423,6 +423,7 @@ Private_Dirty:         0 kB
>  Referenced:          892 kB
>  Anonymous:             0 kB
>  Swap:                  0 kB
> +SwapPss:               0 kB
>  KernelPageSize:        4 kB
>  MMUPageSize:           4 kB
>  Locked:              374 kB
> @@ -432,16 +433,23 @@ the first of these lines shows the same information as 
> is displayed for the
>  mapping in /proc/PID/maps.  The remaining lines show the size of the mapping
>  (size), the amount of the mapping that is currently resident in RAM (RSS), 
> the
>  process' proportional share of this mapping (PSS), the number of clean and
> -dirty private pages in the mapping.  Note that even a page which is part of a
> -MAP_SHARED mapping, but has only a single pte mapped, i.e.  is currently used
> -by only one process, is accounted as private and not as shared.  "Referenced"
> -indicates the amount of memory currently marked as referenced or accessed.
> +dirty private pages in the mapping.
> +
> +The "proportional set size" (PSS) of a process is the count of pages it has
> +in memory, where each page is divided by the number of processes sharing it.
> +So if a process has 1000 pages all to itself, and 1000 shared with one other
> +process, its PSS will be 1500.
> +Note that even a page which is part of a MAP_SHARED mapping, but has only
> +a single pte mapped, i.e.  is currently used by only one process, is 
> accounted
> +as private and not as shared.
> +"Referenced" indicates the amount of memory currently marked as referenced or
> +accessed.
>  "Anonymous" shows the amount of memory that does not belong to any file.  
> Even
>  a mapping associated with a file may contain anonymous pages: when 
> MAP_PRIVATE
>  and a page is modified, the file page is replaced by a private anonymous 
> copy.
>  "Swap" shows how much would-be-anonymous memory is also used, but out on
>  swap.
> -
> +"SwapPss" shows proportional swap share of this mapping.
>  "VmFlags" field deserves a separate description. This member represents the 
> kernel
>  flags associated with the particular virtual memory area in two letter 
> encoded
>  manner. The codes are the following:
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 6dee68d013ff..d537899f4b25 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -446,6 +446,7 @@ struct mem_size_stats {
>       unsigned long anonymous_thp;
>       unsigned long swap;
>       u64 pss;
> +     u64 swap_pss;
>  };
>  
>  static void smaps_account(struct mem_size_stats *mss, struct page *page,
> @@ -492,9 +493,20 @@ static void smaps_pte_entry(pte_t *pte, unsigned long 
> addr,
>       } else if (is_swap_pte(*pte)) {
>               swp_entry_t swpent = pte_to_swp_entry(*pte);
>  
> -             if (!non_swap_entry(swpent))
> +             if (!non_swap_entry(swpent)) {
> +                     int mapcount;
> +
>                       mss->swap += PAGE_SIZE;
> -             else if (is_migration_entry(swpent))
> +                     mapcount = swp_swapcount(swpent);
> +                     if (mapcount >= 2) {
> +                             u64 pss_delta = (u64)PAGE_SIZE << PSS_SHIFT;
> +
> +                             do_div(pss_delta, mapcount);
> +                             mss->swap_pss += pss_delta;
> +                     } else {
> +                             mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT;
> +                     }
> +             } else if (is_migration_entry(swpent))
>                       page = migration_entry_to_page(swpent);
>       }
>  
> @@ -638,6 +650,7 @@ static int show_smap(struct seq_file *m, void *v, int 
> is_pid)
>                  "Anonymous:      %8lu kB\n"
>                  "AnonHugePages:  %8lu kB\n"
>                  "Swap:           %8lu kB\n"
> +                "SwapPss:        %8lu kB\n"
>                  "KernelPageSize: %8lu kB\n"
>                  "MMUPageSize:    %8lu kB\n"
>                  "Locked:         %8lu kB\n",
> @@ -652,6 +665,7 @@ static int show_smap(struct seq_file *m, void *v, int 
> is_pid)
>                  mss.anonymous >> 10,
>                  mss.anonymous_thp >> 10,
>                  mss.swap >> 10,
> +                (unsigned long)(mss.swap_pss >> (10 + PSS_SHIFT)),
>                  vma_kernel_pagesize(vma) >> 10,
>                  vma_mmu_pagesize(vma) >> 10,
>                  (vma->vm_flags & VM_LOCKED) ?
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index cee108cbe2d5..afc9eb3cba48 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -432,6 +432,7 @@ extern unsigned int count_swap_pages(int, int);
>  extern sector_t map_swap_page(struct page *, struct block_device **);
>  extern sector_t swapdev_block(int, pgoff_t);
>  extern int page_swapcount(struct page *);
> +extern int swp_swapcount(swp_entry_t entry);
>  extern struct swap_info_struct *page_swap_info(struct page *);
>  extern int reuse_swap_page(struct page *);
>  extern int try_to_free_swap(struct page *);
> @@ -523,6 +524,11 @@ static inline int page_swapcount(struct page *page)
>       return 0;
>  }
>  
> +static inline int swp_swapcount(swp_entry_t entry)
> +{
> +     return 0;
> +}
> +
>  #define reuse_swap_page(page)        (page_mapcount(page) == 1)
>  
>  static inline int try_to_free_swap(struct page *page)
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index a7e72103f23b..7a6bd1e5a8e9 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -875,6 +875,48 @@ int page_swapcount(struct page *page)
>  }
>  
>  /*
> + * How many references to @entry are currently swapped out?
> + * This considers COUNT_CONTINUED so it returns exact answer.
> + */
> +int swp_swapcount(swp_entry_t entry)
> +{
> +     int count, tmp_count, n;
> +     struct swap_info_struct *p;
> +     struct page *page;
> +     pgoff_t offset;
> +     unsigned char *map;
> +
> +     p = swap_info_get(entry);
> +     if (!p)
> +             return 0;
> +
> +     count = swap_count(p->swap_map[swp_offset(entry)]);
> +     if (!(count & COUNT_CONTINUED))
> +             goto out;
> +
> +     count &= ~COUNT_CONTINUED;
> +     n = SWAP_MAP_MAX + 1;
> +
> +     offset = swp_offset(entry);
> +     page = vmalloc_to_page(p->swap_map + offset);
> +     offset &= ~PAGE_MASK;
> +     VM_BUG_ON(page_private(page) != SWP_CONTINUED);
> +
> +     do {
> +             page = list_entry(page->lru.next, struct page, lru);
> +             map = kmap_atomic(page) + offset;
> +             tmp_count = *map;
> +             kunmap_atomic(map);
> +
> +             count += (tmp_count & ~COUNT_CONTINUED) * n;
> +             n *= (SWAP_CONT_MAX + 1);
> +     } while (tmp_count & COUNT_CONTINUED);
> +out:
> +     spin_unlock(&p->lock);
> +     return count;
> +}
> +
> +/*
>   * We can write to an anon page without COW if there are no other references
>   * to it.  And as a side-effect, free up its swap: because the old content
>   * on disk will never be read, and seeking back there to write new content
> -- 
> 1.9.1
> 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to