On 04/08/2014 06:34 AM, Mel Gorman wrote:
> zone_reclaim_mode causes processes to prefer reclaiming memory from local
> node instead of spilling over to other nodes. This made sense initially when
> NUMA machines were almost exclusively HPC and the workload was partitioned
> into nodes. The NUMA penalties were sufficiently high to justify reclaiming
> the memory. On current machines and workloads it is often the case that
> zone_reclaim_mode destroys performance but not all users know how to detect
> this. Favour the common case and disable it by default. Users that are
> sophisticated enough to know they need zone_reclaim_mode will detect it.
> 
> Signed-off-by: Mel Gorman <mgor...@suse.de>

Reviewed-by: Zhang Yanfei <zhangyan...@cn.fujitsu.com>

> ---
>  Documentation/sysctl/vm.txt | 17 +++++++++--------
>  mm/page_alloc.c             |  2 --
>  2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index d614a9b..ff5da70 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -751,16 +751,17 @@ This is value ORed together of
>  2    = Zone reclaim writes dirty pages out
>  4    = Zone reclaim swaps pages
>  
> -zone_reclaim_mode is set during bootup to 1 if it is determined that pages
> -from remote zones will cause a measurable performance reduction. The
> -page allocator will then reclaim easily reusable pages (those page
> -cache pages that are currently not used) before allocating off node pages.
> -
> -It may be beneficial to switch off zone reclaim if the system is
> -used for a file server and all of memory should be used for caching files
> -from disk. In that case the caching effect is more important than
> +zone_reclaim_mode is disabled by default.  For file servers or workloads
> +that benefit from having their data cached, zone_reclaim_mode should be
> +left disabled as the caching effect is likely to be more important than
>  data locality.
>  
> +zone_reclaim may be enabled if it's known that the workload is partitioned
> +such that each partition fits within a NUMA node and that accessing remote
> +memory would cause a measurable performance reduction.  The page allocator
> +will then reclaim easily reusable pages (those page cache pages that are
> +currently not used) before allocating off node pages.
> +
>  Allowing zone reclaim to write out pages stops processes that are
>  writing large amounts of data from dirtying pages on other nodes. Zone
>  reclaim will write out dirty pages if a zone fills up and so effectively
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3bac76a..a256f85 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1873,8 +1873,6 @@ static void __paginginit init_zone_allows_reclaim(int 
> nid)
>       for_each_online_node(i)
>               if (node_distance(nid, i) <= RECLAIM_DISTANCE)
>                       node_set(i, NODE_DATA(nid)->reclaim_nodes);
> -             else
> -                     zone_reclaim_mode = 1;
>  }
>  
>  #else        /* CONFIG_NUMA */
> 


-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to