Updates Documentation/vm/numa_memory_policy.txt and Documentation/filesystems/tmpfs.txt to describe optional mempolicy mode flags.
Cc: Paul Jackson <[EMAIL PROTECTED]> Cc: Christoph Lameter <[EMAIL PROTECTED]> Cc: Lee Schermerhorn <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: David Rientjes <[EMAIL PROTECTED]> --- Documentation/filesystems/tmpfs.txt | 11 ++++++++ Documentation/vm/numa_memory_policy.txt | 41 +++++++++++++++++++++++++++---- 2 files changed, 47 insertions(+), 5 deletions(-) diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt --- a/Documentation/filesystems/tmpfs.txt +++ b/Documentation/filesystems/tmpfs.txt @@ -92,6 +92,17 @@ NodeList format is a comma-separated list of decimal numbers and ranges, a range being two hyphen-separated decimal numbers, the smallest and largest node numbers in the range. For example, mpol=bind:0-3,5,7,9-15 +It is possible to specify a static NodeList by appending '=static' to +the memory policy mode in the mpol= argument. This will require that +tasks or VMA's restricted to a subset of allowed nodes are only allowed +to effect the memory policy over those nodes. No remapping of the +NodeList when the policy is rebound, which is the default behavior, is +allowed when '=static' is specified. For example: + +mpol=bind=static:NodeList will only allocate from each node in + the NodeList without remapping the + NodeList if the policy is rebound + Note that trying to mount a tmpfs with an mpol option will fail if the running kernel does not support NUMA; and will fail if its nodelist specifies a node which is not online. If your system relies on that diff --git a/Documentation/vm/numa_memory_policy.txt b/Documentation/vm/numa_memory_policy.txt --- a/Documentation/vm/numa_memory_policy.txt +++ b/Documentation/vm/numa_memory_policy.txt @@ -135,9 +135,11 @@ most general to most specific: Components of Memory Policies - A Linux memory policy is a tuple consisting of a "mode" and an optional set - of nodes. The mode determine the behavior of the policy, while the - optional set of nodes can be viewed as the arguments to the behavior. + A Linux memory policy consists of a "mode", optional mode flags, and an + optional set of nodes. The mode determine the behavior of the policy, + the optional mode flags determine the behavior of the mode, and the + optional set of nodes can be viewed as the arguments to the policy + behavior. Internally, memory policies are implemented by a reference counted structure, struct mempolicy. Details of this structure will be discussed @@ -145,7 +147,12 @@ Components of Memory Policies Note: in some functions AND in the struct mempolicy itself, the mode is called "policy". However, to avoid confusion with the policy tuple, - this document will continue to use the term "mode". + this document will continue to use the term "mode". Since the mode and + optional mode flags are stored in the same struct mempolicy member + (specifically, pol->policy), you must use mpol_mode(pol->policy) to + access only the mode and mpol_flags(pol->policy) to access only the + flags. Any function with a formal of type enum mempolicy_mode only + refers to the mode. Linux memory policy supports the following 4 behavioral modes: @@ -231,6 +238,28 @@ Components of Memory Policies the temporary interleaved system default policy works in this mode. + Linux memory policy supports the following optional mode flag: + + MPOL_F_STATIC_NODES: This flag specifies that the nodemask passed by + the user should not be remapped if the task or VMA's set of accessible + nodes changes after the memory policy has been defined. + + Without this flag, anytime a mempolicy is rebound because of a + change in the set of accessible nodes, the node (Preferred) or + nodemask (Bind, Interleave) is remapped to the new set of + accessible nodes. This may result in nodes being used that were + previously undesired. With this flag, the policy is either + effected over the user's specified nodemask or the Default + behavior is used. + + For example, consider a task that is attached to a cpuset with + mems 1-3 that sets an Interleave policy over the same set. If + the cpuset's mems change to 3-5, the Interleave will now occur + over nodes 3, 4, and 5. With this flag, however, since only + node 3 is accessible from the user's nodemask, the "interleave" + only occurs over that node. If no nodes from the user's + nodemask are now accessible, the Default behavior is used. + MEMORY POLICY APIs Linux supports 3 system calls for controlling memory policy. These APIS @@ -251,7 +280,9 @@ Set [Task] Memory Policy: Set's the calling task's "task/process memory policy" to mode specified by the 'mode' argument and the set of nodes defined by 'nmask'. 'nmask' points to a bit mask of node ids containing - at least 'maxnode' ids. + at least 'maxnode' ids. Optional mode flags may be passed by + intersecting the 'mode' argument with the flag (for example: + MPOL_INTERLEAVE | MPOL_F_STATIC_NODES). See the set_mempolicy(2) man page for more details -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/