Thanks for your review. I'd like to add the following interdiff
commit 3ae104892fb695830d69d557c9706e202014030d Author: Klaus Aehlig <[email protected]> Date: Tue Sep 1 20:35:20 2015 +0200 Interdiff diff --git a/doc/design-memory-over-commitment.rst b/doc/design-memory-over-commitment.rst index 2a84085..5489a9a 100644 --- a/doc/design-memory-over-commitment.rst +++ b/doc/design-memory-over-commitment.rst @@ -10,9 +10,9 @@ overcommitment in Ganeti. Background ========== -Memory is a non-preemptable resource, thus cannot be shared, e.g., +Memory is a non-preemptable resource, and thus cannot be shared, e.g., in a round-robin fashion. Therefore, Ganeti is very careful to make -sure, there is always enough physical memory for the memory promised +sure there is always enough physical memory for the memory promised to the instances. In fact, even in an N+1 redundant way: should one node fail, its instances can be relocated to other nodes while still having enough physical memory for the memory promised to all instances. @@ -20,15 +20,15 @@ having enough physical memory for the memory promised to all instances. Overview over the current memory model -------------------------------------- -To make decissions, ``htools`` query the following parameter from Ganeti. +To make decisions, ``htools`` query the following parameters from Ganeti. - The amount of memory used by each instance. This is the state-of-record backend parameter ``maxmem`` for that instance (maybe inherited from group-level or cluster-level backend paramters). It tells the hypervisor the maximal amount of memory that instance may use. -- The state-of-world paramters for the node memory. They are collected - live and are hypervisor specific. The following paramters are collected. +- The state-of-world parameters for the node memory. They are collected + live and are hypervisor specific. The following parameters are collected. - memory_total: the total memory size on the node @@ -37,7 +37,7 @@ To make decissions, ``htools`` query the following parameter from Ganeti. - memory_dom0: the memory used by the node itself, if available For Xen, the amount of total and free memory are obtained by parsing - the outout of Xen ``info`` command (e.g., ``xm info``). The dom0 + the output of Xen ``info`` command (e.g., ``xm info``). The dom0 memory is obtained by looking in the output of the ``list`` command for ``Domain-0``. @@ -51,12 +51,12 @@ To make decissions, ``htools`` query the following parameter from Ganeti. Current state and shortcomings ============================== -While the current model of never over provisioning memory serves well +While the current model of never over committing memory serves well to provide reliability guarantees to instances, it does not suit well situations were the actual use of memory in the instances is spiky. Consider a scenario where instances only touch a small portion of their memory most -of the time, but occasionally use a large amount memory. Then, at any moment, -a large fraction of the memory used for the instances sits there without +of the time, but occasionally use a large amount of memory. Then, at any moment, +a large fraction of the memory used for the instances sits around without being actively used. By swapping out the not actively used memory, resources can be used more efficiently. @@ -64,15 +64,15 @@ Proposed changes ================ We propose to support over commitment of memory if desired by the -administrator. Memory will change from beeing a hard constraint to +administrator. Memory will change from being a hard constraint to being a question of policy. The default will be not to over commit memory. Extension of the policy by a new parameter ------------------------------------------ -The instance policy is extenden by a new real-number field ``memory-ratio``. -Policies on groups inherit this paramter from the cluster wide policy in the +The instance policy is extended by a new real-number field ``memory-ratio``. +Policies on groups inherit this parameter from the cluster wide policy in the same way as all other parameters of the instance policy. When a cluster is upgraded from an earlier version not containing @@ -100,9 +100,9 @@ Therefore, for ``kvm`` we will report as ``dom0`` the state-of-record backend paramter ``memory_dom0`` for the ``kvm`` hypervisor. As a hypervisor backend paramter, it is run-time tunable and inheritable at group level. If this paramter is not present, a default value of ``1024M`` will be used, -which is conservative estimate of the amount of memory used by Ganeti on a +which is a conservative estimate of the amount of memory used by Ganeti on a medium-sized cluster. The reason for using a state-of-record value is to -have a stable amount of reserved memory, irrsespectively of the current activity +have a stable amount of reserved memory, irrespective of the current activity of Ganeti. @@ -112,9 +112,11 @@ Changes to the memory policy The memory policy will be changed in that we assume that one byte of physical node memory can hold ``memory-ratio`` bytes of instance memory, but still only one byte of Ganeti memory. Of course, in practise -this has to be backed by swap sapce; it is the administrator responsibility +this has to be backed by swap space; it is the administrator responsibility to ensure that each node has swap of at -least ``(memory-ratio - 1.0) * (memory_total - memory_dom0)``. +least ``(memory-ratio - 1.0) * (memory_total - memory_dom0)``. Ganeti +will warn if the amount of swap space is not big enough. + The new memory policy will be as follows. @@ -134,9 +136,9 @@ The new memory policy will be as follows. memory-ratio this is considered a hard violation, otherwise it is considered a soft violation. -- The definition of N+1 redunandcy (including +- The definition of N+1 redundancy (including :doc:`design-shared-storage-redundancy`) is kept literally as is. - Note, however, that the meaning does change, the definition depends + Note, however, that the meaning does change, as definition depends on the notion of allowed moves, which is changed by this proposal. @@ -147,6 +149,9 @@ The only place where the Ganeti core handles memory is when ``gnt-cluster verify`` verifies N+1 redundancy. This code will be changed to follow the new memory model. +Additionally, ``gnt-cluster verify`` will warn if the sum of available memory +and swap space is not at least as big as the used memory. + Changes to ``htools`` --------------------- @@ -166,8 +171,12 @@ discardable memory will serve as an indication of memory activity; as usual, the measure will be the standard deviation of the relative value (i.e., the ratio of non-discardable memory to available memory). The weighting for this metric component will have to be -determined by experimentation; as starting point we will use the -status quo value of ``1.0``. +determined by experimentation and will depend on the memory ratio; +for a memory ratio of ``1.0`` the weight will be ``0.0``, as memory +need not be taken into account if no over-commitment is in place. +As a starting point for experimenation we will use weight ``0.0`` +if the memory ratio is ``1.0`` and weight ``1.0`` if it is bigger +than ``1.0``. -- Klaus Aehlig Google Germany GmbH, Dienerstr. 12, 80331 Muenchen Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschaeftsfuehrer: Graham Law, Christine Elizabeth Flores
