[
https://issues.apache.org/jira/browse/MESOS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021116#comment-15021116
]
Joris Van Remoortere commented on MESOS-3165:
---------------------------------------------
{code}
commit f0b846e15aa941ac3273e42197dfc34d39f11b25
Author: Alexander Rukletsov <[email protected]>
Date: Sun Nov 22 13:16:13 2015 -0500
Quota: Persisted quota to registry for set request.
Originally https://reviews.apache.org/r/39974
Review: https://reviews.apache.org/r/40401
commit 209fad094f36babd2e1011cae34ac3865de09f0c
Author: Alexander Rukletsov <[email protected]>
Date: Sun Nov 22 13:05:59 2015 -0500
Quota: Added registry tests.
Originally https://reviews.apache.org/r/39983
Review: https://reviews.apache.org/r/40400
commit 154f6b0ee5e9dfafb6f777eee58991bc43dffdcf
Author: Alexander Rukletsov <[email protected]>
Date: Sun Nov 22 12:10:19 2015 -0500
Quota: Introduced quota registry operations.
Originally https://reviews.apache.org/r/38958
Review: https://reviews.apache.org/r/40399
{code}
> Persist and recover quota to/from Registry
> ------------------------------------------
>
> Key: MESOS-3165
> URL: https://issues.apache.org/jira/browse/MESOS-3165
> Project: Mesos
> Issue Type: Task
> Components: master, replicated log
> Reporter: Alexander Rukletsov
> Assignee: Alexander Rukletsov
> Labels: mesosphere
>
> To persist quotas across failovers, the Master should save them in the
> registry. To support this, we shall:
> * Introduce a Quota state variable in registry.proto;
> * Extend the Operation interface so that it supports a ‘Quota’ accumulator
> (see src/master/registrar.hpp);
> * Introduce AddQuota / RemoveQuota operations;
> * Recover quotas from the registry on failover to the Master’s
> internal::master::Role struct;
> * Extend RegistrarTest with quota-specific tests.
> NOTE: Registry variable can be rather big for production clusters (see
> MESOS-2075). While it should be fine for MVP to add quota information to
> registry, we should consider storing Quota separately, as this does not need
> to be in sync with slaves update. However, currently adding more variable is
> not supported by the registrar.
> While the Agents are reregistering (note they may fail to do so), the
> information about what part of the quota is allocated is only partially
> available to the Master. In other words, the state of the quota allocation is
> reconstructed as Agents reregister. During this period, some roles may be
> under quota from the perspective of the newly elected Master.
> The same problem exists on the allocator side: it may think the cluster is
> under quota and may eagerly try to satisfy quotas before enough Agents
> reregister, which may result in resources being allocated to frameworks
> beyond their quota. To address this issue and also to avoid panicking and
> generating under quota alerts, the Master should give a certain amount of
> time for the majority (e.g. 80%) of the Agents to reregister before reporting
> any quota status and notifying the allocator about granted quotas.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)