Benjamin Bannier created MESOS-7800:
---------------------------------------
Summary: Tasks with many labels can cause disproportionally huge
allocations
Key: MESOS-7800
URL: https://issues.apache.org/jira/browse/MESOS-7800
Project: Mesos
Issue Type: Bug
Components: agent, master
Reporter: Benjamin Bannier
{{mesos.proto}} provides the {{Labels}} message so others can add free-form
data to a number of messages. In e.g., {{TaskInfo}} and {{ExecutorInfo}} we
explicitly document
{quote}
Therefore, labels should be used to tag tasks with light-weight meta-data.
{quote}
We however never enforce this requirement.
This becomes e.g., problematic in the agent where a {{TaskInfo}} will likely be
copied often, e.g., due to multiple levels of dispatches. I have measured that
a single {{Label}} can trigger 50-100 concurrent copies in flight on the
agent's container launch path; our general assumption here seems to be that
while a {{TaskInfo}} is not necessarily small, it still is not huge.
If users embed a lot of data into e.g., {{TaskInfo}} {{labels}} this can lead
to a temporary explosion of the agent process' memory footprint which can lead
to it being killed by the OS.
Due to the potential negative effects of huge {{labels}} we should evaluate how
we can limit the amount of data we accept from users. This could mean limiting
the size of {{TaskInfo}} or {{Labels}} we accept, measured e.g., by the
message's {{ByteSizeLong}}. It seems that a value somehow related to
{{ARG_MAX}} would be intuitive, but am not sure if we can go as low as the
POSIX-mandated minimum requirement of 4096.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)