[ 
https://issues.apache.org/jira/browse/MESOS-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7800:
---------------------------------------
    Labels: mesosphere reliability  (was: mesosphere)

> Tasks with many labels can cause disproportionally huge allocations
> -------------------------------------------------------------------
>
>                 Key: MESOS-7800
>                 URL: https://issues.apache.org/jira/browse/MESOS-7800
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent, master
>            Reporter: Benjamin Bannier
>              Labels: mesosphere, reliability
>         Attachments: stat_all_task_labels.dat, stat_individual_labels.dat
>
>
> {{mesos.proto}} provides the {{Labels}} message so others can add free-form 
> data to a number of messages. In e.g., {{TaskInfo}} and {{ExecutorInfo}} we 
> explicitly document
> {quote}
> Therefore, labels should be used to tag tasks with light-weight meta-data.
> {quote}
> We however never enforce this requirement.
> This becomes e.g., problematic in the agent where a {{TaskInfo}} will likely 
> be copied often, e.g., due to multiple levels of dispatches. I have measured 
> that a single {{Label}} can trigger 50-100 concurrent copies in flight on the 
> agent's container launch path; our general assumption here seems to be that 
> while a {{TaskInfo}} is not necessarily small, it still is not huge.
> If users embed a lot of data into e.g., {{TaskInfo}} {{labels}} this can lead 
> to a temporary explosion of the agent process' memory footprint which can 
> lead to it being killed by the OS.
> Due to the potential negative effects of huge {{labels}} we should evaluate 
> how we can limit the amount of data we accept from users. This could mean 
> limiting the size of {{TaskInfo}} or {{Labels}} we accept, measured e.g., by 
> the message's {{ByteSizeLong}}. It seems that a value somehow related to 
> {{ARG_MAX}} would be intuitive, but am not sure if we can go as low as the 
> POSIX-mandated minimum requirement of 4096.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to