[ https://issues.apache.org/jira/browse/MESOS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph Wu reassigned MESOS-7564: -------------------------------- Assignee: Joseph Wu > Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication. > ----------------------------------------------------------------------------- > > Key: MESOS-7564 > URL: https://issues.apache.org/jira/browse/MESOS-7564 > Project: Mesos > Issue Type: Bug > Components: agent, executor > Reporter: Anand Mazumdar > Assignee: Joseph Wu > Priority: Critical > Labels: api, mesosphere, v1_api > > Currently, we do not have heartbeats for executor <-> agent communication. > This is especially problematic in scenarios when IPFilters are enabled since > the default conntrack keep alive timeout is 5 days. When that timeout > elapses, the executor doesn't get notified via a socket disconnection when > the agent process restarts. The executor would then get killed if it doesn't > re-register when the agent recovery process is completed. > Enabling application level heartbeats or TCP KeepAlive's can be a possible > way for fixing this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)