[
https://issues.apache.org/jira/browse/MESOS-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133497#comment-15133497
]
Joseph Wu commented on MESOS-4598:
----------------------------------
I agree that we need a centralized fix.
I see the following scenarios:
| || Subprocess uses libprocess || Subprocess is something else ||
|| Subprocess sets/inherits the same {{PORT}} by accident | Bind failure -> exit
Option #1 above prevents accidental inheritance | Nothing happens (?) |
|| Subprocess sets a different {{PORT}} on purpose | Bind success (?) | Nothing
happens (?) |
My thought for a complete fix is the following changes:
* If the {{subprocess}} call gets {{environment = None()}}, we should
automatically remove {{LIBPROCESS_PORT}} from the inherited environment.
** I'd prefer not to unset {{LIBPROCESS_PORT}} on initialization because this
makes it harder to catch the upper-left error above. Also, the V1 HTTP
scheduler library tests will eventually need to re-initialize libprocess
between tests.
* The parts of
[{{executorEnvironment}}|https://github.com/apache/mesos/blame/master/src/slave/containerizer/containerizer.cpp#L265]
dealing with libprocess & libmesos should be refactored into libprocess as a
helper. We would use this helper for the Containerizer, Fetcher, and
ContainerLogger module.
* If the {{subprocess}} call is given {{LIBPROCESS_PORT ==
os::getenv("LIBPROCESS_PORT")}}, we can probably return an {{Error}}
immediately. Or log a warning and unset the env var locally.
> Logrotate ContainerLogger should not remove IP from environment.
> ----------------------------------------------------------------
>
> Key: MESOS-4598
> URL: https://issues.apache.org/jira/browse/MESOS-4598
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.27.0
> Reporter: Joseph Wu
> Assignee: Joseph Wu
> Labels: mesosphere
>
> The {{LogrotateContainerLogger}} starts libprocess-using subprocesses.
> Libprocess initialization will attempt to resolve the IP from the hostname.
> If a DNS service is not available, this step will fail, which terminates the
> logger subprocess prematurely.
> Since the logger subprocesses live on the agent, they should use the same
> {{LIBPROCESS_IP}} supplied to the agent.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)